Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.fw001
Minicomputers and Large Scale Computations
In Minicom...
73 downloads
1451 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.fw001
Minicomputers and Large Scale Computations
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.fw001
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Minicomputers and Large Scale Computations Peter Lykos, EDITOR
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.fw001
Illinois Institute of Technology
A symposium sponsored by the ACS Division of Computers in Chemistry at the Second Joint Conference of the Chemical Institute of Canada and the American Chemical Society, Montreal, Canada, June 1, 1977.
ACS SYMPOSIUM SERIES 57
AMERICAN
CHEMICAL
SOCIETY
WASHINGTON, D.C. 1977
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.fw001
Library of Congress
Data
Minicomputers and large scale computations. ( A C S symposium series; 57 I S S N 0 0 9 7 - 6 1 5 6 ) Includes bibliographical references and index. 1. Chemistry—Data processing—Congresses. 2. Minicomputers—Congresses. I. Lykos, Peter George, 1 9 2 7 . II. American C h e m i cal Society. D i v i s i o n of Computers i n Chemistry. III. Joint Conference of the Chemical Institute of Canada and the American Chemical Society, 2 n d , Montreal, Quebec, 1977. I V . Series: American Chemical Society. A C S symposium series; 5 7 . QD39.3.E46M56 I S B N 0-8412-0387-3
542'.8 A C S M C 8 57
77-15932 1-239
Copyright © 1977 American Chemical Society All Rights Reserved. No part of this book may be reproduced or transmitted in any form or by any means—graphic, electronic, including photocopying, recording, taping, or information storage and retrieval systems—without written permission from the American Chemical Society. The citation of trade names and/or names of manufacturers in this publication is not to be construed as an endorsement or as approval by ACS of the commercial products or services referenced herein; nor should the mere reference herein to any drawing, specification, chemical process, or other data be regarded as a license or as a conveyance of any right or permission, to the holder, reader, or any other person or corporation, to manufacture, reproduce, use, or sell any patented invention or copyrighted work that may in any way be related thereto. PRINTED IN THE UNITED
STATES
OF
AMERICA
Society Library
1155 16th St. N. W. Washington, D. C. 20036 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
ACS Symposium Series
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.fw001
Robert F. Gould, Editor
Advisory Board Donald G. Crosby Jeremiah P. Freeman E. Desmond Goddard Robert A. Hofstader John L. Margrave Nina I. McClelland John B. Pfeiffer Joseph V. Rodricks Alan C. Sartorelli Raymond B. Seymour Roy L. Whistler Aaron W o l d
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.fw001
FOREWORD The ACS SYMPOSIUM SERIES was founded in 1974 to provide a medium for publishing symposia quickly in book form. The format of the SERIES parallels that of the continuing ADVANCES IN CHEMISTRY SERIES except that in order to save time the papers are not typeset but are reproduced as they are submitted by the authors in camera-ready form. As a further means of saving time, the papers are not edited or reviewed except by the symposium chairman, who becomes editor of the book. Papers published in the ACS SYMPOSIUM SERIES are original contributions not published elsewhere in whole or major part and include reports of research as well as reviews since symposia may embrace both types of presentation.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.pr001
PREFACE '"phis symposium on "Minicomputers and Large Scale Computations" brings together a representative set of reports of concrete experiences, including cost analyses, in which computer users have turned to so-called minicomputers to handle computational problems which just a few years ago could have been handled by only large scale scientific computers. This book should be viewed as a snapshot of a dynamic situation changing fairly rapidly in time. The chapters have been arranged in sequence starting with the smallest instrument (a hand-held programmable calculator) to the largest (a dual large scale, or super, minicomputer). Several superposed trends are operating, and it is important to sort them out so that one can intelligently analyze how to best approach a particular set of computational needs. In its first manifestation with widespread use (the D E C PDP-8) the minicomputer was physically small (made to fit in a standard instrument rack), slow in cycle time, and small in main memory size; it had a short list of machine instructions, a short word length, minimal software support, and virtually no peripherals except a teletypewriter. Its target users were experimenters interested in automated data collection and reduction and those concerned with real-time control applications, somtimes in nonfriendly physical environments. Gradually the minicomputer evolved in several directions including toward the large scale or "super" minicomputer typified by the last four chapters (13-16). The super minicomputer class includes machines with 16-, 24-, and 32-bit word-based architectures, fast floating-point arithmetic (achieved in different ways), virtual memories, a full range of peripheral devices (mass storage, printers, card readers, etc.), and sophisticated multi-user supporting operating systems, compilers, interpreters, and data-base management systems. Indeed the PRIME 400 even has a super-speed small (or cache) component of the main fast memory similar to the IBM 370/195. Thus the full power of the superscientific computer of 10 years ago is now available for an order of magnitude less the costs of purchase, maintenance, and operation. In addition, the space and air conditioning requirements have been reduced to that of an ordinary small research laboratory. Even the modern laboratory minicomputer, similar in many respects to the venerable PDP-8, is being pressed into service as a scientific calculator. Chapters 2, 3, 6, 7, 8, 10, and 11 present examples in which the ix In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.pr001
computer program was reorganized compared with the way in which it would have been done for a large scale scientific computer. The small main memory forced the users to (a) make more, and more clever, use of disk storage where available, (b) search for non-conventional algorithms, in some cases more specifically problem-oriented, and (c) in two cases minimize the need for floating-point operations by scaling, and by table searching and interpolation. Of special concern here, because of the long (wall clock) running times, is the finite probability of machine failure. Chapters 4, 5, 6, and 16 explore the trade-offs involved in using minicomputers for portions of the calculations and conventional large scale computers for the remainder. Indeed Chapter 4 introduces A P L (a mathematically oriented language not so widely used by chemists as the ubiquitous FORTRAN) and also a feature of the IBM 5100 A P L processor which permits the unsophisticated (i.e., higher level language) programmer to build in details of communication protocol easily where optimal distribution of computing tasks among several processors is sought. Another trend is toward the design of special-purpose processors intended to be enslaved to conventional processors. The array processor AP-120B, as an add-on to the Harris 6024/4 at the National Astronomy and Ionosphere Center, has handled highly organized floating-point operations at 12.4 megaflops (millions of floating-point operations per second) which has been compared with the 5-megaflop C D C 7600 and the 15-megaflop ILLIAC IV {see Wolin, L., "Procedure Evaluates Computers for Scientific Applications," Computer Design (1976) 15, 93 for a more detailed comparison of minicomputers and current large scale scientific systems). Chapters 5, 8, 10, and 12 use specialized hardware to hardware-tailor a computer system to the requirements of a specific class of problems. The quantum of computational power is shrinking in physical size and cost to the point where the choice, as well as the computer, is in the hands of the individual user. The microprocessor has burst upon the scene. The mushrooming of over 400 retail computer hobby outlets has been sparked by the large scale integrated circuit ( LSI ) computer-on-achip and the growing personal computing market. The hand-held electronic calculator has decreased in physical size to the limit that conventional computer input-output can tolerate—namely the resolving power of the human eye and the physical size of human fingers. Chapter 1 illustrates attache-case-portable programmable computers with off-line storage and built-in printer capability. However, a parallel limiting process also becomes evident, i.e., the decreasing level of software support and the need for programs in machine language. For the convenx In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.pr001
tional computer (the serial prooessor) the increasing sophistication of LSI chip circuitry and the decreasing cost per bit of corresponding large scale non-electromechanical mass memory makes it more and more likely that the large-scale conventional computer system of today will be replaced by a small inexpensive package that can support today's complex software (see Turn, Rein, "Computers in the 1980's," Columbia University Press, 1974). But by the time that happens, who will want it? Because of the decreasing size and cost of individual processors, computer designers can contemplate highly concurrent multiprocessor devices. However, such devices with so many degrees of freedom available in their design must be problem-oriented. In addition, the algorithms developed to solve problems on conventional serial processors are no longer optimal for more complex computer systems. The recent symposium on High Speed Computer and Algorithm Organization ( proceedings to be published by Academic Press, late 1977) revealed that the surface has hardly been scratched in that regard. Furthermore, the computer designer is severely restricted because historically the user has accepted the designers product passively and adapted his problems and algorithms to the computer rather than vice versa. Perhaps the most important trend of all is that the awesome computer mystique is gradually being supplanted by a more healthy attitude on the part of a computer-acculturated and increasingly demanding community of users who are discovering the Golden Rule, namely, "He who has the gold . . . rules." Chicago, Illinois September 1977
PETER LYKOS
xi In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1 Microcomputer
Plus Saul'yev M e t h o d Solves
Simultaneous Partial Differential Equations of the Diffusion T y p e w i t h H i g h l y N o n l i n e a r B o u n d a r y Conditions
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
R. KENNETH WOLFE, DAVID C. COLONY and RONALD D. EATON University of Toledo, Toledo, OH 43606 Important today is the ability to answer rapidly and inexpen sively the complex questions posed by an increasingly complex society. Mathematics has played an important role in s c i e n t i f i c problem solving. Practical solutions today rely heavily on com puterized numerical approaches. This paper extends for use with the Hewlett-Packard 67/97 a numerical method due to Saul'yev (1,2). His method i s very similar to the popular method of Schmidt (3) used in graphical, numerical and computer computations to study transient heat con duction problems. This paper w i l l i l l u s t r a t e the use of a small minicomputer (microprocessor) to apply the Saul'yev approach to a simple case and also to a more complex case. The complex case is that of a hot s o l i d slab bounded on one side by a cooler semi - i n f i n i t e s o l i d and exposed at the hot surface to solar radiation, cloud cover and forced or free convective heat losses to a i r . A Simple Case Consider a solid cylinder with faces fixed at two different tem peratures. The sides of the cylinder are insulated. Temperature and time are then related through the extension of Fourier's law to the parabolic partial differential equation: 2
1) 01
8T 9X2
_
3T 3t Τ = T(x,t) = temperature at a point χ and a time t. χ = distance, in feet t = time in hours α = k/pc = thermal d i f f u s i v i t y k = thermal conductivity, BTU/hrft °F/ft p = density, l b / f t C = heat capacity, BTU/lb °F 3
m
m
1 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
Equation 1) i s the d i f f u s i v i t y equation which applies to heat transfer as well as to the transport of matter. Assume that the cylinder has an i n i t i a l constant temperature of Τχ and that at time zero one face i s instantaneously brought to a temperature T . The time-temperature relationship can then be determined analytc a l l y by any of several methods to be: s
2)
T(x,t) = ( T x - T s J e r f t x / v C T + T
$
erf(z) = error function 0
_ , exp(-z ) , 1 2
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
r
1_
+
z ' 2z3
νπ
Tj3
1-3-5
22 5 " 23 7
K
Z
Z
, ··'
;
'
Equation 2) i s developed in most heat transfer texts. Two good references which treat this problem are Chapman (4) and Carslaw and Jaeger ( 5 j . But direct application of this analytical treatment i s not often f e a s i b l e . Other methods of solution are required. Approximations of the error function are given in Abramowitz and Stegun (6) which are highly accurate and very fast on computers. Schmidt's Numerical Method Consider, as shown in Figure 1, a cylinder that i s divided into hypothetical elements which are frequently called nodes in heat transfer l i t e r a t u r e . To develop Schmidt's numerical method, an energy balance around an element i i s written: [Heat flow from i - 1 ] + [Heat flow from i + l ] 3)
kA(T _ i
1
- T.)
+
kA(T.
ΔΧ
A Δχ at T^ T.
+1
- T.)
=
P
=
H e a t
accumula
tion i n i
CAAx(T' - T. )
ΔΧ
ΔΪ
= area perpendicular to flow, f t = element length, f t = small increment of time, hours = temperature of element i at time t+ t = temperature of element i at time t . 2
Rearranging equation 3) gives: τ
[(τ^ - V
+
(T
i+1
-
Τ ι
)] =
τ;-τ.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
W O L F E
E T AL.
3
Microcomputer Plus Saul'yev
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
ΔΧ
τ
t
2
τ
Δχ
'i-l
3
1
'i+l
isothermal 1 ines
depth , D-
Figure 1.
Imaginary division of slab with finite depth and large surface dimensions in relation to the depth
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4
4)
MINICOMPUTERS
a N T
i-i
+
(1-2αΝ)Τ. + aNT.
SCALE
COMPUTATIONS
= Τ! ,
+1
a = thermal
A N DL A R G E
diffusivity,
Ν = At/AX
2
For numerical s t a b i l i t y , the coefficients of a l l temperature variables must be non-negative. This means as far as equation 4) is concerned that Ν must be selected such that aN
8
>
ο ο
s
05
1.
W O L F E
E T A L .
7
Microcomputer Plus Saul'yev
Saul'yev method. Other methods, such as Schmidt's, require the use of new registers for this purpose. It can be stated that the agreement between the analytical results and the numerical results is excellent. Since the Saul'yev method is an alternating direction method, an even number of rows i s required for accurate r e s u l t s . Each pair of rows i s called a pass; the total number of passes, P, i s equal to t/(2At). Programming the HP 67/97 to obtain the results in Table I , with printout of intermediate r e s u l t s , required 56 program steps out of 224 available. Two memory registers were used to store aN and to maintain a count of the number of rows computed.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
A More Complex Problem In this section we examine a more complex case and develop some extended formulas. Figure 2 shows a recent situation where the authors (7_) needed to provide a method for f i e l d engineers to predict time-temperature cooling curves for hot asphaltic surfaces placed during highway construction, in order to support decisions on whether or not to allow paving work during marginal weather conditions. Figure 2 also shows the various factors which influence the heat transfer and the temperature-time history of a hot pavement layer. The mathematical model which i s applicable to this s i t u a tion i s summarized below: Governing Equation for Hot Layer 10)
aT 9^ 2
α ι
_ =
3T
3t
Governing Equation for Cold Base
8U
11)
2
ax "
012
=
7
9U
3x
Surface Energy Balance 12) ^ a j l ^ t l
= - aMH + . 6 5 V ( T ( 0 , t ) - T 8
+ ε σ ( Τ ( 0 , ΐ ) + 460)
a i >
)
4
Hot Layer - Cold Layer Interface 13)
TU,t) = UU,t)
contact condition
14)
k BTij^tl
k MhH
x
=
2
e condition at medium A to medium Β interface
e n e r g y
b a l a n C
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
8
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
H = solar flux Cloud Transmission Factor j= M = .15 for clouds = 1.0 for no clouds Fraction of cloud cover = W (visually estimated) \adiation=
e a A
( i T
+ 4 6 0
>
4 Tn=T„.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
\onnection-
h A ( T l
-
air
T o )
i n i t i a l uniform
q
T
s o l a r =aMWH
lΊ . T
x=0
2
temperature of media A y at time zero.* " . T
Medium A HOT
k
5
•T 6
Τβ = i n i t i a l tempera ture of base at time zero.* Medium Β
T
7
T
8
T
9
Τ/\.β
Medium to medium change = T in thib case
=
6
Tio Tn
Step size = bAx = 9ΔΧ
COLD BASE Tl2
t Step size = bAx = 9ΔΧ
I 13
* Temperatures T. are specified at a l l node points at time zero. Figure 2.
Hot shb on cold semi-infinite base with surface radiation, convection, and insolation
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
W O L F E
E T
9
Microcomputer Plus Saul'yev
AL.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
I n i t i a l Conditions 15)
T(x,0) = T
16)
U(x,0) = U
0
TO
'V,
+
τ
) Mi
Μ
- s f f o )T, • 0 * ^
) η
b Backward 26
>
TO
0, T(0,t-*0) T(x,t) = T^rUx 1+r When the two media have the same thermophysical properties, r=l and T(0,0) = Ti+Ui = 1/2(T +U ) 1
1
1+1 It i s clear from the foregoing equation that the arithmetic average correctly represents the interface temperature only when the thermophysical properties of the hot and cold media are equal.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
12
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
Surface Equations The surface of a hot l a i d asphaltic con crete pavement layer is acted upon by several environmental influences. These influences include solar radiation (insolation) and cloud cover along with a i r velocity and turbulence. Solar radiation varies in intensity with time of day and the season of the year, while wind conditions are even more variable. Solar radiation can be measured with a pyrheliometer, but in practice, the equations and tables in the ASHRAE Handbook of Fundamentals (8) can be used to accurately determine the solar f l u x . The effect of cloud cover is perhaps more d i f f i c u l t to e v a l uate since height, thickness, water droplet size and percentage of cloud cover a l l influence the transmission of solar energy. For the present purposes, the cloud cover is assumed either to exist or not to e x i s t . If i t e x i s t s , solar radiation is reduced by 85% in a l l computations. The following diagram depicts the surface element and node construction appropriate to the problem under discussion:
(air)
1 2
^One-half element assigned to Τχ
AX-
ΔΧ
The energy balance at the surface element y i e l d s : - radiation loss + solar gain + gain from T
- connective loss
2
= energy gain/loss in Τ χ element 29)
- ε σ Α ί Τ ^ β Ο ) + aMAH + kA(T -Ti) 4
2
Δχ
- ΗΑ(Τχ-Τ . ) a
i
r
= 1/2ΑΔχρΟ(Τ{-Τχ) "Ix
Rearranging terms gives: 30)
2haNAx
(Το-Τχ) + 2 Ν ( Τ - Τ χ ) 2
+ 2aaNMAxH k =T{
2εσαΝΔχ
(Τχ+460) + Τχ 4
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
W O L F E
13
Microcomputer Plus SauVyev
E TA L .
Forward and backward interpretations of the above are: Forward 31)
2αΝΔχ k
(hTft + aMH - εσ(Τ +460) ) Ι+
1
+ 2αΝ(Τ -Τχ) + Τι = (1 + ^ Ν Δ χ 2
}
J
{
Backward 32)
2αΝΔχ k
(hT + aMH - ε σ ί Τ ι ^ β Ο ) ) 4
0
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
+ (i . 2oNA_j! j
Τ
ι
+
2 α Ν Τ
,
=
( 1 + 2 α Ν ) τ
.
In equations 31) and 32) the radiation term, (T +460) , has not been applied in a forward and backward sense, since the solu tion for T'. would otherwise be unduly complicated. Tests show this omission to y i e l d negligible errors. The largest discrepancy occurs in T only during the early minutes after time zero. I+
1
x
Solution of Problem in Figure 2 The above equations provide a means for solving the problem depicted in Figure 2. To further i l l u s t r a t e this problem, the following data are hypothesized: Environmental conditions Solar radiation Η = 200 BTU/hr (obtained from ASHRAE Handbook of Fundamental Tables (8). M = 1 or . 1 5 , assume cloud cover with M=.15 Air velocity = 10 MPH A i r Temp. = 80°F h = convection coefficient α = . 6 5 v = .65(10) · = 4.10 Air temperature = 70°F Surface radiation = εσ(Τ+460) = .95 χ 1.731-10~ (T+460) = 1.644-10" (T+460) 8
8
4
9
Hot Solid Absorptivity a for solar flux = .85 Emissivity for solar radiation = .95 I n i t i a l temperature = 300°F k = 1.5 ρ = 150 C = .25 ι α = 0.04 Δχ= 0 . 5 " = ft.
y
4
4
Cold Solid I n i t i a l temperature = 70°F k = 3 C = .25 ρ = 150 α = 0.08 Δχ from node 5 to node 10 = .5 in Δχ from node 10 to node 13 = 9 χ .4 = 3.6 inches
Elapsed time = 15 minutes At =
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
14
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
Total depth of base = 1.6 + 10.8 = 12.4 inches. Table II gives the results of computations using the HP 67/97 at one minute increments up to 15 minutes. These results have been compared with highly accurate results from an IBM 360 and they agree within 2°F at a l l points.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
Program The program to obtain the results in Table II took 223 steps out of an available 224 steps. The authors have made several hundred computations using the HP67 or HP97. To these authors, the results are extremely satisfactory. The computations are convenient and the method is generally superior to other approaches. The programs are appended to this paper. Current purchase price of an HP67 computer is $400 and that of an HP97 i s $750. No hardware other than one of the foregoing was needed to perform the complex heat transfer calculations which have been described. Monthly maintenance cost of these instruments can be considered negigible. Conclusion Examples have been presented which demonstrate the usefulness of a small, handheld computer for performing numerical solutions of simultaneous partial d i f f e r e n t i a l equations of the diffusion type. Some mathematical development, or extension, of a standard numerical solution method was required to adapt the method to a small computer. But the results obtained compare very closely to those yielded by an IBM 360 computer; and the use of a small computer makes possible rational decisions on the s i t e in real time by a construction project engineer. It has not been possible, hitherto, to support "go - no go" decisions at a paving s i t e with such detailed analysis of environmental data. Acknowledgements The authors wish to thank Mr. Leon Talbert and Mr. W i l l i s Gibboney of the Department of Transportation, State of Ohio, for their assistance and encouragement. The research reported in this paper was supported in part by the Department of Transportation, State of Ohio, and the U.S. Department of Transportation, Federal Highway Administration.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Minutes
in
Time 275
274
263
254
244
235
227
219
256
247
238
230
222
214
207
200
193
188
183
177
172
3
4
5
6
7
8
9
10
11
12
13
14
15
124 131 135 137 139
158 157 159 159
208
155
207 201 195 190 186 182
219 212 205 199 194 189 184
212
205
199
193
187
182
178
212
226
165
168
171
174
177
180
147
138
138
139
150 149
139
140
140
153 152
154
140
157
188
219
234 184
140
157
192
226 140
158
196
235
253 243
201
245
216
111
156
263
256
228
271
286
283
264
70 71
72 76
93
101 103 104 106 107
113 114 115 116 117
126
127
127
127
127
100
112 126
70 70
72 73
96 98
70
72 95
70
70 70
71
70
70
71 72
91
89
98
71
87 111
71
84 125
96
109
70
70
82
124
93
107
122
70
70 80
91
104
120
70
70
70
70 70
70
70
70
14.4
70
70
70
10.8"
70
77
88
101
117
75
73
70
70
80
7.2"
3.6"
84
97
91
108 113
84
99
76
85
250
288
2min
293
296
275
70
70
70
165
300
300
300
1min
300
300
0
3.2"
2.8"
1.2"
0.8"
0.0"
Interface 1.6" 2.0" 2.4"
Inches from Surface
Temperature Profile of Problem in Figure 2 as a Function of Time
t\x
0.4"
Table II.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
^
y?
RCLl RCL5
16-34
23 36 16 26 35 36 36
86 86 4£ 45 81 85 -35 36 15
X.
RCLE
-35
t
T
ISZI STOi RCLl RCL3 T
used
Σαχο^Ν 2^ι (node # + 10) of solid-solid inter face «1^2
+
α
h = 1.35 OR 0.8 h = 0.65 v' NH (as entered) Ν = cloud cover factor = 1.0 or 0.15
36 82 -24
RCL2
τ
-35 -55 -£'4 26 4c 35 45 36 0£
86 -24 1£ 26 46 35 45 36 81 36 03 -24
RCL4 RCLZ
r
λ
S Tût RCLO RCLE
+
EtiTI 1 •
f
2ajAt
54 36 84 36 02 -24 -35 35 15 3£ .3 36 le -55 36 15 -21 il -55 -24
ST0L
35 14
PZS RTH $L6L£
16-51 24 21 06 -62
MH
«NODE. int %)de int
αΝ
Ν I
'air
ft
2
ύ£
2
α ι
2ΡΔΧ
Ρ = # of passes
26 4£
K
RC-L3
'
Hr
35 45 3t *5 3c 53 -35 26 •it 35 45 36" 04
Λ
RCLI RC^4
_t
35 4t
STOI R0L9 ST 01 ROLE
17
Microcomputer Plus Saul'yev
E T AL.
" used
S 8
MH *used
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
R so
f~
A N DLARGE
SCALE
COMPUTATIONS
Program Listing STEP
KEY TNTRY 113 114 115 lit 11? 118 119 126 121 122 123 124 125 126 127 12S 129 138
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
'ύί
MINICOMPUTERS
18
131 132 133 134 135 136 137 138 139 148 141 142 143 144 145 146 147
1
_
A a 0 5
COMMENTS
STEP
êc*
S τ*-
5 3 F27 Ri
x
è£ ëz
STOi RTH *LBLÎ 1
35 45 24
STO: RCL* STCi RCLI x;i RCLB ISZl
35 46 36 il
570 ί
RCLI RCL6
KZI
It 2 163
λ=νν
164 165 166 167 168
£T0? tuBLD 1 8 STOI
1 88 189 198 91 192 193 194 195 S6
6i -45 16-33 22 ëi ëi -34
QTûS 1 4 CHS iTOS *LBLS RCLD ISZl STOi ISZl *L8L9 RCLC STOi ISZl RCLI 2 4
8
36 46 16-4* 36 12 16 26 4c 35 45 36 46 36 86
λ=/?
149 158 151 152 153 154 155 156 15? 158 159 168 161
n
TEMPERATURES DISTRIBUTED FROM RIO to R23
ëi
e
h = 0.65V OR IF F2 ON « h = 0.53v°-
71 172 173 174 175 176 17? 178 1 79 188 181 182 183 184 1 85 186 18?
0,8
-62 es ëZ 16 23 CZ -41 -31
197 198 99 2 2 81 2 82 2 83 2 264 2 85 66 2 67 2 88 2 89 2 18 2 2 12 2 13
ee
16-41 22 45 ZI 3c 36 14 16 26 46 35 16 26 21 36 35 16 Zô 36
KEY ENTRY 169 178
-£Z 6 5 EHT'
148
Date input input" by pa G Q used
KEY CODE
45 46 CS 1Ζ 45 46 46 ëZ v4
14 15 2 16 2 1?
RCLi 1 ii FSE RCLi PRTk $LBL8 ISZl tLBLi 5 CHS. XZI GTOi $LBLE
i6~4l 36 4t ëi ëë ~4Z 16 5. 36 45
DSZI R1 *LBL7 CLh RCLi * DSZI RCLI
£70? R; RCL6 1 1
R7h
Display node number
Zi ëë it 26 46 21 Ci èl ë>5 16-4* 22 45 21 15 ëi
1
STOI
RCL ί RCL6 STOI Ri RCLi
COMMENTS
KEY CODE
35 36 36 Z5
46 45 66 46 -ci 36 45 -55
COMPUTE AVERAGE TEMPERATURE OF LAYER 1 USING TRAPEZOIDAL RULE
-24 It 25 46 16-31 21 ë7 -51 36 45 16 25 46 36 4t 81 ëi -45 16-42 22 C7 -3i 36 06 ëi ëi -45 -24 24
R-ί
ib-
RTri
B
22 U; •a 14 ë* ëë 46
Compute
220
PRINT TERMPERATURES FROM RIO to R23 LAE ELS
c
Allocate D Print/ nut
0
e
1 2
b
c
used h=c v0.8
2
3
4
7
8
9
1
6
d
—uspri
SET STATUS
FLAGS Ε
useo*
3
FLAGS ON OFF O D D 1 • • 2 • • 3 • •
TRIG DEG GRAD RAD
DISP • • •
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
FIX SCI ENG
• •
α
WOLFE
E T AL.
Microcomputer Plus Saul'yev
APPENDIX Β
User Instructions SAUL'YEV PROGRAM frSTEP
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
1 2 3
Start
•
•
INSTRUCTIONS
Put in data with Saul'yev input/output program Read Saul'yev program Output from Saul'yev input/output program
INPUT DATA/UNITS
OUTPUT DATA/UNITS
KEYS
, :
11 11
!
1 1
"J [ """""]
L "Ί 1 -
1
LID (Z~1IZZ1 i : ! 1 II 1 1 1 1 C~3 '• 1 • CZI i 1 i 1 1 1 • CZI ι :i ι ι :ι ι ι ir I : π ι
π
α
IZUtZZ]
• ι •
11 11
;
11 11
["
]ι
; ι 1 ι
:
i!
;
:
11
• Γ
: u d 11
cri 1 i 1
ι "J ( II 1 ί
I I
• 1
1 ι 1 1 1
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
20
STEP
KEY ENTRY 891
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
062 005 004 005 006 007 008 005 010 011 012 013 014 015 01 6 917 818 015 020 021 022 023 024 025 026 027 028 025 030 031 032 033 034 035 036 Θ37 03S 035 040 041 042 043 044 045 046 047 048 045 050 051 052 053 054 055 056
*L£LH
ÎSBE 1
STxi DSZJ. RCi. ι ISZI ISZI RCL i * RCL: X
USZ:
ST+i 1 RCL: +
ISZI RCL6 RCLI (iBS
X=V? &70C 2 i x=y:
C7CD 2 3 RCLI
X=r'7 C7Ci 1 1 CHS
Α=ϊ'? CT02 CTOtt *L6L1
SFÊ 2 CHS S7CI CTCa *LfiL2 CSEE CF6 1
ST-e
RCLC X=07
23 ;2 »o ti 36 t'. 35--35 16 25 56 16 26 it 26 56
START Cale. Τ ι MAIN PROGRAM
45 4o 4C4c 4c
45 -55 36 Ci -35 4c it 35--55 45 Ci 36 Ci -55 35-24 45 it 26 46 36 66 36 4 6 16 3i 16--33 22 13 32 Ci 16--33 22 14 '02 ti«i 36 46 i6--33 22 Ci Si 01 -22 16 -33 22 02 22 16 11 21 Ci 16 21 Ct? 02 02 -22 35 46 22 16 i i Zi 62 23 12 16 22 30 Ci 35 -45 CC 36 ce 16 -43
Next Τ T. ? interface* Step size change?
Tl ? 3
Surface T ? x
Prepare for backwarq pass, set flag 0.
Set RI to negative values. END OF BACKWARD PASd Reduce # of passes left by one.
J12
2
Ti»T,
BP* 5
057 058 055 060 061 062 063 064 065 066 067 068 065 070 071 Θ72 073 074 075 076 077 078 075 080 081 082 083 084 085 086 087 088 085 050 051 052 053 054 055 056 057 058 055 100 101 102 103 104 105 106 107 106 105 110 111
RTN 67 OH «LÊL3 CHS STOi
isz;
16 22
CTCa *LEL8
i 0
STCI RCLi RCL?
it
ISZI RCLi
4 6 0 + 4 yx
j 6 4
4 EEA
5 CHS Λ
-
RCLS 8
5
Λ
RC15 A
RCLi •
RCLi 2 Λ
ISZI it RCL i DSZI it F0? 16 CTOfc 22 RCi.;
-
21
λ
Node
24 ii 63 46 0C -1000
7.44406E-2 7.44407E-2
1 1 43 (6)
7.44407E-2 f a i 1 ed 7.48813E-2
57 >1000
3.48733E-3 #.4873E-3
26
3.48748E-3
123
8.89216E-4
51 242 (4)
8.89202E-4
147
(5)
The f i g u r e i n p a r e n t h e s e s a f t e r t h e number o f m a t r i x p r o d u c t s i s t h e number o f i n v e r s e i t e r a t i o n s . In a l l c a s e s t h e s h i f t k=0.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
32
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
x^ , the eg linear equations solution is usually accomplished in many fewer than η matrix-vector products. This has also been observed by Ruhe and Wiberg [4]. The number of inverse iterations, starting form a shift k=0, has been included in parentheses behind the number of matrix-vector products in Table III. Acknowledgement s While this study used very minimal computing resources, we wish to acknowledge time provided on the Data General NOVA and IBM 370 (Datacrown Ltcl. ) at Agriculture Canada and the Amdahl V6 at the University of Alberta.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch002
Literature Cited [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]
Hestenes M.R. and Stiefel E . , J. Res. Nat. Bur. Standards Section Β (1952) 49, 409-436. Fletcher R. and Reeves C.M., Computer Journal (1964) 7, 149-154. Wilkinson J.H., "The algebraic eigenvalueproblem ,Clarendon Press, Oxford, 1965. Ruhe A. and Wiberg T., BIT (1972) 12, 543-554. Bradbury W.W. and Fletcher R., Numer. Math. (1966) 9.,259-267. Shavitt I., Bender C.F., Pipano Α., and Hosteny R.P., J. Computational Physics (1973) 11, 90-108. Nesbet R.K., J. Chem. Phys. (19657) 43, 311-312. Davidson E.R., J. Computational Physics (1975) 17, 87-94. Beale E.M.L., in Lootsma F.A., "Numerical methods for nonlinear optimization", 39-44, Academic Press, London, 1972. Nash J.C., "Compact numerical methods: linear algebra and function minimization", To be published, probably in 1978. Nash J.C., "Function minimization with small computers", submitted to ACM Trans. Math. Software, 1976. Ruhe Α., in Collatz L., "Eigenwerte Probleme", 97-115, Birkhäuser Verlag, Basel, 1974. Acton F.S., "Numerical methods that work", 58-59, Harper & Row, New York, 1970. Geradin M., J. Sound. Vib. (1971) 19, 319-331. Bowdler H., Martin R.S., Reinsch C., and Wilkinson J.H., Numer. Math. (1968) 11, 293-306. Stewart G.W., in Bunch J.R. and Rose D.J., "Sparse matrix computations", 113-130, Academic Press, New York, 1976. Fried I. J . Sound. Vib. (1972) 20, 333-342. 11
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3 L a r g e Scale S i m u l a t i o n with a
Minicomputer
Β. E. ROSS, PAULA JERKINS, and JAMES KENDALL
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
College of Engineering, University of South Florida, Tampa, FL 33620
This paper describes the alteration and development of an Interdata 7/16 minicomputer to perform large-scale computations. Substantial savings in cost and overall turnaround time resulted from the act of performing the calculations with the minicomputer instead of the previously used IBM 360. Introduction Since 1969, students and faculty of the College of Engin eering at the University of South Florida have been developing a co-ordinated set of digital computer models for environmental simulation. Calculations of the hydrodynamical, chemical and biological aspects of estuarine areas are included. Large-scale physical areas are simulated over long real time periods and the complexity of the interactions of the models result in largescale computations. A natural consequence of performing the simulations with a general purpose IBM 360 are rapid execution times but very delayed turnaround time due to system p r i o r i t i e s and other user demands. The alternative of adapting and upgrading an existing Inter data computer was studied in detail. The developments in mini computer technology in recent years have increased the obtainable operating speed of the central processing units. Large-scale main memories and large-scale auxiliary memories have become available for economical minicomputer development. The computer programs which are implemented with data and become the simulation models are carefully co-ordinated into sub routines which lend themselves to overlay techniques. The p r i n c i pal numerical schemes involved are explicit so that core require ments and run time are flexible and interchangeable. The combin ation of subroutines and explicit solution greatly simplified the transition of the programs from IBM to Interdata.
33 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
34 The
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
Simulation Problem
The tasks to be performed by the computer involve the numerical s o l u t i o n of the v e r t i c a l l y i n t e g r a t e d equations o f motion, c o n t i n u i t y , and mass transport with chemical and b i o l o g i cal i n t e r a c t i o n s i n two dimensions. The b a s i c equations are as follows :
9υ 3t
1 8 υ
+
D
+
3T
3V
U 3V D
3H 3t
3U 3x
3C. ι 3t
+U
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
+
3_
3y
37 +
ν9υ
+
3x
V 3V D
37
+
n l I +
Ω
ϋ
3H 3y
n =
-8°
υ
- / QUD
- I |P ρ 3x D
I
/ QVD"
d
P
| L 3y
3V 0 37 =
ac. JT 1
3C.
3y
v
(1)
(2)
(3)
V +
3C. (DE
ΩΥ = -gD
D 3y
+
1
3
8
C
i
w
4
M. J
1
Ρ
ud
g
Transport i n the χ d i r e c t i o n
vd
g
Transport i n the y d i r e c t i o n
(4)
D = Local water depth Ω = C o r i o l i s parameter Η = Local water surface e l a . / = Local f r i c t i o n f a c t o r Q = L U + V23 h Ρ = Atmospheric pressure C^= Concentrations o f water q u a l i t y parameters or b i o t a NL= I n t e r a c t i o n process and sources or sinks ρ = Mass d e n s i t y Ex,Ey = Dispersion coefficients 2
For the s o l u t i o n scheme the equations are reduced to f i n i t e d i f f e r e n c e form and solved on a square g r i d matrix. An example of the a p p l i c a t i o n o f the model to H i l l s b o r o u g h Bay, F l o r i d a , i s shown i n Figure One. Figure One shows the d i s t r i b u t i o n o f d i s s o l v e d oxygen i n the Bay r e s u l t i n g from the discharge of p o l l u tants from i n d u s t r i a l , m u n i c i p a l and n a t u r a l sources, and the i n t e r a c t i o n of b i o t a .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
E T A L .
Large Scale Simuhtion
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
ROSS
Figure 1.
Hilhborough Bay, Florida
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
36
AND LARGE SCALE
COMPUTATIONS
The number of g r i d elements i n v o l v e d i n the s i m u l a t i o n are 90 X 36 or 3240 elements. There are 10 v a r i a b l e s i n v o l v e d with each element. C a l c u l a t i o n s are performed f o r the h y d r a u l i c program i n i n t e r v a l s of 90 seconds of r e a l time f o r 24 hours. Numerous a u x i l i a r y c a l c u l a t i o n s updating c o e f f i c i e n t s must be performed. The h y d r a u l i c p o r t i o n (the s o l u t i o n of equations 1, 2, and 3) of t h i s s i m u l a t i o n i s the set of c a l c u l a t i o n s that were chosen f o r the computer comparison purposes.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
The
Interdata Computer
B a s i c a l l y the Interdata machine was a p a r t of an experimental data r e d u c t i o n system. The cpu handles numbers at 16 b i t s , two bytes at a time i n t e r n a l l y . The word length i s 32 b i t s so the roundoff e r r o r i s the same as that f o r the IBM. The o r i g i n a l machine had 8K bytes of memory and a magnetic tape u n i t . A f t e r an examination of the problem i n v o l v e d with overlay techniques, the p o s s i b i l i t y was found that adequate storage could be obtained by using 65K bytes of core and a 50 M byte d i s c . Both of these u n i t s were a v a i l a b l e f o r the Interdata. The c o n f i g u r a t i o n of Interdata computer f i n a l l y implemented f o r the s i m u l a t i o n comparisons i s described as f o l l o w s : Supplier Unit Price M71-012 7/16 CPU 3,700 M71-101 Binary D i s p l a y Panel 350 M71-103 Automatic Loader 400 M71-104 Power F a i l / A u t o Restart 400 M71-105 Signed M u l t i p l y / D i v i d e 950 INTERDATA M71-106 High Speed ALU 5,000 M46-004 ASR-33 Teletype 1,950 M48-024 Current Loop I n t e r f a c e 400 M46-500 9 Track 800 BPI Magtape I n t e r f a c e 2,950 M46-501 9 Track 800 BPI Magtape Transport 6,000 M47-102 RS-232 I n t e r f a c e 500 $22,600 BALL COMPUTER BD-50 50M Byte 3330 Type Disc Drive 7,000 MINI-COMPUTER TDC-803 3330 Disc I n t e r f a c e 1,900 PUSHPA PM9800 65K Byte Memory 4,000 $35,500 HAZELTINE 2000 Video Terminal ξ P r i n t e r 4,000 $39,500 The hardware s e l e c t e d i s supported by a Disc Operating System (DOS) s u p p l i e d by Interdata. This i s not the most soph i s t i c a t e d operating system a v a i l a b l e but s u f f i c i e n t to meet the immediate needs. The i n s t a l l e d v e r s i o n has c a p a b i l i t i e s as out l i n e d below: D.O.S. (Disc Operating System) I. System U t i l i t y : A Copy F i l e s Β Compress/Decompress C Disc Backup
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
II.
Ross E T A L .
37
Large Scale Simulation
Disc F i l e Management:
A Allocate Files Β Delete F i l e s C Protect F i l e s D List Files Ε P o s i t i o n to a S u b f i l e III. Software: A FORTRAN V Level 1 Β Extended Basic C OS E d i t o r D OS L i b r a r y Loader Ε OS Assembler F OS A i d s , Debugger The language used i n the s i m u l a t i o n programs i s FORTRAN V , l e v e l 1, which i s a high l e v e l compiler supporting the r e q u i r e ments o f ANSI standard FORTRAN and includes s i g n i f i c a n t extension i n both language and subroutine l i b r a r y to support process control a p p l i c a t i o n s and m u l t i - t a s k i n g programs. Features o f FORTRAN V , which may be o f major importance to the FORTRAN programmer are: (a) Mixed mode a r i t h m e t i c i s allowed (b) Array i n i t i a l i z a t i o n and i m p l i e d - D o s i n Data statements are provided f o r (c) M u l t i p l e e n t r i e s i n t o FORTRAN subroutines are provided (d) H o l l e r i t h constants may be declared using the apostrophe as a delimiter Features which may be o f i n t e r e s t to a FORTRAN programmer u s i n g FORTRAN V as a process c o n t r o l language are: (a) The use of i n - l i n e assembly language i s allowed (b) Encode/decode statements are allowed (c) Hexadecimal and character constants may appear as arguments i n expressions as w e l l as i n Data statements and c a l l p a r a meter l i s t s (d) Analog input i n a s e q u e n t i a l order i s allowed (e) Analog input i n any sequence i s allowed (f) Analog output i n any sequence i s allowed (g) L o g i c a l functions intended to support the Instrument S o c i e t y of America/Purdue Standards are a v a i l a b l e FORTRAN V contains s e v e r a l features to s i m p l i f y and expedite program debugging such as (a) Over 60 compile-time d i a g n o s t i c s are provided (b) 35 run-time e r r o r messages are provided (c) Run-time trace c a p a b i l i t y i s provided (d) Optional c o m p i l a t i o n i s provided which f a c i l i t a t e s i n s e r t i o n of the programmer's d i a g n o s t i c s and allows these to be e a s i l y deleted from a program without p h y s i c a l removal. 1
Standard Operating Procedures with the
Interdata
The main programs e x i s t i n source form on the 50 M Byte d i s c . These can be c a l l e d by an operator. Basic input data are entered by the operator i n an i n t e r a c t i v e mode. The program goes immediately to the compile and run modes. Intermediate
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
38
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
c a l c u l a t e d data such as h y d r a u l i c v e l o c i t i e s and water depths are s t o r e d on the d i s c temporarily and p r i n t e d on magnetic tape at s e l e c t e d r e a l time i n t e r v a l s . The t o t a l q u a n t i t y o f c a l c u l a t e d data i n t h i s step i s too great to be s t o r e d i n t a c t upon the 50 M Byte d i s c . Upon completion of the c a l c u l a t i o n o f the h y d r a u l i c s o f a bay, the water q u a l i t y program i s c a l l e d . The water q u a l i t y program uses the h y d r a u l i c data from the magnetic tape to c a l culate chemical and b i o l o g i c a l r e s u l t s . Longer r e a l time i n t e r v a l s are used i n the water q u a l i t y c a l c u l a t i o n s , thus l e s s data are c a l c u l a t e d . The c a l c u l a t e d r e s u l t s are now s t o r e d upon d i s c f o r p r i n t o u t , or reading onto another magnetic tape.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
Comparisons of Costs and Time The s i m u l a t i o n c a l c u l a t i o n s were a l t e r n a t e l y made by use o f an IBM 360, with 3 Megabyte a c t i v e core and almost u n l i m i t e d d i s c space. However, t h i s system i s i n a U n i v e r s i t y and hosts many languages and supports d i v e r s e u s e r s needs so that much o f the c a p a b i l i t y i s not a v a i l a b l e to a u s e r . The IBM system i s supported by the usual IBM operating system. The usual debugging programs and some p r o f e s s i o n a l a s s i s t a n c e i s a v a i l a b l e by appointment. Comparisons are made f o r comparable computer environmental simulations. Two times are important. These are cpu time and turnaround time. Another parameter o f i n t e r e s t i s c o s t . The r e s u l t s i n d i c a t e that the t e s t program u t i l i z e d 4612.65 cpu seconds i n the IBM machine. The machine elapsed time was 3 hours. The best turnaround f o r the s i m u l a t i o n was 24 hours. Usual turnaround times are on the order o f 72 hours. The cost of the t e s t c a l c u l a t i o n on the IBM system was $303.43. A comparable run was performed on the Interdata 7/16 before hardware high speed f l o a t i n g p o i n t a r i t h m e t i c u n i t was i n s t a l l e d . The cpu time was 74 hours which was a l s o the turnaround time. Costs o f computation were based on the f o l l o w i n g f a c t o r s : $34,300 amortized 4 years $8,575 I n t e r e s t 1st year 3,087 1/3 T e c h n i c i a n time f o r general maintenance 4,000 F i e l d r e p a i r s by I n t e r data per year 500 $16,162 I f computer used 50% time $88.56/day or $ 3.69/hr. Thus, based on 74 hours the cost of t h i s long run was $273.06. A n a l y s i s o f the cpu usage during the long Interdata run i n d i c a t e d that 85% o f the time was spent i n software operations involving floating point arithmetic. A high speed f l o a t i n g p o i n t a r i t h m e t i c u n i t (HSALU) was i n v e s t i g a t e d . A n a l y s i s showed that i f the same s i m u l a t i o n was performed with HSALU the r e s u l t s 1
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3.
Large Scale Simuhtion
Ross E T A L .
IBM 360
96 -,
INTERDATA WITH HSALU*
INTERDATA NO HSALU*
303
273
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
72 72· j.--jUsaal
48
I 74
I
74
J
!
TIME IN HOURS
24J
y
Moo
!
24
I
Best
f^j| Turnaround Time
6
6
24
*High Speed A r i t h m e t i c Unit
|§§ CPU Time
DIΛD
C o s t
1:10
Figure 2.
Comparisons for runs yielding identical results
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
40
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
s t u d i e d would i n d i c a t e a cpu time o f 5.69 hours which i s also the turnaround time. The costs are adjusted t o r e f l e c t an a d d i t i o n a l investment o f $5,200 and the r e s u l t s i n d i c a t e a cost of $4.09/hour. The cost o f t h i s h y p o t h e t i c a l run i s $23.28. I n s t a l l a t i o n of the HSALU and subsequent s i m u l a t i o n confirmed the expected r e s u l t s . Thus, the new run at highspeed saved $280.15 and 18 hours of turnaround time. The r e s u l t s are summarized i n Figure Two where time and d o l l a r s have been rounded t o the nearest i n t e g e r .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
Conclusions The conclusions are that l a r g e - s c a l e computations can be accomplished on modern minicomputers with savings i n time and money. Accuracy i s not s a c r i f i c e d i n 32 b i t word machines and maintenance and r e l i a b i l i t y appear to be r e a l i s t i c i n c o s t . Programs and numerical techniques must be compatible with the s i z e of the machine chosen.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4 APL
L e v e l L a n g u a g e s in A n a l y s i s
A Host-Microcomputer-Instrument Hierarchy in L i g h t Scattering Spectroscopy
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
J. ADIN MANN, ROBERT V. EDWARDS, THOMAS GALL, H. M. CHEUNG, F. COFFIELD, C. HAVENS, and P. WAGNER Department of Chemical Engineering, Case Western Reserve University, Cleveland, OH 44106 So far as we know this is the first report in the open literature of a laboratory experiment instrumented totally within the context of APL as the computer language driving each level of a hierarchy. The significance of our report is the demonstration that such a high level language improves by orders of magnitude the effort required to implement complicated experiments involving elements of control, data acquisition, data processing, and modelling of complex phenomena. It i s especially easy to retain the degree of human interaction required by the experiment. A l l of these desirable results can be accomplished by persons with no formal training in computer science. Certainly other languages and combination of languages have been used to write interactive systems for data collection and analysis. We have had considerable experience with FORTRAN and BASIC as well as assembly languages. Our experience with operating systems for experimental work has been limited to the DEC PDP 11/40 DOS and the DEC PDP11/45 RSX11D operating systems. Unequivocally, APL and our hierarchy has proved to be an order of magnitude more effective in reducing f i r s t concepts to producing results with experimental equipment. The APL language and the concepts of the APLSV or VSAPL implementations have provided an integrity of design and ease of coding that from our experience is far ahead of FORTRAN and BASIC oriented systems. Certainly the human engineering that has gone into APL implementations, e.g. the IBM 5100, is a factor in such a significant improvement, but the major element is that the structure of the language i t s e l f and the notation is much closer to the mathematics involved in experimental work than any computer language known to us. The penalty one pays in using a high level language is that of execution times for certain operations and, perhaps, cost of the hardware. These disadvantages were more than balanced by reduction in the time necessary to integrate the hierarchy into the experiment. Should the measurements become routine, i t may 41 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
42
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
be u s e f u l to have p r o f e s s i o n a l hardware and software t e c h n i c i a n s implement the APL algorithms i n the appropriate assembly language. In that case, the APL functions serve as an unequivocal d e s c r i p t i o n of the "operating system" needed for the experiment. We have chosen to describe our methods i n the context of automating a s p e c i f i c experiment i n v o l v i n g the determination of the d i f f u s i o n c o e f f i c i e n t of p a r t i c l e s i n a f l u i d by the a n a l y s i s of fluctuations i n scattered light. A d e s c r i p t i o n of the p h y s i c s of the experiment i s r e q u i r e d i n order to put the instrumental a n a l y s i s i n context. The d e t a i l s of the i n t e r c o n n e c t i o n of the h i e r a r c h y w i l l be described and excerpts of code w i l l be given to illustrate methods. F i n a l l y , performance w i l l be o u t l i n e d . L i g h t S c a t t e r i n g Spectroscopy:
Particle
Diffusion
Only a b r i e f i n t r o d u c t i o n w i l l be given here to a c l a s s of experiments that i n v o l v e the time s e r i e s a n a l y s i s of scattered light. Consult the books by Chu [1] and by Berne and Pecora [2] for d e t a i l s . The object of the experiment i s the study of the response of a system to small f l u c t u a t i o n s over a l a r g e frequency range. When the frequency region i s below perhaps 1MHz, the "response function" w i l l y i e l d measures of macroscopic c o n s t i t u t i v e c o e f f i c i e n t s such as diffusion coefficients, viscosity coefficients, elastic c o e f f i c i e n t s , r a t e constants and o t h e r s . For frequencies l a r g e r than about 10GHz we observe r e l a x a t i o n effects a s s o c i a t e d with molecular d i s t o r t i o n s . We w i l l s p e c i a l i z e to the low frequency region below 1MHz and further outline only the problem of determining the d i f f u s i o n coefficient of macroscopic p a r t i c l e s i n s o l u t i o n . The d i f f u s i o n c o e f f i c i e n t s of macroscopic p a r t i c l e s (lOé-nm d ^ 2000 nm, d i s the diameter of a p a r t i c l e ) are of interest for a number of p r a c t i c a l and t h e o r e t i c a l reasons. D i s p e r s i o n s of p a r t i c l e s are used by the medical p r o f e s s i o n , paint i n d u s t r y , the p r i n t i n g i n d u s t r y as w e l l as analyzed i n the context of environmental safety and i n human medicine. The theory of c o l l o i d s t a b i l i t y can be s t u d i e d d i r e c t l y as can the i m p l i c a t i o n s of the theory of f l u i d s [3]. Consider a suspension of small p a r t i c l e s , d 200 nm, each p a r t i c l e w i l l execute Brownian random motion that r e s u l t s from the very frequent c o l l i s i o n s of the Brownian p a r t i c l e with the small molecules of the surrounding s o l v e n t . When the number density of the Brownian p a r t i c l e i s s m a l l , the p a r t i c l e s can be treated as i n d i v i d u a l s so that c o l l i s i o n s between these l a r g e p a r t i c l e s can be i g n o r e d . The d e s i r e d information about d i f f u s i o n can be c a l c u l a t e d from the s c a t t e r i n g function
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4.
M A N N
43
APL Level Languages
E T A L .
where p(D) i s the d i s t r i b u t i o n of d i f f u s i o n coefficients, D and q i s the s c a t t e r i n g v e c t o r . A c o m p l i c a t i o n due to the v a r i a t i o n of s c a t t e r i n g c r o s s - s e c t i o n with s i z e i s included as part of the data analysis. The intermediate s c a t t e r i n g function i s r e l a t a b l e to the results of a l i g h t s c a t t e r i n g measurement i n the f o l l o w i n g way. When the i n c i d e n t beam has an e l e c t r i c f i e l d E the scattered beam, Tf^ w i l l be modulated by the p a r t i c l e motion so that at the detector the i n t e n s i t y w i l l be i = β |E.1 and the current autocorrelation f u n c t i o n produced by the detector w i l l be c
3
Rji-rt -
- Afitf
(1A)
A
where A , Β and C are constants for a given experiment. Experimentally, R^(l) computed from a time s e r i e s produced by the detector that i s e s s e n t i a l l y the photocurrent as a function of time. When the i n c i d e n t f l u x of the s c a t t e r e d light is s u f f i c i e n t l y h i g h , the photocurrent i s put through an analog to d i g i t a l conversion before the computation of the correlation f u n c t i o n by the r u l e that i
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
+ Β F (q,y) + C
s
*i< *« = - k \ n
vy+
(
n
I
B
)
When the i n c i d e n t f l u x i s s m a l l , photon counting i s done d i r e c t l y but a s i m i l a r formula holds with the d e f i n i t i o n that 1i s the number of photoelectrons detected during a period T^rAy around j x A T · computer i s used for these c a l c u l a t i o n s . In p r a c t i c e , the r e l a x a t i o n times i n range between a few tenths of microseconds to a few tens of milliseconds'. R e l a x a t i o n times shorter than l^«.sec r e q u i r e measurements of the time s e r i e s i n the 10 to 100/usee range. The accuracy requirement for i ^ ± } modest, eight b i t s i s often s u f f i c i e n t , but averaging must be done with respect to a l a r g e number of time s e r i e s . Even when d i r e c t memory access i s f a s t enough the memory of a conventional computer w i l l be f i l l e d before a large enough time s e r i e s has been c o l l e c t e d . We have been using the SAICOR Mod 42 and 43 machines for preprocessing the time s e r i e s data. While they are not programmable, c o n t r o l of f u n c t i o n can be done by a host computer. The r e s u l t of a determination of R£ i s a set of 16 b i t numbers, one for each correlator channel. As the d e v i c e ' s memory i s read, these b i t s are a v a i l a b l e i n two's complement code on 16 pins mounted on the back panel of the correlator. The a n a l y s i s can take a number of forms and a convenient one involves the computation of cumulants. Essentially, A
i
log
s
F(q,7) = C
+ £
C
where the power s e r i e s i s truncated a f t e r
-^T
(
the Mth term.
(
2
A
)
Then
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
44
MINICOMPUTERS A N D LARGE SCALE
Dp(D)dD
K, = q
COMPUTATIONS
=
(2B)
Ο
q^(D
-
Since f o r Ύ > 0
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
log(R^)
- C) = 2 log F ( q , J ) + constant
(3)
i t i s obvious that a polynomial must be f i t to what amounts to the l o g of the c o r r e l a t i o n f u n c t i o n produced by the SAICOR hardware. The c o e f f i c i e n t s of the polynomial can be i n t e r p r e t e d p h y s i c a l l y as the cumulants of the d i f f u s i o n c o e f f i c i e n t of polydispersed p a r t i c l e s . The determination must be repeated often i n the course of an experiment. The sequence of events must be: 1.
2.
3. 4. 5. 6. 7.
Computation of R^ goes on i n r e a l time for a s e l e c t e d period of time and may i n v o l v e 10M bytes of information on the photocurrent. The 100 to 400, 16 b i t R^ vector must be sent to a computer and transformed to numbers from a two's compliment code. The R{ vector must be subjected to a l e a s t squares a n a l y s i s and the cumulants c a l c u l a t e d . The R^ vector must be archived with ID data and the cumulants made a v a i l a b l e for a n a l y s i s . Repeat t h i s sequence many times for each experiment. The data f i l e s for the experiment must be catalogued. Repeat t h i s e n t i r e sequence for many experiments by d i f f e r e n t u s e r s .
We have found that VSAPL or APLVS l e v e l of APL implementations to be h i g h l y f a c i l e for handling these tasks effectively. A hierarchy with an APL host was devised for performing the data a c q u i s i t i o n , computations and c o n t r o l required. Before the d e t a i l s are d e s c r i b e d , a d e s c r i p t i o n of APL i s necessary. APL
(A Programming Language)
APL i s an array processing language for manipulating sets of numbers or sets of characters of quite general shapes. The formal syntax of APL i s based on the mathematical concepts of f u n c t i o n and functions of functions or o p e r a t o r s . Much of the power of the language derives from the extensive set of
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4.
M A N N
E T AL.
APL Level Languages
45
p r i m i t i v e functions and operators as w e l l as the notation that represents t h e i r behavior. Defined functions can be constructed simply by w r i t i n g sequences of p r i m i t i v e functions that lead to the d e s i r e d result. The defined f u n c t i o n has the same syntax as the primitive functions. Several examples w i l l be s u f f i c i e n t to i l l u s t r a t e the use of the language. Suppose that the f o l l o w i n g double summation must be evaluated.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
In APL:
y+.xB+.xJ
The l i n e a r l e a s t - s q u a r e s algorithm or one step i n an i t e r a t i v e nonlinear l e a s t - s q u a r e s algorithm would require e v a l u a t i o n of the f o l l o w i n g matrix problem:
the
While FORTRAN requires DO looping and a c a l l to a subroutine for computing the inverse m a t r i x , APL does the e n t i r e operation as follows: K+cm It has been our experience that i n general APL code i s more compact by a f a c t o r of ten to 100 than the equivalent FORTRAN code and takes roughly a tenth of the time to produce and debug on a computer. The large set of p r i m i t i v e s as w e l l as the syntax of the language allows for a s u r p r i s i n g l y large redundancy i n the ways one may code a p a r t i c u l a r c a l c u l a t i o n . This i s an advantage for a number of reasons, not the l e a s t of which i s that the language i s very f o r g i v i n g for the inexperienced programmer. A simple subset of the p r i m i t i v e s i s s u f f i c i e n t for handling most computations that an inexperienced programmer may want to do. As he gains experience, he w i l l n a t u r a l l y take to e x p l o r i n g some of the s o p h i s t i c a t e d p r i m i t i v e s allowed i n the language. In our experience, APL has been far e a s i e r to teach to inexperienced programmers than any of the other languages commonly i n use. The APL language i t s e l f i s i n d i f f e r e n t to i t s implementation. The language has most often been implemented for t i m e - s h a r i n g , but there i s no reason to exclude a r e a l - t i m e implementation. Since about 1972, new APL systems based on the shared v a r i a b l e concept, have been w r i t t e n for the IBM 370 s e r i e s computers. This approach allowed the APL processor to communicate to the e x t e r n a l world e a s i l y . Since shared v a r i a b l e s have e x a c t l y the same s t r u c t u r e as any other v a r i a b l e i n APL, defined functions could be w r i t t e n that use
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
46
MINICOMPUTERS A N D LARGE
SCALE
COMPUTATIONS
data passed back and f o r t h between the APL processor and external processors e a s i l y . It i s possible, therefore, to consider an experimental apparatus as an e x t e r n a l processor that communicates with the APL processor through shared variables. The d i f f e r e n c e between implementations centers on whether or not shared v a r i a b l e s are used and whether or not c e r t a i n systems functions and systems v a r i a b l e s are d e f i n e d . In practice, there i s a degree of p o r t a b i l i t y i n user functions that i s beyond most other languages. The Appendix includes a t a b l e of a number of APL systems that are supported on l a r g e r machines. The l i s t i s probably not complete. We are of the o p i n i o n that a small a d d i t i o n to the set of systems functions and systems v a r i a b l e s would provide a l l of the resources needed for doing r e a l time operation e n t i r e l y w i t h i n the context of an APL machine. The implementation of such a proposal i s beyond the e x p e r t i s e that we have w i t h i n the department. However, we have found an a t t r a c t i v e a l t e r n a t i v e to a f u l l APL r e a l - t i m e machine. A block diagram of the i n t e r f a c i n g schemes that have been used s u c c e s s f u l l y i n our laboratory for the l a s t year i s shown and described i n a l a t e r s e c t i o n as Figures 2 and 3. The Hosts The APL h i e r a r c h y i s s t r u c t u r e d so that any machine running with the equivalent of APLSV can be attached as an e f f i c i e n t host. See t a b l e (1) i n the Appendix. In p a r t i c u l a r , the IBM 5100 and the Xerox Sigma 7 machines have been used as hosts extensively. The IBM 5100 with APLSV features was considerably e a s i e r to use than the Xerox APL. However, the Xerox APL i s s u f f i c i e n t for the purpose even though awkward by modern standards. The various IBM 370 machines running e i t h e r VSAPL or APLSV are e n t i r e l y able to handle the host r e s p o n s i b i l i t i e s . The most e f f e c t i v e i n t e r a c t i o n of the host with the h i e r a r c h y does r e q u i r e communication rates above 300 baud. Since the IBM 5100 can transmit and then r e c e i v e at rates programmable up to 9600 baud, i t was a superior h o s t . Although our experience i s l i m i t e d , our t r i a l s show that the Hewlett Packard 3000 Series I I machines are a l s o s u i t a b l e as h o s t s . In f a c t , the terminal ports for the HP 3000II can work to 2400 baud and the I/O bus to ca 300KBytes/sec or f a s t e r . Use of that I/O bus for the hierarchy r e q u i r e s both hardware and software that does not e x i s t . The IBM 5100 a r c h i t e c t u r e was described by Roberson [ 4 ] and w i l l not be repeated here. This APL system i s small and p o r t a b l e and includes a CRT d i s p l a y as w e l l as a tape c a r t r i d g e d r i v e . The c a r t r i d g e s have a c a p a c i t y of about 220,000 bytes and the system performs tape w r i t e and checking at about 900 bytes/sec and tape read at about 2500 bytes/sec. Our work required a p r i n t e r as w e l l as the a u x i l i a r y tape d r i v e f o r e f f i c i e n c y . It
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4.
M A N N
E T AL.
47
APL Level Languages
was necessary to use the maximum memory storage of 64K bytes which gives an a c t u a l workspace s i z e of about 57K b y t e s . The serial I/O adapter was r e q u i r e d i n order to a t t a c h the IBM 5100 to the h i e r a r c h y . The cost of t h i s system i s about $23,000. The IBM 5100 was p h y s i c a l l y attached to the h i e r a r c h y through i t s S e r i a l I/O p o r t . The APL processor allows the d e f i n i t i o n of shared v a r i a b l e s that "appear" i n both the APL workspace and the I/O i n t e r f a c e . This i s done by invoking an APL systems f u n c t i o n for generating a shared v a r i a b l e o f f e r , JJSVO, s i n the expression 1 QSVO ' MICRO' where the v a r i a b l e MICRO i s now shared with the s e r i a l I/O processor so that a s s i g n i n g MICRO to another APL v a r i a b l e w i l l cause information to be t r a n s f e r r e d from the I/O processor i n t o the APL p r o c e s s o r . A s s i g n i n g to MICRO w i l l cause information to be t r a n s f e r r e d from the APL processor to the I/O p r o c e s s o r . One or more v a r i a b l e s can be declared as "shared" by £7SV0. The I/O processor must have some information about the data to be t r a n s f e r r e d and that i s given by a s s i g n i n g l i t e r a l s t r i n g s of c o n t r o l information to the shared v a r i a b l e . Three c l a s s e s of s t r i n g s must be assigned to the shared v a r i a b l e before I/O communications o c c u r . F i r s t l y , when the l i t e r a l s t r i n g 'OUT 31001 TYPE=I' i s assigned to the shared v a r i a b l e , the s e r i a l I/O processor i s put i n t o command mode as designated by the ' d e v i c e number' 31001. I f t h i s i s i n f a c t done the vector 0 0 i s assigned to the shared v a r i a b l e by the S e r i a l I/O p r o c e s s o r . A non-zero value i m p l i e s an e r r o r and that c o n d i t i o n can be checked by simple APL code. S i m i l a r l y , ' I N 33001' when assigned to the shared v a r i a b l e informs the I/O processor to prepare f o r input from device address 33 ( i n p u t ) , f i l e 001. L a s t l y , the assignment of 'OUT 32001 TYPE=I' s t a t e s that an output operation w i l l occur for device 32, f i l e 001 and the data type i s s p e c i f i e d . A f t e r the command device i s opened by a s s i g n i n g 'OUT 31001 TYPE=I' to the shared v a r i a b l e , the next assignment to the shared v a r i a b l e i s the s p e c i f i c a t i o n of the device c h a r a c t e r i s t i c s i n the form of a character s t r i n g . The input and output buffer s i z e s may be s p e c i f i e d along with the data r a t e (0.5 baud s t e p s ) . Such aspects as the prompting c h a r a c t e r , new-line c h a r a c t e r , end-of-buffer c h a r a c t e r , p a r i t y , number of stop b i t s and changes i n the I/O t r a n s l a t i o n t a b l e s can be s p e c i f i e d at any time i n c l u d i n g during the execution of defined f u n c t i o n s . The device c h a r a c t e r i s t i c s that can be i n c l u d e d are s u f f i c i e n t to handle any handshake p r o t o c o l of the machines we have used. In f a c t , one may use 5, 6, 7 or 8 b i t I/O code so t h a t , f o r example, the IBM 5100 can be i n t e r f a c e d to EBCDIC or ASCII devices e a s i l y . I t i s convenient to define a small set of f u n c t i o n s that handle the opening and c l o s i n g of the S e r i a l I/O "devices" a u t o m a t i c a l l y . The monadic f u n c t i o n ^COMMAND r e q u i r e s as a r i g h t argument the l i t e r a l s t r i n g of device s p e c i f i c a t i o n s , AOUT outputs a l i t e r a l s t r i n g r i g h t argument, while A l N does not r e q u i r e an argument but can be used to a s s i g n whatever i s i n the input buffer to a
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
a
American Chemical Society Library 1155 16th St. N. W. In Minicomputers and Large Scale Computations; Washington, D. C. 20036 Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
48
MINICOMPUTERS AND LARGE
variable. Each f u n c t i o n checks the r e t u r n c o n d i t i o n s i n d e t a i l . The ease with which the communication p r o t o c o l could be b u i l t was an important f a c t o r i n producing code
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
The
SCALE
COMPUTATIONS
code and reports e r r o r the d e t a i l s of i n t o defined functions quickly.
Device C o n t r o l Processor
The microcomputer chosen for t h i s study was the Motorola 6800 b u i l t up with the components l i s t e d on Figure 1. The microprocessing u n i t (MPU) was b u i l t up on one card with the MC 6800 as the processor. Off of the common address bus and data bus l e a d i n g to the MPU were s e v e r a l types of I/O adapters. These chips provided I/O for two modes of terminal operation as w e l l as i n t e r f a c i n g to the APL h o s t . The t h i r d mode of operation i s that of asynchronous communications through a s p e c i a l chip c a l l e d the MIKBUG ROM. This ROM provides an asynchronous program, a loader program, and a d i a g n o s t i c program for use with the MPU. Two Kbytes of memory were b u i l t up from ICs on a memory card attached to the address bus and the data b u s . Memory could be expanded simply by adding a d d i t i o n a l cards to the b u s . Communications to the instruments required i n t e r f a c i n g , part of which was organized on a channel card as shown i n Figure 1. The p e r i p h e r a l i n t e r f a c e adapter (PIA) was used for t h i s purpose. The PIA allows eight b i t b i d i r e c t i o n a l communication with the MPU and two b i d i r e c t i o n a l eight b i t buses for i n t e r f a c i n g to p e r i p h e r a l s . Handshake c o n t r o l l o g i c for input and output p e r i p h e r a l operation i s also included i n the chip. We used the two e i g h t - b i t buses together for the input of 16-bit p a r a l l e l I/O from the lowest l e v e l of the hierarchy, the SAI 42 or 43 c o r r e l a t o r s (Honeywell - SAICOR). The channel card of the DCP had a simple layout based on the Motorola PIA c h i p . We designed each channel to be of s i m i l a r s t r u c t u r e and only small adaptions, i f any, had to be made i n order to complete the i n t e r f a c e . Our i n t e n t i s to place a l l of the s p e c i a l i n t e r f a c i n g i n the instrument and keep the channel card as clean and ubiquitous as p o s s i b l e . The s p e c i f i c a t i o n of s i x channels does not represent a design r e s t r i c t i o n , but r e f l e c t s our estimate of what i s needed for the l a s e r l i g h t s c a t t e r i n g experiment. The boards and power supply of the DCP were b u i l t up by Hexagram, I n c . , C l e v e l a n d , Ohio f o r a t o t a l cost of about $2,000 i n c l u d i n g l a b o r . Software development for t h i s p a r t i c u l a r v e r s i o n of our system was done by Hexagram, I n c . and brought the e n t i r e cost of the microcomputer to $4,000. Hexagram, I n c . produced a competent design and implementation f o r us and i n the process taught us a f a i r amount of the technology needed for c o n s t r u c t i n g the systems. We are planning to implement a d d i t i o n a l microcomputer systems i n house at a savings. P h y s i c a l l y , a terminal and instruments are plugged i n t o the various channels using conventional telecommunication connectors.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
4.
M A N N
E T AL.
49
APL Level Languages
The host i s connected i n t o e i t h e r the MIKBUG channel or the asynchronous communications i n t e r f a c e adapter (ACIA) depending on whether the purpose i s to load object code i n c l u d i n g the i n i t i a l i z a t i o n of the program for the operating system. Once the operating system has been i n i t i a l i z e d , then the host i s switched i n t o an ACIA channel for the remainder of the s e s s i o n . Of course, the program for the operating system could be entered through other media but we very q u i c k l y learned that i t was easy and f a s t to download a processing module as a hexadecimal character a r r a y . This was e s p e c i a l l y convenient to do with the IBM 5100 as the host s i n c e one could i n i t i a t e a program load through assignment of the character vector to the shared variable. Transfer was accomplished at r a t e s of 1200 baud and could be done considerably f a s t e r than that i f d e s i r e d . The f i r s t operating system w r i t t e n for the device c o n t r o l processor was based on a set of commands for i n t e r a c t i n g with the c o r r e l a t o r . The system was to be compatible with APL r u l e s . This was easy to do once an APL f u n c t i o n was w r i t t e n to emulate the behavior expected of the DCP. The )) i s executed by the DCP while ) i s executed by the APL processor. The f o l l o w i n g f u n c t i o n defines the DCP. DCPLP \STR% CMNAME ; CVAR «EMULATION OF THE DEVICE CONTROL PROCESSOR OF FIG 1. η ΝΑΜΕΔΡ IS A PROCESSOR ft ΝΑΜΕΔΡ IS A FUNCTION fl BLANKSkF STRIPS OFF BLANKS AND : ,6p η NAME IS A VARIABLE OR LABEL fi CM IS SHORT FOR COMMAND η CMARG IS THE ARGUMENT OF A COMMAND fl • REPRESENTS THE TERMINAL I/O TO THE DCP. f
f
η
LI: STR+BLMKSbF CUPr^ : , 6 p +(*/'))'=2+STR)/MPU STR+APLbJ? STR + ( Λ / ' ) ) » = 2t,STR) /MPU a causes the microcomputer to transmit information to e i t h e r the host or the CRT. The command s t r i n g ))BEGIN i s a microcomputer c a l l to the subroutine START and then TRANSMIT that causes the c o r r e l a t o r to run and when that step i s completed, causes the t r a n s f e r of the data up i n t o the h o s t . The command ))INITIATE invokes a subroutine i n the microcomputer that makes the necessary l i n k s for a host to i n t e r a c t with the h i e r a r c h y . The host must be able to handle block r e c e p t i o n of the data being transmitted from the microcomputer. The time r e q u i r e d for the 5100 to change from output to input mode i s of the order of 100 to 200 m i l l i s e c o n d s and that was slow compared to the microcomputer. However, the 5100 can be i n s t r u c t e d to take the l a s t character of the output s t r i n g as an input prompt. The r e s u l t was that the l a s t character of the s t r i n g was not sent u n t i l the IBM 5100 had switched from output to input. With t h i s technique i t was impossible for the 5100 to
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
4.
M A N N
E T AL.
57
APL Level Languages
miss any of the data s t r i n g s being sent from the microcomputer. Examples of APL/5100 functions to communicate with the M6800 microcomputer are a v a i l a b l e from the authors. Several l e v e l s of p r o t o c o l are i n v o l v e d : b i t l e v e l handshaking, shared v a r i a b l e s c o n d i t i o n i n g of the s e r i a l I/O i n t e r f a c e , and the APL defined f u n c t i o n s . However, i n the end only the APL defined functions need be used by the experimentalist i n w r i t i n g the functions for handling the various aspects of running an experiment. Data, i n s t r u c t i o n s and functions can be stored on magnetic tape. U s u a l l y , data storage on tape was organized with one f i l e for each data s e t . A f i l e name for each data set i s stored in an array c a l l e d LIBRARY s p e c i f i c for each workspace executing the control functions. The number of l i n e s i n LIBRARY t e l l s the system how many f i l e s there are on that tape. The s i z e of the f i l e s should be known i n advance so that the proper s i z e can be marked on the tape. I f the s i z e of the data sets to be stored are unknown, the f i l e s are marked with the s i z e of the l a r g e s t data set expected. A number of functions were coded i n APL to handle the a r c h i v i n g of data a u t o m a t i c a l l y . After each run o r , a l t e r n a t e l y , a f t e r a s e r i e s of runs, APL functions can be invoked to perform data workup c a l c u l a t i o n s , one such c a l c u l a t i o n that must be done i s the conversion of the hexadecimal code transmitted from the microcomputer i n t o the IBM 5100 i n t e r n a l representation for numbers. After t h a t , the l e a s t squares curve f i t of a run or a s e r i e s of runs can be made automatically to the cumulant polynomial, eq. (2). A part of the code i s shown below. K*C CUM Ν B+(NRiSIG)*R-C (N+l) pR)pNRiSIG)x( K+-MA +0 9
( (pi?) N+1)pi , * Î \N)χ(-Γ) 9
° .*0, xN
fl ΡίΤΕΕ SET OF DELAY TIMES, T IS COMPUTED GWEN ΔΤ. PiTHE FIRST ELEMENT OF R WAS DROPPED. «THE WEIGHTED Β VECTOR IS COMPUTED FROM THE LOG OF fl THE CORRELATION FUNCTION SUBTRACTING OUT THE BASE LINE. f\ALL OF THE DERIVAT WES ARE COMPUTED AND ASSIGNED TO A. PiA IS AS LONG AS THE DATA SET AND AS WIDE AS THE NUMBER fl OF CUMULANTS TO BE CALCULATED, N PLUS ONE. flS&4 COMPUTES THE CUMULANTS f\CUM IS PART OF A SHORT FUNCTION THAT CONTROLS THE ITERATION. 9
9
To make memory a v a i l a b l e for l a r g e c a l c u l a t i o n s , a system was designed whereby a l l the data a c q u i s i t i o n programs were themselves stored on tape as c a n o n i c a l representations of the functions. Only the f i r s t few l i n e s of a c a l l i n g function are
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
58
MINICOMPUTERS AND LARGE SCALE
COMPUTATIONS
r e s i d e n t i n the workspace. The r e s t are s t o r e d . The data handling functions are invoked by the c a l l i n g program and so are c l e a r e d from the memory by the 0 Ε Χ f u n c t i o n when i t i s finished. S t o r i n g the functions and data on tape and having two tape drives expands the a v a i l a b l e workspace memory to approximately 400 Κ b y t e s . O b v i o u s l y , t h e execution time i s slowed down when functions have to be read from tape. Even though the IBM 5100 execution time i s slower on i n d i v i d u a l functions than a l a r g e r time sharing system, i t does not have the transmission and sharing delays that a large system requires. Operating at 1200 b i t s per second t r a n s m i s s i o n , the 5100 can take and s t o r e a 400 point data set every 15 s e c . In c o n t r a s t , the Xerox Sigma 7 as a host exchanging data at 300 b i t s per second, r e q u i r e d an average of 5 minutes to take and s t o r e a data s e t . Even though the microcomputer i s slower at executing i n d i v i d u a l f u n c t i o n s , i t s d e d i c a t i o n to one user and one experiment makes i t s o v e r a l l throughput at l e a s t 5 to 10 times that of a conventional time sharing machine. Results and Conclusions The h i e r a r c h i c a l system has been operating for s i x months reliably and has been used by roughly a dozen d i f f e r e n t experimenters. The e f f e c t i v e n e s s of the system i s demonstrated by a simple problem. In working up the a u t o c o r r e l a t i o n f u n c t i o n data to d i f f u s i o n c o e f f i c i e n t s (see the f i r s t s e c t i o n ) , the weight function for the cumulant a n a l y s i s must be estimated. Edwards [7] has constructed a model of the process that p r e d i c t s a c e r t a i n v a r i a t i o n of the standard d e v i a t i o n of the c o r r e l a t i o n function (CJ^ ) with delay time. A simple p r o p a g a t i o n - o f - e r r o r a n a l y s i s w i l l show that the values of the cumulants are s e n s i t i v e to the v a r i a t i o n of the weight f u n c t i o n when very accurate estimates of the d i f f u s i o n c o e f f i c i e n t ( β
eu ο 60 Ή Cd Ρ
Si s
§ s eu u
CO
S3
M CU
β ω > Pj S > · Η
42 çd ο çu cd S β & β 43 ο Ο Ο Ό 43 CO β τΗ cd eu β β eu PQ ο β Ρ£ί ^ £ μ
43 cd β
ο ο ο
eu
S
(0
Χ)
00
> •H
Μ
ο eu ο
μ
•H
ιΗ
Ο Ο
P.
CO
ΡΗ · Η
β P μ
u
Pu co
PL, μ ΐ Q ΡΜ ΡΗ
,the time the CPU spends executing a program. The mass storage transfers are handled by peripheral processors during which time the central processor i s performing housekeeping operations or working on another program. The actual "time on the machine" or turn-around time on the CYBER i s d i f f i c u l t to obtain and varies so widely, depending on the load on the machine, as to be worthless as a performance measure. F i n a l l y , in order to obtain a r e l i a b l e estimate of actual execution time with a minimum of overhead from the operating system as well as from sources described above, a simple bench mark designed to mimic the most time consuming portion of the typical SCF calculation was run on the ECLIPSE, the CYBER 173, and the CDC 6400. As mentioned e a r l i e r , for semi-empirical methods the solution of Eq. (2) involves p r i n c i p a l l y the repeated diagonalization of F(C) until self-consistency i s obtained. Thus, a benchmark involving the repeated diagonalization of F plus some matrix multiplications should closely resemble step 2. This execution benchmark, then, consisted of the diagonalization of F which was an array f i l l e d with real numbers ranging from 10 to +10 with magnitudes varying from 1 0 " to 10 , followed by the back-transformation of F to recover the original matrix. We refer to this as the diagonalization benchmark. The numerical precision for various word lengths ( i . e . , single vs. double precision) was determined by subtracting the value of each 5
5
5
5
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
120
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
element in the back-transformed maxtrix F from the original matrix F. The largest value of this difference matrix then, gave an estimate of the machine accuracy. Execution times for the diagonalization benchmark were conducted for varying array sizes. The purpose of varying the dimension of the matrices was to eliminate the effects of operating system overhead expenses and to provide a basis for extrapolating execution times to larger matrices. The entire diagonal ization benchmark was placed in a FORTRAN DO loop and executed 100 times for each array s i z e . The time at the start of execution and the time at the end of execution were printed. Printing of the arrays was suppressed in this series of runs so that I/O time was not a factor.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
Results and Discussion In this section we seek answers to the following: 1. Can a more or less "standard" molecular orbital program with the a b i l i t y to handle 50 basis functions be made to run at a l l on a minicomputer? If so, how much trouble is i t to "bring up" such a program? 2. Will execution times or, more appropriately, turnaround times be reasonable? 3. Will the costs be competitive with alternative computing sources? The f i r s t of these three questions has been answered in the previous section. By using program overlaying and disk mass storage for temporarily holding arrays, a 50 orbital basis semiempirical molecular orbital program can be made to f i t in 32K of 16 b i t words. Of course, one can effectively increase the size of the program by making even more extensive use of the disk. But this would entail a drastic change in the program and, as we shall see shortly, would increase the mass storage overhead to intolerable l i m i t s . As was discussed e a r l i e r , the iterative part of the SCF problem i s the most time consuming and has the largest core requirements for both data and code. The size of the largest overlay i s of particular importance on the ECLIPSE since the area in core reserved as an overlay area is preset at load time to the size needed to contain the largest overlay in the corresponding segment of the user overlay f i l e . This area i s reserved in core throughout execution regardless of whether succeeding overlays are smaller or not, unlike the CYBER. The size of the SCF overl a y , which determined the size of the overlay area, was 26.6K which l e f t a l i t t l e over 5K for the main overlay and run-time stack (expandable area for temporary storage of variables and intermediate r e s u l t s , e . g . , non-common variables). The f i n a l program required 31,778 words out of the 32,768 available under MRDOS to load. A word might be said here regarding the internal clocks
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
9.
FLANiGAN A N D MCivER
Molecular Orbital Calculations
121
provided by the ECLIPSE operating system. The clock interrupt frequency can be selected at 10, 100, or 1000 hertz. Although only a 6% variation of elapsed time was found between the 10 hertz and the 1000 hertz clocks, this perturbation of the system argues in favor of using an external clock for time measurements. This "clock frequency effect" was not observed for the mass storage benchmark since very l i t t l e processor time i s used. In preparing the benchmark, we encountered an interesting and important difference in the way array indexing i s handled by the two computers. The ECLIPSE FORTRAN manual emphatically cautions the user to ensure that the f i r s t index of arrays within a nest of DO loops corresponds to the index of the innermost loop of the nest. Complying with t h i s , however, would have entailed a major revision of the molecular orbital program. We i n v e s t i gated this problem by executing each of the four programs shown in Table I, 10,000 times and recording the execution times on each machine. (The statements in the square brackets were not included in the 10,000 executions of each program). Comparing the ECLIPSE execution times for programs I and III, i t i s seen that the warning in the ECLIPSE manual i s j u s t i f i e d . The ECLIPSE results for programs II and IV show that this problem can be e a s i l y circumvented by referencing the array with a single sub s c r i p t and handling the double subscript indexing in FORTRAN with the aid of the "look up table" INJ. When implemented in the diagonalization benchmark, a 30% reduction of execution time occurred. It was thus included in the SCF overlay of the molec ular orbital benchmark. The ECLIPSE results in Table I can be understood when i t i s recognized that the compiler computes a single index address by the equivalent of the formula IJ = I + 50 * ( J - l ) and that i t "optimizes" the code by removing constant expressions from loops. Thus the optimized revision of program III would be DO 1 I = 1,50 Κ = 50 * (1-1) DO 1 J = 1,50 1 A(K + J) = 1.0 which requires only 50 multiplications (the slowest operation in the program) rather than the 2500 of program I. Programs II and IV require no multiplications, with program IV requiring 2450 fewer table look-ups than II. The CYBER results are puzzling in this context since the CYBER compiler also optimizes the source code and uses the same formula for the single subscript IJ as does the ECLIPSE. The results of II and IV are nearly twice as long as I and III(which are now comparable) on the CYBER as on the ECLIPSE. The explana-
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
122
MINICOMPUTERS AND
LARGE SCALE
COMPUTATIONS
TABLE I COMPARISON OF INDEXING METHODS EXECUTION TIME (SEC)
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
FORTRAN CODE
II.
ECLIPSE
CYBER
[DIMENSION A(50,50)] DO 1 I = 1,50 DO 1 J = 1,50 1 A(I,J) = 1.0
785
66
[DIMENSION A(2500),INJ(50) DO 5 Κ = 1,50 5 INJ(K) = 50*(K-1)]
625
120
DO 1 DO 1 IJ = 1 A(IJ)
I = 1,50 J = 1,50 I + INJ(J) = 1.0
III.
[DIMENSION A(50,50)] DO 1 I = 1,50 DO 1 J = 1,50 1 A(J,I) = 1.0
549
70
IV.
[DIMENSION A(2500),INJ(50) DO 5 Κ = 1,50 5 INJ(K) = 50*(K-1)]
550
103
DO 1 DO 1 JI = 1 A(JI)
I = 1,50 J = 1,50 J + INJ(I) = 1.0
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
9.
123
tion for this l i e s in the fact that the multiplications 50*(I-1) are e f f e c t i v e l y eliminated by the compiler. This is accomplished by using the princple that any positive integer can be expressed as a linear combination of powers of 2 with coefficients of plus or minus one. Thus, in our example, the integer 50 (which is known to the compiler from the DIMENSION statement) can be w r i t ten as 2 - 2 + 2 . Multiplication of any integer ( Ι - Ί ) by 2 can be very rapidly carried out on the CYBER by simply s h i f t i n g the bits of the integer η spaces to the l e f t . Thus the m u l t i p l i c a t i o n by 50 i s replaced by three b i t s h i f t s , an integer additon and an integer subtraction. The impact of using singly dimensioned arrays on the CYBER version of the molecular orbital benchmark can be estimated from the fact that the diagonalization benchmark took 10% more time to execute when the arrays were made 1inear. The double precision 64 b i t word on the ECLIPSE gave a noticeable improvement in precision compared to the 32 b i t single precision floating point word. The largest error in the diagonalization benchmark (with tolerence set to 10" ) was 0.2 χ 10 for the double, precision word and 0.0004 for the single precison version, the CYBER (60 b i t word) gave an error of 6.0 χ 1 0 " . Various execution times for the molecular orbital benchmark are shown in Table II. As discussed e a r l i e r for the ECLIPSE, the total execution time is the "real time", i . e . , the sum of pro cessing time and mass storage transfer time. The contribution of overlay overhead to the l a t t e r averages about 6 seconds and is a constant independent of both the molecule and the number of i t e r a t i o n s . Therefore, disk 1/0 in the form of data transfers accounts for the balance of the mass storage time. The ratio of total ECLIPSE execution time to CYBER CPU time shown in Table II varies from 25 for Z W to 7.3 for C H where i t appears to be leveling off. This is a result of the diminish ing importance of the mass storage transfer time for the larger molecules. The mass storage overhead could easily be eliminated i f s u f f i c i e n t extended memory were available to us. Results of a short test program using window mapping for data transfers i n d i cated that 20,000 REMAP operations using a 2K window can be performed in three seconds. The number of data transfers to disk for the molecules executed in the SCF program varied between 43 and 51 depending on the number of iterations. Therefore, using REMAPS instead of disk reads and writes would entail almost zero overhead. The mass storage overhead then would be due soley to overlaying and this is a known constant (6 seconds). One can easily estimate the SCF ECLIPSE execution time which would be the equivalent of the CYBER CPU time by subtracting the total mass storage times obtained from the mass storage benchmark from the total ECLIPSE execution times. These times are l i s t e d in Table II as Δ. The ratio of Δ to CYBER CPU time varies from 3.8 for C rU to 4.7 for C H . These can be compared 6
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
Molecular Orbital Calculations
FLANiGAN A N D MCivER
h
1
n
8
8
8
2
2
6
h
6
8
8
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
124
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
TABLE II COMPARISON OF EXECUTION TIMES
MOLECULE
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
# ORBITALS
# ITERATIONS IN SCF
CH 2
Cfo
4
C^
Q
CgHg
12
18
22
30
32
9
13
11
10
13
72 61 11
98 68 30
106 65 41
151 62 89
188 68 120
2.88
6.73
9.06
18.81
25.70
14.5 4.5
11.7 4.5
8.0 4.7
7.3 4.7
ECLIPSE TOTAL TIME (SEC) MASS STORAGE TIME Δ (TOTAL - MASS STORAGE)
CYBER CPU TIME (SEC)
ECLIPSE/CYBER RATIOS TOTAL TIME/CPU TIME Δ/CPU TIME
25.0 3.8
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
9.
FLANiGAN
A N D
MCivER
Molecular Orbital Calculations
125
to the ECLIPSE/CYBER execution ratios obtained in the diagonalization benchmark (which has no mass storage c a l l s ) , thus provid ing an independent check of our assumption that Δ corresponds to the CYBER CPU time. The diagonalization benchmark was executed for both singly and doubly subscripted arrays because, although single subscripting was shown to be faster than double, the SCF program used a combination of both. For single subscripting, the ratios obtained in the diagonalization benchmark varied from 3.7 (15 χ 15 matrix) to 3.8 (35 χ 35 matrix) while those for double varied from 4.6 (15 χ 15 matrix) to 5.0 (35 χ 35 matrix). The ratios of Δ to CYBER CPU time given above f a l l s well within the range 3.7 to 5.0. The evaluation of the cost effectiveness^ of anything i s beset with many d i f f i c u l t i e s and the effectiveness of carrying out molecular orbital calculations on a mincomputer i s no exception, Some of these d i f f i c u l t i e s have been discussed in the section on cost analysis. We venture no further discussion here other than to remark that the results presented in this section must be regarded as crude. Table III shows the CYBER cost (as given by the University's charging algorithm) for each molecule and the ratios of the CYBER costs to the ECLIPSE costs (as computed under the cost plans A and Β described e a r l i e r ) . Table III also includes the estimated ratios of CYBER costs to the costs obtained on an ECLIPSE with 65K of extended memory. This is s u f f i c i e n t memory to eliminate the disk I/O from the molecular orbital benchmark. This estimate was obtained by f i r s t modifying cost plans A and Β to include the cost of the additional memory (and i t s maintenance) in the total five year price of the machine. The execution times on this hypothetical ECLIPSE were estimated by adding six seconds (the fixed cost of the disk overlaying overhead) to the A's of Table II. According to the results shown in Table III i t is far cheaper to use an ECLIPSE for these calculations than the CYBER. Even in the worst case shown, the CYBER is nearly ten times more expen sive to use than the ECLIPSE. Moreover, the results also show that the additional extended memory on the ECLIPSE i s well worth the extra investment, although the differences for the two ECLIPSE configurations are not great for the larger molecules. Conclusions The surprising aspect of this work i s not that a molecular orbital program could be run on a 16 b i t minicomputer. Given a suitable length floating point word, such programs can be highly overlayed and at the worst, the array dimensions can be lowered. We believe that even a b - i n i t i o programs can be made to run on the ECLIPSE, although the basis set size might be limited and mass storage overhead somewhat high. What was surprising to us was the sheer power of the ECLIPSE. The results of the diagonaliza-
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
126
TABLE III COMPARISON OF COSTS MOLECULE
C
CYBER CHARGE
H
2 4
C
H
3 6
C
H
4 6
C
H
5 10
C
H
6 8
$0.41
$0.74
$0.91
$1.76
$3.28
9.3 16.9
12.3 22.4
14.0 25.5
19.0 31.6
28.5 51.9
32.4 59.1
26.0 47.5
24. 44.
23.4 42.8
33.0 60.1
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
CYBER/ECLIPSE PLAN A PLAN Β
CYBER/EXTENDED ECLIPSE PLAN A PLAN Β
tion benchmark showed that the ECLIPSE i s only 4 to 5 times slow er than the CYBER 173, even though the ECLIPSE uses a 64 b i t floating point word. For a further comparison, we found via the diagonalization benchmark, that the CYBER 173 i s 30 to 50% faster than the CDC6400. Because a minicomputer used in this fashion is a dedicated "hands on" machine, with the only delay due to p r i n t i n g , the turn-around time w i l l often be much better than that of the CYBER. Provided that the usage demand is s u f f i c i e n t l y high, we be lieve that the use of a minicomputer i s both a highly convenient and cost effective alternative to using University computer cen ters for the type of calculation described in this paper. Acknow!edgement We are very grateful to Dr. Stanley Bruckenstein for the use of his ECLIPSE. We also thank Mr. Greg Martinchek as well as other members of Dr. Bruckenstein's group for their valuable assistance in using this machine. Literature Cited (1) Dewar, M. S. and Haselbach, Ε., J. Amer. Chem. Soc.,(1970) 92, 1285. (2) Pople, J. Α., Beveridge, D. L. and Dobosh, P. Α., J. Chem. Phys., (1967) 47, 2026.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
10 A Minicomputer Numbercruncher A. LINDGÅRD, P. GRAAE SORENSEN, and J. OXENBOLL
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
Chemistry Laboratory III, H. C. Orsted Institutet, University of Copenhagen, Universitetsparken 5, DK-2100 København Ο
Due t o t h e l o w p r i c e o f s m a l l m i n i c o m p u t e r confi g u r a t i o n s t h e y h a v e become v e r y p o p u l a r i n c o m p u t e r i z e d instrumentation f o r c h e m i s t r y and p h y s i c s . L a r g e s c a l e s c i e n t i f i c c o m p u t i n g has not been much influenced by this, but h a s m o s t l y b e e n done on l a r g e c o m p u t e r s . An example o f use o f a monoprogrammed minicomputer for quantum chemistry is found at B e r k e l e y ( M i l l e r and S c h a e f e r , 1973), but the c o n f i g u r a t i o n w i t h plenty of main memory, backing store etc. i s not t y p i c a l f o r m i n i c o m p u t e r s y s t e m s . On t h e o t h e r h a n d t h e s m a l l s t a n d alone systems are not s u i t e d for program development due t o t h e l a c k o f p o w e r f u l p e r i p h e r a l s . O n l y interpre ters like BASIC can be used with a r e a s o n a b l e turn around time for program development, and BASIC is certainly not s u i t e d f o r a n y t h i n g but s m a l l programs. A problem with the l a r g e machines i s that they are very expensive for jobs having large cpu-time require ments. Monte-Carlo calculations in statistical m e c h a n i c s c a n o f t e n r e q u i r e weeks o f c p u - t i m e , but do n o t r e q u i r e much b a c k i n g s t o r e o r u s e o f p e r i p h e r a l s . Considering these cpu-bound problems i t became clear that a dedicated minicomputer with a reasonable amount of fast s t o r e w o u l d be s u f f i c i e n t t o do t h e s e type of c a l c u l a t i o n s at a very low cost, the only problem being how t o d e v e l o p p r o g r a m s a n d g e t d a t a i n and o u t o f t h e m e m o r y . At t h e H . C . 0 r s t e d I n s t i t u t e t h e r e was a n e e d f o r h a n d l i n g p r o b l e m s i n s t a t i s t i c a l m e c h a n i c s and c h e m i c a l k i n e t i c s r e q u i r i n g weeks t o m o n t h s of cpu-time. Core requirements for these jobs are low. These jobs c o u l d o f c o u r s e r u n on o u r medium s i z e m u l t i p r o g r a m m e d RC4000 computer ( B r i n c h Hansen, 1967), but not i n a r e a s o n a b l e 127 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS AND LARGE
128
SCALE
COMPUTATIONS
way. E i t h e r o t h e r u s e r s w o u l d h a v e p r o b l e m s getting a decent turnaround time for their computational jobs, and t h e RC+000 w o u l d be l e s s a t t r a c t i v e f o r d o i n g s m a l l j o b s l i k e e d i t i n g , c o m p i l i n g and r u n n i n g s m a l l p r o g r a m s from a t e r m i n a l , or the t u r n a r o u n d time for the time c o m s u m i n g j o b w o u l d h a v e b e e n so l o n g t h a t i t c o u l d n o t have been r e a l i z e d .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
System
design.
The purpose of the s y s t e m i s t o make l o n g c p u bound c o m p u t a t i o n s f e a s i b l e . Typically a program will run f o r a few h o u r s b e f o r e i t n e e d s a t t e n t i o n f r o m t h e RCH000 f o r s t o r i n g away d a t a . The p r o g r a m w i l l t h e n go on making a new c o m p u t a t i o n . T h i s c y c l e may c o n t i n u e for weeks. From t h e p o i n t o f v i e w of the minicomputer the RC*+000 is a backing s t o r e . T h e c o m p u t e d d a t a may be s t o r e d by an RC4000 control program on the backing store ie. t h e d i s c . The f i n a l s e c u r i t y o f t h e d a t a is a s s u r e d by t h e s e c u r i t y dump o f t h e w h o l e b a c k i n g s t o r e on a m a g n e t i c t a p e , w h i c h i s done o n c e e v e r y d a y .
Selecting
the
minicomputer.
The p r i m a r y c r i t e r i a u s e d i n s e l e c t i n g the minicomputer for this project were p r o c e s s i n g r a t e , i n s t r u c t i o n r e p e r t o i r e and c o s t . I t was d e c i d e d t h a t h a r d w a r e multiply/divide was essential for most applications, but that floating p o i n t a r i t h m e t i c w o u l d be u s e d i n a few c a s e s o n l y . It was expected that a l a r g e amount o f p r o c e s s i n g t i m e w o u l d be u s e d f o r b i t m a n i p u l a t i o n and memory addressing, and an a d v a n c e d a d d r e s s i n g scheme w i t h e a s y u s e of i n d e x r e g i s t e r s was important. The Texas Instrument 980A was selected as a r e a s o n a b l e compromise between the abovementioned r e q u i r e m e n t s . F o r i n s t a n c e , the s h i f t i n s t r u c t i o n can h a n d l e a v a r i a b l e number o f p o s i t i o n s a n d t h e h a r d w a r e multiply/divide i s not too d i f f i c u l t t o use f o r multilength i n t e g e r a r i t h m e t i c . F u r t h e r , the p r o t e c t i o n system of the TI980A was considered as an a d v a n t a g e . Software support from the manufacturer was not considered, b e c a u s e we a l r e a d y h a v e a general assembler for any m i n i c o m p u t e r and m i c r o c o m p u t e r , and p r o g r a m d e v e l o p m e n t s h o u l d n o t be done on t h e m i n i c o m p u t e r .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
10.
LiNDGARD E T A L .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
Connecting
the
A Minicomputer dumber cruncher minicomputer
to
the
129
RC4000.
The minicomputer may be c o n n e c t e d e i t h e r a s an independent machine h a v i n g a t e r m i n a l f o r the u s e r and only using the RC4000 as b a c k i n g s t o r e , o r as a s l a v e c o m p u t e r c o m p l e t e l y c o n t r o l l e d by t h e R C 4 0 0 0 , with no other peripherals. We favor t h e l a s t s o l u t i o n as it makes h a r d w a r e and s o f t w a r e s i m p l e r , A slave computer i s l i k e any o t h e r c o m p l e t e l y c o n t r o l l e d p e r i p h e r a l . The difference is t h a t a g e n e r a l purpose minicomputer can do v e r y c o m p l e x d a t a t r a n s f o r m a t i o n s while other perip h e r a l s g e n e r a l l y can n o t . The minicomputer should not h a v e any character oriented peripherals connected. Character input/output requires a lot of software. I f a t e r m i n a l had b e e n c o n n e c t e d , u s e r s would f u r t h e r m o r e have felt inclined to use the minicomputer f o r d e v e l o p i n g , e d i t i n g and a s s e m b l i n g o f p r o g r a m s . T h i s r e q u i r e s a command interp r e t e r and some p r o g r a m t o d e t e r m i n e w h e t h e r t h i s could be done locally or involve the RC4000. We would c e r t a i n l y u s e t h e same command l a n g u a g e on t h e m i n i c o m puter as on t h e RC^OOO, w h i c h i m p l i e s t h a t we h a d t o d e v e l o p a l o t o f s o f t w a r e . I t i s much s i m p l e r t o force the u s e r t o u s e t h e RC4000 f o r e d i t i n g , a s s e m b l i n g and l o a d i n g p r o g r a m s and h a v e no conventional peripherals on t h e m i n i c o m p u t e r , A slave computer is simple to h a n d l e . It can a l w a y s be p u t i n t o a w e l l d e f i n e d s t a t e , i t c a n n o t harm t h e RC4000 as i t c a n n o t do a n y t h i n g on i t s own b u t has t o a s k t h e RC^OOO t o do i t , by s e n d i n g a s i g n a l . The
TI980A
controller.
Communication between the RC4000 a n d t h e T I 9 8 0 A t a k e s p l a c e v i a t h e l o w - s p e e d and t h e h i g h - s p e e d (DMA) data c h a n n e l s o f t h e RC*+000 , b u t o n l y v i a t h e DMA p o r t o f t h e T I 9 8 0 A . B e s i d e s t h e DMA c a p a b i l i t y , this port has an instruction controlled o u t p u t f e a t u r e and an i n t e r r u p t i n p u t . T h e s e f e a t u r e s made i t e a s y to build the TI980A i n t e r f a c e , b e c a u s e i t was o n l y n e c e s s a r y to i m p l e m e n t one p e r i p h e r a l d e v i c e to the minicomputer, n a m e l y RC^OOO t h r o u g h t h e DMA p o r t . The interface can l o g i c a l l y be d i v i d e d i n t o two p a r t s , a c o n t r o l s y s t e m and a DMA d a t a t r a n s f e r system. In t h e c o n t r o l s y s t e m t h e T I 9 8 0 A i s connected to the instruction controlled low-speed data c h a n n e l of the RC4000 and f u n c t i o n s as a s l a v e computer. The RCH000 uses f i v e i n s t r u c t i o n s to c o n t r o l the m i n i c o m p u t e r : 1) reset, 2) stop, 3) s t a r t , 4) s i n g l e instruction e x e c u t i o n a n d 5) i n t e r r u p t . I m p l e m e n t a t i o n of the first
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
130
MINICOMPUTERS AND LARGE
SCALE
COMPUTATIONS
f o u r i n s t r u c t i o n s i s p e r f o r m e d by c o n n e c t i n g t h e o u t p u t o f t h e RC4000 c o n t r o l l e r t o t h e f r o n t panel board of the minicomputer and t h e n s i m p l y s i m u l a t i n g t h e front p a n e l s w i t c h e s . The i n t e r r u p t i n s t r u c t i o n i s connected t o t h e DMA p o r t . The DMA s y s t e m c o n t r o l s t h e d a t a t r a n s f e r s between the two c o m p u t e r s . A word i s l o a d e d f r o m t h e memory o f one c o m p u t e r t h r o u g h i t s DMA p o r t , stored temporarily in a one word b u f f e r , and t h e n t h e s e c o n d c o m p u t e r i s r e q u e s t e d t o s t o r e t h i s word i n i t s memory t h r o u g h its DMA port. In one RC4000 24 b i t word o n l y one 16 b i t T I 9 8 0 A word i s s t o r e d . No a t t e m p t h a s b e e n made t o make a more e f f i c i e n t p a c k i n g , because i t would complicate b o t h s o f t w a r e and h a r d w a r e . The DMA t r a n s f e r c a n o n l y be i n i t i a l i z e d by t h e RC4000, w h i c h has f o u r i n s t r u c t i o n s f o r this purpose: 1) load the RC4000 a d r e s s c o u n t e r f o r i n p u t , 2) l o a d t h e RC4000 a d r e s s counter for output, 3) load the TI980A adress counter and 4) load t h e word number c o u n t e r . E x e c u t i o n of the l a s t i n s t r u c t i o n a l s o starts the data transfer, which is now c o n t r o l l e d by t h e interface. The a u t o m a t i c t r a n s f e r i n s t r u c t i o n (ATI) of the minicomputer is used for a c t i v a t i n g the i n s t r u c t i o n c o n t r o l l e d o u t p u t a t t h e DMA p o r t . T h i s instuction is normally used to initialize a DMA t r a n s f e r to a peripheral device ( e . g . a d i s c ) when t h e m i n i c o m p u t e r i s used i n a stand alone system. Here the output is used for a low s p e e d c o m m u n i c a t i o n f r o m t h e T I 9 8 0 A t o the RC4000. The DMA p o r t does not have an input f e a t u r e , so l o w - s p e e d c o m m u n i c a t i o n t h e o p p o s i t e way i s not implemented. The A T I i n s t r u c t i o n c a n l o a d two 16 b i t TI980A words to p e r i p h e r a l r e g i s t e r s , and sends at t h e same t i m e an i n t e r r u p t t o t h e R C 4 0 0 0 . T h i s c a n r e a d the two r e g i s t e r s by s e n s e i n s t r u c t i o n s . The r e m a i n i n g b i t s w h i c h c a n be r e a d by a s e n s e i n s t r u c t i o n a r e used for status. Software.
for
The communication and c o n t r o l s o f t w a r e d e v e l o p e d t h i s p r o j e c t c o n s i s t s of the f o l l o w i n g p r o g r a m s : 1. A h a n d l e r as a part of the RC4000 monitor (Brinch Hansen, 1973) which together with a process d e s c r i p t i o n i s the peripheral process "ti980a". 2. I n i t i a l i s a t i o n code i n the RC4000. T h i s i s o n l y e x e c u t e d at system r e s t a r t i n the RC4000. 3. A monitor in the TI980A. This includes a h a n d l e r f o r t h e RC4000 known as " r c 4 0 0 0 " .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
10.
LINDGARD
A Minicomputer "Numbercruncher
ET AL.
131
The T I 9 8 0 A m o n i t o r p r o v i d e s a c o n t r o l a n d c o m m u n i c a t i o n s t r u c t u r e s i m i l a r t o t h a t o f t h e RC4000 m o n i t o r . The T I 9 8 0 A u s e r a r e a a n d r e g i s t e r f i l e dump i s c o n c e p t u a l l y a p r o c e s s s i m i l a r to the i n t e r n a l p r o c e s s of the RC4000 m o n i t o r ( B r i n c h H a n s e n , 1 9 7 3 ) . The process it may c o m m u n i c a t e w i t h i s t h e p e r i p h e r a l p r o c e s s " r c H O O O " (see figure 1) a n d i t d o e s so u s i n g a m e s s a g e buffer t e c h n i q u e e q u i v a l e n t t o t h a t i n t h e RC4000 s y s t e m . T h u s m u l t i b u f f e r i n g of i n p u t / o u t p u t i s a built-in feature. The structure a l l o w s us t o i m p l e m e n t m u l t i p r o g r a m m i n g on t h e T I 9 8 0 A w i t h o u t c h a n g i n g e x t e r n a l c o n v e n t i o n s and with a r e l a t i v e l y small e f f o r t i n software development.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
Communication
in
the
TI980A.
When t h e T I 9 8 0 A u s e r p r o g r a m w a n t s the attention of t h e RC4000 u s e r p r o g r a m i t s e n d s a m e s s a g e . T h i s i s d o n e by c a l l i n g a p r o c e d u r e " s e n d message". A buffer within the T I 9 8 0 A m o n i t o r i s s e l e c t e d and t h e m e s s a g e i s c o p i e d from the u s e r program t o the message buffer. The buffer address is returned to the TI980A u s e r p r o g r a m . The l a t t e r may s e n d a new m e s s a g e o r may wait for an answer t o t h e m e s s a g e s e n d ( s e e f i g u r e 2b f o r an e x a m p l e ) . C a l l i n g t h e T I 9 8 0 A m o n i t o r p r o c e d u r e " w a i t a n s w e r " d e l a y s t h e T I 9 8 0 A u s e r p r o g r a m u n t i l t h e RCH000 has s e n t an a n s w e r b a c k t o t h e T I 9 8 0 A . T h e a n s w e r from the RC4000 a r r i v e s i n t h e same m e s s a g e b u f f e r as u s e d by " s e n d m e s s a g e " a n d i s c o p i e d by " w a i t answer" into
TMX
TI980A interface
Figure 1. Structure of a simple job using the TI980A showing the communication and control paths. Rectangular boxes are interface hardware; circles are peripheral processes.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
132
MINICOMPUTERS AND LARGE
SCALE
COMPUTATIONS
an answer a r e a i n t h e T I 9 8 0 A u s e r p r o g r a m . The T I 9 8 0 A user program can c a l l the TI980A monitor to examine w h e t h e r an a n s w e r has a r r i v e d .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
Communication
in
the
RC4000.
When the RC4000 user program has loaded and started t h e T I 9 8 0 A , t h e T I 9 8 0 A u s e r p r o g r a m may s e n d a message t o the h a n d l e r t e l l i n g t h e h a n d l e r t o queue up the message buffer until a message a r r i v e s from the T I 9 8 0 A u s e r p r o g r a m . When i t arrives the message is copied from the TI980A to the s e l e c t e d message buffer i n t h e R C 4 0 0 0 . The RCH000 u s e r program will get the T I 9 8 0 A m e s s a g e c o p i e d i n t o i t s a n s w e r a r e a by e x e c u t i n g wait answer. An a n s w e r t o t h e m e s s a g e f r o m t h e T I 9 8 0 A c a n be s e n d by t h e RC4000 u s e r by e x e c u t i n g a new " s e n d message", "wait answer" s e q u e n c e , (see f i g u r e 2a). Control. The RC4000 u s e r p r o g r a m i s an o p e r a t i n g s y s t e m f o r t h e T I 9 8 0 A u s e r p r o g r a m . I t c a n do block input/output t o t h e u s e r a r e a a t any t i m e . I t c a n s t a r t and s t o p t h e T I 9 8 0 A u s e r p r o g r a m and when f i n i s h e d r e m o v e t h e T I 9 8 0 A user program. T h i s i s done by s e n d i n g m e s s a g e s t o t h e h a n d l e r . The T I 9 8 0 A u s e r p r o g r a m a n d even the TI980A monitor can do nothing to harm t h e RCHOOO a n d t h e a c t i v i t i e s t h e r e i n . The c o n t r o l b o t h i n h a r d w a r e a n d i n s o f t w a r e i s e x c l u s i v e to the RC4000. Survival. F o r l o n g t e r m c o m p u t a t i o n s , i t w o u l d be c o n v e n i e n t if the minicomputer could survive most kinds of troubles in the host system, i r r e s p e c t i v e of whether t h e y a r e c a u s e d by h a r d w a r e m a l f u n c t i o n i n g or by new d e v e l o p m e n t o f hardware and b a s i c s o f t w a r e . In hardware the TI980A i s p r o t e c t e d a g a i n s t the R C 4 0 0 0 . The c o m m u n i c a t i o n channel is separated both from t h e RCH000 d a t a c h a n n e l s a n d f r o m t h e T I 9 8 0 A d a t a c h a n n e l t h r o u g h two c o n t r o l l e r s . The TI980A can run even when there i s no power on t h e c o n t r o l l e r i n t h e RC*+000. I n t h e d e s i g n o f t h e T I 9 8 0 A m o n i t o r and t h e RCH000 h a n d l e r i t was p o s s i b l e t o d e s i g n a safe strategy to keep the TI980A g o i n g i n d é p e n d a n t o f system deadstarts i n t h e R C H 0 0 0 . T h i s i s d o n e by h a v i n g a copy of all state variables in both the TI980A m o n i t o r and t h e RC4000 p e r i p h e r a l p r o c e s s . A t s y s t e m d e a d s t a r t in the
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
LiNDGARD E T A L .
A Minicomputer Ν umber cruncher
133
rcusercomm 1 27 4 76 1 begin 2 comment r c 4 0 0 0 u s e r p r o g r a m f o r c o n t r o l a n d 3 communication with t i 9 8 0 a ; 4 integer i ; 5 i n t e g e r a r r a y M,A(1: 8 ) , i m a g e ( 1 : 256 ) , r e g i s t e r ( 1 : 9 ) ; 6 6 comment f e t c h t r a n s l a t e d t i u s e r p r o g r a m f r o m d i s ^ 7 careaproc(); 8 M ( l ) : = 3 s h i f t 12; comment i n p u t o p e r a t i o n ; 9 M(2 ) : = f i r s t a d d r ( i m a g e ) ; 10 M(3):=M(2)+2*256-2; 11 M ( 4 ) : = l ; comment r e l a t i v e segment f o r c o d e ; 12 waitanswer(sendmessage(,M),A); 13 comment t h e t i u s e r p r o g r a m i s now i n image; 14 14 comment r e s e r v e t i 9 8 0 a a n d move code t o t i 9 8 0 a ; 15 reserveproc(,0); 16 M ( l ) : = 5 s h i f t 12; comment output; 17 comment M(2) a n d M(3) a r e u n c h a n g e d ; 18 M(4):=0; comment f i r s t a d d r e s s i n t i 9 8 0 ; 19 w a i t a n s w e r ( s e n d m e s s a g e ( < : t i 9 8 0 a : > , M ) , A ) ; 20 20 comment s e t r e g i s t e r s a n d s t a r t t i 9 8 0 a ; 21 r e g i s t e r ( 8 ) : = r e g i s t e r ( 9 ) : = 0 ; 22 comment T I p r o g r a m c o u n t e r : = T I s t a t u s r e g i s t e r : = 0 ; 23 M ( l ) : = 5 s h i f t 12+2; comment s e t r e g i s t e r s a n d s t a r t ; 24 M ( 2 ) : = f i r s t a d d r ( r e g i s t e r ) ; 25 w a i t a n s w e r ( s e n d m e s s a g e ( < : t i 9 8 0 a : > , M ) , A ) ; 26 26 comment w a i t f o r 5 m e s s a g e s a n d g e n e r a t e a n s w e r s ; 27 f o r i : = l s t e p 1 u n t i l 5 do b e g i n 28 M ( l ) : = 1 4 s h i f t 12; comment w a i t m e s s a g e ( < : t i 9 8 0 a :>) ; 29 waitanswer(sendmessage(,M), A); 30 comment a message h a s a r r i v e d , g e n e r a t e an a n s w e r ; 31 M ( l ) : = 1 0 s h i f t 12; comment s e n d a n s w e r ( < : t i 9 8 0 a :>); 32 M ( 2 ) : = A ( 2 ) ; comment c o p y T I b u f f e r a d d r e s s ; 33 waitanswer(sendmessage(,M), A ); 34 end l o o p ; 35 35 comment r e m o v e p r o g r a m a n d r e l e a s e t i 9 8 0 a ; 36 M ( l ) : = 1 6 s h i f t 12; 37 w a i t a n s w e r ( s e n d m e s s a g e ( < : t i 9 8 0 a : > , M ) , A ) ; 38 end algol
end
15
Figure 2a. Model operating system written in the ALGOL6 dialect (Lauesen, 1969). The program reads the translated TI980A user program from the RC4000 backing store and moves it to the user area of the Τ1980A (lines 6-19). The Τ1980A register file is loaded and the minicomputer started (lines 20-25). A number of messages and answers are exchanged (lines 26-34). The minicomputer is released (lines 35-37).
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
134
AND LARGE
SCALE
COMPUTATIONS
RC4000 the initialization code reads the state v a r i a b l e s from the TI980A monitor into the RC4000 peripheral process. Loading the TI98&A m o n i t o r i s a p r i v i l e g e d o p e r a t i o n which a normal user cannot execute . The major problem that arises when h a n d l i n g a s u r v i v a l problem l i e s in the multiprogrammed RC4000 computer. A f t e r d e a d s t a r t o n l y t h e p e r s o n who h a s l a s t r e s e r v e d t h e T I 9 8 0 A s h o u l d be allowed to control it again, if a TI980A user program is r u n n i n g . The reservation scheme is extended a s f o l l o w s . When t h e TI980A i s f r e e , any RC4000 process may r e s e r v e the TI980A. T h e name o f t h e r e s e r v i n g p r o c e s s i s moved t o the TI980A m o n i t o r . At system d e a d s t a r t it is copied from the TI980A monitor i n t o the p e r i p h e r a l p r o c e s s . O n l y a n RC4000 process with the same name as the original réserver c a n now r e s e r v e the T I 9 8 0 A . The d i s a d v a n t a g e o f t h i s scheme i s that the RCH000 user p r o c e s s e x p l i c i t l y has t o r e l e a s e the t i 9 8 0 a process. A user can take advantage of the automatic p r o c e s s start up facility in one of the o p e r a t i n g systems (Graae S^rensen and L i n d g â r d , 1 9 7 3 ) . T h e RCH000 user program can e a s i l y examine the s t a t e o f the TI980A and t h e p r o g r a m t h e r e i n , m a k i n g t h e c o d i n g o f an start up mechanism r e l a t i v e l y easy. f f
How
to use the
t f
TI980A.
In figure 2a is given a s i m p l e example o f an o p e r a t i n g RC4000 a l g o l program and in figure 2b a TI980A a s s e m b l y l a n g u a g e p r o g r a m . The o p e r a t i n g p r o g r a m fetches t h e a s s e m b l e d c o d e as g e n e r a t e d by t h e g e n e r a l a s s e m b l e r ( B a n g , 1 9 7 4 ) . The TI980A i s l o a d e d with the program after reservation has taken place and t h e TI980A u s e r program i s s t a r t e d . The TI980A u s e r program and t h e RC4000 user program exchange a number of messages. F i n a l l y t h e RC4000 u s e r p r o g r a m r e l e a s e s t h e TI980A. The c o n t r o l m o d e l p r o g r a m i n f i g u r e 2a i s a short program and it is easy to extend i t to a r e a l i s t i c c o n t r o l p r o g r a m by i n c l u d i n g some i n p u t / o u t p u t a n d t e s t of t h e c o m m u n i c a t i o n s . Such a program w i l l only be a few p a g e s l o n g a n d r a t h e r t r i v i a l t o w r i t e . The T I 9 8 0 A m o d e l p r o g r a m i n f i g u r e 2b i s v e r y s h o r t . I t h a s i n d e e d b e e n t h e s c o p e o f t h e d e s i g n t o make l i f e e a s y f o r the programmer when h a n d l i n g c o m m u n i c a t i o n s . A s s e m b l y l a n g u a g e c o d i n g s h o u l d be k e p t a t a m i n i m u m . A l t h o u g h the communication p r i m i t i v e s looks d i f f e r e n t i n the TI980A t h e y work b a s i c l y t h e same way a s i n t h e R C 4 0 0 0 . I n a n
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
LINDGARD E T A L .
10.
A Minicomputer"Νumbercruncher
135
tiuser 0 0 rep : ldx=message ; trap 3 ; send m e s s a g e ( < : r c 4 0 0 0 : > m e s s a g e ) ; 1 ste bufferaddress; bufferaddress:= 2 3 3 comment c o m p u t a t i o n s and/or other communications 3 may t a k e p l a c e h e r e ; 3 3 ldx=answer 3 lde bufferaddress 4 wait answer(bufferaddress.answer); 5 trap 4 6 bru rep goto r e p ;
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
7 7 8 14 20
bufferaddress: message :
Figure 2b.
0 0,r. 0,r.
message a r e a answer a r e a
Model program for the TI980A showing how to communicate with the RC4000
RC4000 assembly language have been t h e same.
program,
the
comments
would
Discussion. Starting out with the model operating system (figure 2a) i t i s a r a t h e r t r i v i a l task to write an o p e r a t i n g system f o r a s p e c i f i c application. The system has a l r e a d y been s u c c e s f u l l y used to solve scientific problems in statistical mechanics ( R o t n e a n d H e i l m a n n , 1976) f o r p o l y m e r s on a g r i d . The c o s t p e r r u n i s v e r y low compared with the cost on a large, fast machine c o n s i d e r i n g the dif ference in speed. The stability of the system is extremely good. The T I 9 8 0 A h a s n o t f a i l e d a t l e a s t i n the p a s t y e a r . The t o t a l h a r d w a r e d e v e l o p m e n t c o s t i s a r o u n d h a l f the p r i c e of the minicomputer. The basic software d e v e l o p m e n t c o s t was a r o u n d two p e r s o n m o n t h s . Acknowledgement.
The g r a n t from S t a t e n s N a t u r v i d e n s k a b e l i g e F o r s k n i n g s r â d to p u r c h a s e the TI980A i s gratefully acknowledged. Jorgen Bang designed and implemented the g e n e r a l a s s e m b l e r . H e i n r i c h B j e r r e g a a r d implemented the basic software.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
136
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
Abstract.
The low price of minicomputers makes them attractive for timeconsuming jobs which are only cpu-bound, like Monte Carlo simulations. At the H. C. Ørsted Institute a minicomputer with 12 k main memory has been connected to the multiprogrammed RC4000 computer. A l l program development is done on the RC4000 and so is the control of the minicomputer.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
Literature cited.
Bang,
J.,
(1974), Report 74/15, Datalogisk
Institut,
København. Brinch Hansen, P. (1967), Bit 7 191-199 Brinch Hansen, P. (1973), Operating Systems Principles, Prentice-Hall, Englewood C l i f f s , N. J . Graae Sørensen, P. and Lindgård, A. (1973), Computers in Chemical Research and Ed. Hadzi, Elsevier, Amsterdam. Lauesen, S. (1969), ALGOL5 User's Manual, RCSL 55-D42 Regnecentralen, København. Miller, W.H. and Schaefer, H.F. (1973), Quarterly Reports, Department of Chemistry, University of California, Berkeley, California. Rotne, J . and Heilmann, O.J. (1976), Proc. VIIth International Congress on Rheology, Gothenburg.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
11 Molecular Dynamics Calculations on a Minicomputer PAUL A. FLINN
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
Physics and Metallurgy Departments, Carnegie-Mellon University, Pittsburgh, PA 15213
Since its introduction by Rahman in 1964 (1), the technique of computer simulation of motion in liquids, generally known as "molecular dynamics", has played a vital role in increasing our understanding of the real nature of the liquid state. Applications of the technique to various liquids have been reviewed by McDonald and Singer (2), Rahman (3), and Fisher and Watts (4). The results of the calculations have been in excellent agreement with a variety of experimental measurements of the properties of liquids: the equation of state, the radial distribution function, inelastic scattering of neutrons, and diffusion. The molecular dynamics results also provide valuable tests of the adequacy of various approximate analytic theories of liquids. A major limitation of the technique has been economic: the calculations have required large amounts of time on large, expensive, computers. Fortunately, it is possible to carry out useful molecular dynamics calculations at greatly reduced cost on a minicomputer (or microcomputer); much more widespread use, including instructional use, of the technique, should now be possible. The calculation is, in principle, quite simple; it consists of numerical integration of the simultaneous nonlinear differential equations of motions for a number of particles constituting a small sample of the liquid. In the original work the Newtonian form of the equations of motion was used: d r. 1
m
.5-
f(r..)
dt where m is_^the p a r t i c l e mass, i s the position of the i t h p a r t i c l e , r-y i s t h e d i s t a n c e between t h e centers o f p a r t i c l e s i and j , and f i s t h e f o r c e a c t i n g between p a r t i c l e i and j . F o r the work d e s c r i b e d here, i t was more convenient t o use t h e Hamilton!an equations:
137 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
138
MINICOMPUTERS
-4 dp. dt " . ^ j
U
r
SCALE
COMPUTATIONS
-• dr ρ dt " m
;
i j
A N D LARGE
where p. i s t h e momentum o f t h e i t h p a r t i c l e . The p o t e n t i a l used f o r a given l i q u i d i s g e n e r a l l y o f a form suggested "by t h e o r e t i c a l arguments, hut with parameters obtained from experimental data on the m a t e r i a l . To i l l u s t r a t e t h e method we use t h e case of argon, w i t h a Lennard-Jones p o t e n t i a l : V(r) =
[(σ/r)
1 2
6
- (σ/r) ]
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
—21 e = I.65 X 10"
and parameter values work o f Rahman.
5? J , σ = 3.U À taken from t h e
B a s i c P r i n c i p l e s o f Minicomputer Use. The c h a r a c t e r i s t i c f e a t u r e s o f most minicomputers are small word s i z e (l6 b i t s ) , l i m i t e d memory, and reasonable speed f o r integer arithemetic. F l o a t i n g point operations u s u a l l y r e q u i r e subroutines and are quite slow. The usefulness o f minicomputers f o r molecular dynamics c a l c u l a t i o n s r e s u l t s from the f a c t that t h e range o f values o f t h e v a r i a b l e s needed i s s u f f i c i e n t l y l i m i t e d t h a t i n t e g e r a r i t h m e t i c can be used, and l 6 b i t p r e c i s i o n i s adequate. The only p o t e n t i a l d i f f i c u l t i e s a r i s e i n connection w i t h t h e interatomic f o r c e f u n c t i o n , which may be o f complicated form, and has an unbounded magnitude. These problems can be f a i r l y e a s i l y circumvented: t h e interatomic f o r c e f u n c t i o n i s evaluated and t a b u l a t e d a t the beginning o f the c a l c u l a t i o n ; i n the body o f t h e c a l c u l a t i o n determination of t h e f o r c e i s simply a look-up operation. The wide range o f the magnitude of t h e f o r c e does not represent any r e a l problem, since t h e f o r c e becomes i n c o n v e n i e n t l y l a r g e only at d i s t a n c e s considerably shorter than those which a c t u a l l y e x i s t when the l i q u i d i s a t or near e q u i l i b r i u m . We can, t h e r e f o r e , truncate t h e magnitude of t h e f o r c e t o a constant value f o r d i s t a n c e s l e s s than some r . We a l s o , as i s customary, l i m i t t h e range of i n t e r a c t i o n by s e t t i n g the f o r c e equal t o zero f o r d i s t a n c e s beyond t h e c u t o f f d i s t a n c e . Our f o r c e law then has t h e form: s
r < r r
g
^ r * ^
f(r) f
(
r
f(r ) g
-13
) c r
r > r c
f(r) = ' v
c r 2
0
and t h e lookup t a b l e need cover only t h e range r ^ r ^ r . F o r t h i s c a l c u l a t i o n , r was taken as 2.82 A, and r_ as 5.07 A. r
c
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
11.
FLiNN
Molecular Dynamics Calculations
139
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
System Hardware. The minicomputer used f o r t h i s work was a Texas Instruments 96OA, borrowed from i t s normal use as a Mossbauer spectrometer and data processor (5>). The system i n c l u d e s 8192 words (l6 b i t ) of semiconductor memory, an i n t e r f a c e t o a t e l e t y p e with paper tape punch and reader, and a CRT d i s p l a y . The o p t i o n a l extended i n s t r u c t i o n set of the 96OA includes the f o l l o w i n g hardware operations: m u l t i p l y two l6 b i t words, 32 b i t (double word) product; d i v i d e double word by s i n g l e word, s i n g l e word quotient and s i n g l e word remainder; double word add; double word subtract; double word l e f t and r i g h t s h i f t operations. The d i s p l a y i s provided by a Tektronix 603 storage d i s p l a y u n i t , d r i v e n by two D a t e l DAC k^lOB 10 b i t analog t o d i g i t a l converters, i n t e r f a c e d through the communications r e g i s t e r u n i t (CRU) of the computer. Alphameric d i s p l a y i s provided by software; no character generating hardware i s used. The o r i g i n a l cost of the computer was about $7000 (1972). The 96OA i s no longer made, but equipment w i t h s i m i l a r performance can now be obtained at a much lower cost. For i n t e g e r a r i t h m e t i c , the 96OA i s only moderately slower than t y p i c a l l a r g e computers, such as the Univac 1108. The times i n microseconds f o r some t y p i c a l i n s t r u c t i o n s are:
96 OA
1108
Add
3.583
1.50
Subtract
3.583
1.50
Multiply
8.583
3.125
10Λ17
Load
3.333
3.875 1.50
Store
3.583
1.50
Divide
System Software. The operating system used f o r t h i s work was one o r i g i n a l l y w r i t t e n f o r the Mossbauer spectrometer a p p l i c a t i o n , and described i n more d e t a i l elsewhere (_5). I t c o n s i s t s of a monitor, i/o r o u t i n e s , and a f l o a t i n g p o i n t a r i t h m e t i c package. The monitor provides f o r the l o a d i n g of programs from paper tape, i n i t i a t i o n of execution, recovery from e r r o r t r a p s , dump, patch, and debug facilities. The i/o r o u t i n e s provide f o r t e l e t y p e input and output of decimal, hexadecimal, and alphanumeric data, and CRT d i s p l a y of alphanumeric data by software character generation. The f l o a t i n g point package i n c l u d e s a d d i t i o n , s u b t r a c t i o n , m u l t i p l i c a t i o n , d i v i s i o n , i n t e g e r t o f l o a t i n g p o i n t , and f l o a t i n g p o i n t t o i n t e g e r conversion. A f i x e d point square root r o u t i n e was w r i t t e n and included f o r t h i s c a l c u l a t i o n . The system
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
140
AND
LARGE SCALE
COMPUTATIONS
software occupies 2128 words, with an a d d i t i o n a l 82 words f o r t h e square root r o u t i n e . A l l programming was done i n assembly language, and converted t o object code by a macro i n s t r u c t i o n processor and a cross assembler run on an IBM 360/67. Molecular Dynamics Program. The working program c o n s i s t s o f s e v e r a l p a r t s : initial i z a t i o n , c o o l i n g , i n t e g r a t i o n , c a l c u l a t i o n o f s t a t i s t i c s , CRT d i s p l a y , and t e l e t y p e output. The program storage requirements, i n l 6 b i t words, are: i n i t i a l i z a t i o n , 210; c o o l i n g , 38; i n t e g r a t i o n , 192; s t a t i s t i c s , 100; d i s p l a y and output, kQ; t o t a l , 588. The data storage requirements are: f o r c e t a b l e , 102^; p o s i t i o n , momentum, i n i t i a l p o s i t i o n and i n i t i a l momentum, 38+ each; c o r r e l a t i o n f u n c t i o n s , 102^; t o t a l data 358^. The o v e r a l l space r e q u i r e d i s ^172 words.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
1
Units and S c a l i n g . I t i s customary i n molecular dynamics c a l c u l a t i o n s t o use a system o f u n i t s based on t h e p r o p e r t i e s o f the system under study: f o l l o w i n g the choice o f Tsung and Maclin (6), we take t h e p a r t i c l e mass as t h e u n i t o f mass, t h e parameter σ as t h e u n i t o f length, and a conveniently short time ( l O " ^ sec.) as t h e u n i t o f time. With t h i s convention, t h e momentum i s numerically equal t o t h e velocity. In order t o avoid unnecessary l o s s o f p r e c i s i o n i n t h e i n t e g e r a r i t h m e t i c , i t i s necessary t o scale t h e v a r i a b l e s o f t h e problem i n t o proper i n t e g e r u n i t s . We consider f i r s t t h e length s c a l e : we have a system o f Ν p a r t i c l e s , with a volume V.per p a r t i c l e , f o r a t o t a l volume To use t h e f u l l p r e c i s i o n o f the computer we represent t h i s length by 2 ^ . We choose our u n i t o f time f o r one i n t e g r a t i o n step as 62.5 femtoseconds; t h i s i s scaled as l / l 6 o f an i n t e g e r u n i t , since m u l t i p l i c a t i o n by At i s accomplished by a s h i f t o f k b i n a r y places t o the r i g h t . One i n t e g e r unit o f time i s t h e r e f o r e 0.1 picosecond. T h i s choice o f distance and time s c a l e s f i x e s t h e v e l o c i t y (and momentum) s c a l e s . We d e f i n e the "temperature" o f the system i n terms o f the k i n e t i c energy: 1
Τ - 3k/m. 2
Startup
of Calculation.
The f i r s t step i n the c a l c u l a t i o n i s the generation o f t h e f o r c e lookup t a b l e f o r t h e range o f R needed (1023^ t o 18^26). Since t h e memory space a v a i l a b l e was quite l i m i t e d , steps o f 8 were used, so that only 102^ l o c a t i o n s were r e q u i r e d . I n t e r p o l a t i o n from the t a b l e was planned f o r intermediate values of R, but proved t o be unnecessary.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
11.
FLiNN
Molecular Dynamics Calculations
141
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
Next, t h e i n i t i a l c o n f i g u r a t i o n o f t h e system i s constructed by a s s i g n i n g random values t o t h e p o s i t i o n and momentum coordinates of t h e p a r t i c l e s . Random numbers are generated by the m u l t i p l i c a t i v e congruence method: repeated m u l t i p l i c a t i o n by 3125 and r e t e n t i o n o f t h e l e a s t s i g n i f i c a n t h a l f of t h e double word product. T h i s i n i t i a l c o n f i g u r a t i o n has, o f course, an extremely high energy. I t i s necessary t o " c o o l " t h e system by g r a d u a l l y removing k i n e t i c energy. T h i s i s done by p e r i o d i c a l l y reducing the magnitude o f each component o f momentum o f each p a r t i c l e by some f r a c t i o n o f i t s value. The c o o l i n g must be done g r a d u a l l y t o avoid f r e e z i n g i n a nonequilibrium s t a t e ; t h e c o o l i n g r a t e ' s h o u l d not exceed t h e r a t e a t which t h e i n i t i a l l y extremely high p o t e n t i a l energy o f t h e system can be converted i n t o k i n e t i c energy, so t h a t approximate e q u i p a r t i t i o n i s maintained. Main C a l c u l a t i o n Loop. The c a l c u l a t i o n proper i s c a r r i e d out i n a nest of three loops. The outermost (T loop) i s a time loop; each execution corresponds t o one time i n t e g r a t i o n step. The intermediate loop (J loop) i s over a l l p a r t i c l e s ; one execution corresponds t o a c a l c u l a t i o n o f the net f o r c e on one p a r t i c l e , and an updating o f the p o s i t i o n and momentum o f t h a t p a r t i c l e . The innermost loop (I loop) i s a l s o over a l l p a r t i c l e s ; one execution corresponds t o a c a l c u l a t i o n o f t h e f o r c e on one p a r t i c l e due t o one other particle. We use t h e f o l l o w i n g n o t a t i o n t o d e s c r i b e t h e c a l c u l a t i o n : X l ( l ) , X 2 ( l ) , X 3 ( l ) : p o s i t i o n coordinates of the I t h p a r t i c l e . P l ( l ) , P 2 ( l ) , P 3 ( l ) : momentum coordinates o f t h e I ' t h p a r t i c l e . DX1, DX2, DX3: components o f t h e v e c t o r from p a r t i c l e J t o p a r t i c l e I; e.g., DX1 = X l ( l ) - X l ( j ) . R: t h e d i s t a n c e from p a r t i c l e J t o p a r t i c l e I . F: t h e f o r c e exerted by p a r t i c l e I on p a r t i c l e J . F l , F2, F3: t h e components o f the net f o r c e on p a r t i c l e J ; t h i s i s c a l c u l a t e d as a running sum over t h e I p a r t i c l e s i n t h e inner loop. The c a l c u l a t i o n proceeds as f o l l o w s : Zero t h e time r e g i s t e r and enter t h e Τ loop. I n i t i a l i z e t h e J r e g i s t e r and enter t h e J loop. C l e a r F l , F2, F3 t o zero. I n i t i a l i z e t h e I r e g i s t e r and enter t h e I loop. Test and s k i p i f I = J . C a l c u l a t e RR - DX1**2 + DX2**2 + DX3**2 as a double word sum. Test and s k i p i f RR > RRC. (Separation beyond c u t o f f range). C a l c u l a t e R = SQRT(RR) and s t o r e . Form R-RS and set equal t o zero i f negative. S h i f t r i g h t 3 places ( d i v i d e by 8) and use as index t o look up F. Form the components of F: (F*DXl/R), (F*DX2/R), (F*DX3/R), and add t o F l , F2, F3. T
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
142
AND LARGE SCALE
COMPUTATIONS
Increment I and continue. On e x i t from I loop c a l c u l a t e momentum changes as F l , F2, F3 s h i f t e d r i g h t k places ( d i v i d e d "by 16, corresponding t o At = l / l 6 ) and update P l ( j ) , P 2 ( j ) , and P5(J). C a l c u l a t e p o s i t i o n changes as P l ( j ) , P 2 ( j ) , and P 3 ( j ) , s h i f t e d r i g h t k p l a c e s , and update X l ( j ) , X 2 ( j ) , and X5(J). D i s p l a y new p o s i t i o n . Increment J and continue. On e x i t from J loop, store panel switches and t e s t f o r e x i t t o monitor, c o o l i n g , or temperature c a l c u l a t i o n . Increment Τ r e g i s t e r and continue.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
D i s p l a y and Output o f R e s u l t s . The c a l c u l a t i o n s produce two s o r t s o f r e s u l t s : p a r t i c l e p o s i t i o n s as a f u n c t i o n o f time, and s t a t i s t i c a l f u n c t i o n s o f t h e system. The p a r t i c l e p o s i t i o n s as a f u n c t i o n o f time are d i s p l a y e d on t h e storage CRT. F o r reasons o f c l a r i t y , only those p a r t i c l e s i n t h e f i r s t octant ( a l l components o f p o s i t i o n p o s i t i v e ) are d i s p l a y e d . A f t e r the new p o s i t i o n o f t h e J ' t h p a r t i c l e i s c a l c u l a t e d , t h e t h r e e components o f p o s i t i o n a r e t e s t e d , and, i f a l l are p o s i t i v e , t h e χ and y components o f p o s i t i o n are t r a n s m i t t e d through t h e CRU t o t h e 10 b i t analog t o d i g i t a l converters which d r i v e t h e CRT d i s p l a y u n i t . We thus d i s p l a y t h e p r o j e c t i o n on t h e x-y plane o f t h e content o f t h e f i r s t octant. P l a c i n g t h e d i s p l a y u n i t i n storage mode r e s u l t s i n the development o f t r a c e s o f t h e paths o f the centers o f t h e p a r t i c l e s . Some t y p i c a l t r a c e s a f t e r v a r y i n g lengths o f time (θ.36 ps, 0.9 ps, 1.8 p s ) a r e shown i n Figures 1, 2 and 3. Such d i s p l a y s are quite valuable f o r v i s u a l i s i n g t h e nature o f a l i q u i d ( i t r a p i d l y becomes obvious t h a t a l i q u i d i s n e i t h e r g a s - l i k e nor s o l i d - l i k e ) , but, obviously some q u a n t i t a t i v e c h a r a c t e r i s t i c s a r e needed. Two widely used s t a t i s t i c s o f a l i q u i d a r e t h e mean square displacement f u n c t i o n , and t h e v e l o c i t y a u t o c o r r e l a t i o n f u n c t i o n . We take t h e mean square displacement f u n c t i o n , χ ( t ) , as t h e ensemble average: 2
x ( t ) = <x (o)x (t)> i
i
1
It i s , o f course, equal t o t h e time average f o r any p a r t i c l e : 2
x ( t ) = <x(T) (t + x)> x
T
but t h e f i r s t form i s more convenient here. To evaluate i t , we choose a s t a r t i n g time a f t e r t h e system has reached e q u i l i b r i u m , as determined by t h e constancy o f t h e "temperature". We take t h i s time as t = 0, and store t h e values o f XI, X2, and X3 f o r a l l t h e
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
11. F L i N N
Molecular Dynamics Calculations
143
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
Figure 1. Projection on the x-y plane of the tracer of the motion of the center of simulated argon atoms in thefirstoctant of the system. Temperature is 90 K; elapsed time 0.36 picoseconds; computation time, 2 minutes.
^ •
^
j ^
* \
φ
•η) ν \.
w
Figure 2. Projection of tracer of motion of argon atoms as in Figure 1, but after total elapsed time of 0.9 picoseconds; computation time, 5 minutes
Figure 3. Projection of tracer of motion of argon atoms as in Figures 1 and 2, but after total elapsed time of 1.8 picoseconds; com putation time, 10 minutes
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
144
A N D LARGE SCALE
COMPUTATIONS
p a r t i c l e s as X S 1 , X S 2 , and X S 3 . At each time step, a f t e r completion of the J loop, t h e current value o f x ( t ) i s evaluated by summing ( X l ( l ) - X S l ( l ) ) * * 2 + ( X 2 ( l ) - X S 2 ( l ) ) * * 2 + ( X 3 ( l ) - X S 3 ( l ) ) * * 2 over a l l p a r t i c l e s and d i v i d i n g by N. The r e s u l t i n g f u n c t i o n can be d i s p l a y e d a t any time on the CRT or punched out on paper tape at the conclusion o f a run f o r p l o t t i n g on a pen and ink p l o t t e r . A t y p i c a l p l o t i s shown i n Figure k The normalized v e l o c i t y a u t o c o r r e l a t i o n f u n c t i o n i s c a l c u l a t e d i n a s i m i l a r way. I t i s defined as: 2
m
2
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
0 ( t ) = /. At the t = 0 chosen as d e s c r i b e d above, we store the values o f PI, P 2 , P3 f o r a l l the p a r t i c l e s as PSI, P S 2 , P S 3 . A f t e r each time step we form P1*PS1 + P 2 * P S 2 + P3*PS3 f o r each p a r t i c l e , sum over a l l p a r t i c l e s , and normalize by d i v i s i o n by the sum of P S 1 * * 2 + P S 2 * * 2 + P S 3 * * 2 f o r a l l p a r t i c l e s . This f u n c t i o n a l s o can be d i s p l a y e d on the CRT or punched out f o r e x t e r n a l p l o t t i n g . A t y p i c a l p l o t o f t h i s f u n c t i o n i s shown i n F i g u r e 5 · Comparison with Standard C a l c u l a t i o n s . The r e s u l t s obtained i n t h i s i n v e s t i g a t i o n are c o n s i s t e n t with those obtained i n conventional l a r g e machine c a l c u l a t i o n s , but the cost per computation i s very much lower. The q u a l i t a t i v e f e a t u r e s seen i n Figures 1-5 are the same as those reported f o r the standard c a l c u l a t i o n s ; a q u a n t i t a t i v e t e s t i s provided by the d i f f u s i o n c o e f f i c i e n t , which i s a s e n s i t i v e t e s t o f the technique. Levésque and V e r l e t ( 6 ) have summarized the r e s u l t s o f t h e i r c a l c u l a t i o n s f o r argon with the e m p i r i c a l formula i n reduced units: D = 0.006^23 T/p
2
+ 0 . 0 2 2 2 - 0 . 0 2 8 0 p.
Converted t o SI u n i t s , t h i s becomes: D = 5.639 X 1 0 "
5
T/p
2
+ 8.270 X 1 0 "
9
- 6.2C7 X 1 0 '
1 2
p.
For t h e c o n d i t i o n s corresponding t o the data shown i n Figure k, Τ = 113 Κ, and Ρ = l kh6 X 1 0 " 9 m /s, t h e i r equation p r e d i c t s D = 2 . 3 ^ X 1 0 " n r / s . The data o f F i g u r e 2 correspond t o a D - 2 . 7 7 Χ Ι Ο " m /s. T h i s d i f f e r e n c e i s of the same order as t h e s c a t t e r of standard c a l c u l a t i o n s , and the discrepancy between c a l c u l a t i o n and experiment. The speed o f the c a l c u l a t i o n was quite reasonable: 282 time steps i n 1 0 minutes, or 1692 steps per hour. Each second of machine time corresponds t o 3 X 1 0 " l 5 seconds i n argon. F o r comparison, the reported r a t e achieved on a l a r g e machine, a CDC 6 6 0 0 , was I5OO steps per hour f o r a somewhat l a r g e r system (Q6k p a r t i c l e s ) ( 2 ) . With proper programming, the c a l c u l a t i o n 2
m
9
9
2
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
11. F L i N N
Molecular Dynamics Calculations
Time
145
(Picoseconds)
Figure 4. Mean square displacement of simuhted argon atoms at 113 Κ as a function of time
I 0.0
ι 0.2
ι
ι
ι
ι
0.4 0.5 0.8 1.0 Time (Picoseconds)
ι
ι
1.2
1.4
Figure 5. Normalized velocity autocorrelation function of simulated argon atoms at 113 Κ
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
146
AND LARGE SCALE
COMPUTATIONS
time v a r i e s approximately as N. With t h i s allowance, i t appears t h a t c a l c u l a t i o n on a minicomputer i s slower "by roughly a f a c t o r of 6 than on a l a r g e machine. To estimate t h e r e l a t i v e cost o f computation, we take t h e i n i t i a l cost of t h e system and d i s t r i b u t e i t over t h r e e years (a conservative procedure, s i n c e our machine has been i n continuous use f o r f i v e years with no maintenance contract and n e g l i b l e s e r v i c i n g ) . T h i s corresponds t o a cost of $6Λθ a day or $0.27 per hour. I f we assume a l a r g e machine cost of about $100 per hour, t h e cost of equivalent c a l c u l a t i o n s i s lower on t h e small machine by a f a c t o r o f about 50.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
Future Prospects. The use o f c u r r e n t l y a v a i l a b l e hardware, i n s t e a d of t h e obsolete 96ΟΑ, would make p o s s i b l e both g r e a t e r savings and much more ambitious c a l c u l a t i o n s . In p a r t i c u l a r , a t h r e e dimensional a r r a y o f microprocessors (such as t h e T I 9900), each assigned a p o r t i o n o f the volume under study, could be used t o i n c r e a s e t h e speed o f c a l c u l a t i o n by more than an order of magnitude f o r a c o s t i n c r e a s e o f about a f a c t o r o f two.
Literature Cited. (1) Rahman, Α., Phys. Rev. (1964), 136, A405. (2) McDonald, I. R. and Singer, Κ., Quart. Rev. (1970), 24, 238. (3) Rahman, A. in "Interatomic Potentials and Simulation of Lattice Defects", ed. by Gehlen, P. C., Beeler, J . R. J r . , and Jaffee, R. I., p. 233, Plenum, N.Y., 1972. (4) Fisher, R. A. and Watts, R. O., Aust. J . Phys. (1972), 25, 529. (5) Flinn, P. Α., in "Mössbauer Effect Methodology", vol. 9, ed. by Gruverman, I. J., Seidel, C. W., and Dieterly, p. 245, Plenum, N.Y. 1974. (6) Levesque, D. and Verlet, L . , Phys. Rev. (1971), A2, 2514.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12 Many-Atom Molecular Dynamics with an Array Processor KENT R. WILSON
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
Department of Chemistry, University of California—San Diego, La Jolla, CA 92093
"The change of motion is proportional to the motive force impressed; and is made in the direction of the right line in which the force is impressed." Sir Isaac Newton, Philosophiae Naturalis Principia Mathematica, 1687. I.
Introduction and History
A. Theoretical Instruments. We chemists traditionally have built specialized instrumentation for experimental studies. We are now beginning also to build specialized instrumentation for theory (1). While we are accustomed to designing and building, for example, special spectrometers or molecular beam machines to efficiently probe the experimental side of a particular class of chemical questions, it is now becoming clear that with comparable effort we can also design and build specialized computational systems which will efficiently probe particular classes of theoretical problems. The reasons for building specialized instrumentation in either case are similar; that we want to explore chemical questions beyond the range of what we can learn using general purpose commercial instrumentation which must sacrifice specific efficiency to generalized applicability. B. Plastic Hardware. We are accustomed to thinking of computer software as plastic, malleable; employed to adapt a general purpose computer to our specific needs. The advance of computer science and technology has now softened hardware as well, making it also plastic, moldable to effectively fit the task at hand. But while hardware is plastic, it still has restraints. It flows more easily in some directions than in others. Thus, the initial task is to find those chemical problems which are best suited to this natural direction of hardware flow. For example, it is now cheaper to replicate many identical hardware units than to produce even a few different units. Therefore, one direction of hardware flow is toward structures composed of many identical units, working in parallel (2-4). The American Chemical Library In MinicomputersSociety and Large Scale Lykos, P.; 1155 16th st. Computations; N . w. ACS Symposium Series; American Chemical Washington. D. C. Society: 2O036Washington, DC, 1977.
MINICOMPUTERS
148
AND LARGE
SCALE
COMPUTATIONS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
congruent chemistry involves those t h e o r e t i c a l problems which can be cast i n t o forms i n v o l v i n g many simultaneous p a r a l l e l streams o f computation. C. Mechanical Molecules. One such chemical area i s the c l a s s i c a l mechanical treatment o f how n u c l e i , or roughly speaking atoms, i n t e r a c t on a Born-Oppenheimer p o t e n t i a l s u r f a c e . The i d e a t h a t the forces among a c o l l e c t i o n o f p a r t i c l e s determine both t h e i r s t a t i c c o n f i g u r a t i o n (molecular s t r u c t u r e ) and t h e i r motions (molecular dynamics) i s an o l d one. Newton, i n the 17th century, already understood the fundamental concepts o f c l a s s i c a l l y i n t e r a c t i n g p a r t i c l e s and considered t h a t macroscopic p r o p e r t i e s might r e s u l t from m i c r o s c o p i c i n t e r a c t i o n s . By the 19th century, with the acceptance o f the atomic theory, the view that chemistry should u l t i m a t e l y be an e x e r c i s e i n mechanics be came a popular one. The nature o f the underlying mechanics became apparent f i f t y years ago with the development o f quantum mechanics; i t i s now c l e a r that what the e l e c t r o n s are doing i s i n h e r e n t l y a quantum problem, but given a p o t e n t i a l surface derived e i t h e r from a t h e o r e t i c a l quantum computation o f e l e c t r o n i c energy or from a f i t to experimental measurements, that what the n u c l e i are doing both i n terms o f molecular s t r u c t u r e and molecular dynamics can be handled i n most cases reasonably w e l l by t h a t approximate form o f quantum mechanics c a l l e d c l a s s i c a l mechanics. (In a sense t h i s i s unfortunate, for chemistry would be an even more subtle and i n t e r e s t i n g p u z z l e i f P l a n c k s constant were l a r g e r . ) We w i l l thus concentrate here on the advantages which com p u t e r hardware p l a s t i c i t y can b r i n g to c l a s s i c a l molecular dy namics. (Molecular s t a t i c s o r molecular s t r u c t u r e w i l l be viewed i n t h i s context as that subset o f molecular dynamics for which the energy has been reduced to a g l o b a l minimum.) The s t r u c t u r e o f the computation i s exceedingly s i m p l e , a d e s i r a b l e s i t u a t i o n f o r a f i r s t essay i n t o a d i f f e r e n t mode o f s o l u t i o n . Given Ν atoms, we have, from Newton's Second Law, 1
F. = m. "
1
m
Q
£j ;
dt
Z± £±(£v
i = 1,
..., Ν
(1)
2
•••>iT )"-V V(r , N
i
...,r )
1
(2)
N
i n which j ; . , the force on the i t h atom, l o c a t e d at _r. , i s a func t i o n o f the p o s i t i o n s , · · · > JTN> °^ °^ ° whose masses are n ^ , n^, and V i s the Bom-Oppenheimer p o t e n t i a l t
n
e
s
e
t
a
t
m
s
surface seen by the n u c l e i .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12.
Molecular Dynamics
WILSON
149
D. Two Molecular Dynamics. S t r a n g e l y , the a p p l i c a t i o n o f t h i s viewpoint, that chemistry may be understood as the d e t a i l e d mechanics o f atomic motions, has l e d to two q u i t e d i s t i n c t f i e l d s , each c a l l e d by the same name, molecular dynamics, which have r e mained q u i t e separate f o r twenty y e a r s . Both f i e l d s , which are compared i n Table I , grew up i n the l a t e 1950 s , one (5) out o f s t a t i s t i c a l mechanics (SM), l a r g e l y (but not e x c l u s i v e l y ) concerned with e q u i l i b r i u m and steady s t a t e p r o p e r t i e s , u s u a l l y o f f l u i d s composed o f many simple p a r t i c l e s : hard spheres, atoms or s i m p l i f i e d molecules. The breakthrough which t r i g g e r e d the development o f the f i e l d was computational, the a b i l i t y provided by the e l e c t r o n i c computer to a c t u a l l y c a l c u l a t e the t r a j e c t o r i e s of many i n t e r a c t i n g p a r t i c l e s .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
f
TABLE I .
Comparison of two f i e l d s
c a l l e d molecular dynamics
Category
Molecular Dynamics (SM)
Molecular Dynamics (CK)
historical antecedents
statistical
chemical k i n e t i c s
initiating breakthrough
computational
experimental
major application
e q u i l i b r i u m and steady s t a t e
chemical
number o f atoms
many
few
major state
liquid
vacuum ( i s o l a t e d molecules)
mechanics
reactions
The other molecular dynamics (6^, 7) grew out o f chemical k i n e t i c s (CK) and has been concerned with understanding the d e t a i l e d mechanics of the mechanisms o f chemical r e a c t i o n s , usua l l y i n v o l v i n g r e l a t i v e l y few atoms, s m a l l e r molecules c o l l i d i n g and r e a c t i n g i n i s o l a t i o n , the "vacuum" phase. The development of the f i e l d was i n i t i a t e d by experimental advances, the a b i l i t y p r o v i d e d by molecular beam and i n f r a r e d chemiluminescence t e c h niques to measure the r e s u l t s o f i n d i v i d u a l chemical r e a c t i o n events. What we are now attempting i s a synthesis drawing from both f i e l d s o f molecular dynamics, a computational advance which w i l l allow through mechanics the study o f the d e t a i l e d mechanisms o f chemical r e a c t i o n s i n v o l v i n g many atoms, often o c c u r r i n g i n solution.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
150
MINICOMPUTERS AND LARGE
SCALE
COMPUTATIONS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
E. D i f f i c u l t i e s and D i r e c t i o n s . Given that the s t r u c t u r e of E q s . (1) and (2) i s so s i m p l e , why i s n ' t the d e t a i l e d mech anism o f many-atom chemical r e a c t i o n s r o u t i n e l y s t u d i e d by com puting the t r a j e c t o r i e s o f the atoms? Three major d i f f i c u l t i e s are as f o l l o w s . 1. Potential surface. In r e a l i t y , we know q u a n t i t a t i v e l y r e l a t i v e l y l i t t l e about the b a s i c determinant o f molecular s t r u c t u r e and dynamics, the forces among atoms. I f we would have to compute from f i r s t p r i n c i p l e s the p o t e n t i a l surface to chemical accuracy s e p a r a t e l y f o r each large molecule o f i n t e r e s t along with a l l the i n t e r a c t i o n s with surrounding solvent mole c u l e s , the problem would seem insurmountable. Our chemical experience, c o n c e p t u a l i z a t i o n , nomenclature and system o f c a t a l o g i n g o f molecules, however, i s based on the f a i t h t h a t mole cules can be analyzed i n t o f u n c t i o n a l groups which r e t a i n t h e i r approximate i d e n t i t y and nature from molecule to molecule. Thus the force f u n c t i o n s , £^(£η> · · · > £ ) > to a f i r s t approximation should be decomposable i n t o i ) l o c a l force functions which describe chemical f u n c t i o n a l groups and which are approximately t r a n s f e r a b l e from molecule to molecule and i i ) terms which des c r i b e the i n t e r a c t i o n among f u n c t i o n a l groups. This t r a n s f e r a b l e force function approach has been e x t e n s i v e l y developed i n v i b r a t i o n a l spectroscopy ( 8 ) , organic chemistry (9-12) and biochemis t r y (13, 14) and the wide extent o f i t s a p p l i c a b i l i t y i s s t r e s s e d i n a recent review by Warshel (15), who describes both the usual type of f u l l y e m p i r i c a l p o t e n t i a l surface treatment and a v e r s i o n i n which π e l e c t r o n s are t r e a t e d i n a formulation de r i v e d from semiempirical quantum mechanics. Thus a reasonable approach to p o t e n t i a l surfaces i s the p a t i e n t c o l l e c t i o n and refinement with respect to t h e o r e t i c a l c a l c u l a t i o n s and comparison o f computed to measured parameters of a l i b r a r y o f force functions which should be at l e a s t approx imately t r a n s f e r a b l e from molecule to molecule. N
2. Computational speed. I f one wishes to study the de t a i l e d molecular dynamics o f r e a c t i o n s o f even simple molecules i n s o l u t i o n , one must consider at l e a s t a s i n g l e s o l v a t i o n s h e l l around each molecule, and thus at l e a s t the order o f 100 atoms. Given x, y and ζ components for E q s . (1) and (2), one must solve the order o f 300 coupled d i f f e r e n t i a l equations, i n t e g r a t i n g forward for thousands or perhaps m i l l i o n s of time steps. The number o f a r i t h m e t i c operations i n v o l v e d i s therefore i n e v i t a b l y large. I f one wishes to i n t e r a c t with the on-going c a l c u l a t i o n s , viewing the t r a j e c t o r i e s o f the atoms and seeing the r e s u l t s o f m o d i f i c a t i o n s o f parameters w i t h i n a reasonable waiting time, the p r o c e s s i n g system must be a r a p i d one even by today's l a r g e computer standards. T h i s d i f f i c u l t y , however, i s overshadowed by an even more demanding and s u b t l e one.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12.
WILSON
Molecular Dynamics
3. I n i t i a l c o n d i t i o n s . U n f o r t u n a t e l y , we u s u a l l y do not know i n advance where to s t a r t , which set o f i n i t i a l p o s i t i o n s and v e l o c i t i e s for the atoms w i l l l e a d , as time proceeds, to the chemical process o f i n t e r e s t . For most chemical r e a c t i o n s we can't j u s t assemble our molecules and allow them to r a t t l e around toward e q u i l i b r i u m , f o r on the time s c a l e o f i n t e r n a l molecular motion most chemical r e a c t i o n s o f i n t e r e s t w i l l a l most never occur i n an e q u i l i b r i u m system. Thus a random ap proach doesn't solve the problem. A quick c a l c u l a t i o n shows that a brute force systematic approach won't solve i t e i t h e r . Consider a systematic search through j u s t 10 d i f f e r e n t i n i t i a l p o s i t i o n vectors and 10 d i f ferent i n i t i a l v e l o c i t y vectors f o r each o f 100 atoms. This would give ΙΟΟίΟΟ = 1 0 (a number greater than the estimated number o f atoms i n the u n i v e r s e ) , d i f f e r e n t i n i t i a l phase space p o i n t s , each o f which would have to be i n t e g r a t e d forward i n time t o decide i f i t d i d indeed l e a d to the r e a c t i o n o f i n t e r e s t . Such a brute force approach i s now and w i l l always remain i n feasible. I f n e i t h e r random nor brute force systematic approaches are g e n e r a l l y f e a s i b l e , what can be done? One p o s s i b l e ap proach i s the development o f techniques to automatically i d e n t i fy c r i t i c a l configurations or saddle p o i n t s (or more p r e c i s e l y surfaces or regions i n phase space (16) through which r e a c t i o n t r a j e c t o r i e s must p a s s ) . I f one can i d e n t i f y such a phase space r e g i o n , one can then i n t e g r a t e both forward and backward i n time to t r a c e out the e n t i r e t r a j e c t o r y , and one can explore neighboring t r a j e c t o r i e s as w e l l . T h i s approach can be s t r a i g h t forward for systems with s u f f i c i e n t symmetry, such as defect jumps i n c r y s t a l s (17), and i t s extension to more complex mole c u l a r systems can a l s o be expected t o be pursued. Another a l t e r n a t i v e , perhaps complementary to the above, i s t o t r y to use the human chemist's accumulated understanding of the mechanisms o f chemical r e a c t i o n s to guide the machine's calculations. We chemists at l e a s t think we have some know ledge o f the way to r e l a t i v e l y o r i e n t two molecules and how to shove them at one another to get them to r e a c t . We t h i n k we have some f e e l i n g f o r the r e a c t i o n pathway from reactants to p r o d u c t s , for the bonds which must change and f o r the c r i t i c a l c o n f i g u r a t i o n s ( t r a n s i t i o n s t a t e s , a c t i v a t e d complexes) which must be t r a v e r s e d . U n f o r t u n a t e l y , t h i s chemists's understanding i s l a r g e l y p i c t o r i a l and i n t u i t i v e , but our computers need n u m e r i c a l guidance as to p o s i t i o n s and v e l o c i t i e s i n order to p r o ceed. T h i s need to b r i n g together the chemist's non-numerical mechanistic understanding o f the r e a c t i o n pathway with the machine's a b i l i t y t o c a l c u l a t e forward and backward along the r e a c t i o n t r a j e c t o r y once given the p o t e n t i a l surface and the atomic p o s i t i o n s and v e l o c i t i e s at any given p o i n t on the t r a j e c t o r y has l e d us to work on techniques of c l o s e r man-machine interaction. 2 0 0
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
151
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
152
MINICOMPUTERS
AND LARGE
SCALE
COMPUTATIONS
The f i r s t need i s v i s i o n . In order t o comprehend the molec u l a r dynamics o f r e a c t i o n s i n v o l v i n g a hundred or more atoms, i t i s imperative to be able to watch the motions, the t h r e e dimensional (3D) t r a j e c t o r i e s o f the atoms i n v o l v e d . Fortun a t e l y t h i s i s a w e l l - s o l v e d problem, with s e v e r a l s p e c i a l i z e d d i s p l a y systems now being commercially a v a i l a b l e which make f e a s i b l e the v i s u a l i z a t i o n o f the 3D motions o f hundreds o r even thousands o f atoms i n r e a l time (human, not molecular) and even i n c o l o r and/or s t e r e o , i f d e s i r e d . In a d d i t i o n , films can e a s i l y be made u s i n g even r e l a t i v e l y simple d i s p l a y t e r m i n als which can allow the o f f - l i n e v i s u a l i z a t i o n o f molecular dynamics. We q u i c k l y d i s c o v e r e d , however, that v i s i o n alone i s i n s u f ficient. We want to manipulate atoms, fragments w i t h i n molecules or e n t i r e molecules which are c l o s e l y surrounded by other atoms, fragments and molecules, i n order to a r r i v e at some p o i n t on a r e a c t i o n - p a t h phase-space t r a j e c t o r y . To do t h i s we must remain w i t h i n the energy range which i s thermally allowed. However i n a dense system, as i s w e l l known i n Monte Carlo c a l c u l a t i o n s (18), almost a l l randomly chosen new configurations are e n e r g e t i c a l l y i n a c c e s s i b l e , because the atoms are almost a l l already up against hard r e p u l s i v e w a l l s (19) and a random displacement w i l l almost always send the energy too h i g h . Thus, j u s t as p o t e n t i a l surface referenced importance sampling (18) i s used to guide the choice o f new configurations i n Monte C a r l o c a l c u l a t i o n s , some feedback from the p o t e n t i a l energy surface i s needed t o guide the human chemist i n manipulating atoms, fragments and molecules to reach a p o i n t on the r e a c t i o n p a t h . We have found that v i s i o n i s a poor feedback t o o l for maneuvering on a multidimensional p o t e n t i a l surface and we b e l i e v e that t h i s i s at l e a s t i n p a r t because touch r a t h e r than v i s i o n i s the n a t u r a l human sense when forces and torques are to be p e r ceived. This has l e d us to the development of man-machine touch i n t e r f a c e s (1_, 20) more c l o s e l y l i n k man and machine beyond what i s p o s s i b l e with v i s i o n alone. t
0
F. Goal. Our goal thus i s to develop and use an " i n s t r u ment f o r theory" which we c a l l NEWTON, a c l o s e r man-machine symbiosis focused on the understanding of the molecular dynamics o f many-atom chemical r e a c t i o n s , a machine which opens a window to the m i c r o s c o p i c world o f the 3D t r a j e c t o r i e s o f moving atoms, v i s u a l i z e d as we w i s h , elements l a b e l e d , bonds shown. We wish to be able to b u i l d up the system o f i n t e r e s t from atoms, fragments and molecules, adjusting the p o s i t i o n s and v e l o c i t i e s to correspond to our understanding o f mechanism, r e a c t i o n path and c r i t i c a l c o n f i g u r a t i o n i n order to i n i t i a t e the d e s i r e d chemical r e a c t i o n . We want to c o n t r o l energy, temperature and pressure by the turn o f knobs and t o d i s p l a y the c a l c u l a t e d values as the process proceeds. Our viewpoint (angle and zoom) should be v a r i a b l e , as w e l l as which atoms are to be
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12.
WILSON
Molecular Dynamics
153
displayed. One should be able to c o n t r o l the speed of passage of computed time; i n c r e a s i n g (up to the computational l i m i t ) , decreasing, freeze framing, or backing up and then r e a d j u s t i n g parameters and r e s t u d y i n g . One would l i k e to c a l c u l a t e and d i s p l a y derived parameters such as bond lengths and angles, p r o gress along a defined r e a c t i o n coordinate or computed s p e c t r a to compare with measured s p e c t r a . In a d d i t i o n , a r e c o r d of the run, i n c l u d i n g a l l input parameters and atomic t r a j e c t o r i e s , should be stored for future a d d i t i o n a l a n a l y s i s . As we w i l l see i n the f o l l o w i n g s e c t i o n , most o f these instrumental goals have been achieved, at l e a s t , i n a p r e l i m i n a r y fashion.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
II.
Instrumentation
Two versions o f NEWTON have now been b u i l t and t e s t e d , the e a r l i e r v e r s i o n able to handle a few atoms and the present one a hundred or more atoms. A. I n i t i a l V e r s i o n . The f i r s t implementation of the NEWTON concept i s shown s c h e m a t i c a l l y i n Figure 1. As i t i s described elsewhere (1), i t w i l l only b r i e f l y be mentioned here. The equations o f motion are i n t e g r a t e d i n a minicomputer, the moving atoms are d i s p l a y e d on an Evans and Sutherland (E $ S) P i c t u r e System and the user can c o n t r o l the p o s i t i o n and v e l o c i t y of any s e l e c t e d atom by using the "Touchy-Feely" touch i n t e r f a c e , feeling the forces imparted by neighboring atoms. T h i s system served to show that such an instrument could be b u i l t , but was only adequate to handle a few i n t e r a c t i n g atoms and manipulate them atom by atom. B. Present V e r s i o n . The current system, which can handle a hundred i n t e r a c t i n g atoms f a s t enough for i n t e r a c t i v e use (at approximately 10 i n t e g r a t i o n time steps per second) i s shown as a block diagram i n Figure 2 and as a photograph i n Figure 3. Several hundred atoms can be handled at reduced speed. The equations o f motion are i n t e g r a t e d i n a F l o a t i n g Point Systems (FPS) AP120B Array Processor which runs f o r our a p p l i c a t i o n at a through-put o f s e v e r a l f l o a t i n g p o i n t operations per microsecond and which forms, with the help o f i t s h o s t , essentially a general-purpose processor capable of s e v e r a l simultaneous o p e r a t i o n s , with p a r a l l e l and p i p e l i n e d f l o a t i n g p o i n t adder and multiplier. At p r e s e n t , i t lacks d i r e c t higher l e v e l language capability. I t s approximate r e l a t i v e power may be judged by comparisons i n d i c a t i n g a speed 3 to 4 times slower (21) than a C o n t r o l Data Corporation (CDC) 7600 and 10 to 50 times f a s t e r (22) than a Data General (DG) E c l i p s e under Fortran V . It should be r e a l i z e d that a l l such comparisons are a f u n c t i o n o f program mix and e f f i c i e n c y of coding. V i s u a l i n t e r a c t i o n with the user i s through a dynamic 3D
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
χ
IBM
1800
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
AND LARGE
CAMAC CRATE
CAMAC CRATE
SCALE
COMPUTATIONS
META 4
VISUAL PROCESSOR CAMAC CRATE
TTY
VISUAL INTERFACE
TOUCH INTERFACE
Figure 1. Block diagram of system used to test crudely the con cept of NEWTON. The touchstone of the touch interface drives the central carbon atom of a methane molecule, allowing it to be moved and the forces on it from the other atoms to be felt by the user. The molecule is displayed on the Evans à- Sutherland (Eb-S) Picture System, and the differential equations are integrated in real (human) time by the Digital Scientific Meta-4 computer to give the trajectories displayed on the Picture System. The Meta-4 is linked through three CAMAC crates and an IBM 1800 to the California Data Processors (CDP) 135 emulating a Digital Equipment Corporation (DEC) Ρ DP 11/40 which in turn runs symbiotically with the Picture System processor.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Molecular Dynamics
155
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
WILSON
UNIX COP
Figure 2. Block diagram of present NEWTON instrument designed for interactive study of the molecular dynamics of chemical reactions involving a hundred or more atoms. The user interacts with NEWTON by setting parameters such as temperature, pressure, and time step through knobs and teletype, by watching the motion of the atoms and the values of calculated parameters on the screen of the Eb-S Picture System and by adjusting the positions and velocities of atoms with the touch interface. The coupled differential equations (Newtons Second Law) are integrated in the Floating Point Systems (FPS) Array Processor to calculate the atomic trajectories. Other parts of the Chemistry Department Computer Facility (into which NEWTON is integrated) which are used as part of NEWTON include a CDF 135 emulating a DEC PDP 11/40 which serves as host for the Array Processor and the Picture System and a Varian 72 which handles disk management.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
MINICOMPUTERS
AND LARGE
SCALE
COMPUTATIONS
Figure 3. Photograph of NEWTON showing Eù-S Picture System screen on the left, control knobs and FPS Array Processor in the background X TRANSLATION MOTOR
Figure 4. Schematic of "Touchy-Twisty' designed for force-torque—position—orientation man—machine communication, a touch interface to assemble and manipulate three-dimensional objects. A handball containing force-torque vector sensors is driven to position and orientation by three nester computerdriven rotational stages carried by three nested computer-driven transitional stages.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
12.
WILSON
Molecular Dynamics
157
d i s p l a y u s i n g an Evans and Sutherland P i c t u r e System which allows the motions o f the a p p r o p r i a t e l y l a b e l e d atoms to be seen as they are c a l c u l a t e d . A C a l i f o r n i a Data Processors (CDP) 135 emulating a D i g i t a l Equipment Corporation (DEC) PDP 11/40 serves as host for both the Array Processor and the P i c t u r e System. B i n o c u l a r stereo and c o l o r presentations are a v a i l a b l e by v i s u a l f u s i o n of s p i n n i n g - d i s k c o n t r o l l e d s e q u e n t i a l images, but i n p r a c t i c e are only r a r e l y used. R o t a t i o n o f the system o f molec u l e s i s a b e t t e r depth cue and l a b e l i n g o f atoms i s a s u f f i c i e n t identifier. O r i e n t a t i o n of view, angular v e l o c i t y of r o t a t i o n and zoom are a l l c o n t r o l l a b l e by knobs and buttons. Temperature i s v a r i e d by k n o b - c o n t r o l l e d , mass-weighted v i s c o s i t y which removes energy as v i s c o s i t y i s increased or adds energy i f v i s c o s i t y i s formally made n e g a t i v e . E x t e r n a l pressure i s c o n t r o l l e d by changing the s i z e of an e l a s t i c - w a l l e d boundary cube. Other boundary c o n d i t i o n s , for example, p e r i o d i c r e p e t i t i o n or a f r e e f l o a t i n g drop are a l s o p o s s i b l e . Temperature and pressure are c a l c u l a t e d from atomic v e l o c i t i e s , forces and p o s i t i o n s and are d i s p l a y e d on the P i c t u r e System screen. NEWTON i s i n t e g r a t e d i n t o the Chemistry Department Computer F a c i l i t y , which i n c l u d e s a dozen processors interconnected through a system based on the CAMAC convention. Others of these processors which are used i n conjunction with NEWTON i n c l u d e a V a r i a n 72 which handles d i s k management, an IBM 1800 which cont r o l s p e r i p h e r a l s and a second CDP 135 emulating a DEC PDP 11/40 which runs a UNIX time-shared operating system used for program e d i t i n g and f i l e manipulation. C. Touch I n t e r f a c e . We wish to b u i l d up our chemical systems of i n t e r e s t not j u s t atom by atom, but from fragments and whole molecules and we wish a l s o to be able to reach i n t o the simulated volume and guide fragments and molecules i n t o the d e s i r e d coordinates and v e l o c i t i e s to a r r i v e at a p o i n t along the r e a c t i v e t r a j e c t o r y for the chemical process of i n t e r e s t . The atom by atom touch i n t e r f a c e d e s c r i b e d above, i n v o l v i n g force and p o s i t i o n (1^20), i s no longer s u f f i c i e n t i f we wish to assemble and manipulate three dimensional objects such as f r a g ments and molecules i n v o l v i n g f o r c e , torque, p o s i t i o n and o r i e n t a tion. Therefore we are b u i l d i n g (20) what we c a l l a "TouchyTwisty" which i s shown i n Figures 4-6. A b a l l for the u s e r ' s hand (the handball) i s d r i v e n by three nested computer-controlled t r a n s l a t i o n a l stages c a r r y i n g three nested computer-controlled r o t a t i o n a l stages to follow the x , y , z p o s i t i o n of the center of mass as w e l l as the o r i e n t a t i o n o f three defined axes w i t h i n a designated fragment or molecule. The force and torque v e c t o r s exerted by the user on the handball w i l l be sensed by i n t e r n a l f l e x i n g members with s t r a i n gauge pickups (see Figure 6) and w i l l be added a p p r o p r i a t e l y to the forces already exerted by surrounding atoms on each atom o f the designated molecule, and w i l l t h e r e f o r e a f f e c t the on-going
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
MINICOMPUTERS AND LARGE
SCALE
COMPUTATIONS
Figure 5. Photograph of "Touchy-Twisty" partially constructed
Figure 6. Photograph of force-torque resolver inside handball, under construction. Strain gauges will be mounted on the flexing members to pick up components of force and torque.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12.
WILSON
Molecular Dynamics
159
c a l c u l a t i o n of that m o l e c u l e s t r a j e c t o r y . Thus, as the user t r i e s to t r a n s l a t e or r o t a t e the handball-molecule i n a way which matches chemical p o s s i b i l i t y as described by the p o t e n t i a l s u r f a c e , i t w i l l move r e l a t i v e l y f r e e l y , being unhindered by opposing forces from surrounding atoms. Conversely, i f one t r i e s to t r a n s l a t e or r o t a t e the handball-molecule so that r e p u l s i v e walls o f surrounding atoms are impinged upon, i t w i l l move only with d i f f i c u l t y , as these atoms must be shoved out o f the way to proceed. T h i s type o f touch i n t e r f a c e i s designed s p e c i f i c a l l y to i n t e r a c t with a dynamic system, as i t s communication with the user i s i n t i m a t e l y l i n k e d to the computer's a b i l i t y to s i m u l t a neously i n t e g r a t e the equations o f motion of the objects involved i n the dynamic s i m u l a t i o n . f
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
III.
Chemical A p p l i c a t i o n s
While the mechanical molecule approach to the molecular dynamics o f many-atom chemical r e a c t i o n s i s i n p r i n c i p l e a p p l i c a b l e to almost any chemical r e a c t i o n , our lack o f s u f f i c i e n t general q u a n t i t a t i v e knowledge of interatomic forces makes i t wise to concentrate, at l e a s t i n i t i a l l y , on cases i n which the many-atom complexity a r i s e s l a r g e l y from the r e p e t i t i o n of simple u n i t s , for example polymers i n which the monomer i s the repeated u n i t and r e a c t i o n s of smaller molecules i n s o l u t i o n i n which the solvent molecule i s repeated, so that the number o f force parameters to be determined remains manageable. Two o f our current i n t e r e s t s are t h e r e f o r e dynamic approaches to v i b r a t i o n a l spectra i n s o l u t i o n and to the microscopic understanding o f solvation. A. Dynamic Approach to V i b r a t i o n a l S p e c t r a . I f we observe a small molecule, the v i b r a t i o n a l spectrum ( i n f r a r e d or Raman) i s a s e r i e s of w e l l - d e f i n e d l i n e s , and we know how to i n v e r t such s p e c t r a to gain information on the p o t e n t i a l surface near the e q u i l i b r i u m geometry (8, 23). I f we go to many-atom systems, i . e . large molecules or c o l l e c t i o n s o f c l o s e l y i n t e r a c t i n g molecules as i n a l i q u i d , instead o f w e l l - d e f i n e d l i n e s we f i n d broad continuous bands and we can no longer i n v e r t to the p o t e n t i a l surface i n the same d i r e c t way. However, we can s t i l l proceed i n the opposite d i r e c t i o n , c a l c u l a t i n g the v i b r a t i o n a l spectrum from the p o t e n t i a l energy s u r f a c e . (Such an approach was perhaps b e t t e r known before the day o f modern computers when a c t u a l mechanical models o f molecules were constructed from springs and masses and d r i v e n by an e c c e n t r i c disk on a motor whose speed was v a r i e d to f i n d the resonances corresponding to the normal f r e quencies (24, 25).) For example, we can use l i n e a r response theory (26-31) to r e l a t e the spectrum o f the n a t u r a l f l u c t u a t i o n s o f a parameter i n a system at e q u i l i b r i u m to the response spectrum we would f i n d i f we drove that parameter with a weak e x t e r n a l
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
160
MINICOMPUTERS AND LARGE
SCALE
COMPUTATIONS
perturbation. Thus we can s t a r t with a p o t e n t i a l surface V(r . . . , r ) , c a l c u l a t e the t r a j e c t o r i e s r - ( t ) , £ (t) of atoms upon i t at e q u i l i b r i u m at a chosen temperature, c a l culate (for example, i n the f i r s t approximation by a s s i g n i n g p a r t i a l atomic charges) the time v a r y i n g d i p o l e moment y(t) from the t r a j e c t o r i e s , and then c a l c u l a t e the i n f r a r e d spectrum from the power spectrum or from the F o u r i e r transform o f the time c o r r e l a t i o n of the d i p o l e moment (29, 30). 1 5
N
V( ,
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
E l
N
r ) N
X l
(t),
r (t) N
- > }i(t) — • Α ( ω )
(3)
S i m i l a r l y , by assigning an approximate r e l a t i o n s h i p b e tween p o l a r i z a b i l i t y and atomic c o o r d i n a t e s , one should be able to compute Raman s p e c t r a . For example, we have used the L e m b e r g - S t i l l i n g e r p o t e n t i a l (32) f o r water to c a l c u l a t e i n f r a r e d s p e c t r a at approximately room temperature f o r e q u i l i b r a t e d i s o l a t e d water molecules and then for l a r g e r and l a r g e r c l u s t e r s . The spectrum s h i f t s smoothly from the gas-phase l i n e spectrum toward the broad bands c h a r a c t e r i s t i c o f the l i q u i d phase, the bending ( s c i s s o r s ) v i b r a t i o n moving up i n energy and broadening as expected and the asymmetric and symmetric s t r e t c h e s moving down i n energy and melding together to form what i n the l i q u i d i s a s i n g l e broad peak. The L e m b e r g - S t i l l i n g e r p o t e n t i a l was designed f o r somewhat d i f f e r e n t ends, and by i t s nature as a c e n t r a l force approximation, a sum o f two body terms, V ^ , V and V Q H
Q 0
i t cannot accurately reproduce the i s o l a t e d molecule spectrum. Nonetheless, i t i s i n s t r u c t i v e to see that the expected gas to l i q u i d s h i f t s are t a k i n g p l a c e as the c l u s t e r s i z e grows. S i m i l a r c a l c u l a t i o n s with more r e a l i s t i c p o t e n t i a l s are i n preparation for several l i q u i d s . There are two purposes to such c a l c u l a t i o n s . The f i r s t i s to improve our knowledge o f i n t e r a t o m i c f o r c e s , i n p a r t i c u l a r non-bonded and i n t e r m o l e c u l a r f o r c e s , which we need f o r f u r t h e r molecular dynamics s t u d i e s . For example, we can set up a parameterized p o t e n t i a l function which i s constrained i n regards to that which we know such as e q u i l i b r i u m bond lengths and angles and d i s s o c i a t i o n e n e r g i e s , but which contains a d j u s t able parameters such as those d e s c r i b i n g non-bonded i n t e r a c t i o n s . Then we can i t e r a t i v e l y change the adjustable parameters to t r y to gain b e t t e r agreement between c a l c u l a t e d and measured s p e c t r a , h o p e f u l l y converging on an improved p o t e n t i a l s u r f a c e . The second purpose i s to t r y to change our present under standing o f l i q u i d s t a t e v i b r a t i o n a l s p e c t r a , which i s mainly q u a l i t a t i v e , i n t o q u a n t i t a t i v e understanding based on p o t e n t i a l surfaces and molecular dynamics. For example, i f we b e l i e v e we have a reasonable p o t e n t i a l s u r f a c e , we should be able to assign
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
12.
WILSON
Molecular Dynamics
161
s p e c t r a l features by d r i v i n g the simulated system o f molecules with a simulated e l e c t r i c f i e l d o s c i l l a t i n g at the frequency o f the s p e c t r a l feature and then watching and analyzing the a c t u a l computed t r a j e c t o r i e s which the atoms follow i n response to t h i s perturbation. Such an approach i s not r e a l l y a new one, as i t resembles the technique (33) used f o r t y years ago to analyze s t r o b o s c o p i c a l l y the normal motions o f molecules modelled mechanically by masses and springs and d r i v e n by an external mechanical o s c i l l a t o r y p e r t u r b a t i o n . The advent of systematic procedures for the a n a l y s i s o f v i b r a t i o n a l l i n e spectra (8, 23) has made such mechanical molecule approaches unnecessary f o r few atom systems, but has not solved the problem for many-atom systems. With the present a v a i l a b i l i t y o f very f a s t computing systems such as our array processor and our a b i l i t y to v i s u a l l y recognize complex motions with the a i d o f dynamic computer g r a p h i c s , we can now apply t h i s mechanical molecule approach i n a new form to many-atom s p e c t r a , i n p a r t i c u l a r s p e c t r a i n s o l u t i o n . B. Dynamics o f S o l v a t i o n . A second area o f a p p l i c a t i o n i s the understanding o f s o l v a t i o n i n terms o f the t r a j e c t o r i e s o f the atoms. Most r e a c t i o n s o f i n t e r e s t to chemists and most o f the chemistry i n l i v i n g systems occur i n s o l u t i o n , yet we understand very l i t t l e o f s o l v a t i o n , and even l e s s o f chemical r e a c t i o n s i n s o l u t i o n , i n terms o f a q u a n t i t a t i v e microscopic p i c t u r e i n v o l v i n g atomic motions. The modelling of the molecular dynamics of s o l v a t i o n i n i s o l a t e d d r o p l e t s o f up to hundreds o f solvent molecules i s r e l a t i v e l y s t r a i g h t f o r w a r d ; the large d i f f i c u l t y comes i n t r y i n g to match the p r o p e r t i e s of bulk s o l u t i o n s with c a l c u l a t i o n s i n v o l v i n g f i n i t e numbers o f molecules. The key to the l a t t e r appears to be i n the boundary c o n d i t i o n s : whether to choose, f o r example, p e r i o d i c boundary c o n d i t i o n s , a d i e l e c t r i c - s u r r o u n d e d c a v i t y or a surface l a y e r which i s f i x e d i n the c o n f i g u r a t i o n o f bulk solvent (34). In the i l l u s t r a t i o n s shown i n Figures 7-9 we have chosen the easy way out, by modelling i s o l a t e d d r o p l e t s . These stereo p a i r s , which may be seen by most people i n depth by a s l i g h t c r o s s i n g o f the eyes, represent i n d i v i d u a l frames from the c a l c u l a t e d time h i s t o r y o f a water c l u s t e r , the s o l v a t i o n o f a c h l o r i d e ion i n water and the process o f d i s s o l u t i o n and s o l v a t i o n o f an u l t r a c r y s t a l l i t e o f NaCl i n water. The water p o t e n t i a l i s again L e m b e r g - S t i l l i n g e r (32) with e l e c t r o s t a t i c i n t e r a c t i o n s and approximate r e p u l s i v e cores for the i n t e r a c t i o n s with and among the i o n s . IV.
Some Thoughts on the Future
A. Future A p p l i c a t i o n s . The author suspects that i n the long r u n , the most i n t e r e s t i n g many-atom molecular dynamics i s l i k e l y to be found i n biomolecular r e a c t i o n s . While up to the p r e s e n t , biochemistry and molecular b i o l o g y have concentrated on
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
162
MINICOMPUTERS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
Figure 7.
AND
LARGE
SCALE
COMPUTATIONS
A time-step in the evolution of a cluster of 31 water molecules
Figure 8.
A time-step in the history of a chloride ion solvated in an isolated water droplet
Figure 9.
and ion solvation of a crystallite of NaCl A time-step in the dissolution
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
12.
WILSON
Molecular Dynamics
163
s t a t i c s , i . e . the r e l a t i o n s h i p o f s t r u c t u r e and f u n c t i o n , i t seems c l e a r that the f u n c t i o n i n g o f at l e a s t many o f the most i n t e r e s t i n g biomolecules must be understood i n terms o f dynamics, t h e i r time e v o l u t i o n . A very long p e r i o d o f s e l e c t i o n has undoubtedly moulded many biomolecules i n t o very e f f i c i e n t machines whose dynamics as yet i s l a r g e l y s p e c u l a t i v e . Examples of such biomachinery are to be found i n enzymic a c t i o n (35) and a l l o s t e r i c e f f e c t s , muscle c o n t r a c t i o n , membrane transport ( p a r t i c u l a r l y a c t i v e t r a n s p o r t ) , aspects o f drug-receptor i n t e r a c t i o n , and biomolecular self-assembly. Perhaps as the past twenty years have seen such great progress i n the understanding of biomolecular s t r u c t u r e - f u n c t i o n r e l a t i o n s h i p s , the next twenty years may see s i m i l a r progress i n understanding the more complete p i c t u r e of biomolecular s t r u c t u r e - d y n a m i c s - f u n c t i o n . While some molecular dynamic c a l c u l a t i o n s on biomolecules are already i n progress i n batch mode, for example r e t i n a l photoi s o m e r i z a t i o n (36), water around a d i p e p t i d e to study the d i f f e r e n c e i n dynamics near h y d r o p h i l i c and hydrophobic s i t e s (37) and motions o f a s i m p l i f i e d small p r o t e i n , p a n c r e a t i c t r y p s i n i n h i b i t o r (38), such c a l c u l a t i o n s are s e v e r e l y hindered by l i m i t s to a v a i l a b l e computational speed. How can such l i m i t s be transcended? B. F a s t e r Computation. With a few more orders o f magnitude i n computer speed, the mechanism o f most r e a c t i o n s of i n t e r e s t to chemists would be a c c e s s i b l e to study by many-atom molecular dynamics. How can such speed increases be achieved? Two d i r e c t i o n s are apparent: more powerful elements ( i n t e g r a t e d c i r c u i t s ) and the i n t e r c o n n e c t i o n o f these elements i n a r c h i t e c t u r e s which more e f f i c i e n t l y match the problem to be s o l v e d . I t i s thought that there i s another f a c t o r o f 30 s t i l l to be r e a l i z e d i n l i n e a r shrinkage i n metal oxide semiconductor (MOS) technology before fundamental p h y s i c a l l i m i t s are reached (39). T h i s t r a n s l a t e s i n t o a 30 increase i n packing d e n s i t y on a chip and another f a c t o r o f 30 i n speed, f o r a t o t a l gain o f perhaps four orders of magnitude. Thus we can look forward to continuing s u b s t a n t i a l gains i n computational power per element by t h i s and probably by other routes as w e l l . A complementary approach i s the a r c h i t e c t u r e o f interconnecting the elements. The c l a s s i c a l mechanics of a set of i n t e r a c t i n g p a r t i c l e s i s a problem p a r t i c u l a r l y amenable to s p e c i a l i z e d computer a r c h i t e c t u r e because i ) the algorithms are r e l a t i v e l y 2
* Such increases i n s p e c i a l i z e d computer power are i n progress i n other areas as w e l l (1_) . Examples i n c l u d e the P a r a l l e l E l e ment Processing Ensemble (PEPE) f o r m i s s i l e t r a c k i n g b e i n g cons t r u c t e d f o r the Army Advanced B a l l i s t i c M i s s i l e Defense Agency which i s designed (4) to run many times f a s t e r than any e x i s t i n g general purpose processor as w e l l as the s p e c i a l aerodynamic computer (40) being considered by NASA which would be two orders o f magnitude f a s t e r than e x i s t i n g general purpose machines.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
164
AND LARGE
SCALE
COMPUTATIONS
simple and h i g h l y r e p e t i t i v e and i i ) the computation can be s p l i t i n t o p a r a l l e l streams which need communicate only once (or p e r haps a few times with more complex i n t e g r a t i o n schemes) f o r each i n t e g r a t i o n time s t e p . Thus, i n s t e a d o f an array processor we can consider arrays o f processors or even arrays o f array p r o cessors (1). When one considers such computational systems composed o f so many a c t i v e elements, s e v e r a l s i m i l a r i t i e s between computer a r c h i t e c t u r e and molecular a r c h i t e c t u r e become evident (39). The b a s i c determinant o f s t r u c t u r e becomes not the l o g i c a l e l e ments (atoms) themselves, but r a t h e r t h e i r interconnections (bonds) and these now become the focus o f design (39) as shown i n Table I I .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
Table I I .
E v o l u t i o n o f emphasis o f computer a r c h i t e c t u r e l o g i c a l elements to interconnections (39).
from
Characteristics
Past
Future
large,
slow,
expensive
logical elements
interconnections
small,
fast,
cheap
interconnections
logical
elements
Because computer a r c h i t e c t u r e can now be constructed cont a i n i n g so many elements and i n t e r c o n n e c t i o n s , the same problems in human c o n c e p t u a l i z a t i o n a r i s e as i n systems composed o f many atoms and bonds, that no one person can p o s s i b l y understand a l l the r e l a t i o n s h i p s among the i n d i v i d u a l d e t a i l e d p a r t s o f the system. In response, the same approach o f emphasizing the symmetry o f the s i t u a t i o n becomes u s e f u l . For example, one obvious way o f i n t e r c o n n e c t i n g processors i n p a r a l l e l i s a s i n g l e bus, as shown i n Figure 10. To a chemi s t t h i s i s a l i n e a r polymer and shares i t s symmetry. I f one branches the b u s , i t ' s a branched polymer, o r one can make c y c l i c systems, e t c . A very appealing s o l u t i o n f o r a problem such as molecular dynamics which i s to be solved i n terms o f C a r t e s i a n space i s to map the 3D problem space onto a 3D space o f an array o f p r o c e s sors (39) , an example o f which i s shown i n Figure 11. Two ways of c a r r y i n g out such a mapping f o r our case are as f o l l o w s . F i r s t , one could map each atom onto a processor and then "dyn a m i c a l l y r e a l l o c a t e processors" so as to maintain near n e i g h bor r e l a t i o n s h i p s as atoms move about on t h e i r t r a j e c t o r i e s . A key question to i n v e s t i g a t e i s whether there i s a l o c a l r e a l l o c a t i o n algorithm which w i l l e f f i c i e n t l y maintain a s a t i s f a c t o r y mapping by querying only other processors i n the v i c i n i t y , and then exchanging assignments o f processors to atoms. A second
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12.
WILSON
Molecular Dynamics
165
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
PROCESSORS
oooooooo BUS Figure 10. The symmetry of an array of processors connected by a bus, or equivalently the symmetry of a linear polymer
Figure 11. The symmetry of a simple cubic 3D array of processors, or equivalently of a 3D simple cubic crystal lattice
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
166
MINICOMPUTERS
AND LARGE SCALE
COMPUTATIONS
approach i s t o map regions o f 3D coordinate space onto s p e c i f i c p r o c e s s o r s ; i n other words, to d i v i d e a l l the space i n which the atoms move i n t o volumes such that each processor takes care o f a l l atoms which happen to be i n that volume. When an atom crosses the boundary o f t h a t volume, i t would be reassigned to the processor h a n d l i n g the adjacent volume. In c o n s i d e r i n g such a scheme, i t i s important to note t h a t the force on any given r e a l atom i s only a function o f the p o s i t i o n s o f other atoms w i t h i n some f i n i t e volume about that atom (1) and thus that each processor only need communicate with a l o c a l i z e d set o f other p r o c e s s o r s . Thus i n the l i m i t o f a very large number Ν o f atoms, the number o f a r i t h m e t i c operations r e q u i r e d , i f done p r o p e r l y , to solve the molecular dynamics i n creases only p r o p o r t i o n a l l y to N , i n contrast to widely h e l d opinion (shared u n t i l r e c e n t l y by the author) t h a t i t must r i s e f a s t e r than N . T h i s i s t r u e both i n force c a l c u l a t i o n from a r e a l i s t i c p o t e n t i a l surface i n c l a s s i c a l mechanics, i n that i n r e a l i t y a l l i n t e r a t o m i c forces i n dense systems are damped out at some d i s t a n c e by i n t e r v e n i n g movable and p o l a r i z a b l e atoms as w e l l as i n quantum mechanics i n that i n t e g r a l s among o r b i t a l s s u f f i c i e n t l y separated can be ignored. I f we consider 3D arrays o f p r o c e s s o r s , we chemists already know a l l the p o s s i b l e d i f f e r e n t symmetries o f how to b u i l d the processor array (39), the " c r y s t a l computer" (41) . The p o s s i b l e symmetries w i t h i n each u n i t composing the array are j u s t the symmetries o f c r y s t a l u n i t c e l l s and the symmetries with which the u n i t s can be stacked o r interconnected i n t o 3D arrays are j u s t the l a t t i c e symmetries, the 14 Bravais l a t t i c e s , the grand t o t a l o f a l l combined u n i t c e l l and l a t t i c e symmetry p o s s i b i l i t i e s b e i n g the 230 space groups (42). I f we r e s t r i c t ourselves to b u i l d i n g from symmetric, i d e n t i c a l u n i t s which stack i n t o a s p a c e - f i l l i n g 3D a r r a y , the p o s s i b i l i t i e s are even more l i m i t e d and i n fact we can r e f e r back to the Greeks for the s o l i d t e s s e l lations. Out o f the r e g u l a r and Archimedean polyhedra there are only 5 which are space f i l l i n g : the cube, t r i a n g u l a r p r i s m , hex agonal p r i s m , rhombic dodecahedron and t r u n c a t e d octahedron (43). C. Other Instruments for Theory. One can imagine other instruments f o r other t h e o r i e s . Instead o f a NEWTON f o r c l a s s i cal mechanics, one could consider b u i l d i n g a machine for quantum c a l c u l a t i o n s , a SCHRODINGER o r a HEISENBERG. One can again map 3D c o n f i g u r a t i o n space onto a 3D array o f p r o c e s s o r s , e i t h e r o r b i t a l (s) or atom(s) to p r o c e s s o r or volume o f space to p r o c e s s o r . And a g a i n , as the number Ν o f atoms grows large enough, one r e gion o f space w i l l no longer d i r e c t l y a f f e c t another and the a r i t h m e t i c operations i n v o l v e d i n the c a l c u l a t i o n w i l l s c a l e , i n the l i m i t o f q u i t e large N , p r o p o r t i o n a l l y as N . L a s t l y , one might want to b u i l d a SEMI, a s e m i c l a s s i c a l i n strument f o r s o l v i n g quantum mechanically ( e i t h e r ab i n i t i o or semi empiric a l l y ) for the e l e c t r o n i c wavefunction and using t h i s
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12.
WILSON
Molecular Dynamics
167
wavefunction to a n a l y t i c a l l y derive (44-46, 8_, 15) on the f l y a force function for the n u c l e i whose t r a j e c t o r i e s are being i n t e grated c l a s s i c a l l y . For a system o f very large Ν i t i s no longer f e a s i b l e to c a l c u l a t e and s t o r e a p o t e n t i a l function Cl , £ j j i n advance on a 3N - 6 dimensional mesh. For s e m i c l a s s i c a l dynamics, a l l one needs anyway are the forces at those r e l a t i v e l y few p o i n t s a c t u a l l y sampled by the sequence o f n u c l e a r coordinate sets generated by the c l a s s i c a l numerical i n t e g r a t i o n o f the n u c l e a r t r a j e c t o r i e s . It should be noted that a l l o f the instruments for theory described above could be implemented as the same 3D array o f stored-program p r o c e s s o r s . v
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
V.
1
Summary
While we chemists have long b u i l t s p e c i a l i z e d instruments f o r experimental s t u d i e s , we are now d i s c o v e r i n g that we can also b u i l d s p e c i a l i z e d instruments f o r theory, computational apparatus designed to e f f i c i e n t l y solve p a r t i c u l a r classes o f chemical problems. An example i s NEWTON, an instrument we have constructed to study the d e t a i l e d mechanism, i . e . the molecular dynamics, of many-atom chemical r e a c t i o n s , p a r t i c u l a r l y i n s o l u tion. NEWTON allows the chemist to c o n t r o l the s t a t e o f a simulated system o f i n t e r a c t i n g molecules: s e l e c t i o n o f the p a r t i c u l a r molecules, i n i t i a l conditions o f p o s i t i o n and v e l o c i t y , parameters o f the p o t e n t i a l surface, temperature and p r e s s u r e . In response, atomic t r a j e c t o r i e s are c l a s s i c a l l y i n t e g r a t e d on the i n t e r a t o m i c p o t e n t i a l surface i n a very fast p r o c e s s o r . The chemist can watch the e v o l v i n g molecular dynamics on a 3D d i s p l a y and i n t e r a c t with the molecules through knobs, keyboard and touch i n t e r f a c e . A p p l i c a t i o n s i n progress i n c l u d e dynamic s t u d i e s o f v i b r a t i o n a l s p e c t r a i n s o l u t i o n and the dynamics o f the s o l v a t i o n process. With i n c r e a s e d computer speed, much o f biochemistry might become a c c e s s i b l e ; the r e l a t i o n among s t r u c t u r e , dynamics and function for example i n enzymic a c t i o n , a c t i v e t r a n s p o r t and biomolecular s e l f - a s s e m b l y . Hope f o r such speed increases l i e s i n two d i r e c t i o n s : more power per computational u n i t and the adaptation o f o v e r - a l l computer a r c h i t e c t u r e to match the s t r u c ture o f the problem to be s o l v e d . A p a r t i c u l a r l y appealing route i s the mapping o f c a l c u l a t i o n s i n three dimensional con f i g u r a t i o n space onto a three dimensional array o f p a r a l l e l p r o c e s s o r s , a route which can be a p p l i e d e q u a l l y to c l a s s i c a l , s e m i - c l a s s i c a l and quantum c a l c u l a t i o n s , a l l o f which can be shown to s c a l e only p r o p o r t i o n a l l y to the number Ν o f atoms i n the l i m i t o f very large N . Acknowledgement The v i b r a t i o n a l s p e c t r a and dynamics o f s o l v a t i o n are by
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
168
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
Peter Berens. Thanks to John Cornelius and the staff of the Chemistry Department Computer Facility for their help, to Sylvia Francl for aid on vibrational spectra, and to the Division of Computer Research of the National Science Foundation and to the Division of Research Resources, National Institutes of Health (RR-00757) whose support has made this work possible. Literature Cited 1.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
16. 17.
Wilson, K. R. in "Computer Networking and Chemistry," Lykos, P., ed., American Chemical Society, Washington, D. C . , 1975, p. 17. Murtha, J . C . , Adv. Computers (1966) 7, 1. Lorin, H . , "Parallelism in Hardware and Software," PrenticeHall, Inc., Englewood Cliffs, New Jersey, 1972. Comptre Corporation, Enslow, Philip H . , Jr., ed., "Multi processors and Parallel Processing," John Wiley & Sons, New York, 1974. Berne, B. J., ed., "Statistical Mechanics, Part B: TimeDependent Processes," Vol. 6 of "Modern Theoretical Chemistry," Plenum Publishing, New York, 1977. Levine, R. D., and Bernstein, R. B., "Molecular Reaction Dynamics," Oxford University Press, New York, 1974. Miller, W. Η., ed., "Dynamics of Molecular Collisions, Parts A & Β," Vols. 1 & 2 of "Modern Theoretical Chemistry," Plenum Publishing, New York, 1976. Califano, S., "Vibrational States," John Wiley, London, 1976. Williams, J . D., Stand, P. J., and Schleyer, P.v.R., Ann. Rev. Phys. Chem. (1968) 19, 531. Kitaigorodsky, A. I., "Molecular Crystals and Molecules," Academic Press, New York, 1973. Hopfinger, A. J., "Conformational Properties of Macromolecules," Academic Press, New York, 1973. Shipman, L. L., Burgess, W., and Sheraga, Η. Α., Proc. Nat. Acad. Sci. USA (1975) 72, 543. Blout, E. R., Bovey, F. Α., Goodman, Μ., and Lotan, Ν., eds., "Peptides, Polypeptides and Proteins," John Wiley & Sons, New York, 1974. Momany, F. Α., McGuire, R. F . , Burgess, A. W., and Sheraga, Η. Α., J . Phys. Chem. (1975) 79, 2361. Warshel, A. in "Semiempirical Methods of Electronic Struc ture Calculation, Part A: Techniques," Segal, G. A., ed., Vol. 7 of "Modern Theoretical Chemistry," Plenum Pub lishing, New York, 1977. Bunker, D. L., "Theory of Elementary Gas Reaction Rates," Pergamon Press, Oxford, 1966, Sections 2.2 and 3.2. Bennett, C. H . , in "Diffusion in Solids: Recent Develop ments," Burton, J . J., and Nowich, A. S., eds., Academic Press, New York, 1975.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12. WILSON 18.
19. 20. 21. 22. 23. Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41.
Molecular Dynamics
169
Valleau, J . P., and Whittington, S. G., in "Statistical Mechanics, Part A: Equilibrium Techniques," Berne, B. J., ed., Vol. 5 of "Modern Theoretical Chemistry," Plenum Pub lishing, New York, 1977. Weeks, J. D., Chandler, D., and Andersen, H. C., J . Chem. Phys. (1971) 54, 5237. Atkinson, W. D., Bond, Κ. E . , Tribale, G. L. III, and Wilson, K. R., Comput. $ Graphics (1977) 2, 97. Sutherland, G., Lawrence Livermore Laboratories, Livermore, California, private communication. Park, T. C., Loma Linda University, Loma Linda, California, private communication. Wilson, Ε. B. J r . , Decius, J . C., and Cross, P. C., "Mole cular Vibrations," McGraw-Hill, New York, 1955. Kettering, C. F., Shutts, L. W., and Andrews, D. H., Phys. Rev. (1930) 36, 531. Herzberg, G. Η., "Molecular Spectra and Molecular Structure II. Infrared and Raman Spectra of Polyatomic Molecules," D. Van Nostrand, Princeton, New Jersey, 1945. Kubo, R. in "Lectures in Theoretical Physics Vol. 1," Brittin, W. F., and Dunham, L. G., eds., Interscience Publishers, New York, 1959. Kadanoff, L. P., and Martin, P. C., Ann. Phys. (1963) 24, 419. Felderhof, B. U., and Oppenheim, I., Physica (1965) 31, 1441. Gordon, R. G., Advan. Magn. Resonance (1968) 3, 1. Berne, B. J., in "Physical Chemistry, An Advanced Treatise, Vol. VIIIB, Liquid State," Henderson, D., ed., Academic Press, New York, 1971. Kampen, N. G. van, Physica Norvegica (1971) 5, 10. Lemberg, H. L . , and Stillinger, F. H., J. Chem. Phys. (1975) 62, 1677. Andrews, D. H., and Murray, J. W., J . Chem. Phys. (1934) 2, 634. Warshel, Α., University of Southern California, Los Angeles, California, private communication. Warshel, Α., and Levitt, Μ., J . Mol. Biol. (1976) 103, 227. Warshel, Α., Nature (1976) 260, 679. Karplus, Μ., and Rossky, P. J., "Abstracts of Papers," Chemical Institute of Canada and American Chemical Society, Montreal, 1977, phys. 66. Levitt, Μ., MRC Laboratory of Molecular Biology, Cambridge, U. Κ., private communication. Sutherland, I. E . , California Institute of Technology, Pasadena, California, private communication. Datamation (March, 1977) 23, 150. O'Leary, G., Floating Point Systems, Portland, Oregon, private communication.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
170 42. 43. 44.
Henry, N. F. Μ., and Lonsdale, Κ., eds., "International Tables for X-Ray Crystallography Vol. 1," Kynoch Press, Birmingham, England, 1952. Cundy, Η. Μ., and Rollett, A. P., "Mathematical Models," Oxford University Press, London, 1961. Gerratt, J., and Mills, I. M., J. Chem. Phys. (1968) 49, 1719. Pulay, P., Molec. Phys. (1969) 17, 197. Pulay, P., and Török, F., Molec. Phys. (1973) 25, 1153.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
45. 46.
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
13 Theoretical Chemistry via Minicomputer
*
PETER K. PEARSON, ROBERT R. LUCCHESE, WILLIAM H. MILLER, and HENRY F. SCHAEFER III **
***
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
Department of Chemistry, University of California, Berkeley, CA 94720
C e r t a i n l y one of the most important and f a r - r e a c h i n g developments i n chemistry over the past decade has been the emergence of theory as a p r e d i c t i v e t o o l of s e m i - q u a n t i t a t i v e reliability. T h i s statement i s no way meant to detract from the pre-1960 t h e o r e t i c a l chemistry that p r o v i d e d , through the work of men such as Linus P a u l i n g , Robert M u l l i k e n , and Henry E y r i n g , the modern foundations of valence theory and chemical kinetics. Contemporary t h e o r e t i c a l research i s o b v i o u s l y b u i l t upon the achievements of these p i o n e e r s . However the d i s t i n g u i s h ing feature of modern t h e o r e t i c a l chemistry i s the ability not only to c o r r e l a t e e x i s t i n g experimental data (and make rough q u a l i t a t i v e p r e d i c t i o n s ) , but a l s o to provide an a priori d e s c r i p t i o n of chemical phenomena that allows p r e c i s e p r e d i c t i o n s to be tested by experiment. The most s t r i k i n g example of t h i s new age of theory i s the understanding that the s i n g l e - c o n f i g u r a t i o n s e l f - c o n s i s t e n t - f i e l d (SCF) approximation for e l e c t r o n i c wave functions provides e q u i l i b r i u m geometries i n very c l o s e agreement with a v a i l a b l e experimental data ( 1 ) . I f one defines chemistry as the union of s t r u c t u r e , e n e r g e t i c s , and dynamics on the molecular level, then it seems f a i r to say that theory has a f i r m grasp on at l e a s t one t h i r d of t h i s branch of s c i e n c e . Furthermore, s i n c e SCF theory may now be a p p l i e d fairly r o u t i n e l y (2) to systems as l a r g e as TCNQ-TTF (Figure 1) the range of applicability i s c l e a r l y rather broad. A second major i n s i g h t gleaned over the past decade i s the r e a l i z a t i o n that the d e t a i l e d dynamics of chemical r e a c t i o n s are w e l l described by ordinary c l a s s i c a l mechanics, i . e . , by c l a s s i c a l t r a j e c t o r y s t u d i e s (3). Although most t h e o r e t i c a l s t u d i e s to date have d e a l t with the c a n o n i c a l A + BC -> AB + C r e a c t i o n (for which the most d e t a i l e d experimental data i s a v a i l a b l e ) (4), systems as l a r g e as the methyl isocyanide r e a c t i o n CH NC + CH CN 3
3
171 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
172
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
Figure 1
are r e a d i l y a c c e s s i b l e (5). In f a c t i t i s reasonable to assume that much of the future research i n t h i s area w i l l be d i r e c t e d toward a t h e o r e t i c a l understanding of model organic r e a c t i o n s . The l i n k between the above two branches of theory i s c l e a r : e l e c t r o n i c s t r u c t u r e theory has as a p r i n c i p l e aim the e l u c i d a t i o n of the p o t e n t i a l energy s u r f a c e ( s ) ; while the theory of dynamics or c o l l i s i o n processes begins with the same p o t e n t i a l energy s u r f a c e ( s ) . The present research p r o j e c t had i t s genesis i n c o l l a b o r a t i v e s t u d i e s between WHM (dynamics) and HFS ( e l e c t r o n i c structure). Here we have assembled a " f i n a l " (only i n the sense of a r a p i d l y approaching deadline) report on our use of a minicomputer for research i n modern t h e o r e t i c a l chemistry. At the outset we should s t a t e that we have already w r i t t e n many words on t h i s s u b j e c t , and r e p e t i t i o n of these would not appear to serve a purpose. A modified v e r s i o n of the o r i g i n a l proposal has been published i n Computers and Chemistry. That proposal goes i n t o the j u s t i f i c a t i o n and economic m o t i v a t i o n for t h i s p i l o t p r o j e c t . Secondly, Appendix I contains four i n t e r i m reports d e s c r i b i n g i n d e t a i l our experiences with the new machine. We s t r o n g l y encourage the reader to go over these documents c a r e f u l l y . F i n a l l y we note that the proposal for a N a t i o n a l Resource for Computation i n Chemistry (NRCC) has brought squarely to the a t t e n t i o n of the chemical community the need for improved computational f a c i l i t i e s . We therefore a l s o urge the reader to give s e r i o u s c o n s i d e r a t i o n to the r e p o r t s of Wiberg (6) and B i g e l e i s e n (_7) committees. The Economic Argument The minicomputer chosen was the Datacraft 6024/4, which was f u l l y assembled at Berkeley on March 13, 1974. Thus our e x p e r i ence spans a p e r i o d of roughly three y e a r s . Although the same
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
13.
PEARSON E T A L .
Theoretical Chemistry
173
machine i s s t i l l i n production (ours i s machine #3 of about 200 produced to d a t e ) , s e v e r a l company changes have occurred and our minicomputer i s now c a l l e d the H a r r i s Corporation Slash Four. The cost of the machine was e s s e n t i a l l y $130,000, i n c l u d i n g C a l i f o r n i a s t a t e s a l e s tax. No overhead on the purchase p r i c e was r e q u i r e d . Assuming a m o r t i z a t i o n over a four year p e r i o d , t h i s amounts to $2708 per month. The other l a r g e cost i s that of maintaining the s e r v i c e c o n t r a c t , c u r r e n t l y $1715/month ($1280 to the H a r r i s Corporation and $435 to the UC Berkeley overhead). On t h i s b a s i s the t o t a l cost i s $4423/month or $7.30 per hour i f we assume 20 hours of usage per day, as shown to be r e a l i s t i c i n Appendix I . As noted by one of the reviewers, t h i s cost might be further reduced i n a chemistry department where there i s already a t e c h n i c a l s t a f f member with extensive d i g i t a l hardware e x p e r t i s e . Of course the insurance aspects o f the maintenance contract would be l o s t i n t h i s case. Extensive timing comparisons (Appendix I) have shown the minicomputer to be 25-30 times slower than the C o n t r o l Data Corporation (CDC) 7600. Thus the minicomputer generates the equivalent o f 1 hour o f 7600 c e n t r a l processor (cpu) time per $200. For comparison, we c i t e the charge s t r u c t u r e of the Lawrence Berkeley Laboratory (LBL) CDC 7600. This machine i s g e n e r a l l y a v a i l a b l e to NSF grantees and o f f e r s 7600 machine time at p r i c e s roughly f i v e times l e s s expensive than commercial r a t e s . Nevertheless the LBL r a t e s range from roughly $350 to $900 per hour of cpu time. The former f i g u r e r e f e r s to weekend deferred p r i o r i t y time. On t h i s b a s i s , then, one concludes that the m i n i computer i s *\> 2-4 times more economical than the 7600. However, as we d i s c u s s i n d e t a i l i n the o r i g i n a l proposal and i n Appendix I , the above f i g u r e s i n c l u d e input-output charges ( e s p e c i a l l y d i s k accesses) f o r the H a r r i s machine, but these are a d d i t i o n a l charges (often r a t h e r severe) on the CDC 7600. Thus as i s seen i n Appendix I , the cost e f f e c t i v e n e s s of the m i n i computer sometimes exceeds that o f the 7600 by a f a c t o r of s i x or seven. In a l l f a i r n e s s , the minicomputer does not provide the q u a l i t y of s e r v i c e of the LBL CDC 7600, a smoothly f u n c t i o n i n g p r o f e s s i o n a l l y operated computer c e n t e r . Much of the savings made i s simply a consequence o f the f a c t that our o p e r a t i o n involves no paid employees other than graduate students and postdoctorals. Research Accomplishments The u l t i m a t e t e s t of the present proposal i s undoubtedly whether the chemistry research completed j u s t i f i e s the NSF funds expended. Since t h i s document i s intended f o r p e r u s a l by academic and i n d u s t r i a l research chemists, we leave t h i s judgment to you. A v a i l a b l e upon request i s a l i s t of seventy p u b l i c a t i o n s based on
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
174
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
research c a r r i e d out using the H a r r i s Slash Four minicomputer. In s e v e r a l cases the research was c a r r i e d out i n c o l l a b o r a t i o n with t h e o r i s t s from other i n s t i t u t i o n s . When such studies made use of machines i n a d d i t i o n to the minicomputer, an a s t e r i s k i s indicated. Papers i n the course of p u b l i c a t i o n w i l l be provided on request. Not wishing to be e n t i r e l y i m p a r t i a l , we add the o p i n i o n that the minicomputer has allowed us to make a number of important c o n t r i b u t i o n s both to theory and to chemistry. With t h i s machine, our choice of problems has been p r i m a r i l y based on chemical i n t u i t i o n and s c i e n t i f i c i n c l i n a t i o n , r a t h e r than the p r e s s i n g economic circumstances many t h e o r e t i c a l chemists r e g r e t t a b l y face.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
Developmental Work i n Progress As mentioned i n the i n t r o d u c t i o n , we are j u s t now beginning to take f u l l advantage of the H a r r i s machine. Bruce G a r r e t t , a student of Professor M i l l e r ' s i s continuing work on the development of a quantum mechanical t r a n s i t i o n s t a t e theory. Cliff Dykstra has developed (8) and i s c o n t i n u i n g to work on a Theory of S e l f Consistent E l e c t r o n P a i r s (TSCEP), a fundamentally new approach to the c o r r e l a t i o n problem (9). Also i n Professor Schaefer's group, Robert Lucchese, Jim Meadows, B i l l Swope, and Bernie Brooks are working together to develop a new system of programs for l a r g e s c a l e c o n f i g u r a t i o n i n t e r a c t i o n (CI) s t u d i e s of e l e c t r o n c o r r e l a t i o n i n molecules. The l a t t e r programs are described i n some d e t a i l elsewhere (10). Thus, although t h i s report i s o f f i c i a l l y l a b e l e d " f i n a l " , there i s much work yet to be done i n the development of new t h e o r e t i c a l methods and comput a t i o n a l techniques. It i s i n such cases, where o r i g i n a l programs have been w r i t t e n s p e c i f i c a l l y for the minicomputer, that i t s advantages become most c l e a r l y apparent. In t h i s regard i t i s noteworthy that most students who have taken the time (perhaps one month) to f a m i l i a r i z e themselves with the mini a c t u a l l y prefer i t to the CDC 7600. Qualms A balanced view r e q u i r e s us to admit that a l l i s not sweetness and l i g h t . We have already noted that there i s no convenient computer center s t a f f to operate the machine. When problems occur we not only must c a l l the customer engineer, but a l s o p o i n t him r a t h e r c a r e f u l l y i n the d i r e c t i o n of the problem. As one of the reviewers has pointed out, t h i s i s at l e a s t i n part a r e s u l t of the f a c t that the support s e r v i c e s of the H a r r i s Corporation are s u b s t a n t i a l l y l e s s than those of IBM or CDC. An absolute n e c e s s i t y i s the presence of one very b r i g h t , knowledgeable, and r e s p o n s i b l e computer expert i n the group. The Lord has blessed us with two such i n d i v i d u a l s , Dr. Peter Pearson (who went on to
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
13.
PEARSON
E T A L .
Theoretical Chemistry
175
greater things i n September of 1974) and more r e c e n t l y Mr. Robert Lucchese. This s o r t of i n d i v i d u a l i s r e q u i r e d to make system changes and updates, determine whether the machine i s r e a l l y s i c k or j u s t out of shape, and show the customer engineer e x a c t l y which machine i n s t r u c t i o n i s f a i l i n g when a d e f i n i t e problem i s located. A l s o , debugging a l a r g e program i s much more d i f f i c u l t than on the CDC 7600. Programmers always blame most of t h e i r mistakes on the computer and t h i s can be e s p e c i a l l y true when a m i n i i s involved. O c c a s i o n a l l y one finds a student who i s simply u n w i l l ing to go through the exhaustive checking that i s necessary to debug a l a r g e s c a l e program on a machine such as the H a r r i s Slash Four. Successful u t i l i z a t i o n of the machine r e q u i r e s the p h y s i c a l presence of one student at any given time. For some i n d i v i d u a l s the idea of spending the n i g h t with a computer i s not a pleasant one. We have found that the only s a t i s f a c t o r y s o l u t i o n to t h i s problem i s to have a s u f f i c i e n t number of students (at l e a s t 10) using the machine that they simply cannot a f f o r d to r i s k the p o s s i b i l i t y of being absent i n the event of a machine h a l t . Two a d d i t i o n a l weaknesses of the m i n i r e l a t i v e to a l a r g e machine such as the 7600 are (a) the smaller memory and (b) the l a r g e amounts of elapsed time required to complete a given j o b . The former l i m i t a t i o n r e s t r i c t s u s , f o r example, to using about 80 contracted gaussian functions i n e l e c t r o n i c s t r u c t u r e c a l c u l a tions. Although C l i f f Dykstra has developed a method of i n c r e a s ing t h i s l i m i t to 120 contracted f u n c t i o n s , such a computation might run i n t o trouble on the second p o i n t . That i s , about 24 hours i s the p r a c t i c a l l i m i t for a s i n g l e j o b . In g e n e r a l , the other users become q u i t e h o s t i l e i f a job r e q u i r e s even t h i s long. In a d d i t i o n , 24 hours i s about the mean time i n t e r v a l between machine f a i l u r e s i f the machine i s running a s i n g l e j o b . I t should be noted that t h i s time r e s t r i c t i o n (to about 1 hour of 7600 time per job) would be a s e r i o u s b a r r i e r i n accomplishing some of the goals set out for the NRCC (6, 7 ) · I n t e r f a c i n g with Experiments A question we are frequently asked i s "Could you handle three or four o n - l i n e experiments at the same time?" The answer to t h i s q u e s t i o n , at l e a s t for the H a r r i s Slash Four, i s an unequivocal no. The cost e f f e c t i v e n e s s of machines such as ours i s i n part a r e s u l t of i t s somewhat r e s t r i c t e d c a p a b i l i t i e s . If one wants the f l e x i b i l i t y of an IBM 370 system, t i e d i n to 43 t e l e t y p e s , one should probably be w i l l i n g to pay ten times more to c a r r y out a p a r t i c u l a r task i n computational chemistry. Our system i s i d e a l l y s u i t e d to batch o p e r a t i o n s , where only one job runs at a time. In fact i f a p a r t i c u l a r job i s long and not r e s t a r t a b l e (many of our programs are now r e s t a r t a b l e ) i t i s b e t t e r not even to read i n another job during execution. Thus the p o s s i b i l i t y of o n - l i n e experiments i s d e f i n i t e l y s l i m .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
176
AND LARGE SCALE
COMPUTATIONS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
However, there are a l l kinds of experimental chemists who r e l y on computers f o r number crunching jobs designed to a i d i n the a n a l y s i s of t h e i r data. Such jobs are w e l l s u i t e d to a machine such as the Slash Four and could very w e l l provide a major p a r t of the j u s t i f i c a t i o n for a proposal to the NSF. In l i g h t of s e v e r a l reviewers comments, we f e e l compelled to note that the newer H a r r i s machines ( e s p e c i a l l y the Slash Seven) now have v i r t u a l memory, which allows genuine t i m e - s h a r i n g . Having observed the Slash Seven at the I n t e r n a t i o n a l Engineering Company i n San F r a n c i s c o we must conclude that the simultaneous p r o c e s s i n g of three or four users i s now a r e a l i t y on the Slash Seven. Although v i r t u a l memory i s an a d d i t i o n a l expense (perhaps $20,000) i t would c e r t a i n l y be worthwhile i n s i t u a t i o n s where o n - l i n e data a c q u i s i t i o n i s a primary task. Environmental Impact U n t i l q u i t e r e c e n t l y , the primary medium f o r the disseminat i o n of the r e s u l t s of t h i s minicomputer experiment has been personal c o n t a c t . A f t e r the o r i g i n a l proposal was submitted, copies were mailed to ^ 25 prominent t h e o r e t i c a l chemists. The i n t e r i m reports have been d i s t r i b u t e d on request, of which we have had *\> 50 from research chemists. Another ^ 50 v i s i t o r s , i n c l u d i n g an NSF review team, have toured the Berkeley f a c i l i t y . A s l i g h t l y modified v e r s i o n of the o r i g i n a l proposal was published (Volume 1, pages 85-90) i n the new j o u r n a l Computers and Chemistry. Professor Schaefer presented an i n v i t e d paper "Are Minicomputers S u i t a b l e for Large Scale S c i e n t i f i c Computation" i n September 1975 a t the Eleventh Annual IEEE Computer Society Conference i n Washington, D . C . The same l e c t u r e was given e a r l i e r at the IBM Research Laboratory, San Jose. The trade j o u r n a l Computerworld published a popular d e s c r i p t i o n of the experiment i n i t s March 8, 1976 i s s u e . A number o f recent papers have mentioned the Berkeley m i n i experiments. Most recent and perhaps the most i n t e r e s t i n g i s that of I s a i a h S h a v i t t , (11) e n t i t l e d "Computers and Quantum Chemistry." F i n a l l y , the American Chemical S o c i e t y ' s D i v i s i o n of Computers i n Chemistry, under the d i r e c t i o n of Professor Peter Lykos, has organized the present symposium (June, 1977 i n Montreal) on "Minicomputers and Large Scale Computations". Several research groups (perhaps 20) have expressed serious i n t e r e s t i n a c q u i r i n g t h e i r own minicomputer for purposes comparable to our own. However, to our knowledge the only group to a c t u a l l y do so i s that of the l a t e Professor Don L . Bunker of the U n i v e r s i t y of C a l i f o r n i a at I r v i n e . Although the Hewlett-Packard machine purchased by Professor Bunker with NSF support was much l e s s expensive (and p r o p o r t i o n a l l y slower) than the H a r r i s Slash Four, he found i t to be adequate f o r h i s research i n dynamics and a v a s t improvement over h i s former dependence on an incompetent campus computer c e n t e r .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
13.
PEARSON
E T A L .
Theoretical Chemistry
177
Since the d r a f t v e r s i o n of t h i s f i n a l r e p o r t was prepared, two groups of t h e o r e t i c a l chemists have ordered H a r r i s Slash Sevens. These are the groups headed by Professor P h i l l i p C e r t a i n at the U n i v e r s i t y of Wisconsin and by D r s . John T u l l y and Frank S t i l l i n g e r at B e l l Telephone L a b o r a t o r i e s . These and other i m p l i c a t i o n s of our research have been noted i n recent semi-popular reviews i n Science (12) and Nature (13).
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
The Future The controversy concerning the r e l a t i v e merits of m i n i computers and l a r g e machines i s l i k e l y to continue f o r some time. At present both the Slash Four and CDC 7600 appear to be r e l a t i v e l y economical a l t e r n a t i v e s . The r e a l l o s e r s i n such comparisons are the machines between these two extremes (11). For example, s e v e r a l u n i v e r s i t i e s and research i n s t i t u t e s ( e . g . the U n i v e r s i t y of C a l i f o r n i a , the U n i v e r s i t y of Washington, Colorado State U n i v e r s i t y , and B a t t e l l e , Columbus) are c u r r e n t l y using the CDC 6400. Although the 6400 i s only about 1.5 times f a s t e r than the H a r r i s Slash Four, the cost of using t h i s machine can be as high (at Berkeley) as $420/hour. This i s c l e a r l y an absurd s t a t e of a f f a i r s , and we would encourage the abused supporters of such machines to consider t h e i r a l t e r n a t i v e s . Since our o r i g i n a l p r o p o s a l , s e v e r a l developments have occurred i n the minicomputer a r e a . At that time the H a r r i s Slash Four was by f a r the f a s t e s t machine a v a i l a b l e i n our p r i c e range. Since then at l e a s t four machines of n e a r l y comparable speed have appeared: the Data General E c l i p s e , the V a r i a n V75, the System Engineering L a b o r a t o r i e s (SEL) 32/55, and the Interdata 8/32. We have been e s p e c i a l l y i n t e r e s t e d i n the SEL 32 s i n c e i t i s a true 32 b i t machine and might be s i g n i f i c a n t l y f a s t e r than the H a r r i s Slash Four i f a powerful 64 b i t f l o a t i n g p o i n t processor were available. In f a c t , such a f a s t f l o a t i n g p o i n t processor appears to be a r e a l p o s s i b i l i t y for SEL i n the near f u t u r e . In a d d i t i o n the new H a r r i s Slash Seven i s about 30% f a s t e r than our Slash Four machine. Another encouraging development i s the f a c t that memory p r i c e s have now come down by n e a r l y a f a c t o r of two r e l a t i v e to our purchase p r i c e for the 64K of Datacraft 24 b i t core memory. Thus i t seems q u i t e reasonable that future m i n i purchasers w i l l not be r e q u i r e d to r e s t r i c t themselves to small memory machines. C e r t a i n l y the most s p e c t a c u l a r t e c h n o l o g i c a l achievement of the l a s t three years i s the i n t r o d u c t i o n by F l o a t i n g Point Systems ( P o r t l a n d , Oregon) of t h e i r high speed array processor. At a cost of °o $40,000, t h i s device i s able to c a r r y out 38 b i t f l o a t i n g point operations at e s s e n t i a l l y the speed of the 7600. Professor Kent Wilson of UC San Diego has already purchased the FPS array processor for use i n s i m u l a t i n g the c l a s s i c a l dynamics of b i o l o g i c a l systems (14). We have studied t h i s device c a r e f u l l y and while very e n t h u s i a s t i c about i t , have some r e s e r v a t i o n s . F i r s t the 38 b i t
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
178
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
word, corresponding to 8 plus s i g n i f i c a n t f i g u r e s , i s not q u i t e adequate f o r our type of t h e o r e t i c a l computations. As we have emphasized on many o c c a s i o n s , the 48 b i t word of the H a r r i s machine i s i d e a l for our purposes. Secondly, i n t e r f a c i n g the FPS device to a standard mini i s going to be q u i t e a c h a l l e n g e , and hand coding must be done whenever the array processor i s to be used. Since the a r r a y processor i s so much f a s t e r than the host m i n i , the FPS must be used very j u d i c i o u s l y to avoid i t s degradation. In short we do not f e e l that the FPS array processor i s s u i t able at present for general l a r g e s c a l e computations. The use of such a s p e c i a l i z e d device would tend to "freeze" one i n t o a p a r t i c u l a r t h e o r e t i c a l approach, with future options s e v e r e l y limited. However, the mere f a c t that FPS can manufacture a device of t h i s speed for only $40,000 i s c e r t a i n l y a remarkable a c h i e v e ment. We look forward to the further development of t h i s concept. F i n a l l y i t must be noted that a very important development has a l s o occurred i n the l a r g e s c a l e machine area. This i s the i n t r o d u c t i o n of the CRAY machine, which i s at l e a s t a f a c t o r of f i v e f a s t e r than the 7600 and w i l l be s o l d at e s s e n t i a l l y the same p r i c e (y $10 m i l l i o n ) . At present CDC has l e g a l l y succeeded i n s t a l l i n g the o f f i c i a l d e l i v e r y of the f i r s t CRAY, but t h i s should not be allowed to continue i n d e f i n i t e l y . Our personal o p i n i o n i s that by the time the CRAY machine becomes commercially a v a i l a b l e , both H a r r i s and SEL w i l l have introduced machines about f i v e times the speed of the H a r r i s Slash Four. Thus i t seems l i k e l y that the present r e l a t i v e economic comparisons w i l l be v a l i d for perhaps another f i v e y e a r s . A f t e r completion of our d r a f t r e p o r t , we learned of the i n t r o d u c t i o n of the PDP 11T55 machine by the D i g i t a l Equipment Corporation. Although timing and p r i c i n g information i s s t i l l incomplete, t h i s new DEC m i n i claims to exceed the speed of the H a r r i s Slash Four. We are s k e p t i c a l that a c o r p o r a t i o n as l a r g e and "respectable" as DEC w i l l be competitive with H a r r i s or SEL, but t h i s announcement i s c e r t a i n l y welcome. At the very l e a s t i t w i l l force H a r r i s and SEL to a c c e l e r a t e the development and r e l e a s e o f t h e i r new f a s t e r machines. Recommendat ions The greatest challenge p r e s e n t l y before the NSF (and ERDA) with respect to computation i n chemistry i s the above mentioned NRCC. We s t r o n g l y recommend that these bodies agree as q u i c k l y as p o s s i b l e on a procedure for implementing the NRCC (hopefully for F i s c a l 1978). One c o n c l u s i o n drawn from our i n v e s t i g a t i o n s i s that the u l t i m a t e goal o f the NRCC should not be the a c q u i s i t i o n o f i t s own 7600, but rather of the much more powerful and economical CRAY machine. Although i n i t i a l implementation w i l l probably involve some f r a c t i o n of a 7600, the CRAY a l t e r n a t i v e should be kept i n the forefront of c o n s i d e r a t i o n .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
13.
PEARSON
E T A L .
Theoretical Chemistry
179
At the same time the NSF should continue to c a r e f u l l y monitor new developments i n the minicomputer a r e a . A reasonable procedure would i n v o l v e the funding of two such minis per year for the next f i v e y e a r s . Our H a r r i s Slash Four has remained f o r t u i t i o u s l y current during the four p l u s years s i n c e the submission of our p r o p o s a l . However, as discussed i n the previous s e c t i o n , the winds of change are now beginning to blow. P e r s o n a l l y we intend to submit a new proposal to NSF as soon as a r e l i a b l e manufacturer meets the f o l l o w i n g s p e c i f i c a t i o n : for l e s s than $200,000 complete ( i n c l u d i n g C a l i f o r n i a s a l e s tax) a machine four times the speed of the Slash Four. Our current o p i n i o n i s that innovations of a l e s s comprehensive nature are not worth the t r i b u l a t i o n s (see Appendix) inherent i n breaking i n a new machine. Since the H a r r i s Slash Four w i l l s t i l l be a very d e s i r a b l e machine ( e s p e c i a l l y with i t s r e s i d e n t programs, i n c l u d ing POLYATOM, GAUSSIAN 70, SCEP, and BERKELEY), we would leave i t to the d i s c r e t i o n of the NSF to f i n d a s u i t a b l e new owner. Appendix I Interim Reports on the Berkeley Minicomputer P r o j e c t . Q u a r t e r l y Report No. 1, December 14,
1973
Notice was r e c e i v e d on June 15, 1973 that the proposal "Large Scale S c i e n t i f i c Computation v i a Minicomputer" had been funded to the extent of $129,600 by the N a t i o n a l Science Foundation. At t h i s p o i n t f i n a l n e g o t i a t i o n s with the Datacraft Corporation was entered i n t o . The U n i v e r s i t y of C a l i f o r n i a was represented by Mr. R. J . B r i l l i a n t of the Purchasing O f f i c e , while Datacraft was represented by Mr. Don F a l t i n g s , of t h e i r Walnut Creek o f f i c e . A f i n a l agreement was reached on October 5, 1973. The primary change r e l a t i v e to the proposed system was the s u b s t i t u t i o n of a 56,000,000 byte d i s k for the o r i g i n a l 28,000,000 byte d i s k . In a p a r a l l e l development, we r e c e i v e d a l e t t e r on June 28, 1973 from Professor D. R. W i l l i s , A s s i s t a n t to the C h a n c e l l o r Computing. On behalf of the Campus Advisory Committee on Computing, Professor W i l l i s requested that we advise him on how progress reports could best be made, on a r e g u l a r and c o n t i n u i n g basis. On August 30, 1973, we agreed to f i l e q u a r t e r l y r e p o r t s , one or two typewritten pages l o n g , to the Berkeley Campus Computing Committee. These q u a r t e r l y reports w i l l a l s o be sent to D r . W. H . Cramer, Program D i r e c t o r f o r Quantum Chemistry, N a t i o n a l Science Foundation. The m a j o r i t y o f the 6024/4 system was d e l i v e r e d at Berkeley on November 14, 1973. As discussed with D a t a c r a f t , the s c i e n t i f i c a r i t h m e t i c u n i t ( f l o a t i n g p o i n t hardware) and 56 megabyte d i s k did not appear. These items are scheduled at be d e l i v e r e d i n e a r l y January, 1974. In the meantine, a temporary 11 megabyte d i s k was s u p p l i e d by D a t a c r a f t .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS AND LARGE
180
SCALE
COMPUTATIONS
The Datacraft engineer, Mr. Mike Crumbliss, a r r i v e d i n Berkeley on November 19 and proceeded to connect the system. Several e a r l y problems were c l e a r e d up during the f i r s t week. For example, an i n a b i l i t y to plug i n the f i n a l 8,000 words of memory was traced to a misadjustment i n the power supply. With i n the f i r s t week the machine was able to d i a g o n a l i z e a 50 χ 50 matrix i n s i n g l e p r e c i s i o n ( s i x s i g n i f i c a n t f i g u r e s ) . This c a l c u l a t i o n was done using the benchmark program HDIAG discussed i n our NSF p r o p o s a l . However, the machine was unable to properly d i a g o n a l i z e the same 50 χ 50 matrix i n double p r e c i s i o n . This e r r o r , which as of today s t i l l o c c u r s , was traced back to trouble i n the square root r o u t i n e , which i n turn f a i l s due to an e r r o r i n the f l o a t i n g p o i n t d i v i d e o p e r a t i o n . The s p e c i f i c problem i s that the quantity (1.0 - 2"^ )/1.0 i s computed to give 1.0 2_3 8 _ 2~ . The Datacraft engineers are working on t h i s problem now and i n d i c a t e that i t should be resolved s h o r t l y . Despite the p e c u l i a r d i v i d e problem o u t l i n e d above, Professor M i l l e r ' s c l a s s i c a l t r a j e c t o r y programs appear to execute properly i n both s i n g l e and double p r e c i s i o n . The complex-valued t r a j e c t o r i e s run only i n s i n g l e p r e c i s i o n , since the f l o a t i n g point hardware i s r e q u i r e d f o r double p r e c i s i o n complex o p e r a t i o n s . The f i r s t e l e c t r o n i c s t r u c t u r e program we are attempting to set up i s HETINT, Professor Schaefer's diatomic molecular i n t e g r a l s program. The program has been rearranged to f i t i n memory w i t h out o v e r l a y i n g , but does not yet execute properly due to the d i v i d e e r r o r discussed above. In general we have found the double p r e c i s i o n software to execute 150-200 times slower than the CDC 7600. This i s about as expected, and a f a c t o r of 3-4 from the f l o a t i n g p o i n t hardware w i l l put us i n the speed range discussed i n the p r o p o s a l . In our research groups, the i n d i v i d u a l most knowledgeable about computers and computing i s Mr. Peter K. Pearson, and he has taken over r e s p o n s i b i l i t y f o r the care o f the system and dissemination o f necessary information to the other research students. At l e a s t four other students have a good grasp o f the system. In that most of us know a great deal more about computers than we d i d one month ago, i t appears that our m i n i computer experiment has had considerable e d u c a t i o n a l value already. On March 29, 1974 our i n s t a l l a t i o n w i l l be v i s i t e d by a s p e c i a l N a t i o n a l Science Foundation committee, t e n t a t i v e l y composed of D r s . W. H . Cramer (NSF), 0. W. Adams (NSF), J . C . Browne ( U n i v e r s i t y o f Texas), and P . G. Lykos ( I l l i n o i s I n s t i t u t e of Technology). 8
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
η
Q u a r t e r l y Report No. 2, March 27, 1974 Our f i r s t q u a r t e r l y r e p o r t documented the a r r i v a l of most of the Datacraft 6024/4 system, described i n our NSF p r o p o s a l . This proposal has now been modified s l i g h l y so as to be s u i t a b l e f o r
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
13.
PEARSON
E T A L .
Theoretical Chemistry
181
p u b l i c a t i o n , and w i l l appear i n the new j o u r n a l "Computers and Chemistry". At the time of our f i r s t r e p o r t , the 6024/4 had been unable to s u c c e s s f u l l y complete our 50 χ 50 matrix d i a g o n a l i z a t i o n bench mark i n double p r e c i s i o n , due to an e r r o r i n the f l o a t i n g p o i n t d i v i d e subroutine. Shortly t h e r e a f t e r t h i s e r r o r was further traced by Peter Pearson to a machine i n s t r u c t i o n , the AMD i n s t r u c t i o n , Add Memory Double. We should point out here that the Datacraft engineers (or those of any other data p r o c e s s i n g manufacturer) can u s u a l l y solve a problem only a f t e r i t has been traced to a s p e c i f i c machine i n s t r u c t i o n f a i l u r e . In the present case, the AMD i n s t r u c t i o n d i d f u n c t i o n p r o p e r l y when one of the c e n t r a l processor byte s l i c e boards was put on an extender board. This being the case, the e r r o r was e l i m i n a t e d by p o s i t i o n i n g a piece o f copper f o i l between the two offending cpu byte s l i c e boards. The matrix d i a g o n a l i z a t i o n then executed properly at a speed 166 times slower than the CDC 7600. With the f l o a t i n g point hardware, however, we expect (see o r i g i n a l proposal) the benchmark to execute at a speed 49 times slower than the 7600. With the AMD i n s t r u c t i o n c o r r e c t e d , we r e t u r n to the problem of implementing HETINT, Professor Schaefer's diatomic molecular i n t e g r a l s program. Although the program d i d execute, i n c o r r e c t answers were obtained. Peter Pearson e v e n t u a l l y traced t h i s d i f f i c u l t y to improper treatment of exponents by the system's a r i t h m e t i c r o u t i n e s i n underflow cases. In f a c t , he had to modify the f l o a t i n g p o i n t subroutines for double p r e c i s i o n add, subtract, and m u l t i p l y . This was a p a r t i c u l a r l y d i f f i c u l t j o b , s i n c e at that time we d i d not have the source program l i s t i n g s for the software f l o a t i n g point subroutines. With these c o r r e c t i o n s made, HETINT executed p r o p e r l y on December 19, 1973. This program executes at a speed roughly 105 times slower than the CDC 7600. The next major program to be implemented was the Ohio S t a t e Cal Tech-Berkeley v e r s i o n of POLYATOM, a general molecular program for the computation of m u l t i c o n f i g u r a t i o n SCF wave functions (the o r i g i n a l v e r s i o n of POLYATOM was developed by J u l e s Moskowitz and co-workers at NYU). To t h i s end, Dean Liskow began an i n t e n s i v e e f f o r t on the f i r s t of the year. One of the most serious d i f f i c u l t i e s was the setup of the overlay s t r u c t u r e , c o n s i s t i n g of three l e v e l s with seven segments. Success was achieved on January 11 when a proper s e l f - c o n s i s t e n t - f i e l d wave f u n c t i o n f o r the water molecule was obtained. Comparison with the 7600 r e s u l t s showed an accuracy of between 9 and 10 s i g n i f i cant figures f o r the t o t a l energy. Toward the end of January, we began to run POLYATOM on a production b a s i s . One of the f i r s t problems t a c k l e d was the p o s s i b l e existence o f two isomers of the NOâ i o n . A (9s 5p/5s 3$ gaussian b a s i s was centered on each atom, and the three geometric a l parameters optimized for the nonsymmetric form. A complete c a l c u l a t i o n at a s i n g l e geometry r e q u i r e d between four and s i x hours of elapsed time. This i s about a f a c t o r of 85
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
182
AND LARGE
SCALE
COMPUTATIONS
times slower than the CDC 7600. During the same p e r i o d Gretchen Schwenzer used the 6024/4 f o r a thorough p r e l i m i n a r y study of H S and the two h y p o t h e t i c a l hypervalent molecules SHi+ and SH6. S i m i l a r 7600 timing comparison were obtained. Due to the f l o a t i n g p o i n t software's i n a b i l i t y to perform complex operations i n double p r e c i s i o n (11 s i g n i f i c a n t f i g u r e s ) , we have thus f a r been unable to implement Professor M i l l e r ' s s e m i c l a s s i c a l programs i n v o l v i n g complex-valued t r a j e c t o r i e s ( i . e . , generalized tunneling). Several r e a l - v a l u e d t r a j e c t o r y programs ( r o t a t i o n a l e x c i t a t i o n of He + H and t r a j e c t o r y φ "surface-hopping" c a l c u l a t i o n s for 0( D) + N2 -*-0(^P) + N2 ) i n i t i a l l y ran s u c c e s s f u l l y but numerical ^ r e p r o d u c i b i l i t i e s began occuring. This was a source of much f r u s t r a t i o n , and was perhaps due to c r o s s t a l k between s e v e r a l a d d i t i o n a l byte s l i c e boards. To c o r r e c t t h i s problem s e v e r a l a d d i t i o n a l sheets of copper f o i l were p o s i t i o n e d i n the c e n t r a l processor one week ago. The l a r g e d i s k (56 megabyte) and s c i e n t i f i c a r i t h m e t i c u n i t (SAU) a r r i v e d at Berkeley on March 13, 1974. This was two months a f t e r the promised d e l i v e r y date and a source of c o n s i d e r able f r u s t r a t i o n . I t i s important to note here that none of the time comparisons made heretofore u t i l i z e d the SAU ( f l o a t i n g p o i n t hardware). The i n d i v i d u a l hardware f l o a t i n g point add, s u b t r a c t , m u l t i p l y , and d i v i d e i n s t r u c t i o n s execute at speeds 6-14 times f a s t e r than the software subroutines. R e a l i s t i c a l l y , however, we expect the SAU to r e s u l t i n an o v e r a l l i n c r e a s e i n speed of a f a c t o r of 2 to 3. This would put us w i t h i n our o r i g i n a l estimate of being a f a c t o r o f 64 slower than the 7600. As of the time of w r i t i n g of t h i s r e p o r t , n e i t h e r the l a r g e d i s k nor SAU are yet f u l l y o p e r a t i o n a l . The Datacraft engineers are h o p e f u l , however, that the complete system w i l l be f u n c t i o n a l w i t h i n a week. 2
2
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
X
Q u a r t e r l y Report No. 3, June 17,
1974
A w e l l - r e s p e c t e d book d e s c r i b i n g the f i r s t twelve months of infancy makes a statement to the e f f e c t that the t h i r d month of your c h i l d ' s l i f e makes the f i r s t two seem bearable i n retrospect. In a remarkably analogous manner, the f r u s t r a t i o n s of the f i r s t two quarters with our Datacraft 6024/4 minicomputer were more than compensated by the successes of the t h i r d q u a r t e r , j u s t completed. Our second q u a r t e r l y report l e f t o f f with the machine i n o p e r a t i v e due to the recent a r r i v a l of the 56 megabyte d i s k and f l o a t i n g p o i n t processor [referred to by Datacraft as the s c i e n t i f i c a r i t h m e t i c u n i t (SAU)]. Since the NSF v i s i t a t i o n committee (Drs. W. H . Cramer, 0. W. Adams, J . C. Browne and P . G. Lykos) was to a r r i v e on March 29, one might d e s c r i b e the s i t u a t i o n on March 27 as being on the verge of p a n i c . From F o r t Lauderdale Datacraft flew out Mr. R u s s e l l P a t t o n , d i r e c t o r of f i e l d service. Working through the n i g h t , he and Mr. Ron P l a t z
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
13.
PEARSON
E T A L .
Theoretical Chemistry
183
i n s t a l l e d a separate power supply for the SAU and l o c a t e d and corrected a problem with the d i s k automatic block c o n t r o l l e r . With these m o d i f i c a t i o n s implemented, Peter Pearson was able to execute the 50 χ 50 matrix d i a g o n a l i z a t i o n benchmark program. In the l a s t q u a r t e r l y r e p o r t , we noted t h a t , without the f l o a t i n g p o i n t hardware, (SAU), t h i s program executes at a speed 166 times slower than the CDC 7600. With the SAU and the standard Datacraft 6024 compiler a r a t i o of 59 was found. Using the "optimizing" compiler ( a c t u a l l y s t i l l a r a t h e r crude c o m p i l e r ) , the d i a g o n a l i z a t i o n executed at a speed 43 times slower than the 7600. T h i s r e s u l t i s c o n s i s t e n t with the f a c t o r of 49 p r e d i c t e d i n the o r i g i n a l p r o p o s a l , a modified v e r s i o n of which has now been accepted for p u b l i c a t i o n i n the new j o u r n a l Computers and Chemistry. The NSF v i s i t a t i o n provided the framework for a thorough d i s c u s s i o n of the machine's progress through March 29. Bill Cramer and B i l l Adams s t r e s s e d the importance of keeping an accurate record of machine u t i l i z a t i o n , a key f a c t o r i n the economic a n a l y s i s c e n t r a l to t h i s experiment. Peter Lykos suggest ed we c a l i b r a t e the 6024/4 using the MFLOPS (measure of f l o a t i n g point operations per second) benchmark. A copy of MFLOPS has been obtained and an a n a l y s i s w i l l be presented i n the next quarterly report. Jim Brown gave us many u s e f u l i n s i g h t s from h i s experience at the U n i v e r s i t y of Texas as both chemist and computer s c i e n t i s t . Don F a l t i n g s of Datacraft was on hand to answer a number of questions from the committee and b r i e f l y d i s c u s s some new features ( i n c l u d i n g v i r t u a l memory) of the Datacraft l i n e . F i n a l l y , i t was agreed that a second v i s i t a t i o n would be a d v i s a b l e , a f t e r the machine i s f u l l y o p e r a t i o n a l and i t s c h a r a c t e r i s t i c s thoroughly documented. Steady progress was made during the f i r s t 10 days of A p r i l . That i s , a number of programs were modified to run on the complete system, i n c l u d i n g SAU and l a r g e d i s k . However, s e v e r a l nagging problems p e r s i s t e d , one being that 5.4 v o l t s , 0.4 above the recommended l e v e l , were required to s u s t a i n the c e n t r a l processor. When t h i s minimum f u n c t i o n i n g voltage increased to 5.5 v o l t s , Datacraft advised us to turn the machine o f f . After a week of i n v e s t i g a t i o n , t h i s s u r p r i s i n g l y s u b t l e problem was l o c a t ed and q u i c k l y e l i m i n a t e d on A p r i l 22 by replacement of an i n t e grated c i r c u i t on the memory timing and c o n t r o l board. As i t turned o u t , t h i s small machine defect had been r e s p o n s i b l e f o r many of the problems encountered during the f i r s t f i v e months of operation. Not only d i d the machine run properly at 5.0 v o l t s , but i t was a l s o p o s s i b l e to remove the pieces of copper f o i l p r e v i o u s l y necessary (see Q u a r t e r l y Reports 1 and 2) to s h i e l d the d i f f e r e n t boards from each o t h e r . Since A p r i l 22, the 6024/4 has been running q u i t e smoothly. Some o c c a s i o n a l p a r i t y e r r o r s were put to r e s t by replacement of a memory board chip on May 16. A current problem with the add memory to double (AMD) i n s t r u c t i o n has been temporarily r e l i e v e d
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
184
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
by a sheet of copper f o i l between byte s l i c e boards 2 and 3. However, these are minor problems, and o v e r a l l we have been very pleased with the o p e r a t i o n of the machine during t h i s q u a r t e r . With t e c h n i c a l problems pushed i n t o the background, we were able to turn to the c e n t r a l goal of the experiment, the e v a l u a t i o n of the performance of the 6024/4 r e l a t i v e to the CDC 7600 . For t h i s purpose we report the r e s u l t s of two d i r e c t comparisons, one i n v o l v i n g e l e c t r o n i c s t r u c t u r e theory and the other molecular c o l l i s i o n theory. I t i s to be emphasized that the programs used are by no means o p t i m a l l y e f f i c i e n t . However, of primary i n t e r e s t here are the r e l a t i v e speeds of the two machines, and for t h i s purpose our comparisons should be v a l i d . The f i r s t t e s t case arose i n Dean Liskow's study of the chemisorption of hydrogen by c l u s t e r s of b e r y l l i u m atoms. For the BesH system, a double zeta b a s i s set was adopted: Be(4s 2p), H(2s l p ) . The modified POLYATOM program was used to compute s e l f c o n s i s t e n t - f i e l d wave functions for t h i s open s h e l l doublet. The r e s u l t s are summarized below: Times f 6024/4
(seconds) 7600
^
Ratio
Generate l i s t of unique nonzero i n t e g r a l s Compute unique
3091
174
17.8
2506
119
21.5
integrals
( t o t a l of 476,000)
SCF (time per i t e r a t i o n ) 1548 36 43.5 T h i s comparison i n d i c a t e s that the SCF i t e r a t i o n s show the 6024/4 i n the worst l i g h t . We intend to c o r r e c t t h i s weakness by r e c o d ing t h i s s e c t i o n of POLYATOM i n machine language. However, a l l our d i r e c t comparisons with the 7600 must of n e c e s s i t y employ the same FORTRAM programs. The complete c a l c u l a t i o n , i n c l u d i n g 17 SCF i t e r a t i o n s , r e q u i r e d 0.25 cpu hours on the 7600 and cost $243. The i d e n t i c a l c a l c u l a t i o n r e q u i r e d a t o t a l of 8.86 hours of 6024/4 time, or an o v e r a l l f a c t o r of 35 longer than the 7600. The second t e s t case arose from George Z a h r ' s study of the quenching of 0 ( * ϋ ) by N 2 . Assuming a simple a n a l y t i c a l p o t e n t i a l energy s u r f a c e , c l a s s i c a l t r a j e c t o r i e s were performed w i t h i n the surface-hopping model of Preston and T u l l y . 330 such t r a j e c t o r i e s r e q u i r e d 480 minutes on the 6024/4 and 18.2 minutes on the 7600. The 7600 cost was $193. The Datacraft machine i s seen to be a f a c t o r o f 26 slower. Note that t h i s computation i n v o l v e s v i r t u a l l y no input/output o p e r a t i o n s . Both of the above comparisons show the Datacraft minicomputer to be s i g n i f i c a n t l y f a s t e r than the f a c t o r of 64 p r e d i c t e d i n our o r i g i n a l proposal. There we concluded that the t o t a l monthly cost ( i n c l u d i n g a m o r t i z a t i o n over four years) of the 6024/4 would be $4156. Experience has shown t h i s f i g u r e , which we now round to
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
13.
PEARSON
E T A L .
Theoretical Chemistry
185
$4200, to be r e a l i s t i c . Yet to be f i r m l y e s t a b l i s h e d i s the average number of hours of computing a t t a i n e d per day at our installation. We w i l l d i s c u s s t h i s point i n d e t a i l i n our next quarterly report. However, i f we take the p e s s i m i s t i c view that only 12 hours of computing per day are achieved, 360 hours per month t r a n s l a t e s i n t o a cost of $11.67/hour. Thus the BesH job c i t e d above costs $103, as opposed to $243 for the 7600. The 0( D) + N2 job by the same c r i t e r i o n cost $93, as opposed to $193 for the 7600. Again i t i s only f a i r to remark that the c i t e d 7600 costs at the Lawrence Berkeley Laboratory i n c l u d e only o p e r a t i o n a l expenses and completely n e g l e c t the i n i t i a l purchase p r i c e of the machine. Added i n proof: A s u r p r i s i n g l y simple r e o r d e r i n g of the POLYATOM f i l e s t r u c t u r e (no changes i n the FORTRAN program) by Peter Pearson has r e s u l t e d i n n e a r l y a f a c t o r of two increase i n the SCF speed c i t e d above.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
1
Report No. 4, March 30,
1975
By the time of w r i t i n g of our l a s t r e p o r t , i t had become clear that the Datacraft 6024/4 minicomputer was meeting or exceeding the goals that had been set for i t . The past nine months have served to s u b s t a n t i a l l y strengthen that c o n c l u s i o n . A p a r t i c u l a r l y c r u c i a l t e s t has been passed i n that i t i s now apparent that r e l a t i v e l y l i t t l e maintenance of the machine i s required. T y p i c a l l y , i t i s necessary to c a l l the computer engineer once or twice per month, and r e p a i r "down time" for a t y p i c a l month i s roughly one day. In f a c t , the s e r v i c e c o n t r a c t i s necessary p r i m a r i l y as an insurance p o l i c y , s i n c e we would otherwise be unprotected against d i s a s t e r s , e . g . , i f for some mysterious reason the e n t i r e memory were burned out. In t h i s regard i t should be noted that the Datacraft Corporation was swallowed up by the H a r r i s Corporating during t h i s p e r i o d . Thus our minicomputer i s now marketed as the H a r r i s Slash Four. The only e f f e c t (on us) of t h i s takeover was the increased cost of the s e r v i c e c o n t r a c t , for which H a r r i s proposed a p r i c e of $1500/month T h i s suggestion was p a r t i c u l a r l y d i s t r e s s i n g to us s i n c e (a) i t represented a l a r g e increase over the $1155/month we had budgeted for and (b) the U n i v e r s i t y of C a l i f o r n i a has during the past year changed i t s p o l i c y and we now pay 34% overhead on the s e r v i c e contract. A f t e r some d e l i c a t e n e g o t i a t i o n s H a r r i s lowered the s e r v i c e contract p r i c e to $1280/month and we accepted i t . Before l e a v i n g the subject of maintenance, i t should be mentioned that most of our problems r e q u i r i n g s e r v i c e i n v o l v e the t e l e t y p e and l i n e p r i n t e r . I t turns out that n e i t h e r of these devices was intended for the s o r t of f u l l time usage they are receiving. I n c i d e n t a l l y , the t e l e t y p e i s not covered under the new s e r v i c e c o n t r a c t , but i s i n s t e a d s e r v i c e d by U n i v e r s i t y of C a l i f o r n i a personnel.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
186
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
In one r e s p e c t , the minicomputer has proved l e s s expensive to operate than we had p r e d i c t e d . In the o r i g i n a l p r o p o s a l , $300/ month was a l l o c a t e d for " e l e c t r i c i t y , c a r d s , paper, e t c . " As i t turns out, although we do pay the above-mentioned overhead of $437/month on the s e r v i c e c o n t r a c t , the U n i v e r s i t y pays our e l e c t r i c a l b i l l , and the cost of c a r d s , paper, e t c . , i s l e s s than $50/month. Thus t h i s savings of $250/month p a r t i a l l y cancels the high cost of the s e r v i c e c o n t r a c t . During t h i s p e r i o d we have from time to time run programs to gather s t a t i s t i c s on the u t i l i z a t i o n of the minicomputer. These data suggest that the machine i s busy for about 90% of the time i t i s not being s e r v i c e d . Thus the o v e r a l l ( i n c l u d i n g r e p a i r down time) u t i l i z a t i o n i s i n excess of 85%, a f i g u r e considered q u i t e acceptable for l a r g e s c a l e machines. T h i s u t i l i z a t i o n r a t e i s a l s o remarkably c l o s e to the 20 hours/day estimated i n our o r i g i n a l proposal. I t i s necessary, however, to point out that such a r a t e could not be achieved (without a paid operator) w i t h out an aggressive and hard working group of eleven a c t i v e users (students and p o s t d o c t o r a l s ) . Since each user z e a l o u s l y guards h i s ^ 13 hours/week, he/she i s quite l i k e l y to be on hand should any temporary machine problem i n t e r r u p t his/her j o b . Due to machine demand, i t should be noted that jobs r a r e l y run longer than 13 hours; and the thought ( r a i s e d i n our o r i g i n a l proposal) of jobs running c o n s e c u t i v e l y for one month has been long s i n c e abandoned. A t y p i c a l job now runs for about two hours. Scheduling the computer turned out to be more of a problem than we i n i t i a l l y a n t i c i p a t e d . I t was c l e a r i n J u l y , 1974 that the machine had become s u f f i c i e n t l y popular that "good w i l l " would not be a s u f f i c i e n t deterrent to squabbles. The system that has now been s e t t l e d upon i n v o l v e s g i v i n g each a c t i v e user 13 hours of machine time per week. In a d d i t i o n four hours (10 AM - 2 PM) per weekday are a v a i l a b l e on a f i r s t c o m e - f i r s t serve b a s i s for debug j o b s , with a time l i m i t of ten minutes. The time i s signed up for on Thursday afternoons for the week beginning Saturday. The order of sign-up i s a r e g u l a r one, with the user having f i r s t choice one week being demoted to l a s t choice the f o l l o w i n g week. A final r e s t r i c t i o n i s that the b l o c k to time between 2 AM and 8 AM cannot be s u b d i v i d e d . That i s , a s i n g l e user takes the e n t i r e b l o c k . Although t h i s scheduling system w i l l probably be s l i g h t l y r e v i s e d on o c c a s i o n , i t seems to be working reasonably w e l l at present. Two major program conversion e f f o r t s were undertaken s i n c e the t h i r d r e p o r t . The f i r s t , i n v o l v i n g the Gaussian 70 programs of Hehre, Lathan, D i t c h f i e l d , Newton, and Pople, i s now completed. The second, i n v o l v i n g the polyatomic c o n f i g u r a t i o n i n t e r a c t i o n (CI) program of Charles F . Bender, began very r e c e n t l y and has been implemented thus f a r only i n a r e s t r i c t e d v e r s i o n . The Gaussian 70 conversion was deemed e s p e c i a l l y important s i n c e i t now appears that t h i s program w i l l become s i g n i f i c a n t l y more widely d i s t r i b u t ed than any previous ab i n i t i o progarm for s i n g l e determinant SCF studies. Thus the times we report with t h i s program may serve as
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
13.
PEARSON
Theoretical Chemistry
E T AL.
187
a b a s i s for comparison w i t h many other types of computers. The major d i f f i c u l t y i n the implementation of Gaussian 70 was the r e l a t i v e l y complicated (for the 6024/4) overlay s t r u c t u r e . One o f the e a r l i e s t s t u d i e s undertaken u s i n g Gaussian 70 i n v o l v e d the C H - C £ 2 molecular complex. Using the standard ST0-3G b a s i s set (162 p r i m i t i v e gaussians, 54 contracted f u n c t i o n s ) , a complete c a l c u l a t i o n at one geometry, i n c l u d i n g 8 SCF i t e r a t i o n s , r e q u i r e d 64 minutes of 6024/4 elapsed time. Thus i t i s c l e a r that the study o f reasonably complicated organic systems u s i n g minimum b a s i s sets i s q u i t e f e a s i b l e with the minicomputer. Using analogous minimum b a s i s s e t s , computations have been c a r r i e d out on (CH30)2P0 Ca Cl (67 c o n t r a c t e d f u n c t i o n s ; 59 minutes f o r i n t e g r a l s plus 86 minutes f o r 13 SCF i t e r a t i o n s ) and the Bei3 c l u s t e r (65 contracted f u n c t i o n s ; 80 minutes f o r i n t e g r a l s , 250 minutes f o r 20 SCF i t e r a t i o n s ) . We continue to i n v e s t i g a t e a l a r g e number of systems u s i n g the B e r k e l e y - C a l Tech-Ohio State v e r s i o n of POLYATOM. Advantages o f t h i s program are that i t y i e l d s exact s p i n eigenfunctions f o r opens h e l l systems and can perform l i m i t e d MCSCF computations. One of the l a r g e r systems studied was the NH3-C&F charge t r a n s f e r complex. A b a s i s set of s i z e Cil(12s 9p ld/6s 4p I d ) , N,F(9s 5p ld/4s 2p I d ) , H(4s/2s) was used, t o t a l i n g 62 contracted f u n c t i o n s . A l i s t of non zero-unique i n t e g r a l s i s generated i n 40 minutes ( t h i s process need be done only once f o r the e n t i r e p o t e n t i a l c u r v e ) , i n t e g r a l computation r e q u i r e d 130 minutes, and 11 SCF i t e r a t i o n s consumed 50 minutes. A study of trimethylene methane which we had e a r l i e r found exceedingly d i f f i c u l t to f i n i s h on the 7600 (due to cost c o n s i d e r a t i o n s ) has now been completed on the 6024/4. Using a double zeta b a s i s set (120 p r i m i t i v e functions contracted to 52), 74 minutes were r e q u i r e d f o r i n t e g r a l g e n e r a t i o n . Twenty SCF i t e r a t i o n s on the A 2 ground s t a t e (two SCF hamiltonians) devoured 220 minutes o f elapsed time. During the past nine months, a s e r i e s of production runs was made on the g l y o x a l molecule (HC0)2 u s i n g a standard double zeta basis set. S i n c e , a number of runs were a l s o made on the 7600, a comparison of the costs f o r the e n t i r e p r o j e c t i s p o s s i b l e . The POLYATOM timing comparisons are seen i n Table I . 6
6
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
2
3
The r a t i o of elapsed 6024/4 time to 7600 CPU time i s 25.0, a very encouraging f i g u r e . Since the cost of machine time on the m i n i i s about $8/hour ( i n c l u d i n g a m o r t i z a t i o n of the purchase p r i c e over four y e a r s ) , the t o t a l minicomputer cost of the p r o j e c t was l e s s than $3500. An i n t e r e s t i n g development has been the i n c r e a s i n g use of the m i n i i n an i n t e r a c t i v e mode. This i s e s p e c i a l l y h e l p f u l i n SCF calculations. The t o t a l energy i s p r i n t e d on the t e l e t y p e a f t e r each SCF i n t e r a c t i o n and the user has four o p t i o n s : a) c o n t i n u e ; b) go to a weighted averaging of o r b i t a l s ; c) go to an e x t r a p o l a t i o n scheme; d) stop. Use of t h i s i n t e r a c t i v e feature can remarkably improve the r a t e of convergence f o r c e r t a i n types of molecular systems.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
188
MINICOMPUTERS
Table I .
A N DLARGE
COMPUTATIONS
POLYATOM timing comparisons f o r g l y o x a l . 7600 Job
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
SCALE
Cost
6024/4 Minutes Elapsed Time
35
8
9
215
34
46
240
61
86
CPU-Seconds
Lister Integrals (cis/trans) Integrals (gauche) SCF - ground s t a t e per i t e r a t i o n SCF - e x c i t e d s t a t e s per i t e r a t i o n
6
2.75
2.5
17
7.70
7
Glyoxal P r o j e c t : 3 70 60 130 40
listers 105 cis/trans integrals 8050 gauche i n t e g r a l s 14400 SCF ground s t a t e 7020 SCF - e x c i t e d s t a t e s vertical 13600 60 SCF - e x i c t e d s t a t e s geometry search _ 7140 a
24 2380 3660 3250
27 3220 5160 3510
6160
5800
b
c
50,315 $18,719 14.0 hours a) b) c) d)
3240
3245 c
20,957 349.3 hours
Based on nine SCF i t e r a t i o n s for convergence. Based on twenty SCF i t e r a t i o n s for convergence. Based on seven SCF i t e r a t i o n s for convergence. I f run e x c l u s i v e l y on weekends and h o l i d a y s , cost reduced to $9360. I f i n a d d i t i o n run at deferred p r i o r i t y , cost f a l l s to $7488.
Much of WHM's current research i n v o l v e s n u m e r i c a l l y computed classical trajectories. In " c l a s s i c a l S-matrix" c a l c u l a t i o n s , for example, c l a s s i c a l t r a j e c t o r i e s , and the a c t i o n i n t e g r a l along them, are used to construct quantum mechanical S-matrix elements for s p e c i f i c c o l l i s i o n processes. A l s o , a newly formulated quantum mechanical v e r s i o n of t r a n s i t i o n s t a t e theory, which c o r r e c t l y incorporates n o n - s e p a r a b i l i t y of the t r a n s i t i o n s t a t e , uses t r a j e c t o r i e s — p e r i o d i c t r a j e c t o r i e s i n imaginary time—to determine the net r a t e constant f o r r e a c t i o n . Although the c a l c u l a t i o n of c l a s s i c a l t r a j e c t o r i e s themselves i s f a i r l y standard nowadays, these novel kinds of theory u s u a l l y i n v o l v e search
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
13.
PEARSON ET AL.
Theoretical Chemistry
189
procedures, i . e . , they require particular classical trajectories rather than a Monte Carlo average over them a l l . The ability to operate the minicomputer "hands on" has greatly facilitated the application of these new kinds of theoretical models. Also, this type of work requires a great deal of new program debugging, and the 6024/4 has proved quite adequate in this regard, even though the diagnostics are not as comprehensive as those produced by the LBL 7600. Our final fairly typical timing comparison concerns a threedimensional phase space integral calculation. To obtain the rate constant for D + H2 at 200°K, 237 classical trajectories ,(both real and imaginary) were computed. The minicomputer required 60 minutes for this job, while the 7600 used 2.42 minutes of CPU time. Thus the 7600 was a factor of 25 quicker than the 6024/4. The cost of the 7600 job was $20.57. This comparison puts the large machine in a relatively favorable light since there are essentially no 7600 input/output charges associated with trajectory-oriented jobs of this type. In closing we note that this factor of 25 is characteristic of trajectory studies, which involve the numerical integration of ordinary differential equations. Ac knowledgment s We wish to sincerely thank Drs. W. H. Cramer and 0. W. Adams of NSF for their support of this project, especially during its early and more controversial stages. We also thank Professors Jim Brown, Edward Hayes, Maurice Schwartz, Don Secrest, Stanley Hagstrom, and Peter Lykos for their thoughtful comments on the draft version of this report. * Supported by the National Science Foundation, Grant GP-39317. ** Present address: Lawrence Livermore Laboratory, University of California, Livermore, California 94550. *** Address after September 15, 1977: Arthur Amos Noyes Laboratory of Chemical Physics, California Institute of Technology, Pasadena, California 91125 Literature Cited 1. 2. 3. 4.
Pople, J . Α., "Modern Theoretical Chemistry", Vol. IV, ed., H. F. Schaefer, Plenum, New York, 1977. Cavallone, F . , and Clementi, Ε . , J . Chem. Phys. (1975), 63, 4304. Miller, W. H., Advances in Chemical Physics (1974), 25, 69. Herschbach, D. R., Faraday Discussion Chem. Soc. (1973), 55, 233.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
190
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
5. 6.
Bunker, D. L . , Accounts of Chemical Research (1974), 7, 195. Wiberg, Κ. B., "A Study of a National Center for Computation in Chemistry", National Academy of Sciences, Washington, D.C., March, 1974. 7. Bigeleisen, J . , "The Proposed National Resource for Computa tion in Chemistry: A User-Oriented Facility", National Academy of Sciences, Washington, D.C., June, 1975. 8. Dykstra, C. E., Schaefer, H. F., and Meyer, W., J. Chem. Phys. (1976), 65, 2740, 5141. 9. Schaefer, H. F., "The Electronic Structure of Atoms and Molecules: A Survey of Quantum Mechanical Results", AddisonWesley, Reading, Massachusetts, 1972. 10. Lucchese, R. R., Brooks, B. R., Meadows, J. H., Swope, W. C., and Schaefer, H. F., J. Computational Phys., in press. 11. Shavitt, I., paper presented at the Third ICASE Conference on Scientific Computing, Williamsburg, Virginia, April 1-2, 1976. 12. Robinson, A. L . , Science (1976), 193, 470. 13. Richards, G., Nature (1977), 266, 5597, 18. 14. Wilson, K. R., "Multiprocessor Molecular Mechanics", in Computer Networking and Chemistry, Peter Lykos, editor (American Chemical Society, Washington, D.C., August, 1975.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
14 Large Scale Computations on a Virtual Memory Minicomputer
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
JOSEPHM.NORBECK and PHILLIPR.CERTAIN Theoretical Chemistry Institute and Department of Chemistry, University of Wisconsin, Madison,WI53706 In October, 1976, the Chemistry Department at the University of Wisconsin-Madison installed a Harris SLASH 7 computer system. The SLASH 7 is a virtual memory minicomputer and is equipped with 64K of high speed, 24 bit, memory; an 80 Mbyte disc storage module; a 9 track tape drive; and a high speed paper tape punch and reader. Other peripherals include two interactive graphics terminals, a 36" CALCOMP plotter, a 3'x4' data digitizing tablet, remote accessing capability for terminals and other departmental minicomputers and remote job entry (RJE) capability to the campus UNIVAC 1110. The SLASH 7 is a departmental resource for the faculty, staff and graduate students as an aid in their research. The computational and data processing needs of the department can be grouped into four main categories: (1) Real time data acquisition (2) Data reformatting and media conversion (3) Interactive data processing and simulation (4) Batch computing, including large scale number crunching. In this paper we discuss the performance of the Harris computer to date and the role i t plays with respect to the categories mentioned above. The information provided is based on less than six months of operation, with much of this time used for program conversion and user education. Consequently we focus most of our attention on category (4) (batch computing and large scale number crunching), since i t is in this area that cost analyses and performance criteria with respect to larger machines have been concentrated. It is also the easiest area to assess in a short period of time. Although we concentrate on the number crunching capabilities of the computer in this paper, we f i r s t briefly discuss the role of the computer in the other categories. There are presently 16 minicomputers in the department associated with either departmental facilities or instruments dedicated to the research groups of individual staff members. These minis are generally adequate only for control of instrumentation and data acquisition, while the 191 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS AND LARGE
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
192
SCALE
COMPUTATIONS
necessary processing of data in the past has been carried out at the university computing center. Nearly a l l of these minis are equipped with paper tape I/O, with five having magnetic tape units In the future, we expect that much of processing of data w i l l be carried out on the SLASH 7. In addition, the SLASH 7 is equipped with a direct memory access device which i s capable of providing a direct link between the other departmental minis and the SLASH 7. At the present time, three minis are being hardwired to the SLASH 7 throgh RS232C interfaces and w i l l be capable of data transfer of up to 9600 baud. Although direct control of experimental i n s t r u ments by the departmental computer is not contemplated, these direct links to the SLASH 7 w i l l provide fast turn-around for the processing of experimental data. A large number of instruments in the department produce graphic output. This includes a variety of spectrometers, electrochemical instrumentation, chromatographs and custom devices Since most do not provide d i g i t a l output, the Harris computer is equipped with a large data digitizing t a b l e t , which is tied to a high quality plotter through an interactive graphics terminal. These peripherals f a c i l i t a t e the processing of graphic data via curve f i t t i n g , integration, and so on. Turning now to number crunching, after a. brief description of the computer hardware and the virtual memory structure, benchmarks and stand-alone run times are reported for several programs currently in use in the department. One of the most important items of information obtained to date is the extent to which "paging" of the virtual memory degrades job through-put. This has been evaluated by investigating the CPU time to Wall Clock (WC) time ratio under different operating conditions. The CPU/WC time ratios are given for jobs alone in the computer and mixed with others. Stand-alone benchmarks correspond to the optimum conditions for each job and give the most favorable CPU/WC time r a t i o . This provides a bound with respect to CPU time, which i s used in a cost analysis with respect to larger, "hard cash" computer f a c i l i t i e s . We discuss this point in more detail later in the paper. Virtual Memory and Paging The Harris SLASH 7 Virtual Memory System (VMS) involves both hardware and software to control the transfer of user programs and data in IK word (K=l024) segments--called "pages"--between main memory and an external mass storage device, which in our case i s the 80 MB disk. This operation, termed "paging", allows a program's memory area to be noncontiguous and even nonresident, and provides a maximum u t i l i z a t i o n of available memory. This permits the computer (1) to run programs larger than the physical memory (the SLASH 7 has an 18 bit effective memory address that selects one of 256 pages, thus allowing for a maximum individual program size of 262,144 words) and (2) to "page" to disc a low
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
14.
NORBECK
A N D CERTAIN
Memory Minicomputer
193
p r i o r i t y task (e.g. a long running number cruncher) to provide faster turn around for shorter, high p r i o r i t y jobs. The paging feature is obviously a great advantage in a m u l t i user environment. For the individual user, the virtual memory system allows programs to d i r e c t l y address up to 256K words, thus avoiding the necessity of e x p l i c i t overlaying. The disadvantages of the virtual memory system are that (1) the operating system occupies approximately 27K of high speed memory at a l l times, (2) even small programs that do not page incur a paging "overhead", and (3) i t i s possible to create a "thrashing" situation i f the demand for paging becomes greater than a c r i t i c a l value. The mean seek time for a disc read i s 30 milliseconds, so that more than about 30 paging operations per second w i l l s t a l l the system. Our present SLASH 7 has approximately 37 user pages available for programs and data storage. Since many jobs which are executed on our system require s i g n i f i c a n t l y more storage area than t h i s , we have paid particular attention to how paging effects program run-times, and to programming techniques that minimize paging. To give an example of a thrashing s i t u a t i o n , we present in Table I the CPU and WC times required to calculate a l l eigenvalues of various r e a l , symmetric matrices. These programs were executed in double precision on our SLASH 7 with the subroutine GIVENS, distributed by the Quantum Chemistry Program Exchange. Note that as long as the matrix can be stored in 37K (the number of user pages available) the program is CPU bound. For larger matrices (dimension greater than 200x200) the CPU/WC time ratio drops s i g n i f i c a n t l y due to thrashing. The reason thrashing occurs in the present example i s because the matrix i s stored in upper triangular form by columns, while the program code processes the matrix by rows. Consequently, depending on the row being processed, i t is possible to require a new page be brought into memory with each new matrix element which i s referenced. To eliminate thrashing in the present example, i t is necessary to modify the code to process the matrix in the same order in which i t is stored. The run times with the modified code are given in Table I by the entries marked with an asterisk. With the modified program, the CPU/WC time ratio s t i l l decreases as the dimension of the matrix increases, but at a more acceptible rate. In the course of converting programs to execute on the SLASH 7, we have found i t necessary to modify several codes to minimize paging. In a l l cases encountered thus f a r , the changes were straightforward and required l i t t l e program reorganization. Thrashing would be d i f f i c u l t to eliminate in a program that required rapid and random access of a large data s e t , but we have not encountered this problem.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
194
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
TABLE I.
AND LARGE SCALE
COMPUTATIONS
BENCHMARK RESULTS FOR GIVENS.
CPU Time
Wall Clock Time
CPU/WC Time Ratio
2.5K
3 sec.
12 sec.
.25
5.7K
8 sec.
19 sec.
.42
100
10.IK
18 sec.
36 sec.
.50
125
15.8K
35 sec.
51 sec.
.69
150
22.6K
1 min. 18 sec.
.76
175
30.8K
59 sec. 1 min. 32 sec.
1 min. 57 sec.
.79
200
40.2K
1 min. 56 sec.
90 min. 53 sec.
.02
200*
40.2K
2 min. 24 sec.
3 min. 41 sec.
.65
290*
84.4K
7 min. 11 sec.
15 min. 57 sec.
.45
400*
160.4K
18 min. 31 sec.
47 min. 24 sec.
.39
Matrix Dimension
Program Size (Words)
50 75
* Modified GIVENS routine, see text. Benchmarks In this section we present results of programs which were run alone on our SLASH 7. Where available we also present the run times for other machines and, in p a r t i c u l a r , the UNIVAC 1110 which i s the computer at the Madison Academic Computing Center. The following is a short description of each job, the purpose of running the particular task, and the r e s u l t s . (1) CRUNCHER. This program is a small CPU bound job which includes four arithmetic operations plus exponentiation. The main purpose of running this program is to establish the expected accuracy of the Harris 48-bit double precision word (39 b i t mantissa) and to compare machine speeds. In Figure 1 we give the Fortran code for this program and in Table II, the results of CRUNCHER are compared with runs obtained on an IBM 370/195 and the UNIVAC 1110. Each computer has a different word length and the precision of the f i n a l result i s as expected. The "correct" answer is 2.0. For this particular benchmark the SLASH 7 is approximately 15 times slower than the IBM 370/195 and about one-half the speed of the UNIVAC 1110.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
14.
NORBECK
195
Memory Minicomputer
A N D CERTAIN
IMPLICIT DOUBLE PRECISION (A-H,0-Z) R00T=DSQRT(2.0D0) SUM=0.D00 DO 5 1=1,1 000 000 5 SUM=SUM+R00T*R00T/R00T -0.D00 SUM=(SUM/1 000 000.0D0)**2 WRITE(6,9)SUM 9 FORMAT(5H TW0=,E30.20) STOP
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
Figure
TABLE
II.
1.
FORTRAN listing of benchmark* CRUNCHER
BENCHMARK RESULTS FOR CRUNCHER Harris/7
IBM 370/195 UNIVAC 1110
48 Bits
Word Size
64 Bits
Answer 1.999 992 774 8 (Exact=2.0) 1
27.2 sec.
CPU Time
TABLE
III.
36 Bits
72 Bits
1.993 870 154 0
g 9 g
g g g
g g g
8
1.75 sec.
l e 9 g g
g g g
12.8 sec.
g g g
g 8
15.5 sec.
BENCHMARK RESULTS FOR MATMUL
CPU Time
TABLE IV.
UNIVAC 1110
Harris/7
UNIVAC 1110
5 min. 14 sec.
2 min. 18 sec.
BENCHMARK RESULTS FOR SCFPGM Harris/7
UNIVAC 1110
CPU Time
36 min. 3 sec.
26 min. 46 sec.
Wall Clock Time
54 min. 26 sec.
N/A
.66
N/A
CPU/WC Time Ratio
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
196
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
(2) MATMUL. In this program two 60x60 matrices are multiplied together 50 times. In Table III the results are given for the Harris computer and the UNIVAC 1110. Both runs are in double precision. (3) SCFPGM. This program package calculates the one- and two-electron molecular integrals needed for an ab i n i t i o electronic structure calculation and subsequently performs a restricted s e l f - c o n s i s t e n t - f i e l d (SCF) calculation using the integrals. The benchmark calculation involved a gaussian lobe basis set of 39 contracted functions appropriate to the carbon monoxide molecule. In this particular run more than 5χ10 gaussian integrals were calculated and the SCF program ran for 20 iterations. In Table IV we present the results of the bench mark. This program is CPU bound, with a 66% CPU/WC time efficiency.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
6
(4) LEAST SQUARES. This program, which was provided by Dr. J . C. Calabrese of our department, does a least-squares analysis of x-ray crystallographic data and is one part of a large x-ray data analysis package. Such calculations are responsible for a substantial portion of the CPU u t i l i z a t i o n of our computer. In Table V the times for a typical LEAST SQUARES run are given for both the Harris/7 and the UNIVAC 1110. This program is also CPU bound. TABLE V.
BENCHMARK RESULTS FOR LEAST SQUARES. Harris/7
UNIVAC 1110
CPU Time
26 min. 22 sec.
15 min. 9 sec.
Wall Clock Time
36 min. 39 sec.
N/A
CPU/WC Time Ratio
.72
(5) TPROB. This program, which was provided by Professor C. F. Curtiss and Mr. R. R. Woods of our department, calculates atom-diatom rotational excitation cross sections. This program was selected as a benchmark because (1) i t requires 193K words of core which is considerably larger than our 37K physical memory. (The program's paging rate is approximately 10 page requests/sec); (2) the program does a considerable amount of mixed-mode and com plex arithmetic so these functions of the Harris Fortran compiler could be tested; (3) for each set of input parameters the program requires approximately 7 hours of CPU time. In normal operation, this program runs at the lowest p r i o r i t y to soak up unused CPU cycles. The CPU and Wall Clock times for this run on the Harris/7 are given in Table VI. Although this program requires more than 5 times the available core, the program received a 57% CPU/WC time r a t i o .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
14.
NORBECK
TABLE VI.
A N D CERTAIN
Memory Minicomputer
197
BENCHMARK RESULTS FOR TPROB. CPU Time
448 min.
Wall Clock Time
942 min.
CPU/WC Time Ratio
.57
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
Batch-Run Benchmarks In this section we report the results of mixing the programs described in the previous section in order to determine the extent to which the virtual memory system can handle several jobs running simultaneously. For each run in Table VII, the total size for a l l programs greatly exceeds the 36 user pages of physical memory. The results in Table VII are representative of a larger set of s t a t i s t i c s for numerous job mixes run at various p r i o r i t i e s . These jobs are typical for our department. For each run in Table VII, the f i n a l job was aborted when the penultimate job was complete. Based on our experience to date, we make the following observations: (1) For most jobs mixes (e.g. example 1 in Table VII) the total CPU/WC time ratio is close to the combined result obtained when the jobs were run alone. In f a c t , for some mixes (e.g. example 2) there is an overall improvement in through-put. (2) Two or more large jobs running at the same p r i o r i t y results in a decrease in through-put (e.g. example 3 ) . This occurs because in this situation the operating system time-slices the available CPU cycles by alternating between the two programs. The result is more paging and less CPU u t i l i z a t i o n . (3) If two moderately paging jobs are mixed at the same p r i o r i t y , i t i s possible to generate a thrashing s i t u a t i o n . For example, i f two unmodified GIVENS jobs for 175x175 matrices are run alone or at different p r i o r i t i e s , they are CPU bound. If they are run at the same p r i o r i t y they thrash. To correct the problem, i t is necessary to suspend one job until the other is finished. (4) Programs execute faster i f the paging feature of the operating system rather than e x p l i c i t disk I/O commands, is used to reference data sets. This i s not possible i f the total code and data is greater than 256K. (5) For large data sets, where i t i s necessary to use explicit disk I/O commands, i t i s i n e f f i c i e n t to read into memory in a single command more pages of data than there are available user pages in the physical memory. For example, i f there are 4 user pages available and 6 user pages are read from disk, the f i r s t two pages w i l l be read, but then might be paged back to dis:k in order to generate space for the last two pages. A subsequent program reference to the f i r s t two pages results in additional swapping.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
198
TABLE VII.
COMPUTATIONS
BENCHMARK RESULTS FOR BATCH RUNS.
Job 1. SCFPGM
•r—
CPU Time
GIVENS (400)
4
TP ROB
0 11 min. 22 sec.
2. TPROB GIVENS (400)
SCFPGM TOTAL (Size=205K)
18 min. 44 sec. 57 min. 14 sec.
4
36 min. 54 s e c *
0
20 min. 2 sec. 56 min. 56 sec.
TOTAL (Size=359K) 3. GIVENS (400)
Wal 1 Clock Time
CPU/WC Time Ratio
6 27 min. 8 sec.
TOTAL (Size=396K)
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
AND LARGE SCALE
95 min. 18 s e c
.60
100 min. 25 sec.
.57
71 min. 55 sec.
.45
0 18 min. 42 sec. 0
13 min. 37 s e c * 32 min. 19 sec.
* Aborted before completion, see text. Cost-Effecti veness We enter a discussion of this topic with reluctance, since the real cost of operating either our departmental computer or the central university computer i s d i f f i c u l t to establish with precision. For the present, we r e s t r i c t consideration to estimating the dollar cost to the Chemistry Department of computations performed on the SLASH 7, compared to the cost of using the central computing center (MACC). Based on our records of actual charges, the effective cost at MACC is approximately $380 per hour at normal rates. This is a composite charge which includes CPU and memory u t i l i z a t i o n , 1/0 operations, and data and program storage. Most of the numbercrunching calculations are run at a variety of reduced rates (overnight or weekend), so we adopt an average cost of $210/hour. We next estimate the number of UNIVAC 1110 hours that we can generate on the SLASH 7. Based on our experience thus f a r , we expect to be able to achieve a maximum of 12 to 14 hours of SLASH 7 CPU time per day. This includes the estimated paging time and maintenance time. (The paging overhead could be reduced by adding more core.) Thus, at saturation we expect approximately 350 hours per month; this i s a conservative estimate. The equivalent UNIVAC 1110 time i s 175 hours per month, at a total cost of $450,000 per year.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
14.
NORBECK
A N D CERTAIN
Memory Minicomputer
199
In the three months since the SLASH 7 has been in f u l l operation, we have obtained an average of approximately 150 CPU hours per month, or approximately 40% of saturation. This corresponds to an annual cost at MACC of $180,000 per year. Interestingly, this is close to the purchase price of our SLASH 7 ($152,000). The direct costs to the Chemistry Department for operating the SLASH 7 are approximately $40,000 per year, which includes the salary of the systems manager, the on call/complete service contract ($1160 per month), and supplies. If the system cost i s amortized to zero value over a five-year period ($30,400 per year), the total cost of the SLASH 7 is approximately $71,000 per year, irrespective of the degree of u t i l i z a t i o n of the computer. Thus, at the present rate of usage, the cost effectiveness of the SLASH 7 is approximately 5:2, while at saturation i t w i l l be approximately 6:1. We emphasize that we consider this to be a conservative estimate of the effectiveness of the SLASH 7. Discussion After less than six months of f u l l operation, we feel that the SLASH 7 has been an effective departmental resource for research-oriented computing. Departmental users have had l i t t l e trouble in converting programs to the new machine. At present, a complete set of ab i n i t i o electronic structure programs (including configuration interaction), a complete x-ray data analysis package, the MINITAB s t a t i s t i c a l package, MIND03, CND0, Χα and other semiempirical programs, an NMR spectra-simulation package, and other chemistry codes are operating on the SLASH 7. For most applications the 48-bit double precision word-length has provided s u f f i c i e n t accuracy. After an i n i t i a l shake-down period of about four months, the hardware has proved r e l i a b l e . A l l standard programming languages are included in the operating system, with FORTRAN and BASIC receiving the most use. A significant by-product of the departmental computer has been greatly increased interaction among experimental and theoretical research groups. With more than ten groups actively using the computer, a stimulating research environment has been created in which expertise and ideas are shared across the boundaries of specialization and f i e l d . This perhaps w i l l be the most significant and long-lasting benefit of our departmental computer. Acknowledgements Total funding for our SLASH 7 was provided by the University of Wisconsin-Madison. Professors Richard F. Fenske and John C. Schrag, together with the Departmental Computer Committee, were instrumental in the acquisition of the computer.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
15 Computation in Quantum Chemistry on a Multi-Experiment Control and Data-Acquisition Sigma 5 Minicomputer A. F. WAGNER, P. DAY, and R. VANBUSKIRK Chemistry Division, Argonne National Laboratory, Argonne, IL 60439 Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
ARNOLD C. W A H L Science Applications, Inc., Rolling Meadows, IL 60008 There has been c o n s i d e r a b l e e f f o r t i n the past few years to lower the cost of performing quantum chemistry computations. An a l t e r n a t i v e that we have examined i s the u t i l i z a t i o n of a computer system whose primary task i s the p r o v i s i o n of r e a l - t i m e support for the e x p e r i m e n t a l i s t i n the l a b o r a t o r y . There are s e v e r a l r e a sons why such a system i s bound to have resources a v a i l a b l e for execution of a program on an ' a s - t i m e - i s - a v a i l a b l e ' b a s i s . The u s age of system resources r e q u i r e d by many o n - l i n e experiments i s u s u a l l y not constant. The system i s u s u a l l y scaled to provide s e r v i c e f o r worst case c o n d i t i o n s . E f f e c t i v e response to r e a l time events r e q u i r e s that the sum of the ' e v e n t - d r i v e n ' tasks should be l e s s than 100 percent of the system's c a p a c i t y . This i n c i d e n t a l ' f r e e time may then be used for doing u s e f u l work, such as quantum chemistry computations. In a way our f a c i l i t y provides a s e r v i c e to the computationally o r i e n t e d user i n the same way that a mini provides the s e r v i c e when connected to a network where some of the m i n i ' s are involved with instrument cont r o l and other m i n i ' s support the computational o p e r a t i o n s . The d i f f e r e n c e being that we perform a l l the tasks on a s i n g l e computer of somewhat l a r g e r c a p a b i l i t y than a mini-computer. For those i n s t a l l a t i o n s i n t e r e s t e d i n both greater experimental automation and quantum chemistry computing at nominal c o s t , our experience suggests that bootlegging batch computations on a computer d e d i cated to experimental c o n t r o l i s an a t t r a c t i v e and f e a s i b l e a l t e r n a t i v e to a c o l l e c t i o n of dedicated mini-computers. 1
System Overview Our chemistry d i v i s i o n of about 120 research s c i e n t i s t s i s involved i n b a s i c r e s e a r c h , r e q u i r i n g h i g h l y f l e x i b l e instrument automation, experiment c o n t r o l and experiment a n a l y s i s . In a d d i t i o n , there i s a strong program of ab i n i t i o c a l c u l a t i o n s , performed mostly on Argonne's c e n t r a l IBM 370/195. Frequent i n s t r u ment replacement and enhancements r e q u i r e r a p i d and e f f i c i e n t m o d i f i c a t i o n s to the a s s o c i a t e d computer programs and s e r v i c e s .
200 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
15.
WAGNER
E T
A L .
Quantum Chemistry
201
In 1967, before the p r o l i f e r a t i o n of low-cost m i n i s , a c a r e f u l study of our d i v e r s e l a b o r a t o r y automation needs l e d us to the c o n c l u s i o n that a c e n t r a l computer could support a l l of the r e a l time needs of the current and p r o j e c t e d instruments and, on the average, have enough l e f t - o v e r resources to support a u s e f u l amount of t h e o r e t i c a l computation [JL]. A s u i t a b l e hardware conf i g u r a t i o n would r e q u i r e an operating system to provide e f f e c t i v e p r o t e c t i o n , f a s t r e a l - t i m e response and e f f i c i e n t data t r a n s f e r . An SDS Sigma 5 computer s a t i s f i e d a l l our hardware c r i t e r i a . However i t was necessary to design and w r i t e our own operating system [2]. Services i n c l u d e program generation, experiment c o n t r o l , r e a l - t i m e a n a l y s i s , i n t e r a c t i v e g r a p h i c s , batch p r o c e s s i n g and long-term computation (hundreds of h o u r s ) . Our system i s c u r r e n t l y p r o v i d i n g r e a l - t i m e support f o r 26 c o n c u r r e n t l y running experiments (see F i g . 1), i n c l u d i n g an automated neutron d i f f r a c t o m e t e r , a pulsed NMR spectrometer, ENDOR and ESR spectrometers, i n f r a r e d spectraphotometers and n u c l e a r m u l t i - p a r t i c l e d e t e c t i o n systems [_3]. I t guarantees the p r o t e c t i o n of each u s e r ' s i n t e r e s t s and dynamically assigns core memory, d i s k space and 9 - t r a c k magnetic tape usage. M u l t i p l e x o r hardware c a p a b i l i t y allows the t r a n s f e r of data between a u s e r ' s device and assigned core area at r a t e s of up to 100,000 bytes/sec. R e a l time histogram generation f o r a user can proceed at r a t e s of 50,000 p o i n t s / s e c . The f a c i l i t y has been s e l f - r u n n i n g (without computer operator) f o r seven years with a mean time between f a i l ure of 11 days and an uptime of 99% of a weekly schedule of 160 hours. Foreground Tasks. Serving the foreground tasks i s the h i g h est p r i o r i t y f u n c t i o n of the system. These tasks c o n s i s t of the execution of programs a s s o c i a t e d w i t h each of the o n - l i n e i n s t r u ments. A software p r i o r i t y i s a s s o c i a t e d with each program cont r o l l i n g an i n t e r f a c e d instrument. Upon r e c e i p t of a request f o r execution ( e . g . , a data buffer i s f u l l ) , the u s e r ' s r e a l - t i m e p r o gram w i l l commence execution w i t h i n about 160 microseconds i f i t i s the highest p r i o r i t y "ready-to-run" j o b ; otherwise i t w i l l commence running when a l l higher p r i o r i t y tasks are completed. Since foreground s e r v i c e c y c l e s t y p i c a l l y complete i n l e s s than 100 m i l l i s e c o n d s (maximum allowed i s one second), the lowest p r i o r i t y foreground task seldom remains i n the "ready-to-run" s t a t e for more than a f r a c t i o n of a second. Non-Resident Program E x e c u t i o n . Real-time computational requirements vary over a wide range. The pulsed NMR spectrometer may r e q u i r e scan averaging a 16K word histogram every 300 m i l l i seconds, t a k i n g about 100 m i l l i s e c o n d s per update. Other e x p e r i ments may r e q u i r e the execution of a 25K word histogram t r a n s f o r mation program ( c o r r e l a t e d nuclear f i s s i o n p a r t i c l e s ) every m i nute, taking about 10 seconds. S t i l l other users r e q u i r e t h i s type of execution every 10 minutes w i t h execution times ranging
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
CENTRALIZED
COMPUTING
XEROX Sigma 5 Computer 230K bytes
REAL-TIME
FACILITY
tttrx
Graphic Display^
Figure 1.
Sigma 5 hyout
Electronic Spectra of Molten Salts
13
Magnetic Tape
Infrared Spectroscopy
Molecular Beam Research
Pulse Radiolysis at Electron Linac
Pulsed Proton and C NMR Spectroscopy ENDOR-Electron Nuclear Double Resonance
Nuclear Particle Counting in Chemistry Building, Tandem Van de Graaff and Cyclotron
Low Temperature Laboratory
Low-level Radioactivity Counting Facility
Neutron Diffraction at CP-5 Research Reactor
2 drives
Data Viewing and Manipulation
3 drives
Data Storage
Experimental Control, Data Acquisition and Data Analysis for 21 Remotely Located Experiments
^ Teletype
High Temperature Laboratory
Mass Spectrometers
^Teletype
Experiment Communication
Card Reader
Batch Processing
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
15.
WAGNER
E T A L .
Quantum Chemistry
203
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
from a few seconds to 30 seconds. To s a t i s f y t h i s v a r i e t y of demand without r e q u i r i n g an i n o r dinate amount of core memory, the o p e r a t i n g system provides for the time-shared execution of n o n - r e s i d e n t programs (not always r e s i d e n t i n core) i n the background core area (where batch and long-term are executed). These programs are d i s k - r e s i d e n t c o r e images of r e l a t i v e l y l a r g e programs r e q u i r e d i n f r e q u e n t l y and without severe time c o n s t r a i n t s . Two queues for t h i s type of s e r v i c e are p r o v i d e d : one with a 1 and the other with a 32 s e c ond time l i m i t . These programs are u s u a l l y w r i t t e n i n FORTRAN by the i n d i v i d u a l u s e r s . Batch P r o c e s s i n g . An open-shop b a t c h - p r o c e s s i n g c a p a b i l i t y i s supported by the system. Queuing jobs through the card reader provides the casual user with immediate feedback for the r a p i d debugging of programs. Although the o n - l i n e user has the o p t i o n of performing extensive a n a l y s i s of an experiment from a remote t e r m i n a l , the batch l e v e l i s often used where l a r g e amounts of output are r e q u i r e d or for the t r a n s f e r r i n g of f i l e data between magnetic tape and d i s k f i l e storage. The batch l e v e l i s a l s o used e x t e n s i v e l y to generate and debug code for the c o n t r o l of o n - l i n e experiments and for performing most of the computations described i n t h i s paper. The batch l e v e l may use a l l CPU c y c l e s not used by higher p r i o r i t y processes: foreground execution, non-resident execution, system l o a d i n g f u n c t i o n s . Under normal daytime l o a d i n g , the f o r e ground usage r e q u i r e s about 10 percent of the CPU c y c l e s and the non-resident execution about another 40 percent. Thus, i t appears to the batch user that h i s program i s executing on a computer with about h a l f the speed of a Sigma 5 computer dedicated to b a t c h - p r o c e s s i n g . Long Term Computation. U t i l i z a t i o n of the CPU seldon exceeds 40 percent i n a 24 hour p e r i o d , even w i t h considerable batch usage. The remaining CPU c y c l e s are made a v a i l a b l e for executing very long (hours to weeks) batch-type computations r u n ning at a p r i o r i t y l e v e l below batch p r o c e s s i n g . These jobs d i f fer from batch jobs i n that they only have access to d i s k f i l e s , not the batch p e r i p h e r a l s . Once i n i t i a t e d (from the card reader), the job i s read i n t o batch core memory from the d i s k anytime there i s s u f f i c i e n t space and higher p r i o r i t y usage p e r m i t s . The d a i l y saving of d i s k f i l e s on magnetic tape a l s o copies the c u r rent core image of the long term job along with i t s f i l e s . Automatic f i l e (and long-term) r e s t o r a t i o n at system b o o t - i n supports execution extending over long p e r i o d s . Table 1 i n d i c a t e s the d i s t r i b u t i o n of long-term jobs that might be performed during a busy week.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
204
MINICOMPUTERS
LENGTH (HOURS) 100
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
Table I .
A N DLARGE
SCALE
COMPUTATIONS
JOBS PER WEEK 15 4 2 0.3
Long-Term Job Length D i s t r i b u t i o n
Queuing Low P r i o r i t y Tasks. As the system i s r e q u i r e d to provide r e a l - t i m e support, the batch p r o c e s s i n g s u f f e r s . Since many of the batch jobs are I/O bound, c o n s i d e r a t i o n i s being given to s p o o l i n g a l l batch I/O. This would overlap the I/O with f o r e ground and non-resident executions and thus speed up the apparent execution speed of the batch j o b . As a further enhancement to batch execution, c o n s i d e r a t i o n i s a l s o being given to i n c l u d i n g batch i n the non-resident execution queue. This would further enhance batch processing speed and at the same e l i m i n a t e the p r i o r i t y advantage of the time-share u s e r . As implemented, the long-term queue c o n s i s t s of s t a r t i n g the next job from the card reader a f t e r the previous long-term job i s completed. A queue i s going to be set up to execute jobs i n a c y c l i c manner, with more execution time being given to the shorter jobs. Quantum Chemistry Computations The usefulness of the Sigma 5 system for the quantum chemist depends on the s c a l e of the c a l c u l a t i o n s . Broadly speaking, we may d i s t i n g u i s h l a r g e s c a l e c a l c u l a t i o n s , r e q u i r i n g tens of m i nutes on the equivalent of a fourth generation computer, and small scale c a l c u l a t i o n s requiring less resources. Large s c a l e work g e n e r a l l y i n v o l v e s the a^b i n i t i o c a l c u l a t i o n of wave functions for e i t h e r the bound motion of e l e c t r o n s and n u c l e i i n s t r u c t u r e s t u d i e s or for the unbound motion of p a r t i c l e s on p o t e n t i a l energy surfaces i n dynamic s t u d i e s . Such c a l c u l a t i o n s are most conveni e n t l y performed by e i t h e r a l a r g e computer ( e . g . , fourth generation) or a dedicated minicomputer. The Sigma 5 system i s n e i t h e r s u f f i c i e n t l y powerful or s u f f i c i e n t l y dedicated to be conveniently used f o r l a r g e s c a l e c a l c u l a t i o n s . Small s c a l e c a l c u l a t i o n s are v a r i e d and not r e a d i l y categorized. They i n c l u d e the rigorous c a l c u l a t i o n of r e l a t i v e l y simple wavefunctions ( e . g . , for diatomic n u c l e a r motion or for atom-atom e l a s t i c s c a t t e r i n g ) , the approximate c a l c u l a t i o n of wavefunctions or t h e i r i n f o r m a t i o n a l equivalent ( e . g . , Huckel theory or semic l a s s i c a l t r a j e c t o r y s t u d i e s ) , the r e d u c t i o n of the wavefunction to observable q u a n t i t i e s ( e . g . , e q u i l i b r i u m d i p o l e moments or d i f f e r e n t i a l cross s e c t i o n s ) , the curve or surface f i t t i n g of wavef u n c t i o n information at d i s c r e t e system geometries ( e . g . , the d i p o l e moment curve or the p o t e n t i a l energy s u r f a c e ) , and the
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
15.
WAGNER
E T A L .
Quantum Chemistry
205
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
graphies d i s p l a y of the r e s u l t s of a l l the above c a l c u l a t i o n s . Such c a l c u l a t i o n s r e q u i r e a f l e x i b l e but only moderately powerful computer such as the Sigma 5. In what follows we w i l l d e s c r i b e s e v e r a l general features of FORTRAN programming for the Sigma 5 system i n the batch and long term mode. Then we w i l l review sev e r a l small s c a l e quantum chemistry programs now i n o p e r a t i o n . FORTRAN Programming. A FORTRAN program can be w r i t t e n i n two ways: a deck of cards can be keypunched or card images can be entered on an i n t e r a c t i v e d i s p l a y t e r m i n a l . The l a t t e r a l t e r n a t i v e makes use of a page e d i t i n g system TEXTEDIT which permits the r a p i d t y p i n g , and e d i t i n g of card images followed by t r a n s m i t t a l to a d i s k f i l e . The f i l e can be accessed with a batch job and the card images l i s t e d and punched. Three types of terminals are a v a i l a b l e : L e a r - S e i g l e r 7700, Tektronix 4023, and Tektronix 4010. TEXTEDIT a l s o can be used w i t h a t e l e t y p e . There are s e v e r a l system r o u t i n e s which allow the FORTRAN programmer the use of e x c e p t i o n a l l y u s e f u l I/O i n s t r u c t i o n s . For reading data from cards, the system r o u t i n e READ causes the FOR TRAN statement CALL R E A D ( A , B , C , . . . ) to i n s t r u c t the computer to read i n a format free mode A, B, C, e t c . , on a s i n g l e card provided at l e a s t one blank space separates each member of the argument l i s t . For reading from or w r i t i n g on the d i s k , no JCL i s r e q u i r e d . A program may have up to two f i l e s open at one time ( f i l e p o i n t e r s are core r e s i d e n t ) . A f i l e may be p r i v a t e , i n which case i t i s defined by the statement CALL DEFDSK(NAME,NSEC) where NAME i s the address of a 20 character EBCDIC f i l e name and NSEC i s the number of s e c t o r s (256 words) i n the f i l e . Up to 120 p r i v a t e f i l e s may be defined by each u s e r . A scratch f i l e i s also a v a i l a b l e , and i t can be accessed by the statement CALL OPNSCR. Disk I/O can i n v o l v e r e a d i n g , w r i t i n g , or w r i t e - r e a d i n g f i x e d or variable records. The w r i t e - r e a d o p t i o n allows one l o g i c a l record to be w r i t t e n on the d i s k and the next l o g i c a l record to be read into the same core occupied by the f i r s t r e c o r d , thereby saving one d i s k r e v o l u t i o n p e r i o d (25 m s . ) . As an example of a FORTRAN c a l l for d i s k I/O, a v a r i a b l e record read occurs with the execu t i o n of the statement CALL DISKR(ARRAY,Ν,ISEC,I0VER) where ARRAY i s the name of the f i r s t element i n the r e c o r d , Ν i s the number of words per l o g i c a l r e c o r d , ISEC i s the d i s k sector number, and IOVER i s a f i l e overflow i n d i c a t o r . For magnetic tape I/O, l a b e l e d and unlabeled tape may be d i r e c t l y referenced by the standard FORTRAN I/O statements READ (U,F) and WRITE (U,F) where U i s the u n i t number of one of two tape d r i v e s and F i s the format statement number. P r i o r to execution, the r e l e v a n t magnetic tapes must be reserved and mounted. A tape d r i v e can be reserved by a s i n g l e JCL a s s i g n c a r d , for example,
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
206
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
! ASSIGN 111=LMT TAPELABEL where a l a b e l e d tape (LMT) with the l a b e l TAPELABEL i s reserved for d r i v e 111. The JCL f o r executing a batch or long term job i s p a r t i c u l a r l y simple. This i s i l l u s t r a t e d by the examples given i n F i g . 2. In example A , a subroutine or complete program i s stored as an obj e c t module i n a p r i v a t e l i b r a r y under the name of the program. Card 1 i n the example i s the job 'card which i s the f i r s t card i n every batch or long term submission. I t gives the u s e r ' s ID number (XXX) and name. The l a s t card i n the example i s the end-ofdata card which ends every batch submission. In example B, the main program START i s to be executed. Any unresolved e x t e r n a l references i n START l e a d to a s i n g l e pass search through subsequent e n t r i e s i n the p r i v a t e l i b r a r y . Then the p u b l i c l i b r a r y i s searched f o r the referenced u t i l i t y programs ( e . g . , DSQRT, ABS, etc.). In t h i s way i n d i v i d u a l subroutines stored as members of the p r i v a t e or p u b l i c l i b r a r y are s e l e c t e d and l i n k e d together to form an executable module. In example C, the program stored i n the p r i v a t e l i b r a r y under the entry MIDDLE w i l l be executed i n the long term mode. LT on card 2 i d e n t i f i e s the mode and NNN i s the estimated CPU time r e q u i r e d i n minutes. A l l input and output f o r a long term job must be v i a d i s k I/O, so there can be no data deck. Preceeding and f o l l o w i n g batch jobs read i n any input and p r i n t , punch, tape or p l o t any output. System r o u t i n e s permit i d e n t i f i c a t i o n of any long term job current i n the computer. The examples given i n F i g . 2 a l l d e a l with executing jobs v i a a p r i vate subroutine l i b r a r y . Many other ways of running jobs are poss i b l e and a l l have a JCL as simple as the examples i n F i g . 2. The various FORTRAN programming features we have j u s t d e s c r i b e d have a l l been used to assemble a l i b r a r y of o p e r a t i o n a l small s c a l e quantum chemistry programs. Several members of t h i s l i b r a r y we w i l l now d i s c u s s under the loose c a t a g o r i e s of s t r u c ture s t u d i e s , dynamic s t u d i e s , and g r a p h i c s . Several of these programs have been run on an IBM 360/195. While p r e c i s e comparisons are not a v a i l a b l e , our experience i n d i c a t e s that the Sigma 5 i s roughly 30 times slower than the IBM 360/195 f o r jobs that are not I/O bound. Structure S t u d i e s . The program POTFIT w i l l l e a s t squares f i t Morse and H u l b e r t - H i r s c h f e l d e r p o t e n t i a l functions to a set of diatomic p o t e n t i a l energies as a f u n c t i o n of v i b r a t i o n a l stretch. The n o n l i n e a r l e a s t squares code used i n the f i t i s an adaptation o f STEPIT(QCPE program #66) [ 4 ] . The method involves a p a t t e r n search f o r the nearest minima i n the l e a s t square exp r e s s i o n , s t a r t i n g from an i n i t i a l guess of parameter v a l u e s . The input to POTFIT c o n s i s t s of the o p t i o n f o r a Morse or H u l b e r t H i r s c h f e l d e r f i t , the masses of the atoms, the i n i t i a l guess of the parameter v a l u e s , and the s e t of data to be f i t . A l l input i s format f r e e . The output c o n s i s t s of a l i s t i n g of the i n p u t , the f i n a l parameter v a l u e s , the accuracy of the f i t , the s p e c t r o -
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
WAGNER
E T
AL.
Quantum Chemistry
Example Α . XXX MYNAME PROGRAM ROM ! FORTRAN LS (FORTRAN DECK) !EOD !JOB
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
1ST
Example B. !JOB XXX MYNAME JLOAD START (DATA DECK) !EOD
Example C. ! JOB !LOAD !EOD Figure 2.
XXX MYNAME MIDDLE LT
NNN
JCL examples described in text
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
208
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
scopic constants derived from the parameter v a l u e s , and the r e sulting vibrational levels. F i g u r e 3 reproduces the l a s t two pages of output for a Morse f i t to a set of ab i n i t i o c a l c u l a t e d p o t e n t i a l energies for H 2 . POTFIT runs i n 25K bytes and t y p i c a l execution times are about two minutes. The program CR360 i s another n o n l i n e a r l e a s t squares f i t t i n g routine. CR360 w i l l f i t a s u p p l i e d f u n c t i o n a l form to a set of f u n c t i o n a l values for one, two, or three independent v a r i a b l e , i . e . , CR360 w i l l produce curves, s u r f a c e s , or hypersurfaces. The method used i n f i t t i n g i s to d i s t i n g u i s h l i n e a r from n o n l i n e a r parameters, to solve the l i n e a r l e a s t squares problem for a supp l i e d g r i d of n o n l i n e a r v a l u e s , and to d i s p l a y maps of the sum of the squares of the e r r o r s on the g r i d . There i s no automatic search for the nearest minima to the i n i t i a l guess. The user, through examination of the maps, must s e l e c t the next set of nonl i n e a r parameter values to search through. The program was designed for problems where there i s the p o s s i b i l i t y of many minima and the l o c a t i o n and d i s p l a y of a l l the minima are important. P r i o r to execution of CR360, the user must i n s e r t i n t o the p r i v a t e l i b r a r y a subroutine t h a t , for any given set of f i t t i n g parameters and constants, w i l l c a l c u l a t e the f u n c t i o n a l form for any combinat i o n of independent v a r i a b l e s i n the data s e t . At execution, the input for CR360 c o n s i s t s of the number of independent v a r i a b l e s , any b i a s and s c a l i n g to be a p p l i e d to the data, the data and the weight that i s to be attached to each data p o i n t , the g r i d of n o n l i n e a r parameter v a l u e s , and the map r e s o l u t i o n s f o r the maps of the sum of the square of the e r r o r s over the g r i d . The output c o n s i s t s of the l i s t i n g of the input data and the data biased and s c a l e d , a l i s t i n g of the f i n a l parameter values and the f i t t i n g e r r o r s , and the d i s p l a y of up to ten maps of d i f f e r e n t r e s o l u t i o n s for the r e s i d u a l s over the g r i d . The program runs i n 120K bytes and i t s execution time i s s t r o n g l y dependent on the amount of data and the number of f i t t i n g parameters. For 150 data p o i n t s , 30 l i n e a r f i t t i n g parameters, and 200 nonlinear parameter g r i d p o i n t s , CR360 takes between 5 to 10 minutes. A rather s p e c i a l i z e d program used i n c o n j u n c t i o n with l a r g e s c a l e ab i n i t i o wavefunction c a l c u l a t i o n s i s STVTWC, a program modified from one by Hagstrom (QCPE program #9) [5]. This program c a l c u l a t e s s e l e c t e d diatomic o n e - e l e c t r o n i n t e g r a l s f o r a given b a s i s set of atom-centered S l a t e r type o r b i t a l s (STO). The i n t e g r a l s that can be requested are the o v e r l a p , the k i n e t i c energy, the nuclear a t t r a c t i o n , and the z-moment. For a given b a s i s set of STO's, the s e l e c t e d i n t e g r a l for every p a i r of o r b i t a l s i s computed to give a matrix of r e s u l t s . The i n t e g r a t i o n i s performed by expanding the STO s i n e l l i p t i c a l o r b i t a l s followed by a n a l y t i c integration. The input i s format free and c o n s i s t s of the i n t e r nuclear d i s t a n c e , the charge on the two n u c l e i , the s e l e c t i o n f l a g for the i n t e g r a l d e s i r e d , the number of STO s,and the quantum numbers and zeta value for each STO. The p r i n t e d output c o n s i s t s of a l i s t of the input followed by a l i s t i n g of the c a l c u l a t e d 1
1
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
WAGNER E T A L .
Quantum
209
Chemistry
280VC HZ MORSE CURVE F Î T PKfcDICTS DIAT8MIC CHARACTERIZED BY T H E Feu.9WîN9PARAM^TCR* · · * BINDING ENERGY · • • RE « •
0*17606*7230570*00 HARTREES 0**7910732*3820*01 EV Ot11048*1350130*03 KCAL/MBLE
1·*17639 BtJHRS 0.750183 ANUSTROMS
ASYMPTOTIC ENERGY • • · BETA PARAMETER · t
-0*99260*1 1965*>D*00 HARTREE8 ·0·270107*330*00*02 EV -0*6??878937165D*03 KCAL/MBLE
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
Ε·ΧΕ · . ».
···
-
0.10330*5860*10*01 INVERSE BOHRS 0*195217379058D*01 INVERSE ANGSTRBMS
Ε · 0.2022661127590-01 HARTREES . · . 0***39227*1*1*0*0* WAVENUMBERS ·
Ε · •
.
—
—
—
0*5809167739620-03 HARTREES 0*127*96*765990*03 WAVENUMBERS
0*2/08588952*60-03-HARTREES 0*59**66*76*050*02 WAVENUMBERS 0tlO108bl901880-O* HARTREES . 0.221856J1682*0*0l WAVENUMBERS
LPHAE ·
R 1) 2) 3) *) 5)
1·0000 t**000 1*8000 2·0000 2«b000
IdRATlBNAW
ειπτι •Ot1117*28352*23+01 •0.1l6860930?*63*0t •Ot1109920953*00*01 ·0·11ϋ26372029θ*Α1 ·0·10*88982?6323»01
0·6*?2*23?*7953·03 O«*9657*****97">-0*> ί>· 190605*0***53-03 ·3#179*7706*6213·03 3t3?0063185*66V0*
ANALYSIS
NUEVEW Ο 1 Ζ 3 * 5 6 . 7
-
El INPUT) -0*1117*21830000*01 ·0·11684?8960000*01 -0.11*9730350003*01 -0·1132366680003*01 -0·10*8866*20000*01
.
8 9
to
11 12 13 1* 15 16. 17
...
HARTREES 0*996*376****60-02 0.2903285*172*0-01 0·*693579835250·01 0*6367690898*70-01 0*7925618606890-01 0*9367362960520-01 0*10692923959*0*00 . . 0*11902301603*0*00 •
Ο·12995*9589260*00 0*1397250682710*00 0*1*83333**0680*00 0*1557797863170*00 0*16206*3950170*00 0*1671871701700*00. 0*1711*81117750*00 0*1739*72198320*00 Ο*17558**9*3*10*00 0*1760599353030*00
wAvENUMBERS
..
0*2187739*90**3*0* 0*A37197*036l63*3* Ο* 1030121*56*13*0* 0*13975*6*12**3*0* Ο*1739*71*73*03*05 0*2055898P3S*10*05 0·23*6823ΡθβΜ3*0» 0·2612252*83·>*3*0*
0·2852181?62Α*3*0* 0*30666133*6**3*0* 0*32555*3135**3*0* 0*3*189796289*3*05 0.3556901*27070*0* 0.3669333729ΑΡ0*0* . 0.3756266337ΡΑ3*0* 0.38176996*91*3*0* 0.38536336661*3*05 0.386*068387*63*0*
STBP
Figure 3.
The last two pages of printed output from a typical run of
POTFIT
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
210
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
i n t e g r a l matrix. The program runs i n 25K bytes and takes on the order of 3 minutes to execute a t y p i c a l j o b . A f i n a l program that r e l a t e s to s t r u c t u r e s t u d i e s i s FCF, a r o u t i n e to c a l c u l a t e the Franck-Condon f a c t o r s connecting the v i b r a t i o n a l s t a t e s of two d i f f e r e n t Morse p o t e n t i a l s . The c a l c u l a t i o n c o n s i s t s of the numerical determination of the Morse v i b r a t i o n a l wavefunctions followed by Simpson i n t e g r a t i o n of the product. The input c o n s i s t s of the reduced mass followed by the i d e n t i f i c a t i o n t i t l e , the Morse parameters, and the maximum v i b r a t i o n a l l e v e l of each e l e c t r o n i c s t a t e . The i n t e g r a t i o n range and g r i d s i z e complete the i n p u t . At most 1000 g r i d p o i n t s and 15 v i b r a t i o n a l s t a t e s i n each e l e c t r o n i c s t a t e are allowed. The output c o n s i s t s of a l i s t i n g of the input and the c a l c u l a t e d FranckCondon f a c t o r m a t r i x . There i s an o p t i o n to punch the matrix i f desired. The program runs i n 30K bytes and r e q u i r e s about f i v e minutes for a t y p i c a l case. Dynamic S t u d i e s . The program PHASE w i l l c a l c u l a t e the e l a s t i c cross s e c t i o n and d i f f e r e n t i a l cross s e c t i o n as a f u n c t i o n of c o l l i s i o n energy for an atom-atom c o l l i s i o n system. This i s done by c a l c u l a t i n g the quantum phase s h i f t for an input i n t e r a c t i o n p o t e n t i a l for each angular momentum quantum number of importance at the given c o l l i s i o n energy. The phase s h i f t c a l c u l a t i o n can be done e i t h e r r i g o r o u s l y by f i n i t e d i f f e r e n c e s o l u t i o n of the Schroedinger equation or approximately by a JWKB s o l u t i o n i n v o l v ing s p e c i a l quadrature formulas to handle the c l a s s i c a l t u r n i n g point s i n g u l a r i t y i n the JWKB i n t e g r a n d . Once a l l the phase s h i f t s are obtained at a given energy, the cross s e c t i o n and d i f f e r e n t i a l cross s e c t i o n are obtained by standard formulas. Before the program can be executed, the p r i v a t e subroutine l i b r a r y must c o n t a i n the appropriate r o u t i n e to read and d i s p l a y the parameters of the d e s i r e d i n t e r a c t i o n p o t e n t i a l and to c a l c u l a t e the potent i a l and i t s d e r i v a t i v e at any p o i n t i n space. Routines already a v a i l a b l e i n c l u d e those for Lenard-Jones and EXP-6 p o t e n t i a l s as w e l l as a s p l i n e p o t e n t i a l for a n u m e r i c a l l y c a l c u l a t e d set of potential points. Given the p o t e n t i a l r o u t i n e i n the l i b r a r y , the input to phase c o n s i s t s of the reduced mass, the energy (or v e l o c i t y ) spectrum, the p o t e n t i a l parameters, and parameters governing the f i n i t e d i f f e r e n c e or quadrature s o l u t i o n . The o u t put c o n s i s t s of a l i s t i n g of the input followed by a l i s t , for each energy, of the phase s h i f t as a f u n c t i o n of o r b i t a l angular momentum. Along w i t h each phase s h i f t , the program a l s o l i s t s the p o t e n t i a l at the t u r n i n g p o i n t , the c e n t r i f u g a l p o t e n t i a l at the turning p o i n t , the c o n t r i b u t i o n of the phase s h i f t to the cross s e c t i o n , and the accumulated cross s e c t i o n from a l l the preceeding phase s h i f t s . At the end of the phase s h i f t l i s t , the t o t a l cross s e c t i o n and i t s l o g are p r i n t e d and punched i f d e s i r e d . Opt i o n a l p r i n t o u t c o n s i s t s of a l i s t i n g of the d i f f e r e n t i a l cross s e c t i o n over an input range of s c a t t e r i n g angle, a l i s t i n g of the extrema i n the d i f f e r e n t i a l cross s e c t i o n , and a l i n e p r i n t e r p l o t
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
15.
WAGNER
E T A L .
Quantum Chemistry
211
of the l o g of the d i f f e r e n t i a l cross s e c t i o n versus s c a t t e r i n g angle. F i g u r e 4 reproduces such a p l o t from a study of the A r - H elastic scattering. The program runs i n 100K b y t e s . Execution times per energy vary with the energy, the c o l l i s i o n system, and the s o l u t i o n method (rigorous or JWKB). Rigorous c a l c u l a t i o n s take longer and vary from 20 seconds to ten minutes, w i t h a t y p i c a l time on the order of two minutes. Another o p e r a t i o n a l dynamics program i s TRAJ3D, a three d i mensional c l a s s i c a l t r a j e c t o r y r o u t i n e . The program i s an adapt a t i o n of Muckerman's r o u t i n e i n QCPE (program #229) [ 6 ] . Given the p o t e n t i a l energy surface, any three atom c o l l i s i o n system can be s t u d i e d . For a given energy, the standard s e m i c l a s s i c a l i n i t i a l c o n d i t i o n s are used for each t r a j e c t o r y and the c a l c u l a t e d f i n a l c o n d i t i o n s are analyzed according to the b i n method. After c a l c u l a t i n g the d e s i r e d number of t r a j e c t o r i e s , the program anal y z e s the bins for the nonreactive, r e a c t i v e , d i s s o c i a t i v e cross s e c t i o n s and d i f f e r e n t i a l cross s e c t i o n s . The method of c a l c u l a t i o n i s a combination of a Runga-Kutta and an 11th order p r e d i c t o r - c o r r e c t o r s o l u t i o n to Hamilton's equations. Before the p r o gram can be executed, the p r i v a t e l i b r a r y must c o n t a i n a package of routines to read and d i s p l a y the p o t e n t i a l energy surface parameters and to c a l c u l a t e the p o t e n t i a l energy and i t s d e r i v a t i v e at any point i n space. Given t h i s package, the input c o n s i s t s of the reduced mass, the c o l l i s i o n energy, the i n i t i a l s t a t e of the d i a tomic molecule, the range of impact parameters to be s t u d i e d , the i n i t i a l separation of the r e a c t a n t s , the number of t r a j e c t o r i e s , parameters r e l a t i n g to the method of c a l c u l a t i o n , and parameters r e l a t i n g to the a n a l y s i s and d i s p l a y of the r e s u l t s . The output c o n s i s t s of the above mentioned cross s e c t i o n s and d i f f e r e n t i a l cross s e c t i o n s as w e l l as the t r a n s l a t i o n a l energy l o s s as a funct i o n of s c a t t e r i n g angle and the c o r r e l a t i o n of r o t a t i o n a l and v i b r a t i o n a l energy gain or l o s s . TRAJ3D runs i n 100K b y t e s . For a given energy, the execution time v a r i e s w i t h the energy and the c o l l i s i o n system. A t y p i c a l time per t r a j e c t o r y i s about 1 m i nute. TRAJ3D i s not a good program to run i n the batch mode as the usual t r a j e c t o r y study would i n v o l v e up to a few thousand t r a j e c t o r i e s , i . e . , a number of hours of CPU time. However, the long term mode i s i d e a l for t r a j e c t o r y s t u d i e s . Under o r d i n a r y c i r cumstances the i n i t i a l and f i n a l c o n d i t i o n s of each t r a j e c t o r y would be saved for any a d d i t i o n a l a n a l y s i s d e s i r e d l a t e r . Thus TRAJ3D can be r e a d i l y decomposed i n t o an input program that places a l l input on the d i s k , a c e n t r a l program that reads the i n p u t , c a l c u l a t e s the t r a j e c t o r i e s , and s t o r e s the information for each t r a j e c t o r y on the d i s k , and f i n a l l y an a n a l y s i s program that r e duces the t r a j e c t o r y information to measureable q u a n t i t i e s . The c e n t r a l program i s run on long term and i n t h i s way s e v e r a l thousand t r a j e c t o r i e s can t y p i c a l l y be run overnight between two work days.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
212
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
12.C0C.JJ l6.5Cti-
J».CC:.JJ
io.ecsoc
22.C3-.3J ?*.tOv.X 2».coc:c 3%.CO,.,:.
36.00··.·:; 3a.S0v.-J
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
•••CO-.Oj. S2.0CÎ.0.» Sô.SOùC. «O.CCCvJ ée.io;:73.Mi-72.00001 .
•6.os c; ie.ccc.oo $0.00033 u
M.00--J
M.O'.'.c1C0.3CC.J tc2.cc::o IO*.OOC:O
»C6.CCC-JJCS.0CC33 113.000JC ii*.oc.:o ne.;:.-j 120.03υ>0 1M.00C3C «*.CCUO
«6.00000 12».CCCJ3 130.CCCCJ 132.30». C3 l3«.;c-..o 136.0CCC13*.5CwC3" •· t»o.jc:33 1*2.00^:^ U4.0333J is2.-3:.o 156.0COv3 158.3003 it2.ct;:-3 16*.C03C3 166.CC-..0 173.CC-J 172.«Cl.. 17··;0'.υ9 ,
178.3C3JJ ItO.COCbO
Figure 4.
Line printer display of a differential cross section produced by PHASE
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
15.
WAGNER E TAL.
Quantum Chemistry
213
Graphics. The program WSPLOT f i t s cubic s p l i n e polynomials to sets of data and p l o t s the r e s u l t i n g curves i n page s i x (8 1/2 χ 11) f i g u r e s . There i s an o p t i o n to make the X or Y a x i s 8 inches long with the other a x i s 6 inches l o n g . T i c k marks a u t o m a t i c a l l y occur every inch on both axes. The input c o n s i s t s of the l i m i t s of the X and Y axes, the f i g u r e t i t l e , the axes t i t l e s , and the input for each set of data. This data input c o n s i s t s of f i r s t a s e l e c t i o n of a s o l i d or dashed l i n e with or without symbols mark ing the data p o i n t s or no l i n e at a l l with the data marked by d i a monds. Then the a c t u a l data can be submitted i n two ways: one format free data card for each a b c i s s a - o r d i n a t e p a i r or under the c o n t r o l of a subroutine placed i n the p r i v a t e l i b r a r y p r i o r to ex ecution of WSPLOT. Options allow for the i n t e r n a l b i a s i n g and s c a l i n g of both the ordinate and the a b c i s s a . The data p o i n t s must be arranged i n order of i n c r e a s i n g a b c i s s a s . Up to two hun dred data points can be accomodated i n a s i n g l e s e t . Each set of data produces one curve on the f i g u r e . The p r i n t e d output con s i s t s of a l i s t i n g of a l l the i n p u t , a l i s t i n g of the biased and scaled data, and a l i s t i n g of the s p l i n e f i t to the data p o i n t s to t e s t for any numerical e r r o r s i n the s p l i n e f i t . The p l o t output i s on the p l o t d i s k f i l e . In a separate j o b , a system r o u t i n e w i l l d i r e c t the Calcomp p l o t t e r to p l o t what i s on the f i l e ; t h i s separate job r e q u i r e s only the job card (see F i g . 2) followed by a plot card: !LOAD PLOT As many figures and as many curves on each f i g u r e can be run i n a s i n g l e job as d e s i r e d . The program runs i n 25K bytes and takes about 30 sees to process a t y p i c a l c u r v e . Another graphics program that d i s p l a y s surfaces instead of curves i s KPLOT which makes a contour p l o t of any f u n c t i o n of the p o l a r coordinates (R, t h e t a ) . The t i t l i n g i n KPLOT assumes what i s being p l o t t e d i s the p o t e n t i a l energy surface of an atom ap proaching a diatom frozen at a f i x e d v i b r a t i o n a l s t r e t c h . However the contour p l o t i t s e l f can be for any s u r f a c e . P r i o r to the ex e c u t i o n of KPLOT, a surface subroutine must be stored i n the p r i vate l i b r a r y . T h i s r o u t i n e must handle a l l information regarding the surface to be p l o t t e d , i . e . , i t must read and d i s p l a y a l l s u r face parameters and c a l c u l a t e the surface at any a r b i t r a r y p o i n t . KPLOT then searches for contour values along given r a d i a l v e c t o r s . When a d e s i r e d contour value i s d i s c o v e r e d , i t i s n u m e r i c a l l y traced and the r e s u l t i n g curve i s s t o r e d i n the p l o t f i l e . I f the trace i s l o s t due to kinks i n the surface that are missed, an e r r o r message i s g i v e n . The input to KPLOT c o n s i s t s f i r s t of t i t l e cards for the f i g u r e , for the r a d i a l s c a l e i n s e r t i n the f i g u r e and for the chemical symbols of the AB+C p o t e n t i a l energy surface assumed i n the t i t l i n g . Then comes the maximum and minimum r a d i al values w i t h i n which the contours w i l l be p l o t t e d and the d i mensions of the f i g u r e . Next i s given the angles for which r a d i a l vector searches for contour values are to be performed. The
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
214
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
second to l a s t p i e c e of information i s the number of p o s i t i v e cont o u r s , the l a r g e s t contour, the f r a c t i o n r e l a t i n g adjacent c o n t o u r s , and the percent f i t of the computed contour trace to the a c t u a l contour. Both p o s i t i v e and negative contours are searched for and t r a c e d . F i n a l l y any surface parameters are submitted under the c o n t r o l of the subroutine discussed above. The p r i n t e d output c o n s i s t s of a l i s t i n g of the input and a d i g e s t of the t r a c e information for each contour. The p l o t output c o n s i s t of the f i g u r e and, as an o p t i o n , to the r i g h t s i d e of the f i g u r e , a summary of the contours found and where they were found. As always the p l o t information i s placed i n a p l o t d i s k f i l e to be accessed i n a second batch job by the system r o u t i n e PLOT. Figure 5 reproduces a f i g u r e produced by KPLOT; the p l o t t e d surface i s the p o t e n t i a l energy surface for L i + H2 with H2 frozen at 1.4 bohrs. The program runs i n 30K bytes and takes about 3 m i nutes to execute the p l o t i n F i g . 5. Assessment for Quantum Chemists The major advantages of the Sigma 5 system i s i t s power, f l e x i b i l i t y , s i m p l i c i t y of o p e r a t i o n , and nominal c o s t . Most FORTRAN programs for small s c a l e quantum chemistry c a l c u l a t i o n s r e quire l i t t l e reworking to become o p e r a t i o n a l on the system. The JCL, as i l l u s t r a t e d by F i g . 2, i s exceedingly simple and d i r e c t . The system i s open shop and thus each person d i r e c t l y runs h i s own job without the delay of working through an intermediate s t a f f of computer o p e r a t o r s . The nominal cost of the batch and long' term computations i s due to the f a c t that these c a l c u l a t i o n s use e x t r a c a p a b i l i t y unavoidable i n a c h i e v i n g the primary mission of d i r e c t experimental c o n t r o l . The disadvantages of the system for the quantum chemist come i n two forms: foreground i n t e r f e r e n c e and peer p r e s s u r e . Foreground i n t e r f e r e n c e of background batch and long term jobs occurs whenever the foreground tasks and non-resident program executions for experimental c o n t r o l a s s e r t t h e i r p r i o r i t y i n the use of the CPU. On the average, t h i s i n t e r f e r e n c e t i e s up the CPU 50% of the time during r e g u l a r working hours (8 AM to 5 PM) Monday through Friday). I t i s a l s o h i g h l y v a r i a b l e , ranging from no i n t e r f e r e n c e to as much as 55 minutes of i n t e r f e r e n c e per hour during r e g u l a r hours. A f t e r r e g u l a r hours, foreground i n t e r f e r e n c e i s not a subs t a n t i a l problem. As described e a r l i e r , s p o o l i n g , to permit I/O during foreground i n t e r f e r e n c e , and time sharing batch with c e r t a i n foreground jobs w i l l a l l e v i a t e some of the pressure of f o r e ground i n t e r f e r e n c e . However, foreground i n t e r f e r e n c e i s a fundamental feature of the system. Peer pressure c o n s t r a i n s batch or long term usage because, i n the open shop system, the length of time one user can t i e up the batch or long term f a c i l i t i e s i s i n v e r s e l y p r o p o r t i o n a l to the number of people i n l i n e w a i t i n g for the same f a c i l i t i e s . Since there are 120 research s c i e n t i s t s i n the d i v i s i o n , t h i s i s a sub-
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
15.
WAGNER
ET
AL.
215
Quantum Chemistry
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
15HF L I + H2: RHS = 1·H
SCALE
J
I
I
I
2
3
4
5
RHS = 1 ·4>
Figure 5. A plot produced by KPLOT
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
216
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
stantial problem. During regular working hours, the number of batch users per hour ranges from 1 to 20 with an average of 12. In practice, a job requiring more than 5 or 10 minutes generally attracts a crowd of users waiting to run jobs of less duration. After regular working hours, this is much less of a problem. For the long term mode of operation, a week's usage has been given in Table I. In practice, a long term job in the system for longer than 24 hours during the work week would cause others with shorter long term jobs to complain. As described earlier, the establishment of a queue would loosen the constraints of peer pressure by allowing very long long term jobs to run with reduced priority relative to shorter long term jobs. Foreground interference and peer pressure make it inconvenient at best and impossible at worst to run large scale quantum chemistry calculations on the Sigma 5 system. Such large scale computing requires access to either a standard large computer or a dedicated minicomputer. However as our examples indicate, the Sigma 5 system is very well suited for small scale quantum chemistry calculations. It has a power, flexibility, and simplicity of operation, a l l at nominal cost, that would be difficult and expensive to match with dedicated minicomputers. Thus for those laboratories interested in both greater experimental automation and a wide range of small scale quantum chemistry computations, our experience suggests that bootlegging batch and long term computing on a system dedicated to experimental control is a feasible alternative to a collection of mini-computers. Abstract Computation in quantum chemistry and dynamics is being performed in batch and long term mode on a Sigma 5 computer whose primary task is to provide real-time instrument control, data-acquisition and final analysis for 26 on-line experiments. A brief discussion will be given of the multi-programming operating system which provides, in order of priority, real-time interaction with a large number of concurrently running instruments, interactive graphics, time-sharing, batch and long term computation. The efficacy of this facility in three areas of computational chemistry will be reviewed. First, the analysis of wavefunctions and associated energies will be considered with several examples involving property calculations, analysis of potential curves, and least-squares fitting routines for potential energy surfaces. Next, dynamics programs for quantum elastic scattering and three body trajectory studies will be examined. Last, graphics (Calcomp Plots) programs will be discussed in regard to the display of potential energy curves and surfaces. The use of both batch and long term modes will be illustrated and several typical calculations discussed.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
15. WAGNER ET AL.
Quantum Chemistry
217
Acknowledgement s The primary programmer for POTFIT, STVTWC, WSPLOT, and KPOT was Dr. Walter J . Stevens, now of the National Bureau of Standards in Boulder, Colorado. The program FCF was a minor adaptation of a program written by Dr. Patricia Dehmer of the Physics Division at Argonne National Laboratory. Literature Cited [1] [2]
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
[3] [4] [5] [6]
Day, P. and Ktejci, H . , Proc. AFIPS FJCC (1968) 33, 1187-1196. Day, P. and Hines, J., Operating Systems Review (1973) 7 (4) 28-37. Day, P., Computer Networking and Chemistry, ACS Symposium Series 19, Peter Lykos, ed., (1975) 85-107. Chandler, J . P., Program #66 in QCPE Catalogue and Procedures (1974), X, 29. Hagstrom, Stanley, Program #9 in QCPE Catalogue and Procedures (1974), X, 19. Muckerman, J . T., Program #229 in QCPE Catalogue and Procedures (1974), X, 85.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
16 An Effective Mix of Minicomputer Power and Large Scale Computers for Complex Fluid Mechanics Calculations R. J. FREULER and S. L. PETRIE *
**
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
The Aeronautical and Astronautical Research Laboratory, Ohio State University, 2300 West Case Road, Columbus, O H 43220
A "hybrid" computer system employing minicomputers linked to various large scale computers has been implemented to perform varied, complex calculations in fluid mechanics. The requirement for and application of such a system is not unique to the area of fluid mechanics. Similar motivation exists in theoretical chemistry and advanced physics. Basically, the increased power of computing equipment, the exponential rise in costs associated with experimental analyses, and the greater versatility of numerical experimentation has led researchers to turn to various computing techniques to examine physical phenomena. Although costs are also rising in many areas of the computer industry, the fact that computational speed has been increasing much faster than computational cost explains the trend of increased use of computers for research based on theoretical computations. The purpose of the present paper is to describe a unique "hybrid" computing system which has been assembled at the Aeronautical and Astronautical Research Laboratory (AARL) of The Ohio State University to perform numerical experimentation with fluid flows. System
Description
and Background
T h e c o m p u t i n g s y s t e m i s c o n f i g u r e d w i t h two m i n i c o m p u t e r m a i n f r a m e s : a H a r r i s C o r p o r a t i o n SLASH 5 a n d a H a r r i s SLASH 4 c o n n e c t e d i n a n o n - r e d u n d a n t d u a l p r o cessor arrangement. Synchronous communication d e v i c e s a r e u t i l i z e d t o l i n k t h e SLASH c o m p u t e r s w i t h t h e d e s i r e d l a r g e s c a l e machine. AARL t y p i c a l l y employs
*Senior Computer Specialist, Member AIAA. **Professor, Associate Director, AARL. 218 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
16.
FREULER
A N D
PETRIE
Fluid Mechanics Calculations
219
d i a l - u p Remote J o b E n t r y ( R J E ) t o a n IBM S y s t e m / 3 7 0 M o d e l 168, a l t h o u g h o t h e r t y p e s o f m a i n f r a m e s s u c h a s those from C o n t r o l Data C o r p o r a t i o n (CDC) o r S p e r r y U n i v a c may be c a l l e d u p o n a s n e e d e d . The c o m m u n i c a t i o n i s accomplished with conventional d i a l - u p modems so t h a t t h e optimum c o n n e c t i o n o f t h e m i n i c o m p u t e r s t o a l a r g e s c a l e m a c h i n e c a n be o b t a i n e d f o r t h e p a r t i c u l a r problem at hand. A d e t a i l e d d e s c r i p t i o n o f t h e two SLASH c o m p u t e r s s y s t e m o f AARL i s i n c l u d e d a s A p p e n d i x A to t h i s paper. I t s h o u l d be n o t e d t h a t t h e d e s c r i p t i o n i n A p p e n d i x A makes r e f e r e n c e t o a H a r r i s S L A S H 6, n o t a S L A S H 4. The SLASH 4 was made a v a i l a b l e t o AARL u n t i l t h e SLASH 6 d e l i v e r y c o u l d be e f f e c t e d . As a r e s u l t , the work r e p o r t e d here d e a l s m a i n l y w i t h observations a b o u t a n d c o m p a r i s o n s b e t w e e n t h e H a r r i s SLASH 4 a n d t h e p r e v i o u s l y m e n t i o n e d IBM S y s t e m / 3 7 0 M o d e l 168. Some d i r e c t c o m p a r i s o n s b e t w e e n t h e SLASH 4 a n d t h e SLASH 6 h a v e b e e n made h o w e v e r a n d w i l l be r e v i e w e d l a t e r . It i s n o t a b l e t h a t t h e H a r r i s SLASH 4 h a s b e e n s e l e c t e d by o t h e r s f o r u s e i n l a r g e s c a l e c o m p u t a t i o n s (1), partic u l a r l y i n t h e o r e t i c a l chemistry (2). I n 1973, AARL e s t a b l i s h e d i t s D i g i t a l C o m p u t e r a n d Data A c q u i s i t i o n System a f t e r an e x t e n s i v e s u r v e y and b e n c h m a r k s by P e t r i e (3.) . The o r i g i n a l i n t e n t o f t h e c o m p u t e r s y s t e m was t o p r o v i d e a n o n - l i n e real-time d a t a a c q u i s i t i o n and r e d u c t i o n f a c i l i t y u t i l i z i n g a d i g i t a l c o m p u t e r b a s e d s y s t e m , r e p l a c i n g a n a n a l o g corno u t e r o f l i m i t e d c a p a b i l i t y and q u e s t i o n a b l e maintainability. T h i s d i g i t a l system, which i s used t o a c q u i r e and r e d u c e e x p e r i m e n t a l d a t a from t h e v a r i e d wind t u n n e l t e s t i n g f a c i l i t i e s a t AARL ( 4_) , i s b a s e d o n t h e H a r r i s SLASH 5 a n d was i n s t a l l e d f o r a p p r o x i m a t e l y $130,000 ( i n 1 9 7 3 ) , i n c l u d i n g t h e a n a l o g s i g n a l c o n d i t i o n i n g equipment and r e a l - t i m e p e r i p h e r a l g e a r . As i s so o f t e n t h e c a s e when t h e f i r s t d i g i t a l c o m p u t e r i s i n s t a l l e d a t a s i t e , t h e r o u t i n e use o f the computer s y s t e m was e x p a n d e d i n t o new a r e a s a t AARL. Soon, t h e r e a l need f o r t h e a v a i l a b i l i t y o f e x t e n s i v e computer power i n s u p p o r t o f t h e o r e t i c a l f l u i d m e c h a n i c s and o t h e r l a r g e s c a l e n u m e r i c a l c a l c u l a t i o n s was r e c o g n i zed. T h i s n e e d i s b e i n g s a t i s f i e d by t h e a d d i t i o n o f a H a r r i s SLASH 6 p r o c e s s o r c o n f i g u r e d i n t h e d u a l p r o c e s s o r a r r a n g e m e n t w i t h t h e SLASH 5 . The a d d i t i o n o f t h e SLASH 6 i n c l u d i n g a n 80 Mbyte d i s c d r i v e , a n o t h e r 9 t r a c k m a g n e t i c t a p e d r i v e , a 36 i n c h d r u m p l o t t e r , a n d s e v e r a l o t h e r p e r i p h e r a l s amounts t o a p p r o x i m a t e l y $180,000. Thus t h e e n t i r e system r e p r e s e n t s investm e n t s t o t a l i n g $310,000 i n two p h a s e s .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
220
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
SLASH 4
vs.
IBM
System/370 Model
AND LARGE SCALE
COMPUTATIONS
168
As r e l a t e d t o l a r g e s c a l e c o m p u t a t i o n s , t h e AARL SLASH c o m p u t e r s a r e c u r r e n t l y employed t o examine t h e performance of aerodynamic surfaces, u s u a l l y a i r f o i l s e c t i o n s or wings, under v a r y i n g f l u i d mechanical conditions. These analyses are accomplished w i t h a l i b r a r y o f some two d o z e n o r more c o m p u t e r p r o g r a m s , e a c h one o f w h i c h c a n p e r f o r m s p e c i f i c t y p e s o f c a l c u l a t i o n s . A l l o f t h e s e p r o g r a m s a r e c o d e d i n t h e FORTRAN l a n g u a g e . Some a r e d e r i v a t i v e s a n d m o d i f i c a t i o n s o f e a r l i e r v e r s i o n s and u n d o u b t e d l y c o n t a i n p o r t i o n s o f i n e f f i c i e n t and d e a d c o d e . Most, i n c l u d i n g those used f o r comparis o n s t o be made h e r e , a r e more o r l e s s t y p i c a l o f FORTRAN b a s e d l a r g e s c a l e c o m p u t a t i o n a l c o d e s w r i t t e n by r e s e a r c h e r s f i r s t a n d c o m p u t e r p r o g r a m m e r s s e c o n d . The c o d e s a r e t y p i f i e d by m o d e r a t e t o l a r g e memory r e q u i r e m e n t s due t o h e a v y u s a g e o f a r r a y s , and t h e y r e q u i r e s u b s t a n t i a l f l o a t i n g p o i n t p r o c e s s i n g power. The input/output (I/O) r e q u i r e m e n t s a r e m o d e r a t e i n most c a s e s , u s u a l l y c o n s i s t i n g o f a few c a r d i m a g e s i n p u t and a c o u p l e o f t h o u s a n d l i n e s p r i n t e d o u t p u t . S i n c e t h e p r o g r a m s v a r y g r e a t l y i n t h e i r memory r e q u i r e m e n t s , n u m e r i c a l s t a b i l i t y , and r u n t i m e s , no one m a c h i n e c a n be e x p e c t e d t o p e r f o r m i n an o p t i m u m way f o r a l l p r o g r a m s w h i c h m i g h t be u s e d i n t h e analys i s o f an a i r f o i l o r w i n g s e c t i o n . The SLASH 4 h o w e v e r , w i t h i t s 48 b i t f l o a t i n g p o i n t w o r d w i t h a 39 b i t mant i s s a p r o v i d i n g 10+ d i g i t a c c u r a c y *is w e l l s u i t e d t o most f l u i d m e c h a n i c s c a l c u l a t i o n s . On t h e o t h e r h a n d , t h e IBM S y s t e m / 3 7 0 s i n g l e p r e c i s i o n f l o a t i n g p o i n t word l e n g t h o f 32 b i t s w i t h 24 b i t m a n t i s s a i s o f t e n n o t l o n g e n o u g h , r e q u i r i n g u s e o f d o u b l e p r e c i s i o n (64 bit f l o a t i n g p o i n t w o r d l e n g t h w i t h 56 b i t m a n t i s s a ) w i t h an a c c o m p a n y i n g i n c r e a s e i n memory n e e d s and r u n t i m e s . Maximum u s e o f t h e SLASH c o m p u t e r s i s e m p l o y e d s i n c e the cost per run i s g e n e r a l l y l e s s than t h a t f o r the l a r g e r computer, even though t h e run t i m e s are longer w i t h t h e SLASH 4. A r e p r e s e n t a t i v e l i s t of execution time comparis o n s b e t w e e n t h e SLASH 4 a n d t h e IBM 370/168 f o r s e v e r a l cases of f o u r d i f f e r e n t a i r f o i l a n a l y s i s codes is presented i n Table I. E a c h c o d e i s i d e n t i f i e d by a u n i q u e l e t t e r a n d e a c h c a s e e x e c u t e d by t h e c o d e i s summarized. The e x e c u t i o n t i m e s i n d i c a t e d a r e f o r c a s e e x e c u t i o n o n l y ; c o m p i l e and l i n k - e d i t o r c a t a l o g t i m e i s not i n c l u d e d . As m e n t i o n e d e a r l i e r , a l l t h e c o d e s l i s t e d a r e w r i t t e n i n FORTRAN and draw on a m a n u f a c t u r e r s u p p l i e d l i b r a r y o f FORTRAN a r i t h m e t i c s u p p o r t r o u t i n e s ( S I N , COS, ALOG, e t c . ) . A l l programs
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
16.
FREULER
A N D PÉTRIE
221
Fluid Mechanics Calculations TABLE I
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
HARRIS SLASH 4 AND IBM SYSTEM/370 MODEL 168 COMPARISONS Program Code
Case Number
C**
I
I K . 30
728.20
537.1/.
C**
II
82.19
498.26
506.2$
C
III
149.84
1912.55
1176.4$
Ε
I
18.89
157.28
732.6$
Ε
II
18.78
172.86
820.5$
Ε
III
26.49
296.26
1018.4$
Ε
IV
45.47
427.05
839.2$
Κ
I
136.28
1807.26
785.9$
Κ
II
69.12
732.79
960.2$
Ν
I
123.53
1031.71
735.2$
784.86
7164.22
812.8$
Totals/Average
Execution Time i n CPU sec. SLASH 4 IBM 370/168
Percent Slower*
*SLASH 4 i s slower than IBM 370/168 by X%, based on IBM 370/168. **Compiler used on IBM f o r t h i s case was FORTRAN G l . A l l other cases used FORTRAN Η Extended on IBM w i t h the maximum o p t i mization l e v e l .
e x e c u t i n g o n t h e IBM m a c h i n e were r u n i n IBM s i n g l e precision. On t h e SLASH 4, a l t h o u g h s i m p l e a r i t h m e t i c o p e r a t i o n s a r e a l w a y s p e r f o r m e d w i t h a 39 b i t m a n t i s s a , the a r i t h m e t i c support l i b r a r y o f f e r s e i t h e r s i n g l e p r e c i s i o n r o u t i n e s w i t h a 24 b i t m a n t i s s a a c c u r a c y , o r d o u b l e p r e c i s i o n r o u t i n e s w i t h t h e f u l l 39 b i t m a n t i s s a accuracy. I t was d e t e r m i n e d t h a t r e s u l t s p r o d u c e d by P r o g r a m s C a n d Κ w o u l d be i m p r o v e d by u s i n g 39 b i t man t i s s a accuracy f o r a l l c a l c u l a t i o n s . On t h e SLASH 4, t h e change from s i n g l e p r e c i s i o n t o d o u b l e p r e c i s i o n a r i t h m e t i c r o u t i n e s i s a simple matter o f s e l e c t i n g a c o m p i l e o p t i o n , a n d s o t h i s was d o n e . Since the f l o a t i n g p o i n t w o r d on a H a r r i s SLASH c o m p u t e r i s a l w a y s 48 b i t s , t h e r e s u l t a n t i n c r e a s e i n memory r e q u i r e m e n t s a s a r e s u l t o f s e l e c t i n g t h i s double p r e c i s i o n compile-
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
222
MINICOMPUTERS
A N DL A R G E
SCALE
COMPUTATIONS
t i m e o p t i o n i s v e r y s m a l l a n d i s c a u s e d by t h e s l i g h t l y longer double p r e c i s i o n a r i t h m e t i c r o u t i n e s . T h e mecha n i s m h e r e i s t h a t t h e 24 b i t m a n t i s s a a c c u r a c y i n t h e s i n g l e p r e c i s i o n l i b r a r y r o u t i n e s i s extended t o t h e f u l l 39 b i t a c c u r a c y o f t h e n o r m a l H a r r i s SLASH s e r i e s f l o a t i n g p o i n t word. The p r o g r a m s were r u n i n a n o v e r l a y s t r u c t u r e o n t h e SLASH 4 b e c a u s e o f t h e i r memory r e q u i r e m e n t s . No o v e r l a y s t r u c t u r e was u s e d o n t h e IBM m a c h i n e . No a t t e m p t h a s b e e n made t o a d j u s t e x e c u t i o n t i m e s t o a c c o u n t f o r n o n - o v e r l a y i n g o n t h e IBM 370/168 a n d o v e r l a y i n g o n t h e SLASH 4. O v e r l a y i n g o n t h e S L A S H 4 i s t h e d i f f e r e n c e between b e i n g a b l e t o o b t a i n r e s u l t s o r n o t being a b l e t o r u n a t a l l f o r these programs. A fairer c o m p a r i s o n o f what i s r e q u i r e d i n t e r m s o f CPU s e c o n d s f o r each o f t h e computers r e s u l t s from not a d j u s t i n g f o r s u c h f a c t o r s a s n o n - o v e r l a y v s . o v e r l a y . I t w o u l d be exp e c t e d t h a t o v e r l a y s t r u c t u r e s u s u a l l y r e q u i r e more d i s c I/O o p e r a t i o n s a n d l o n g e r w a l l c l o c k e x e c u t i o n t i m e s , b u t h a v e o n l y a s l i g h t a f f e c t o n a c t u a l CPU s e c o n d s . R e f e r r i n g t o T a b l e I , i t c a n be s e e n t h a t t h e SLASH 4 r u n s o n l y a b o u t 8 t i m e s s l o w e r t h a n t h e IBM S y s t e m / 3 7 0 M o d e l 168 o n t h e a v e r a g e f o r t h e c a s e s p r e sented. S i n c e t h e same FORTRAN C o m p i l e r was u s e d f o r a l l c a s e s o n t h e SLASH 4, i t i s n o t a b l e t h a t t h e SLASH 4 r u n s o n l y a b o u t 5 t i m e s s l o w e r t h a n t h e IBM 370/168 when t h e FORTRAN G l C o m p i l e r i s u s e d o n t h e IBM m a c h i n e . A l t e r n a t i v e l y , i t would appear t h a t t h e G l C o m p i l e r g e n e r a t e s much l e s s e f f i c i e n t m a c h i n e c o d e t h a n t h e H Extended Compiler. Worst case comparison p o i n t s o u t t h a t t h e d i f f e r e n c e b e t w e e n t h e SLASH 4 a n d IBM 370/168 i s o n l y a f a c t o r o f 12. The
E f f e c t i v e Mix
B e c a u s e t h e SLASH 4 c o m p a r e s f a v o r a b l y w i t h t h e IBM 370/168, maximum u s e o f t h e SLASH c o m p u t e r i s employed s i n c e t h e c o s t p e r r u n i s g e n e r a l l y l e s s than f o r t h e IBM m a c h i n e , e v e n t h o u g h t h e r u n - t i m e s a r e l o n g e r b y a n a v e r a g e f a c t o r o f 8 w i t h t h e SLASH 4. Program development and i n i t i a l checkout a r e conducted w i t h t h e SLASH c o m p u t e r s w i t h a r e s u l t a n t l o w e r i n g o f program development. The H a r r i s i n t e r a c t i v e a l p h a n u m e r i c e d i t i n g p a c k a g e c o m b i n e d w i t h t h e FORTRAN I V extended c o m p i l e r r u n n i n g under c o n t r o l o f t h e H a r r i s D i s c M o n i t o r S y s t e m (DMS) p r o v i d e an e x c e l l e n t means for program development i n c l u d i n g program source e d i t ing and u p d a t i n g , program c o m p i l i n g f o r s y n t a c t i c a l e r r o r c o r r e c t i n g , and program e x e c u t i o n f o r debugging and c h e c k o u t p u r p o s e s . Because t h e computer
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
16.
FREULER A N D PETRIE
Fluid Mechanics Calculations
223
utilization charging r a t e s a r e c o n s i d e r a b l y lower f o r t h e s m a l l e r AARL c o m p u t i n g s y s t e m t h a n most l a r g e m a i n f r a m e s , p r o g r a m d e v e l o p m e n t c o s t s have been r e d u c e d appreciably. W h i l e much o f t h e c a l c u l a t i o n s c a n be c o n d u c t e d w i t h t h e SLASH c o m p u t e r s s y s t e m , t h e r e a r e s t i l l a f e w of t h e programs w i t h i n t h e f l u i d mechanics l i b r a r y w h i c h h a v e e x c e s s i v e memory r e q u i r e m e n t s a n d / o r v e r y long r u n times. In these cases, each program i s optimized f o r which ever l a r g e mainframes produces t h e best cost-performance. U s u a l l y , t h i s r e q u i r e s minor r e - w r i t i n g o f t h e p r o g r a m code t o p r o v i d e t h e b e s t t r a d e o f f s between e x e c u t i o n speed and s t o r a g e r e q u i r e ments. T h i s i n v o l v e s t a k i n g advantage o f hardware o r s o f t w a r e f e a t u r e s o f t h e p a r t i c u l a r computer s y s t e m on which t h e program i s being r u n . I t has been found that p r o g r a m s w h i c h do n o t s u f f e r f r o m n u m e r i c a l signific a n c e p r o b l e m s w i l l show m a r k e d i m p r o v e m e n t i n p e r f o r m a n c e when f i n e - t u n e d f o r u s e o n a n IBM S y s t e m / 3 7 0 M o d e l 168 a s c o m p a r e d t o a v e r s i o n f o r u s e o n t h e a v a i l a b l e CDC C y b e r 73 m a c h i n e a n d w i l l r e s u l t i n a l o w e r e d c o s t p e r r u n . These r e s u l t s stem d i r e c t l y from t h e d i f f e r e n c e s i n word s i z e employed f o r s i n g l e p r e c i s i o n a r i t h m e t i c b e t w e e n t h e IBM 370/168 a n d t h e CDC Cyber 73, and a l s o from t h e performance d i f f e r e n c e s b e t w e e n t h e two m a i n f r a m e s . The l a r g e m a i n f r a m e f i n e - t u n i n g o r o p t i m i z a t i o n p r o c e s s i s most o f t e n a p p l i e d t o v e r s i o n s o f p r o g r a m s r u n n i n g o n t h e IBM 370/168. The major o p t i m i z a t i o n i s p e r f o r m e d a u t o m a t i c a l l y by u s i n g t h e IBM FORTRAN H Extended Compiler. The c o m p a r i s o n c a s e s r e v i e w e d e a r l i e r demonstrated a s i g n i f i c a n t performance increase o b t a i n e d by u s i n g H E x t e n d e d i n s t e a d o f G l . The e x t r a time and cost r e q u i r e d f o r an H Extended comoile s t e o i s r e p a i d , sometimes s e v e r a l t i m e s o v e r , by t h e s a v i n g s obtained i n the resultant execution. The H E x t e n d e d Compiler i s r o u t i n e l y used f o r "production" v e r s i o n s o f programs and i s even h e l p f u l d u r i n g f i n a l stages o f program c h e c k o u t by o f f e r i n g good d i a g n o s t i c s and a cross reference capability. Minor r e w r i t i n g o f t h e code i s a l s o p e r f o r m e d t o a l l o w t h e c o m p i l e r optimizat i o n p r o c e s s t o be c a r r i e d t o t h e f u l l e s t p o s s i b l e e x tent. I n c r e a s e d e x e c u t i o n e f f i c i e n c y r e s u l t s when input/output o p e r a t i o n s a r e performed on v a r i a b l e s s t o r e d i n contiguous storage l o c a t i o n s , such as a COMMON b l o c k . T h e number o f s e p a r a t e COMMON b l o c k s i s kept as s m a l l as p o s s i b l e and t h e v a r i a b l e s ordered such that t h e l a r g e s t arrays occur l a s t i n t h e block. T h i s a l l o w s fewer base r e g i s t e r s and l e s s f r e q u e n t base r e g i s t e r loads, improving performance. In general,
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
224
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
references to higher-dimensional arrays are slower than references to lower-dimensional arrays. Thus use o f s e v e r a l o n e - d i m e n s i o n a l a r r a y s i s more e f f i c i e n t t h a n a single two-dimensional a r r a y i f the two-dimensional a r r a y c a n l o g i c a l l y be t r e a t e d a s a s e t o f o n e dimensional arrays. The u s e o f E Q U I V A L E N C E statements i s a v o i d e d where p o s s i b l e s i n c e e q u i v a l e n c e d v a r i a b l e s weaken t h e o p t i m i z a t i o n p r o c e s s e s . F i n a l l y , on IBM machines, a l o g i c a l IF statement w i l l g e n e r a t e e q u i v a l e n t or b e t t e r machine code t h a n the c o r r e s p o n d i n g a r i t h m e t i c IF statement f o r simple comparisons. These c o n s i d e r a t i o n s then form the b a s i s f o r f i n e - t u n i n g p r o g r a m s f o r an IBM m a c h i n e . A p h i l o s o p h y has been d e v e l o p e d and i s b e i n g r e f i n e d f o r the r o u t i n e use o f i n t e r a c t i v e g r a p h i c d i s p l a y d e v i c e s f o r the viewing of the numerical r e s u l t s . The r e s u l t s o f , s a y , a p r e s s u r e d i s t r i b u t i o n c a l c u l a t i o n o v e r a c o m p l e x a e r o d y n a m i c s u r f a c e c a n be v i e w e d more e f f e c t i v e l y w i t h g r a p h i c d i s p l a y s r a t h e r than i n conv e n t i o n a l t a b u l a r forms. Such r e v i e w o f the d a t a u s u a l l y m a n d a t e s t h a t a m i n i c o m p u t e r s y s t e m be a v a i l able; the general p a u c i t y of i n t e r a c t i v e graphics capab i l i t y and t h e h i g h c o s t o f s u c h g r a p h i c o p e r a t i o n s i n l a r g e c e n t r a l s y s t e m s i s w e l l known t o c e n t r a l s y s t e m users. The SLASH c o m p u t e r s s y s t e m o u t l i n e d h e r e i n c l u d e s a l a r g e drum p l o t t e r , a h i g h s p e e d s t o r a g e t y p e CRT d i s p l a y w i t h g r a p h i c c a p a b i l i t y , a n d s e v e r a l i n t e r a c t i v e t e r m i n a l s of e i t h e r the t e l e t y p e or alphanumeric CRT v a r i e t y . AARL i s i n t h e p r o c e s s o f f u l l y i m p l e m e n t ing a s o f t w a r e system which can a c c e s s d a t a r e t u r n i n g from e i t h e r a remote host s i t e o r from the in-house SLASH c o m p u t e r s . The m e c h a n i s m h e r e i s t h a t a d y n a m i c a l l y c r e a t e d d i s c f i l e on t h e SLASH c o m p u t e r s s y s t e m i s used t o save t h e d a t a , r e g a r d l e s s o f which machine was u s e d t o g e n e r a t e t h e r e s u l t s . T h i s f i l e can t h e n be a c c e s s e d by a p o s t - p r o c e s s i n g p r o g r a m o p e r a t i n g i n the minicomputer system f o r the purposes of p r e v i e w i n g the r e s u l t s , u s u a l l y d i s p l a y e d i n a p l o t t e d form, p r i o r t o c o m m i t t i n g t h e d a t a t o h a r d c o p y on t h e drum p l o t t e r . T h i s a l l o w s the r e s e a r c h e r t o have a c l o s e r i n t e r a c t i o n w i t h h i s program, which u s u a l l y o p e r a t e s i n a b a t c h e n v i r o n m e n t b e c a u s e o f i t s memory a n d / o r r u n t i m e r e quirements. I f the previewed r e s u l t s i n d i c a t e that perh a p s t h e p r o g r a m i n p u t s p e c i f i c a t i o n s s h o u l d be m o d i f i e d , t h i s c a n be d o n e by e d i t i n g t h e i n p u t p a r a m e t e r s at t h e i n t e r a c t i v e t e r m i n a l and t h e n r e s u b m i t t i n g t h e job t o w h i c h e v e r m a c h i n e ( i . e . , l o c a l l y t o t h e SLASH 4 or v i a RJE t o t h e l a r g e mainframe) i s b e i n g used f o r the c a l c u l a t i o n s . The o p e r a t i n g s y s t e m o f t h e SLASH c o m p u t e r s a l l o w s any i n t e r a c t i v e t e r m i n a l t o s u b m i t j o b s
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
FREULER A N D PETRIE
16.
225
Fluid Mechanics Calculations
to t h e l o c a l b a t c h stream o r t o t h e RJE queue. By t h i s " i n t e r a c t i v e b a t c h " t e c h n i q u e , the r e s u l t s from l a r g e scale computations c a n be s t o r e d , s a v e d , p r e v i e w e d , a n d o p t i o n a l l y committed t o h a r d copy a t a s i g n i f i c a n t c o s t s a v i n g s o v e r t h a t i n c u r r e d by u s i n g o n l y t h e c a p a b i l i t i e s o f a l a r g e c e n r a l system mainframe.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
The
SLASH F a m i l y
a n d a SLASH
6 v s . SLASH
4
Comparison
The H a r r i s f a m i l y o f c o m p u t e r s b e g a n w i t h t h e D a t a c r a f t 6024/1 p r o c e s s o r a n n o u n c e d i n 1968. (Datac r a f t C o r p o r a t i o n became a d i v i s i o n o f t h e H a r r i s C o r p o r a t i o n i n 1974). T h e 6024/1 was a 600 n a n o s e c o n d m a c h i n e , a n e x c e l l e n t p r o c e s s o r f o r l a r g e s c a l e computations. The 6024/3 (1 u s e e . ) f o l l o w e d i n 1970, with t h e SLASH 5 (950 n s e c . ) c o m i n g i n 1971, t h e SLASH 4 (750 n s e c . ) w i t h v i r t u a l memory i n 1973, a n d t h e SLASH 7 (400 n s e c . ) i n 1975. E a c h new m a c h i n e b a s i c a l l y o f f e r e d a c e n t r a l processor a r c h i t e c t u r e s i m i l a r toi t s p r e d e c e s s o r w h i l e i n c o r p o r a t i n g some a v a i l a b l e new t e c h n o l o g y a n d a d d i n g new f e a t u r e s . The SLASH 6, a n n o u n c e d i n J u n e 1976, i s b a s e d on a c o m p l e t e l y different processor architecture, u t i l i z i n g a microp r o g r a m m e d a s y n c h r o u n o u s CPU w i t h a 48 b i t c e n t r a l s y s tem b u s s t r u c t u r e . A d d i t i o n SLASH 6 i n f o r m a t i o n a p p e a r s i n t h e A p p e n d i x A. The e a r l i e s t i n f o r m a t i o n a b o u t t h e H a r r i s SLASH 6 p r o c e s s o r i n d i c a t e d t h a t p e r h a p s i t m i g h t be a s much a s 20$ f a s t e r t h a n t h e SLASH 4. As shown by T a b l e I I , TABLE I I Harris
SLASH
Job Stream Identification Job Job Job Job Job Job
1 2 3 4 5 6 Totals
4 a n d H a r r i s SLASH Job Time(sec) SLASH 4
80.591 15.450 19.360 1174.862 4.676 5.337 1300.276
6
Comparisons
Job Time(sec) SLASH 6
85.008 16.157* 20.983 1210.234 4.981 5.714 1343.077
NOTE:
Percent Slower+
5.481$ 4.576$ 8.383$ 3.011$ 6.523$ 7.064$ 3.292**
A l l c o m p a r i s o n s a r e f o r FORTRAN C o m p i l e r V e r s i o n 24 e x c e p t a s i n d i c a t e d b e l o w . * V e r s i o n 26 Compiler. * * R e f l e c t s m o s t l y t i m e o f J o b 4. +SLASH 6 i s s l o w e r t h a n SLASH 4 by X$, b a s e d on SLASH 4
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
226
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
t h i s i s c l e a r l y n o t t h e c a s e a s t h e SLASH 6 r u n s s l i g h t l y but c o n s i s t e n t l y slower f o r a l l the jobs l i s t e d i n the table. T h e t i m e s shown a r e f o r t h e t o t a l job t i m e f o r i d e n t i c a l j o b s o n t h e two SLASH c o m p u t e r s . The v e r s i o n s o f t h e FORTRAN C o m p i l e r , A s s e m b l e r , C a t a l o g e r , a n d s u p p o r t l i b r a r y w e r e t h e same o n t h e SLASH 6 a s o n t h e SLASH 4 w i t h one e x c e p t i o n a s n o t e d , a n d c o m p i l e r o p t i o n s were i d e n t i c a l . The j o b s r e f l e c t e d a range from simple compile and c a t a l o g (J0B5, J0B6) t o a c o m p i l e , c a t a l o g a n d e x e c u t i o n o f one o f t h e a i r f o i l a n a l y s i s codes (J0B4) r e q u i r i n g heavy f l o a t i n g p o i n t operations. AARL h a s t e s t e d a s i n g l e p r o g r a m o n s e v e r a l comp u t e r s and t h e r e s u l t s a r e g i v e n i n T a b l e I I I . This TABLE I I I Comparisons of S e v e r a l Computers f o r a S i n g l e Program Computer Tested IBM 370/165 IBM 370/168 CDC 6400 CDC Cyber 73 DEC PDP-15 GA. SPC-16/65 SLASH 1 SLASH 3 SLASH 3 SLASH 5 SLASH 4 SLASH 4 SLASH 6 NOTE:
Hardware F l o a t i n g P t . Available/Used
Word S i z e (Bits)
Yes/Yes
32 32 60 60 18 16 24 24 24 24 24 24 24
YesAes Yes/Yes
YesAes Yes/Yes No/No Yes/No
YesAes Yes/No No/No Yes/Yes Yes/No
YesAes
Execution Time (Seconds)
Time determined by timing subroutine unique for machine, except as i n d i c a t e d below.
13.24 11.39 68.43 52.28 340* 970* 146.65 90** 244** 244.29 60.11 183.22 64.75 each
*Timing mechanism unknown. **Timing determined by stop watch.
" b e n c h m a r k " p r o g r a m was u s e d i n e v a l u a t i n g t h e c a n d i d a t e c o m p u t e r s y s t e m s f o r t h e f i r s t p h a s e o f t h e AARL D i g i t a l C o m p u t e r a n d D a t a A c q u i s i t i o n S y s t e m i n 1973. More r e c e n t l y , i t h a s b e e n r u n o n t h e SLASH 4 , t h e S L A S H 6 , a n IBM 3 7 0 / 1 6 8 , a n d a CDC C y b e r 7 3 . While there i s a l w a y s a q u e s t i o n a s t o what a n y g i v e n s i n g l e b e n c h mark p r o g r a m a c t u a l l y t e s t s , t h e s p e e d o f e x e c u t i o n
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
16.
F~RF.TTT.F~R
A N D
PÉTRIE
Fluid Mechanics Calculations
227
i n d i c a t e s t h e e f f i c i e n c y o f t h e i n s t r u c t i o n s e t a n d how well the compiler optimizes the coding. T h i s FORTRAN p r o g r a m was c o n s t r u c t e d w i t h n o p a r t i c u l a r m a c h i n e i n m i n d , a n d i t s h o u l d n o t be u s e d a s a n o v e r a l l recommend a t i o n n o r condemnation f o r any s p e c i f i c computer. The e x e c u t i o n times l i s t e d a r e f o r t h e case s o l u t i o n time o n l y , no c o m p i l e t i m e o r l i n k - e d i t t i m e i s i n c l u d e d . The p r o g r a m h a s no r e q u i r e d i n p u t s a n d p r o d u c e s l i t t l e p r i n t e d o u t p u t ; i t i s t h e r e f o r e compute b o u n d a n d t h e r e s u l t s r e f l e c t m o s t l y c o m p i l e r g e n e r a t e d code efficiency and c e n t r a l p r o c e s s o r speed d i f f e r e n c e s .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
Maintenance
and O p e r a t i n g C o s t s
B o t h s o f t w a r e and hardware m a i n t e n a n c e on t h e d u a l SLASH p r o c e s s o r s y s t e m a r e done i n - h o u s e . A computer t e c h n i c i a n d e v o t e s a p p r o x i m a t e l y 80% o f f u l l t i m e t o c o r r e c t i v e and p r e v e n t a t i v e maintenance and t o t h e d e s i g n o f new d e v i c e i n t e r f a c e s . A s i n g l e system a n a l y s t s p e n d s a p p r o x i m a t e l y 50% o f f u l l t i m e o n s o f t w a r e m a i n tenance and development r e l a t e d t o t h e o v e r a l l system ( i . e . , c a n n o t be r e l a t e d t o a s i n g l e r e s e a r c h p r o j e c t ) . These e f f o r t s p l u s nominal c o s t o f expendable s u p p l i e s r e s u l t i n an average monthly maintenance and o p e r a t i n g c o s t o f a p p r o x i m a t e l y $1800. T h i s c o s t a p p e a r s t o be r e l a t i v e l y independent o f the system s i z e . That i s , our o p e r a t i n g c o s t s d i d n o t c h a n g e a p p r e c i a b l y when t h e SLASH 4 p r o c e s s o r s u b s y s t e m was a d d e d . T h e c o m p u t e r f a c i l i t y was f i n a n c e d b y T h e O h i o State University. The c a p i t a l equipment and implement a t i o n costs are being recovered with connect rate charges t o a l l users. S i n c e t h e system i s used f o r d a t a a c q u i s i t i o n as w e l l as s t r a i g h t n u m e r i c a l computat i o n s , w a l l - c l o c k t i m e a c c o u n t i n g c a n n o t be u s e d . Ins t e a d , t h e c o n c e p t o f c o n n e c t t i m e i s employed where u s e r s a r e charged i f they a r e connected t o t h e system. For e x a m p l e , a u s e r who i s c o n n e c t e d t o a n A/D c o n v e r t e r must be c h a r g e d f o r u s e o f t h e s y s t e m , e v e n t h o u g h data a c q u i s i t i o n i s not i n p r o g r e s s s i n c e h i s connect i o n p r e c l u d e s u s e o f t h a t p o r t i o n o f t h e system by others. The c h a r g i n g scheme i s d e s i g n e d t o r e c o v e r t h e i n s t a l l a t i o n costs over a f i v e year period. For the o r i g i n a l SLASH 5 s y s t e m t h i s r e q u i r e d a n a v e r a g e r e c o v e r y o f $2000 p e r m o n t h a t a c o n n e c t i o n r a t e c h a r g e of $45/hour. For the dual processor configuration, a r e c o v e r y o f $5200 p e r m o n t h i s r e q u i r e d w i t h a c o n n e c t i o n r a t e charge o f $76/hour. O v e r t h e l a s t y e a r , we have had l i t t l e d i f f i c u l t y i n m e e t i n g t h e s c h e d u l e d cost recovery.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
228
MINICOMPUTERS
AND
LARGE SCALE
COMPUTATIONS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
Conclusions The AARL a p p r o a c h t o c o m p l e x n u m e r i c a l e x p e r i m e n t a t i o n a l l o w s g r e a t f l e x i b i l i t y i n t h e c h o i c e o f computer to conduct a set of s p e c i f i c c a l c u l a t i o n s . The d i a l - u p c a p a b i l i t y t o o t h e r computing systems, combined w i t h t h e i n - h o u s e c o m p u t i n g power a v a i l a b l e , has prov i d e d s i g n i f i c a n t a d v a n t a g e s o v e r more c o n v e n t i o n a l a r r a n g e m e n t s w h i c h employ e i t h e r a l a r g e , s i n g l e mainframe computer or a d e d i c a t e d minicomputer system. In t h e d i a l - u p mode o f o p e r a t i o n u t i l i z i n g modems o p e r a t i n g o v e r s t a n d a r d t e l e p h o n e l i n e s , t h e AARL c o m p u t i n g s y s tem c a n i n t e r a c t w i t h any h o s t c o m p u t i n g s i t e w h i c h c a n support communications from a remote t e r m i n a l . The SLASH c o m p u t e r s u s e d a t AARL o f f e r a c o s t - e f f e c t i v e a l t e r n a t i v e to l a r g e s c a l e machines f o r a l a r g e majori t y of the f l u i d mechanics c a l c u l a t i o n s performed at AARL. The H a r r i s SLASH 4 o r t h e H a r r i s SLASH 6 a r e w e l l s u i t e d f o r the v a r i e d t a s k s of program d e v e l o p ment, " p r o d u c t i o n " r u n n i n g , and Remote J o b E n t r y comm u n i c a t i o n s w i t h the l a r g e machines. W h i l e the approach described has been u s e d f o r n u m e r i c a l e x p e r i m e n t s i n f l u i d m e c h a n i c s , i t c a n be a p p l i e d t o any disc i p l i n e requiring extensive numerical calculations.
Appendix
A.
The D i g i t a l System.
Computer
and
Data
Acquisition
The AARL D i g i t a l C o m p u t e r and D a t a A c q u i s i t i o n S y s t e m i s an example o f s t a t e - o f - t h e - a r t techniques i n t h e c o m p u t e r and e l e c t r o n i c s f i e l d s a p p l i e d t o experim e n t a l l y and t h e o r e t i c a l l y o r i e n t e d r e s e a r c h . The m a j o r components o f t h e d a t a a c q u i s i t i o n and reduction p o r t i o n o f t h e s y s t e m a n d t h e i n t e r - r e l a t i o n s h i p s among t h e d e v i c e s and t h e c e n t r a l p r o c e s s i n g u n i t s a r e shown schematically i n F i g u r e App-1 below. The system can be b r o k e n i n t o f o u r g r o u p s o f components f o r d e s c r i p t i v e p u r p o s e s : (1) t h e a n a l o g f r o n t end c o n s i s t i n g o f v a r i ous a n a l o g and s i g n a l c o n d i t i o n i n g d e v i c e s ; (2) the c e n t r a l p r o c e s s i n g u n i t s ( C P U ) ; (3) the v a r i o u s input and o u t p u t p e r i p h e r a l d e v i c e s (I/O d e v i c e s ) t o h a n d l e assorted I/O functions associated w i t h more t y p i c a l c o m p u t e r s y s t e m s ; and (4) t h e Remote J o b E n t r y (RJE) s u b s y s t e m w h i c h e n a b l e s c o m m u n i c a t i o n w i t h any remote h o s t c o m p u t e r i n a d i a l - u p mode o f o p e r a t i o n . The a n a l o g f r o n end s e r v e s t o i n t e r f a c t c o n t i n u o u s a n a l o g s i g n a l s to the d a t a a c q u i s i t i o n c e n t r a l p r o c e s s i n g u n i t i n d i g i t a l ( d i s c r e t e ) form. Analog signals e n t e r t h e c e n t r a l p a t c h p a n e l w h e r e t h e y may be r o u t e d
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
FREULER AND
16.
229
Fluid Mechanics Calculations
PETRIE
SLASH 6
CENTRAL
PROCESSING SLASH 5
PROCESSING
I/O
SCIENTIFIC A R I T H M E T I C UNIT
UNIT
CHANNELS
32K
UNIT
CENTRAL
I/O
WORDS
64K
CORE MEMORY
CHANNELS
WORDS
SEMICONDUCTOR
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
MEMORY ANALOG TO DIGITAL
100
KHZ
DISC
DISC STORAGE MB
STORAGE
10.8
70.5
MB
ANALOG TO DIGITAL
400 KHZ
MAG
SIGNAL
TAPE
75
CONDITION
IPS
ANALOG TO DIGITAL
8 KHZ
DIGITAL TO ANALOG
GRAPHIC CRT TERMINAL
MAG
TAPE
45 READER
RJE SYNCHRONOUS CONTROLLER
LINE PRINTER ASR
FLOPPY
33
TERMINAL
DISC
LA
36
TERMINAL
Γ ι I
IPS
CARD
LSI
11
ANALOG T(J DIGITAL
PROCESSOR
DIGITAL PLOTTER
CRT TERMINAL
τ ι 743
CASSETTE TAPE
TERMINAL
REMOTE DATA
Figure 1.
LOGGER
The AARL digital computer and data acquisition system
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
230
MINICOMPUTERS
AND LARGE SCALE COMPUTATIONS
as d e s i r e d t o t h e v a r i o u s s i g n a l c o n d i t i o n i n g d e v i c e s and/or c o n v e r t e r s . Two a n a l o g - t o - d i g i t a l (A/D) converter systems a r e a v a i l a b l e : (1) a h i g h speed m u l t i p l e x e d A/D c o n v e r t e r s y s t e m w h i c h a c c e p t s up t o 128 different i a l h i g h - l e v e l inputs with a f u l l scale voltage of + 10V, p r o v i d e s a r e s o l u t i o n o f 10 b i t s , h a s an i n p u t i m p e d a n c e o f 10 0 megohms, a n d h a s a t h r o u g h p u t r a t e o f 100 kHz; (2) a medium s p e e d m u l t i p l e x e d c o n v e r t e r s y s tem w h i c h a c c e p t s up t o 64 d i f f e r e n t i a l l o w - l e v e l i n p u t s w i t h a f u l l s c a l e v o l t a g e o f + 1000 m i l l i v o l t s , prov i d e s a r e s o l u t i o n o f 12 b i t s , a n d h a s a t h r o u g h p u t r a t e o f 8 kHz. The l a t t e r s y s t e m h a s 8 program-controll a b l e g a i n r a n g e s and i s t r a n s f o r m e r c o u p l e d t o a l l o w h i g h common mode v o l t a g e s . Eight p r e - a m p l i f i e r s are c u r r e n t l y a v a i l a b l e f o r s i g n a l c o n d i t i o n i n g and t e n b r i d g e b a l a n c e and s p a n c o n t r o l u n i t s a r e i n c l u d e d t o accommodate s t r a i n gage bridge type sensors. A 5 channel d i g i t a l - t o - a n a l o g (D/A) s y s t e m i s a l s o a v a i l a b l e w h i c h c a n be u s e d f o r transmission of various c o n t r o l signals. Part of the D/A s y s t e m i s p r e s e n t l y u s e d t o d r i v e an a n a l o g X-Y plotter. A l s o i n c l u d e d a r e 16 d i g i t a l r e l a y o u t p u t s , 8 d i s c r e t e i n p u t s w i t c h e s , a n d a p r o g r a m m a b l e 10 kHz interval timing unit. The AARL D i g i t a l C o m p u t e r a n d D a t a A c q u i s i t i o n S y s t e m u t i l i z e s two c e n t r a l p r o c e s s i n g u n i t s w h i c h a r e operated i n a non-redundant dual p r o c e s s o r c o n f i g u r a tion. One p r o c e s s o r i s a s s i g n e d t h e o n - l i n e d a t a a c q u i s i t i o n and r e d u c t i o n t a s k s w h i l e t h e s e c o n d g e n e r a l l y i s a s s i g n e d most o t h e r t a s k i n c l u d i n g but n o t l i m i t e d to o f f - l i n e data r e d u c t i o n , program development a n d m a i n t e n a n c e , a n d h e a v y f l o a t i n g p o i n t s c i e n t i fic calculations. The p r o c e s s o r s a r e d i r e c t l y c o n n e c ted v i a a CPU-to-CPU l i n k and i n a d d i t i o n , t h e y share a d i s c c a r t r i d g e mass s t o r a g e d e v i c e . The d a t a a c q u i s i t i o n c e n t r a l p r o c e s s i n g u n i t i s a H a r r i s C o r p o r a t i o n SLASH 5 a n d c o n s i s t s o f a n a r i t h metic u n i t , c o n t r o l u n i t , i n t e r f a c e elements f o r the p l a n a r c o r e memory, a n d t h e i n p u t - o u t p u t c h a n n e l interface. The p r o c e s s o r h a s a 950 n a n o s e c o n d s f u l l c y c l e t i m e a n d a f i x e d w o r d l e n g t h o f 24 b i t s p l u s p a r i t y . T h e r e a r e o v e r 120 g e n e r i c i n s t r u c t i o n t y p e s a v a i l a b l e at the assembly language l e v e l . The S L A S H 5 o p e r a t e s on and f r o m 24 b i t d a t a a n d i n s t r u c t i o n w o r d s . The SLASH 5 e m p l o y e s a m u l t i - a c c e s s bus s t r u c t u r e , f u l l y p a r a l l e l b i n a r y a r i t h m e t i c , f u l l y b u f f e r e d 1/0 channels, a n d s i n g l e a d d r e s s c a p a b i l i t y d i r e c t t o 96 K b y t e s a n d i n d i r e c t a n d / o r i n d e x e d t o 192 K b y t e s . There are f i v e g e n e r a l p u r p o s e r e g i s t e r s , t h r e e o f w h i c h may be u s e d for i n d e x i n g which i s performed without a speed pen-
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
16.
FREULER
AND
PETRIE
Fluid Mechanics Calculations
231
alty. Memory may be a c c e s s e d a t t h e w o r d , d o u b l e w o r d , and b y t e l e v e l s . Memory s i z e i n t h i s p r o c e s s o r i s 32,768 w o r d s ( 3 2 K ) a n d i s e x p a n d a b l e t o 64K words. The s e c o n d c e n t r a l p r o c e s s i n g u n i t i s a H a r r i s C o r p o r a t i o n S L A S H 6, t h e n e w e s t member o f t h e H a r r i s SLASH S e r i e s f a m i l y . The SLASH 6 o f f e r s t o t a l software c o m p a t i b i l i t y w i t h t h e SLASH 5 b u t o f f e r s a m i c r o p r o g r a m m e d a r c h i t e c t u r e a s y n c h r o n o u s CPU w i t h a c e n t r a l s y s t e m bus s t r u c t u r e . O t h e r s t a t e - o f - t h e - a r t SLASH 6 f e a t u r e s i n c l u d e MOS memory w i t h e r r o r c o r r e c t i o n , b i p o l a r m i c r o p r o c e s s o r A r i t h m e t i c - L o g i c U n i t ( A L U ) , and m i c r o c o d e e x e c u t i o n PROMS. The ALU i s c o m p r i s e d o f s i x high-speed microprocessor chips - each r e p r e s e n t i n g a 4 bit logic slice. The a u x i l i a r y PROMS a r e u t i l i z e d f o r i n s t r u c t i o n d e c o d i n g and s u b s e q u e n t m i c r o c o d e e x e c u t i o n - r e s u l t i n g i n program and f e a t u r e c o m p a t i b i l i t y w i t h t h e H a r r i s SLASH 5 p r o c e s s o r . Memory s i z e i n t h i s p r o c e s s o r i s 65,536 w o r d s (64K) a n d i s e x p a n d a b l e t o 256K w o r d s v i a a demand p a g i n g v i r t u a l memory o p t i o n . The i n p u t - o u t p u t s y s t e m , e x c l u s i v e o f t h e d e v i c e s w h i c h c o m p r i s e t h e a n a l o g f r o n t end, c o n s i s t s of the f o l l o w i n g p e r i p h e r a l d e v i c e s : (1) a removable pack d i s c s y s t e m w i t h 70.5 m e g a b y t e s f o r m a t t e d c a p a c i t y a n d a 342.7 kHz t r a n s f e r r a t e ; ( 2 ) a c a r t r i d g e d i s c s y s t e m i n c l u d i n g one f i x e d d i s c p l a t t e r a n d one r e m o v a b l e d i s c c a r t r i d g e e a c h w i t h a 5.4 m e g a b y t e s f o r m a t t e d capacity a n d a 89-5 kHz t r a n s f e r r a t e ; (3) a d u a l d e n s i t y 800/ l600 b i t s / i n c h 9 track i n d u s t r y compatible magnetic t a p e d r i v e w i t h a n o m i n a l t a p e s p e e d o f 75 i n c h e s / s e c o n d and v a c u u m c o l u m n t a p e h a n d l i n g ; (4) an 800 b i t s / i n c h 9 t r a c k magnetic tape d r i v e w i t h a nominal t a p e s p e e d o f 45 i n c h e s / s e c o n d ; (5) a 300 c a r d s / m i n u t e c a r d r e a d e r ; ( 6 ) a 135 c h a r a c t e r s / l i n e , 400 lines/ m i n u t e l i n e p r i n t e r ; (7) a n ASR-33 s t a n d a r d t e l e t y p e w i t h p a p e r t a p e f a c i l i t i e s ; (8) a T e k t r o n i x 4010 cath ode r a y t u b e (CRT) o p e r a t i n g o v e r a n a s y n c h r o n o u s i n t e r f a c e a t a 96ΟΟ baud r a t e p r o v i d i n g g r a p h i c as w e l l a s a l p h a n u m e r i c d i s p l a y c a p a b i l i t i e s ; a n d (9) a f o u r p e n 36 i n c h d r u m t y p e p l o t t e r o f f e r i n g 0.0025 i n c h r e s o l u t i o n a n d d r a w i n g s p e e d s o f up t o 8 i n c h e s / s e c o n d on a m a j o r a x i s . The R e m o t e J o b E n t r y ( R J E ) s u b s y s t e m , w h i c h i s shown s c h e m a t i c a l l y i n F i g u r e A p p - 2 , s u p p o r t s c o m m u n i c a t i o n w i t h a remote host computer. Such communica t i o n s are c a r r i e d out c o n c u r r e n t l y w i t h o t h e r computer t a s k s i n c l u d i n g r e a l - t i m e d a t a a c q u i s i t i o n and r e d u c tion. I n c l u d e d i n t h e RJE s u b s y s t e m a r e : (1) a s y n c h r o n o u s c o n t r o l l e r w i t h b a u d r a t e t o 9600 b i t s / s e c o n d ; ( 2 ) a B e l l s y s t e m c o m p a t i b l e modem w i t h d i a l - u p t e l e p h o n e d a t a s e t ; a n d (3) a CRT d i s p l a y d e v i c e p r o v i d i n g 24 l i n e s w i t h 80 c h a r a c t e r s / l i n e o f d i s p l a y .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS AND LARGE SCALE
232
COMPUTATIONS
SLASH 5 CENTRAL PROCESSING UNIT i/o
32K
CHANNELS
WORDS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
CORE MEMORY
TO
SLASH 6 SYSTEM INCLUDING CARD READER AND L I N E PRINTER
DISC STORAGE
10.8
MB BELL MODEM
RJE SYNCHRONOUS CONTROLLER
MODEM RJE COMMAND CRT
TO
HOST COMPUTER SYSTEM
OTHER
SLASH 5
DEVICES
HOST COMPUTER
FACILITY
Figure 2. Remote job entry subsystem of AARL digital computer and data acquisition system
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
16.
FREULER
A N D PETRIE
Fluid Mechanics Calculations
233
The CRT i s u s e d f o r R J E o p e r a t o r c o m m u n i c a t i o n s b u t may be u t i l i z e d i n a n i n t e r a c t i v e f a s h i o n when R J E is not i nprogress. The c a r d r e a d e r a n d l i n e p r i n t e r are used t o support RJE a c t i v i t i e s as needed. The R J E subsystem canoperate under three d i s c i p l i n e s : * CDC UT-200 f o r t h e 6 0 0 0 / 7 0 0 0 s e r i e s * IBM 2780/3780 f o r t h e S y s t e m 360/370 s e r i e s * UNIVAC 1 0 0 4 f o r t h e 1 1 0 0 s e r i e s E a c h p r o c e s s o r i n t h e AARL D i g i t a l C o m p u t e r a n d Data A c q u i s i t i o n System i s under t h e c o n t r o l o f t h e H a r r i s S e r i e s 6000 D i s c M o n i t o r S y s t e m (DMS). DMS i s a r e a l - t i m e o p e r a t i n g system t h a t p r o v i d e s foreground multiprogramming concurrent with sequential batch proc e s s i n g i n t h e background. The f o r e g r o u n d i s d e s i g n e d for a p p l i c a t i o n - r e l a t e d programs which c o u l d c o n t r o l a wind t u n n e l , process r e a l - t i m e data from an a c o u s t i c experiment, o r i n t e r a c t with m u l t i p l e t e r m i n a l users i n e i t h e r a l o c a l o r remote f a s h i o n . These programs r e c e i v e h i g h e s t p r i o r i t y a n d t h e i r r e q u i r e m e n t s a r e met first. Batch p r o c e s s i n g i s conducted i n t h e background and i s n e v e r t i m e - c r i t i c a l so t h a t b a c k g r o u n d i s s e r v i c e d when p r o c e s s o r t i m e a n d memory s p a c e a r e a v a i l able. S a l i e n t f e a t u r e s o f t h e DMS a r e : * Dynamic l o a d i n g o f f o r e g r o u n d programs * D y n a m i c memory a l l o c a t i o n s e r v i c e s * D y n a m i c s p o o l e d 1/0 f o r a n y l i s t o u t p u t d e v i c e * O p t i o n a l s p o o l e d j o b stream i n p u t from any i n p u t device * P u l l f i l e s e c u r i t y f o r every user i n c l u d i n g read, w r i t e a n d d e l e t e p r o t e c t i o n modes w i t h o p t i o n a l password * Re-entrant foreground program c a p a b i l i t i e s * Program p r i o r i t y s t r u c t u r e t h a t governs t h e a l l o c a t i o n o f memory, d i s c f i l e s , a n d p r o c e s s o r t i m e ; 255 p r i o r i t y l e v e l s * Time s l i c i n g among p r o g r a m s e x e c u t i n g a t t h e same priority * P r o g r a m c o m m u n i c a t i o n s v i a a s p e c i a l Common a r e a , i n i t i a t i o n parameter p a s s i n g , o r a program s w i t c h word * Timer s c h e d u l i n g o f p e r i o d i c foreground programs * Automatic checkpointing and r e l o a d i n g o f t h e b a c k g r o u n d memory a r e a a s r e q u i r e d b y a c t i v a t i o n of n o n - r e s i d e n t f o r e g r o u n d programs * Re-entrant e d i t o r package f o r t e r m i n a l users * C o m p l e t e memory p r o t e c t i o n o f a l l i n a c t i v e p r o grams f r o m c u r r e n t l y a c t i v e p r o g r a m s * Concise j o b c o n t r o l language f o r batch processing
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
234
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
* * * * * *
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
*
Complete operator control over the system environment via the console typewriter or CRT System f i l e manager that maintains program and data files in source and object formats FORTRAN IV interface routines for foreground services Sequential, indexed sequential, and direct random access methods for data files on disc Optional automatic disc f i l e compression and blocking Overlay Link Cataloger that prepares and stores programs on disc in a format designed for rapid loading and relocation RJE subsystem protocol interpreter which performs most of the normal operator functions automatically but does not require a dedicated terminal
The computer system is operated in an "open-shop" mode. A l l users have f u l l access at both the hardware and software levels to the majority of the features of the system. The computer system is used extensively in on-line, real-time, interactive data acquisition and reduction. Literature Cited 1. 2.
3.
4.
Robinson, A. L.: "Computational Chemistry: Getting More from a Minicomputer", Science, (1976), 193, pp. 470-472. Schaeffer, H. F . : "Are Minicomputers Suitable for Large Scale Scientific Computation?", paper presented to 11th Annual IEEE Computer Society Conference, Washington, D. C . , September 1975. Petrie, S. L . : "Design of a Digital Data Acquisition System", paper presented to the 39th SemiAnnual Meeting of the Supersonic Tunnel Association, Bethesda, Maryland, March 1973. (Referenced by author's permission). Freuler, R. J.: "State of the Art Data Acquisition and Reduction Techniques for Transonic A i r f o i l Testing", paper presented to 6th International Congress on Instrumentation in Aerodynamic Simulation F a c i l i t i e s (ICIASF), Ottawa, September 1975.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
INDEX
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ix001
A ACIA ACPL ALGOL Algorithm ( s ) Geradin hybrid Nesbets APL APLVS Asynchronous communications interface adapter ATI Automatic transfer instruction
49 82 75 29 28 29 44 44 49 130 130
B BASIC Benchmarks BERKELEY Bravais lattices
30 194 179 166 C
CALCOMP Cartesian space CGT CK Cloud simulation chamber program .. Colloid stability Communications interface adapter, asynchronous Computer, slave Conjugate gradients inverse iteration by Cost effectiveness Costs, maintenance and operating CRAY CRUNCHER Crystallographic calculations, protein CYBER and ECLIPSE hardware and software differences
191 164 29 149 90 42 49 129 25 27 198 227 178 194 102
Disc operating system Discreet Fourier transform Distribution function, radial DMA transfer DMS D.O.S Dynamics, molecular DYNSYS Ε ECLIPSE and CYBER hardware and software differences Eigenproblems ENDOR spectrometer Equation of state ESR spectrometer Evaporator system, control of Execution times, comparison of
113 24 201 137 201 69 124
F Fast Fourier transform algorithms .... 107 FFT 107 Floating point systems 177 Fluids, theory of 42 FMLS 97 FORTRAN programming 205 Fourier transform spectroscopy 106 Froberg matrix 30, 31 Full Matrix Least Squares 97 G
113
GAUSSIAN 70 GEMCS GER Geradin algorithm Gradients, conjugate inverse iteration by GRAMPS Graphics GRAPPLE
48 108 108 103 137 123 233
Harris slash four Hierarchy HEISENBERG High speedfloatingpoint arithmetic unit Hollerith constants Hosts
179 65 29,30 29 25 27 73 213 75
H
D Device control processor DFT performance for Diffractometer, time-sharing of Diffusion DIMENSION Disc monitor system
36 108 137 130 233 36 149 65
237 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
173 62 166 38 37 46
238
MINICOMPUTERS
HSALU Hybrid algorithms
38 28
I I loop Indexing methods, comparison of Inelastic scattering of neutrons INSYPS Interdata computer Interface, touch Inverse iteration by conjugate gradients INVIT Iteration, inverse by conjugate gradients
141 122 137 65 36 157 29 27 29,30 29 27
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ix001
J J loop JWKB
141 210
Κ KPLOT
213
SCALE COMPUTATIONS
Nesbet's algorithm 29 Neutron diffractometer 201 Neutrons, inelastic scattering of 137 NEWTON 152,153,166 NMR spectrometer, pulsed 201 NRCC 172
Ο Optimal relaxation, method of
29
Ρ Particle diffusion 42 Peripheral interface adapter 48 PIA 48 POLYATOM 179 Protein crystallographic calculations .. 102 Pulsed NMR spectrometer 201 Q Quantum chemistry computations Quotient minimization, Rayleigh, direct
204 26
R
L Large computer, comparison with minicomputer Least squares LEAST SQUARES Lemberg-Stillinger potential Lennard-Jones potential Light scattering spectroscopy Liquids, computer simulation of motion in
103 29,97 196 160 138 42 137
Radial distribution of function Rayleigh quotient minimization, direct RDOS Real time operating system Relaxation, method of optimal REMAP Remote job entry RJE
137 26 114 114 29 114 219 219
S
M MACC MATMUL Mechanical molecules Mechanics, statistical Methyl isocyanide MIKBUG Minicomputer, comparison with large computer Molecular dynamics Molecules, mechanical Monte-Carlo calculations MOR Motion in liquids, computer simulation of MRDOS
A N DLARGE
198 196 148 149 171 49 103 137,149 148 127 29 137 114
Ν National Resource for Computation in Chemistry NES
172 29
Saul'yev method 1 Scattering of neutrons, inelastic 137 SCEP 179 SCF 171 SCEPGM 196 Schmidt's numerical method 2 SCHRODINGER 166 SEL 178 Self-consistent electron pairs 174 Self-consistent-field approximation .... 171 Simulation of motion in liquids, computer 137 Slash 4 220 Slash 7 191 Slave computer 129 SM 149 Solvation, dynamics of 161 Spectrometer ENDOR 201 ESR 201 NMR, pulsed 201
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
239
INDEX
SSE Statistical mechanics Structure determination calculations ..
29 149 94
Τ 141 171 131 103 157 153 157 29 211 174
159 192 192 44
W WC WSPLOT
192 213 X
XTL
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ix001
Τ loop TCNQ-TTF T1980A Time-sharing of the diffractometer .... Touch interface Touchy-Feely Touchy-Twisty TQL TRAJ3D TSCEP
V Vibrational spectra, dynamic approach to Virtual memory system VMS VSAPL
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
97