Reviews in Computational Chemistry Volume 12
Keviews in Computational Chemistrv Volume f2 n
Edited by
Kenny B. Lipkowitz and Donald B. Boyd
@ WILEY-VCH JVEWYORK
. CHICHESTER
.
WEINHEIM
.
BRISBANE
.
SINGAPORE
.
TORONTO
Kenny B. Lipkowitz Department of Chemistry Indiana University-Purdue University at Indianapolis 402 North Blackford Street Indianapolis, Indiana 46202-3274, U.S.A.
[email protected] Donald B. Boyd Department of Chemistry Indiana University-Purdue University at Indianapolis 402 North Blackford Street Indianapolis, Indiana 46202-3274, U.S.A.
[email protected] The authors, editors, and John Wiley and Sons, Inc., its subsidiaries, or distributors assume no liability and make no guarantees or warranties, express or implied, for the accuracy of the contents of this book, o r the use of information, methods, or products described in this book. In no event shall the authors, editors, and John Wiley and Sons, Inc., its subsidiaries, or distributors be liable for any damages or expenses, including consequential damages and expenses, resulting from the use of the information, methods, or products described in this work.
This book is printed on acid-free paper. @ Copyright 0 1998 by Wiley-VCH, Inc. All rights reserved. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4744. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 605 Third Avenue, New York, NY 10158-0012, (212) 850-6011, fax (212) 850-6008, E-Mail: PERMREQ @ WILEY.COM. ISBN 0-471-24671-9 ISSN 1069-3599 Printed in the United States of America 10 9 8 7 6 5 4 3 2 1
Preface The sun is shining brightly on computational chemistry these days. One reason for believing this is that the methodologies of computational chemistry have now become intricately woven into the fabric of other disciplines of chemistry. Another reason is that computational chemistry is having a significant impact on the chemical literature, which we documented in Volumes 2 and 8." A third reason for being optimistic is the current favorable climate in job opportunities for scientists who can boast on their risumts an expertise in computational chemistry. In this preface to our twelfth volume, we present data on the job market for computational chemists. Professors and career counselors may find the information useful when advising their students. For students thinking about career directions, the data give an indication of the value of knowledge in computational chemistry. Experienced laboratory chemists who are thinking of reinventing themselves for the information age may find the data helpful in their decision making. At the same time, however, we are not advocating that everyone become a computational chemist. Certainly experts in this area are finding jobs, but scientists with skills in both computational chemistry and some other area of molecular science are expanding their job horizons. A revealing indicator of job opportunities for people with computational chemistry expertise is the number of advertised openings in Chemical and Engineering News (C&EN), the official weekly magazine of the American Chemical Society. Figure 1 shows the total number of jobs advertised each year from 1983 to 1997, essentially the entire era of modern computational chemistry. It should be noted that these numbers measure only partially the total number of positions available in any given year because many positions are filled by personal contacts and are not publicized. Additionally, job opportunities in other nations are generally not advertised in C&EN, unless a search committee is seeking candidates to return to their homeland after having obtained an education in the United States. Still other job openings not advertised in C&EN are those announced on the World Wide Web home pages of companies and
-
*D. B. Boyd, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1991, Vol. 2, pp. 4 6 1 4 7 9 . The Computational Chemistry Literature. See also, K. B. Lipkowitz and D. B. Boyd, Reviews in Computational Chemistry, VCH Publishers, New York, 1996, Vol. 8, pp. v-ix. Preface. V
vi
Preface
universities or posted on electronic bulletin boards, such as the Computational Chemistry List (CCL) at the Ohio Supercomputer Center.* The job descriptions in the C&EN advertisements cover various facets of computational chemistry including molecular modeling, computer-aided molecular design, molecular library design, quantitative structure-activity relationships, electronic structure calculations, protein homology modeling, simulations, and algorithm development. For purposes of constructing Figure 1, the advertised jobs are grouped into several categories: positions in industry (other than the software companies), tenure-track academic positions, nontenured academic staff positions, postdoctoral research positions, positions at software or hardware companies, and positions in government laboratories. We have not classified the data by degree level, but all the jobs included in the figure required a chemistry degree, rather than, say, a computer science degree. A majority of the jobs required a Ph.D. degree. In 1997 more than 75% of the advertised industrial jobs required a doctorate. What we see from Figure 1 is that the curve for the total number of jobs has been bumpy, but there exists a positive trend line. The total number of jobs has generally been increasing except for the wide and deep valley in the period 1992-1994. Back in Volume 7 of this series, we reported on a deteriorating job situation for computational chemists.+ The Volume 7 preface was written early in 1995, just after those three dismal years when no sign of an improving job situation was evident. At that point in time, the glorious growth of the 1980s, which was accompanied by a rapidly increasing demand for computational chemists, appeared to be running o u t of steam. Recall that in the 1980s, the number of computational chemists employed in industry was doubling every five or so years.* Of course, that rate of increase was starting from a very small base. Companies hiring computational chemists were involved in a wide range of businesses, including agrochemicals, chemicals, petroleum, lubricants, polymers, explosives, and flavorings. But it became evident in the 1980s that the pharmaceutical industry would be the largest employer of computational chemists outside of the universities that produced them. Notice in Figure 1 the little peak in 1985 and the much larger one around 1989-1990; these peaks are due mainly to expansion at software companies catering to the pharmaceutical industry. However, in the early 1990s when these software companies saw the prospect of slower sales of programs for molecular modeling and compound database management, the total number of ad-
-
*See URLs listed by D. B. Boyd, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., Wiley-VCH, New York, 1997, Vol. 11, pp. 373-399. Appendix: Compendium of Software and Internet Tools for Computational Chemistry. tD. B. Boyd and K. B. Lipkowitz, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1995, Vol. 7, pp. v-xi. Preface. *D. B. Boyd, Quantum Chemistry Program Exchange (QCPE) Bulletin, 5, 85-91 (1985). Profile of Computer-AssistedMolecular Design in Industry.
vii
Preface hardware government industrial postdoc academic staff tenure track software academic postdoc industry
40
30 20
10 "
1
1983
.
I
1985
.
I
1987
.
I
'
I
.
I
.
I
.
1
1989
1991 1993 1995 1997 year Figure 1 Annual number of jobs for scientists in the field of computational chemistry advertised in Chemical and Engineering News for the period 1983-1997. The positions available are categorized as to whether they were for software or hardware companies, government laboratories, academia (nontenured staff, tenure-track professorial, or postdoctoral appointments), or industrial research laboratories (permanent or postdoctoral appointments). The data for software vendors includes some postdoctoral-type positions primarily in the years 1988-1990. In a few cases, jobs advertised near the end of one year were also advertised early in the following year; in these situations the positions are arbitrarily counted in both years. In cases of advertisements for an unspecified number of open positions, an estimate was made. Therefore, the data are approximate, but representative.
vertised jobs for computational chemists shriveled. The pharmaceutical companies were less interested in buying software because of dark clouds appearing on their horizon. In the period 1992-1994 the prospects for growth in sales and income in the pharmaceutical industry were dimmed by market and political factors. In
viii Preface
15 Yo
10 5
0 1983
1985
1987
1989
1991
1993
1995
year Figure 2 Year-by-year change in research and development (R&D) expenditures of pharmaceutical companies in the United States. In each year, spending on R&D has increased; the change is given as the percentage increase compared to the prior year. The speckled bars are based on dollars for a given year; the solid bars are for constant 1970 dollars (thereby correcting for the effect of inflation). The percentages are derived from data obtained from the Pharmaceutical Research and Manufacturers of America (PhRMA), 1100 Fifteenth Street NW, Washington, DC. Its members include American Home Products, Bristol-Myers Squibb, Glaxo Wellcome, Johnson & Johnson, Lilly, Merck, Pfizer, Schering-Plough, SmithKline Beecham, Warner-Lambert, and about 90 other companies with operations in the United States.
1993 the people of the United States were confronted with the possibility of a major step toward centralized autocratic regimentation of medical care. The pharmaceutical industry's optimism in investing in science reached a nadir that year (Figure 2); the increase in pharmaceutical research and development spending was the lowest in recent memory. This period was witnessing large job cuts throughout the American economy; layoffs nationwide peaked at 600,000 in 1993. Corporate executives at pharmaceutical companies followed suit and reduced their staffs. Despite those obstacles, the pharmaceutical industry continued to hire young computational chemists even while many of the older scientists at the companies were being offered early retirement options. Following the dip in 1992-1994, growth resumed, and the number of job advertisements reached new highs in 1996 and 1997 (Figure 1).In 1997 the demand for computational chemists reached an all-time high. The reason can be seen from the component curves in Figure 1. Almost all the growth in job opportunities for computational chemists in 1996 and 1997 was due to industry, with upward of 90% of the positions being at pharmaceutical companies of all sizes. Once the dark clouds hanging over American pharmaceutical companies had passed, these businesses were free to invest in their futures. These companies recently announced plans to increase their R&D staffs 15-50%. Corporate executives, as usual, display herd instinct: when they see their counterparts at competing companies laying off workers, they do likewise; when they see their
ix
Preface
-
80 -
/
academic
70 60 .c
0
-
?C
a C
50 -
40 30 20
,
-
10U '
1982
.
,
1984
'
,
1986
-
,
1988
-
,
1990
-
I
1992
-
,
1994
-
,
1996
-
I
1998
year Figure 3 Comparison of the total number of jobs and the total number of academic jobs for scientists in the field of computational chemistry advertised in Chemical and Engineering News.
competitors hiring, they do likewise. Because the companies that hire computational chemists shrunk too much a few years ago, they have had to replenish their work force, thereby helping to create the current climate of plentiful jobs. Companies and job recruitment personnel are having trouble finding enough computational chemists to fill their needs. The supply-demand ratio is now clearly in favor of the scientist seeking a job. The situation in academia has been somewhat different. Illustrating the employment opportunities at colleges and universities, Figure 3 compares the number of academic positions (combining tenure-track, staff, and postdoctoral data from Figure 1 ) versus the total number of jobs advertised in C&EN for computational chemists. We see that the number of academic job openings is not tracking upward. During the period 1983-1997, the number has averaged about 20 per year and peaked back in 1988. One reason for the low advertising in more recent years is that the faculty ranks are already occupied, and new positions will not open up until some of the current professors have retired. Thus, we conclude that in the period covered by our survey, most of the growth in the total number of openings came initially from the software ven-
x
Preface
dors that were developing products to sell to the pharmaceutical industry and then more recently from the pharmaceutical industry itself. An overall indication of where computational chemists have found jobs is given in Figure 4. We show the percentage of jobs in each of the relatively permanent occupational categories based on accumulated totals for the entire period 1983-1997. Thus, we do not include academic and industrial postdoctoral appointments because these are usually temporary “way stations” until the holders of the new doctorates can find more permanent positions. Of course, it must be pointed out that in recent years no job can be considered too permanent, with an increasing number of scientists expecting to work for several employers during their lifetimes. And there are even some computational chemists who seem to change jobs every two or three years. The data in Figure 4 show that 45% of the jobs have been in industry, and, as we have indicated, most of these were associated with molecular design at pharmaceutical companies. Twenty-two percent of the advertised jobs were at software companies. Tenure-track academic positions amounted to 16%. Many chemistry departments and even a few other academic units realized the need to have an expert on site who could maintain the molecular modeling software, workstations, and networks used in research and teaching. For the period 1983-1997, 12% of the jobs were for such academic staff positions. As also seen in Figure 4, the number of advertised positions in government laboratories has not been significant, and, with the government less wanton in spending money it does not have, future growth in this job category is not in sight. New jobs in the computer hardware industry for computational chemists have essentially disappeared. Back in the 1980s, the hardware vendors were hiring a few computational chemists for window dressing and marketing. This strategy is no longer as important to the hardware companies because they have consolidated, and the firms remaining have already established their credibility with the computational chemistry community. Like it or not, the fate of the job market for computational chemists has been and is intimately linked to the health of the pharmaceutical industry. This is the industry where most of the computational chemists work, so it impacts the universities producing new computational chemists and the software vendors developing and maintaining programs for use in molecular design and data management. As long as the pharmaceutical and biotechnology companies can continue to increase their investments in R&D (Figure 2 ) , the job market for computational chemists should remain robust. If society continues to value modern medicines, companies will seek to discover new products, and scientists with computational chemistry expertise will find a place on the research teams. One of the most important goals of Reviews in Computational Chemistry is to provide instruction and help researchers keep up with progress in computational chemistry techniques. We are constantly reminded that education is unending; we must continually learn new things and hone our skills to assure our
Preface
xi
3% 1O/O
0
industry
@l software E3 tenure track academic staff Ed government hardware
16%
45%
Figure 4 Pie chart showing accumulated total distribution of “permanent” jobs advertised in Chemical and Engineering News during the period 1983-1997 for scientists with computational chemistry expertise. The software vendor category includes some temporary postdoctoral-type jobs, so its percentage may be overstated.
employability. This twelfth volume of Reviews in CompututionaZ Chemistry presents chapters that offer our readers a spectrum of key topics ranging from molecular simulations to quantum mechanics. Computational chemistry has permeated many areas of molecular science beyond pharmaceuticals, and this volume looks at some of these topics. Each chapter is written by well-known experts to be part tutorial, part review, so it will have sustained value. Chapter lengths are designed to provide an overview rather than an overload of information. Moreover, it is hoped that these chapters will teach what not to do, as well as what to do, to save the reader time and perhaps embarrassment in their scientific studies. Entropy has always been the bogeyman of computational chemistry. In the early days of computational quantum chemistry and molecular mechanics, it was usually neglected. Deviations of predicted results from experiment could be blamed on the failure to account for entropy, but this was a less than satisfactory explanation. Fortunately, molecular dynamics and Monte Carlo simulations allow entropy, and hence free energy, to be computed. In Chapter 1, Dr. Hagai Meirovich gives a detailed exposition of computing entropy and free energy. This chapter lays a solid foundation of thermodynamics and shows the power and the limitations of the calculations. Developments such as umbrella sampling techniques and nonphysical (alchemical) perturbations within the framework of thermodynamic cycles have made possible new ways to compute the free energy of ligand-receptor binding, conformational energies of proteins, solvation free energies, and the modeling of reaction mechanisms in enzymes. Simulations of polymers and Ising spin systems are also covered. Chapter 2 also relates to molecular dynamics simulations and free energy
xii
Preface
calculations. Drs. Ramzi Kutteh and T. P. Straatsma present the theory underlying simulations run with constraints whereby atomic coordinates are related by algebraic equations. Some of this theory is heretofore unpublished. Constraints can be used to freeze uninteresting degrees of freedom, such as high frequency vibrations of light atoms. Constraints require selected degrees of freedom to be fixed at their respective target values or held nearby. Familiar constraint algorithms such as SHAKE and RATTLE are among the methods discussed in this chapter. Because of the wide use of constraints, the influence they have on a simulation trajectory should be understood. Dr. Straatsma also contributed to Volume 9 of our book series. The proper ways to model a metallic surface in the presence of water are described in Chapter 3 by Drs. John C. Shelley and Daniel R. BCrard. Considering all the situations in which metal comes in contact with water, it is clear that the understanding of interfacial regions between water and metals has implications for electrochemistry, corrosion, catalysis, and other phenomena. Effective methods for performing molecular dynamics and Monte Carlo simulations on interfaces are explained. Heat baths and other pertinent techniques for calculation and analysis are described. In Chapter 4, Professor Donald W. Brenner and his co-workers Olga A. Shenderova and Denis A. Areshkin explore density functional theory and quantum-based analytic interatomic forces as they pertain to simulations of materials. The study of interfaces, fracture, point defects, and the new area of nanotechnology can be aided by atomistic simulations. Atom-level simulations require the use of an appropriate force field model because quantum mechanical calculations, although useful, are too compute-intensive for handling large systems or long simulation times. For these cases, analytic potential energy functions can be used to provide detailed information. Use of reliable quantum mechanical models to derive the functions is explained in this chapter. Professor Henry A. Kurtz and Dr. Douglas S. Dudis give a tutorial in Chapter 5 on quantum mechanical methods for predicting nonlinear optical properties of compounds and materials. The interaction of matter with electromagnetic radiation, as from a laser, can be put to use in signal processing. Modeling can help materials scientists understand how switches and shutters are controlled by light. Whereas molecular simulations play an increasingly important role in studying the properties of complicated systems such as materials and biomolecules, sorting out the critical factors affecting predicted properties of a complex molecular system may not be obvious because the interactions giving rise to the observables are dependent on many factors, such as the many force field parameters. Even with a relatively simple force field, a researcher would be hard pressed to know which parameters are most influential in determining the property of interest. In Chapter 6, Dr. Chung F. Wong, Dr. Tom Thacher, and Professor Herschel Rabitz provide a tutorial on how sensitivity analysis can be used in biomolecular simulations.
Preface xiii
Sometimes the difference between success and failure of a pharmaceutical development project will depend on obtaining an appropriate crystalline form of a compound. Properties such as mixability of the substance with other ingredients of a capsule, the rate of dissolution, or the stability will depend on the polymorph. Similarly, materials researchers may want to control the polymorph that is being produced. Understanding why and how compounds crystallize the way they do is an area of research to which computations can contribute. In Chapter 7, Drs. Paul Verwer and Frank J. J. Leusen discuss methods for predicting crystal polymorphs by computer simulation. Finally, in Chapter 8, Professors Jean-Louis Rivail and Bernard Maigret provide an essay on the historical development of computational chemistry in France. This chapter complements essays on the history in the United States (Volume 5) and in the United Kingdom (Volume 10).France was one of the first epicenters in developing and applying quantum chemical methods to biomolecules, and Chapter 8 gives some of these highlights as well as other important contributions to the field. Information about Reviews in Computatiolzal Chemistry is available on the World Wide Web. The home page includes the author and subject indexes of all volumes as a free online service. The home page is also used to present color graphics and other material as adjuncts to the chapters. Your Web browser will find us at http://chem.iupui.edu/rcc/rcc.html or a search engine may be used. A brief tutorial about the Web can be found in the Appendix of Volume 11. The Appendix included links to suppliers of computational chemistry software. As anyone knows who has tried to obtain a patent, it is not sufficient to just have a good idea; it must be reduced to practice. Helping reduce these volumes to a physical reality is Mrs. Joanne Hequembourg Boyd. We are grateful to the expert authors who wrote the outstanding chapters in this volume. We also appreciate the kind words we have received from our readers. We care about our readers and authors, and trust that these books will serve them well in their learning, teaching, and research. Donald B. Boyd and Kenny B. Lipkowitz Indianapolis January 1998
Contents 1.
Calculation of the Free Energy and the Entropy of Macromolecular Systems by Computer Simulation Hagai Meirovitch Introduction Statistical Mechanics of Fluids and Chain Systems The Partition Function and the Boltzmann Probability Density The Absolute Entropy and Free Energy as Ensemble Averages Fluctuations Entropy and Free Energy Differences by “Calorimetric” Thermodynamic Integration The Kirkwood and Zwanzig Equations Basic Sampling Theory and Simulation Importance Sampling The Monte Carlo and Molecular Dynamics Methods Application of the M C and MD Methods to Macromolecular Systems Direct Methods for Calculating the Entropy of Proteins The Harmonic Approximation The Quasi-Harmonic Approximation Free Energy from < exp[+E/kBTI > Applications of Integration and Importance Sampling Techniques Calculations by Calorimetric Integration and Perturbation Methods Umbrella Sampling and the Potential of Mean Force Thermodynamic Cycles Historical Perspective Free Energy of Enzyme-Ligand Binding Application of Thermodynamic Cycles New Perturbation-Related Procedures Entropy from Linear Buildup Procedures Step-by-step Construction Methods for Polymers
1
1 4 4 5
6 9 10 12 13 15 17 19 20 22 23 23 24 25 30 30 32 34 37 41 42 xu
xui
Contents
Direct Methods for Calculating the Entropy from MC and MD Samples The Stochastic Models Method of Alexandrowicz and Its Implications Additional Methods for Calculating the Entropy The Multicanonical Approach Calculation of Entropy by Adiabatic Switching Four Additional Methods Summary Acknowledgments References 2.
Molecular Dynamics with General Holonomic Constraints and Application to Internal Coordinate Constraints Ramzi Kutteh and T. P. Straatsma Introduction The Analytical Method of Constraint Dynamics Computation of the Forces of Constraints and Their Derivatives Numerical Integration of the Equations of Motion Error Analysis of the Analytical Method Method of Edberg, Evans, and Morriss in Context The Method of Undetermined Parameters Computation of the Partially Constrained Coordinates Computation of the Undetermined Parameters and the Constrained Coordinates Error Analysis of the Method of Undetermined Parameters Using the Method of Undetermined Parameters with the Basic Verlet Integration Algorithm The Matrix Method SHAKE Physical Picture of SHAKE for Internal Coordinate Constraints Method of Tobias and Brooks in Context Application to Internal Coordinate Constraints Bond-Stretch Constraints Angle-Bend Constraints Torsional Constraints Angle Constraint Versus Triangulation Using the Method of Undetermined Parameters with the Velocity Verlet Integration Algorithm RATTLE for General Holonomic Constraints
49 53 55 56 58 59 60 60 61
75 75 84 86 89 89 90 95 97 98 100 101 103 106 110 111 115 116 118 120 123 126 128
Contents xvii Application to Bond-Stretch, Angle-Bend, and Torsional Constraints Further Developments and Future Prospects Acknowledgments References
3.
Computer Simulation of Water Physisorption at Metal-Water Intecfaces ]ohn C. Shelley and Daniel R. Be'rard Introduction Modeling Treatment of Water Treatment of Metal-Water Interactions Simulation Methods Common Aspects of the Methods Molecular Dynamics Monte Carlo Methods Analysis and Results for Metal-Water Interfaces Visualization Electron Density for Jellium Structure Dynamics Miscellaneous Properties General Discussion of the Properties of Metal-Water Interfaces Summary and Perspective Future Developments Acknowledgments References
4.
130 132 133 134
137 137 140 141 143 152 153 159 166 175 176 179 180 186 190 193 196 198 198 199
Quantum-Based Analytic Interatomic Forces and Materials Simulation Donald VE! Brenner, Olga A. Shenderova, and Denis A. Areshkin
207
Introduction and Background Historical Perspective Analytic Potentials and Materials Simulation Framework for Bonding: Density Functional Theory Bridge to Analytic Forms: The Harris Functional Tight Binding Method Second Moment Approximation and Finnis-Sinclair Model Empirical Bond-Order Model EffectiveMedium Theory
207 209 210 212 21 5 21 8 220 226 23 1
xuiii Contents
5.
6.
Embedded-Atom Method Fitting Databases Acknowledgments References
233 235 236 236
Quantum Mechanical Methods for Predicting Nonlinear Optical Properties Henry A. Kurtz and Douglas S. Dudis
241
Introduction Nonlinear Optical Property Basics Examples of Applications of Nonlinear Optics Second Harmonic Generation (SHG) Electrooptic Modulation Optical Bistability (Optical Signal Processing) Degenerate Four-Wave Mixinflhase Conjugation (Imaging Enhancements) Frequency Upconversion Lasing Definitions of Molecular Properties Methods for Molecular Electronic (Hyper)Polarizability Calculations Finite Field Method Sum-Over-States Methods Time-Dependent Hartree-Fock Other Methods Practical Considerations Basis Sets Other Considerations Beyond MoIecular Electronic Calculations Molecular Vibrational Calculations Condensed Phase Problems Summary Acknowledgments References
24 1 242 243 244 244 244
Sensitivity Analysis in Biomolecular Simulation Chung F. Wong, Tom Thacher, and Herschel Rabitz
281
Introduction Methods Dependence of Sensitivity Results on the Choice of Force Fields Convergence Issues Applications Determinants of (Bio)molecular Properties Molecular Recognition
281 2 82 292 294 298 298 306
246 246 247 252 252 252 256 258 263 265 265 271 272 273 273 274 274 2 74
Contents x i x Green's FunctiodPrincipal Component Analysis and Essential Dynamics Error Propagation Potential Energy Function Refinement Conclusions Acknowledgments References
7.
8.
312 314 318 321 322 323
Computer Simulation to Predict Possible Crystal Polymorphs Paul Verwer and Frank /. /. Leusen
327
Introduction Theory and Computational Approaches Crystals Thermodynamics Computational Techniques Crystal Structure Prediction Methods Related Software Comparison of Different Techniques Using Experimental Data Predicting and Evaluating Crystal Structures Example: Polymorph Prediction for Estrone Application Examples Acknowledgments References
327 330 330 331 333 339 344 345 346 347 350 353 358 359
Computational Chemistry in France: A Historical Survey Jean-Louis Rivail and Bernard Maigret
367
Introduction Early Age of Theoretical Chemistry Computational Quantum Chemistry Statistical Mechanics Software Development Computational Facilities Industry Teaching Computational Chemistry Government Funding Conclusion Acknowledgments References
367 368 3 70 3 73 3 73 3 74 3 75 3 75 376 376 377 377
Author Index
381
Subject Index
395
Contributors Denis A. Areshkin, Department of Materials Science and Engineering, North Carolina State University, Raleigh, North Carolina 27695-7907, U S A . (Electronic mail:
[email protected]) Daniel R. BCrard, Molecular Simulations Incorporated, San Diego, California 92121, U.S.A. (Electronic mail:
[email protected]) Donald W. Brenner, Department of Materials Science and Engineering, North Carolina State University, Raleigh, North Carolina 27695-7907, U.S.A. (Electronic mail:
[email protected]) Douglas S. Dudis, Materials Laboratory/Polymer Branch, Wright Laboratory (WLIMLBP), Wright Patterson Air Force Base (WPAFB), Ohio 45433, U.S.A. (Electronic mail:
[email protected]) Henry A. Kurtz, Department of Chemistry, University of Memphis, Memphis, Tennessee 38152, U.S.A. (Electronic mail:
[email protected]) Ramzi Kutteh, Department of Physics, Queen Mary and Westfield College, University of London, Mile End Road, London, E l 4NS, U.K. (Electronic mail:
[email protected]) Frank J. J. Leusen, Molecular Simulations Ltd., 240/250 The Quorum, Barnwell Road, Cambridge, CB5 8RE, United Kingdom (Electronic mail:
[email protected]) Bernard Maigret, Laboratoire de Chimie thkorique, Unit6 Mixte de Recherche au Centre National de la Recherche Scientifique (CNRS) No. 7565, Institut Nanckien de Chimie moliculaire, Universitk Henri Poincark, Nancy 1, Domaine Universitaire Victor Grignard, BP 239, 54506 Vandeuvre-16s-Nancyy France (Electronic mail:
[email protected]) Hagai Meirovitch, Supercomputer Computations Research Institute, Florida State University, Tallahassee, Florida 32306-4052, U.S.A. (Electronic mail:
[email protected]) xxi
xxii Contributors Herschel Rabitz, Department of Chemistry, Princeton University, Princeton, New Jersey 08544, U.S.A. (Electronic mail:
[email protected]) Jean-Louis Rivail, Laboratoire de Chimie thkorique, Unit6 Mixte de Recherche au Centre National de la Recherche Scientifique (CNRS) No. 7565, Institut NancCien de Chimie moliculaire, UniversitC Henri Poincari, Nancy 1, Domaine Universitaire Victor Grignard, BP 239, 54506 Vandceuvre-16s-Nancy, France (Electronic mail: rivailC3lctn.u-nancy.fr) John C. Shelley, The Procter & Gamble Company, Miami Valley Laboratories, P.O. Box 538707, Cincinnati, Ohio 45253-8707, U.S.A. (Electronic mail:
[email protected]) Olga A. Shenderova, Department of Materials Science and Engineering, North Carolina State University, Raleigh, North Carolina 27695-7907, U.S.A. (Electronic mail:
[email protected])
T.P. Straatsma, High Performance Computational Chemistry, Environmental
Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99352, U.S.A. (Electronic mail:
[email protected])
Tom Thacher, Virtual Chemistry, Inc., 7770 Reagents Road #251, San Diego, California 92122, U.S.A. (Electronic mail:
[email protected]) Paul Verwer, CAOSKAMM Center, University of Nijmegen, P.O. Box 9010, 6500 GL Nijmegen, The Netherlands (Electronic mail:
[email protected]) Chung F. Wong, SUGEN, Inc., 230 East Grand Street, South San Francisco, California 94080, U.S.A. (Electronic mail:
[email protected])
Contributors to Previous Volumes* Volume 1 David Feller and Ernest R. Davidson, Basis Sets for Ab Initio Molecular Orbital Calculations and Intermolecular Interactions. James J. P. Stewart: Semiempirical Molecular Orbital Methods. Clifford E. Dykstra,* Joseph D. Augspurger, Bernard Kirtman, and David J. Malik, Properties of Molecules by Direct Calculation. Ernest L. Plummer, The Application of Quantitative Design Strategies in Pesticide Design. Peter C. Jurs, Chemometrics and Multivariate Analysis in Analytical Chemistry. Yvonne C. Martin, Mark G . Bures, and Peter Willett, Searching Databases of ThreeDimensional Structures. Paul G . Mezey, M o l t h l a r Surfaces. Terry P. Lybrand? Computer Simulation of Biomolecular Systems Using Molecular Dynamics and Free Energy Perturbation Methods. Donald B. Boyd, Aspects of Molecular Modeling. *When no author of a chapter can be reached at the addresses shown in the original volume, the current affiliation of the senior or corresponding author is given here as a convenience to our readers. +Current address: 15210 Paddington Circle, Colorado Springs, CO 80921 (Electronicmail:
[email protected]).
*Current address: Indiana University-Purdue University at Indianapolis, Indianapolis, IN 46202 (Electronicmail:
[email protected]). 'Current address: University of Washington, Seattle, WA 98195 (Electronic mail: lybrand@ proteus.bioeng.washington.edu).
xxiii
xxiv Contributors to Previous Volumes
Donald B. Boyd, Successes of Computer-Assisted Molecular Design. Ernest R. Davidson, Perspectives on Ab Initio Calculations.
Volume 2 Andrew R. Leach,+ A Survey of Methods for Searching the Conformational Space of Small and Medium-Sized Molecules. John M. Troyer and Fred E. Cohen, Simplified Models for Understanding and Predicting Protein Structure.
J. Phillip Bowen and Norman L. Allinger, Molecular Mechanics: The Art and Science of Parameterization. Uri Dinur and Arnold T. Hagler, New Approaches to Empirical Force Fields. Steve Scheiner, Calculating the Properties of Hydrogen Bonds by Ab Initio Methods. Donald E. Williams, Net Atomic Charge and Multipole Models for the Ab Initio Molecular Electric Potential. Peter Politzer and Jane S. Murray, Molecular Electrostatic Potentials and Chemical Reactivity. Michael C. Zerner, Semiempirical Molecular Orbital Methods. Lowell H. Hall and Lemont B. Kier, The Molecular Connectivity Chi Indexes and Kappa Shape Indexes in Structure-Property Modeling.
I. B. Bersukert and A. S. Dimoglo, The Electron-Topological Approach to the QSAR Problem. Donald B. Boyd, The Computational Chemistry Literature.
Volume 3 Tamar Schlick, Optimization Methods in Computational Chemistry. Harold A. Scheraga, Predicting Three-Dimensional Structures of Oligopeptides. *Currentaddress: Glaxo Wellcome, Greenford, Middlesex, UB6 OHE, U.K. (Electronic mail:
[email protected]). +Current address: University of Texas, Austin, TX 78712 (Electronic mail: bersukerm eeyore. cm.utexas.edu).
Contributors to Previous Volumes xxv Andrew E. Torda and Wilfred F. van Gunsteren, Molecular Modeling Using NMR Data. David F. V. Lewis, Computer-Assisted Methods in the Evaluation of Chemical Toxicity.
Volume 4 Jerzy Cioslowski, Ab Initio Calculations on Large Molecules: Methodology and Applications. Michael L. McKee and Michael Page, Computing Reaction Pathways on Molecular Potential Energy Surfaces. Robert M. Whitnell and Kent R. Wilson, Computational Molecular Dynamics of Chemical Reactions in Solution. Roger L. DeKock, Jeffry D. Madura, Frank Rioux, and Joseph Casanova, Computational Chemistry in the Undergraduate Curriculum.
Volume 5 John D. Bolcer and Robert B. Hermann, The Development of Computational Chemistry in the United States. Rodney J. Bartlett and John F. Stanton, Applications of Post-Hartree-Fock Methods: A Tutorial. Steven M. Bachrach, Population Analysis and Electron Densities from Quantum Mechanics. Jeffry D. Madura, Malcolm E. Davis, Michael K. Gilson, Rebecca C. Wade, Brock A. Luty, and J. Andrew McCammon, Biological Applications of Electrostatic Calculations and Brownian Dynamics Simulations. K. V. Damodaran and Kenneth M. Merz Jr., Computer Simulation of Lipid Systems. Jeffrey M. Blaney and J. Scott Dixon, Distance Geometry in Molecular Modeling. Lisa M. Balbes, S. Wayne Mascarella, and Donald B. Boyd, A Perspective of Modern Methods in Computer-Aided Drug Design.
Volume 6 Christopher J. Cramer and Donald G . Truhlar, Continuum Solvation Models: Classical and Quantum Mechanical Implementations.
xxvi Contributors to Previous Volumes
Clark R. Landis, Daniel M. Root, and Thomas Cleveland, Molecular Mechanics Force Fields for Modeling Inorganic and Organometallic Compounds. Vassilios Galiatsatos, Computational Methods for Modeling Polymers: An Introduction. Rick A. Kendall, Robert J. Harrison, Rik J. Littlefield, and Martyn F. Guest, High Performance Computing in Computational Chemistry: Methods and Machines. Donald B. Boyd, Molecular Modeling Software in Use: Publication Trends.
Eiji Osawa and Kenny B. Lipkowitz, Appendix: Published Force Field Parameters.
Volume. 7 Geoffrey M. Downs and Peter Willett, Similarity Searching in Databases of Chemical Structures. Andrew C. Good and Jonathan S. Mason, Three-Dimensional Structure Database Searches. Jiali Gao, Methods and Applications of Combined Quantum Mechanical and Molecular Mechanical Potentials. Libero J. Bartolotti and Ken Flurchick, An Introduction to Density Functional Theory. Alain St-Amant, Density Functional Methods in Biomolecular Modeling. Danya Yang and Arvi Rauk, The A Priori Calculation of Vibrational Circular Dichroism Intensities. Donald B. Boyd, Appendix: Compendium of Software for Molecular Modeling.
Volume 8 Zdenek Slanina, Shyi-Long Lee, and Chin-hui Yu,Computations in Treating Fullerenes and Carbon Aggregates. Gernot Frenking, Iris Antes, Marlis Bohme, Stefan Dapprich, Andreas W. Ehlers, Volker Jonas, Arndt Neuhaus, Michael Otto, Ralf Stegmann, Achim Veldkamp, and Sergei F. Vyboishchikov, Pseudopotential Calculations of Transition Metal Compounds: Scope and Limitations. Thomas R. Cundari, Michael T. Benson, M. Leigh Lutz, and Shaun 0. Sommerer, Effective Core Potential Approaches to the Chemistry of the Heavier Elements.
Contributors to Previous Volumes xxvii Jan Almlof and Odd Gropen,' Relativistic Effects in Chemistry. Donald B. Chesnut, The Ah Initio Computation of Nuclear Magnetic Resonance Chemical Shielding.
Volume 9 James R. Damewood, Jr., Peptide Mimetic Design with the Aid of Computational Chemistry.
T. P. Straatsma, Free Energy by Molecular Simulation. Robert J. Woods, The Application of Molecular Modeling Techniques to the Determination of Oligosaccharide Solution Conformations. Ingrid Pettersson and Tommy Liljefors, Molecular Mechanics Calculated Conformational Energies of Organic Molecules: A Comparison of Force Fields.
Gustavo A. Arteca, Molecular Shape Descriptors.
Volume 10 Richard Judson,+ Genetic Algorithms and Their Use in Chemistry. Eric C. Martin, David C. Spellmeyer,Roger E. Critchlow Jr., and Jeffrey M. Blaney, Does Combinatorial Chemistry Obviate Computer-Aided Drug Design? Robert Q. Topper, Visualizing Molecular Phase Space: Nonstatistical Effects in Reaction Dynamics. Raima Larter and Kenneth Showalter, Computational Studies in Nonlinear Dynamics. Stephen J. Smith and Brian T. Sutcliffe, The Development of Computational Chemistry in the United Kingdom.
Volume 11 Mark A. Murcko, Recent Advances in Ligand Design Methods. *Address: Institute of Mathematical and Physical Sciences, University of Tromss, N-9037 Tromss, Norway (Electronicmail:
[email protected]). +Currentaddress: CuraGen Corporation, 322 East Main Street, Branford, CT 06405 (Electronic mail:
[email protected]).
xxviii Contributors to Previous Volumes David E. Clark, Christopher W. Murray, and Jin Li, Current Issues in De Novo Molecular Design. Tudor I. Oprea and Chris L. Waller, Theoretical and Practical Aspects of Three-Dimensional Quantitative Structure-Activity Relationships. Giovanni Greco, Ettore Novellino, and Yvonne Connolly Martin, Approaches to ThreeDimensional Quantitative Structure-Activity Relationships. Pierre-Alain Carrupt, Bernard Testa, and Patrick Gaillard, Computational Approaches to Lipophilicity: Methods and Applications. Ganesan Ravishanker, Pascal Auffinger, David R. Langley, Bhyravabhotla Jayaram, Matthew A. Young, and David L. Beveridge, Treatment of Counrerions in Computer Simulations of DNA. Donald B. Boyd, Appendix: Compendium of Software and Internet Tools for Computational Chemistry.
CHAPTER 1
Calculation of the Free Energy and the Entropy of Macromolecular Systems by Computer Simulation Hagai Meirovitch Supercomputer Computations Research Institute, Florida State University, Tallahassee, Florida 323 06-4052
INTRODUCTION Computer simulation has become a standard approach for studying the thermodynamic behavior of complex systems that are otherwise difficult to treat by analytical techniques. Commonly used methods for such problems include Metropolis Monte Carlo'.2 (MC) and molecular dynamics3-' (MD). These methods are of a dynamical type, meaning that the simulation starts from a configuration (geometry) of the system, which is changed repeatedly under certain rules, and after a long enough time period the system is expected to relax to the most probable region of configurational space according to PB, the Boltzmann probability density (PD). The strength of these methods lies in their ability to select configurations according to PB, but without the need to know the true value of PB. It is thus easy to calculate averages of thermodynamic quantities like the energy E and structural quantities such as the radius of gyration, which are evaluated straightforwardly from each configuration. More specifically, for a common model of liquid argon, for example, the energy of a configuration is the sum of the Lennard-Jones interactions between Reviews in Computational Chemistry, Volume 12 Kenny R. Lipkowitz and Donald B. Boyd, Editors Wiley-VCH, John Wiley and Sons, Inc., New York, 0 1998
1
2
Calculation of Free Energy and Entropy by Computer Simulation
all the pairs of molecules. For a protein the energy is defined by the “force field,” which consists of the bonded interactions between neighbor atoms, and the nonbonded ones, such as the electrostatic and Lennard-Jones interactions. Thus, the energy depends on the distances between the atoms or the molecules that are defined by the configuration. E is the average energy obtained from the contribution of the individual configurations weighted by their Boltzmann PD. On the other hand, the absolute entropy S, which requires calculating In l’; for configuration i, cannot be obtained in such a direct manner because the value of P; is unknown. This means that the Helmholtz free energy F = E - TS, where T is the absolute temperature, is also unknown. For some models (e.g., proteins in vacuum), this problem can somewhat be alleviated by applying harmonic or quasi-harmonic approximations where PB is assumed to be a Gaussian.6-8 The commonly used methods for calculating F and S are based on reversible thermodynamic integration over physical quantities, such as the energy, temperature, and the specific heaq9-I2 as well as nonphysical parameters13-19 (free energy perturbation methods are also included in this category). Thermodynamic integration in most cases provides the difference in entropy (or free energy) between two states, and only when the absolute entropy of one state is known can that of the other be obtained. In general, these methods are inefficient because of the need to carry out a large number of simulations along the integration path. The importance of developing efficient methods for evaluating S or F stems from the unique information they provide. In general, S is a measure of order, which, for example, constitutes a measure for the flexibility of loops in protein^.^"-^^ The usual thermodynamic properties can be derived from F, which also serves as a criterion of stability: as F decreases, the system stability increases. This is particularly important for a stability analysis of protein structures. The energy surface of a protein consists of a tremendous number of potential energy wells, where the most stable one is expected to correspond to the native structure. But, in many studies the energy, E, rather than the free energy, F, has been adopted as the criterion of stability mainly because of the difficulties in calculating the free energy.24 Whereas the dynamical M C and MD methods and thermodynamic integration techniques have become the main tools for studying fluids and biological macromolecules, a different type of method has been developed for synthetic polymers. Here a chain configuration is grown step-by-step with the help of transition probabilities; a simple case is a random walk where each step (direction) is determined by a transition probability. The product of the transition probabilities used to build a specific chain configuration is the chain probability, which leads to the absolute entropy and free energy. These methods are less general than the M C and M D methods because they are applicable to relatively simple models of a single chain or multiple chains, but not to complex systems of chains solvated in water, for e ~ a m p l e . ~Nonetheless, ~-~~ a hybrid of these buildup ideas with the dynamic procedures has led to efficient methods for cal-
Introduction
3
culating the absolute S and F directly from a single M C or M D sample without the need to resort to thermodynamic integration. A sample is a group of configurations selected, for example, from an MD trajectory at constant time intervals. For example, from the MC or MD sample, one can calculate a set of transition probabilities that correspond to the scanning method, which is one of the step-by-step buildup methods mentioned a b ~ v e . ~These l - ~ ~probabilities are expressed in terms of the frequency of occurrence of certain local states corresponding to groups of neighbor angles along the chain. The product of these probabilities for configuration i constitutes an approximation for P$ from which an approximate entropy is obtained. These ideas, which are im~ ~the . ~hypothetical ~ scanning plemented in the local states (LS) m e t h ~ dand method3* t o be described later, are also applicable to bulk three-dimensional (3D)spin or fluid systems. In fact, they were first developed for Ising models by relying on Alexandrowicz's model, which views a (3D)spin system as a long linear chain. This way, the system can be constructed by growing the chain a b initio with the use of transition p r o b a b i l i t i e ~ . ~ ~Again, - ~ ' these probabilities can be recovered from an M C sample and the entropy can be obtained as described above The foregoing discussion demonstrates the significant interplay between different scientific fields. Yet another example is the multicanonical algorithm developed originally for spin systems4446 but now transferred to proteins.474y Accordingly, the central aim of this chapter is to review a wide range of techniques used for calculating entropies and free energies for fluids, biological macromolecules, polymers, and even for discrete magnetic spin systems. (The latter systems are interesting because some of them are equivalent to lattice gas models for The emphasis is more on the methodology and less on a detailed discussion of various applications. Thus, the theoretical bases of the methods are described and their advantages, limitations, and efficiencies are compared; some avenues for further research are also suggested. It should be pointed out that excellent reviews on free energy calculations have appeared in the last 10 years, but most of them have focused on the perturbation and thermodynamic integration methods as applied to the calculation of the free energy of hydration of small molecules or the free energy of protein ligand binding (see, e.g., Refs. 18,19, and 51-61). Although those methods are discussed here, the emphasis is on recent developments. The reader is advised to consult the cited reviews for information about technical details, as well as for a more complete bibliography. The first part of this chapter contains a short introduction to statistical mechanics of continuum models of fluids and macromolecules. The next section presents a discussion of basic sampling theory (importance sampling) and the Metropolis Monte Carlo and molecular dynamics methods. The remainder of the chapter is devoted to descriptions of methods for calculating F and S, including those that were mentioned above as well as others. .42343
4
Calculation of Free Energy and Entropy by Computer Simulation
STATISTICAL MECHANICS OF FLUIDS AND CHAIN SYSTEMS The Partition Function and the Boltzmann Probability Density For simplicity we discuss a classical fluid system of N equal particles of mass m contained in a c c b o ~of" volume V and interacting by a two-body potential that is velocity independent (e.g., a 6-12 Lennard-Jones potential). The system is in equilibrium with a reservoir at temperature T [i.e., an ( N V T )ensemble]. A configuration of the N particles is defined by their Cartesian coordinates and is denoted by the 3N vector x; the ensemble of these vectors defines the configurational space n of volume VN.The momenta of the particles are denoted by the 3N vector p and the corresponding space by np.Because the forces do not depend on the velocities, the contributions of the kinetic energy, p212m, and the interaction energy, E(x),to the canonical partition function Q are separated, Q=--1
1
N!h3N
exp[-p2 / 2mkRTldpjaexp[-E(x) / kT]dx
SZ,
where h is Planck's constant, k, is the Boltzmann constant, and the integration is over the respective spaces. N! is added because the particles are indistinguishable; but it will be eliminated for a single macromolecule, for example. The probability density to find the system within x and x + dx and p and p + dp is
The first integration in Eq. [l]is trivial and leads to most of the ideal gas contribution. More difficult to evaluate is the second integral, the configurational partition function 2, Z=
exp[-E( x)/ bT]dx
[31
which, in contrast to Q, is not dimensionless. Z can also be expressed as Z
=
J62
v(E)exp[-E / bT]dE
141
where v(E)dEis the volume in configurational space of the configurations with energy between E and E + dE. One can define the Boltzmann probability density (PD) over the ensemble of all configurations (compare with IT(X,P)of Eq.
PI),
PB(x)=exp[-E(x)/bT]/Z
PI
Statistical Mechanics of Fluids and Chain Systems
where PB(x)dxis the probability for the system to be between x and x PR(x)is a normalized PD,
5
+ dx;
p q X ) d X =1
The Absolute Entropy and Free Energy as Ensemble Averages PB(x)enables one to define ensemble averages of various thermodynamic quantities that are defined over the configurational space. For example, the average configurational energy is < E > = j*PB(x)E(x)dx
171
Note that in the terminology of the theory of statistics, a function, such as E(x), that is defined over $2, is called a random variable; the brackets denote a statistical average defined with the Boltzmann PD. We now define a total absolute entropy S' that depends on both the coordinates and the momenta (see Eqs. [l]and [2],62 S t = -kB
I
n ( p , x ) l n [ ~ ( p , x ) h ~ ~ N dx !]dp
PI
where the integration is carried out over R and R,, and St is thus formally expressed as a statistical average over phase space.63 Again we are mainly interested in the absolute configurational entropy S that can be viewed as a statistical average < S > of the quantity In PB(x) S = < s > = -kB
I,PB(
x )In PB(x)dx
191
For simplicity, throughout this chapter we shall always use the common notation S rather than < S >. It should be pointed out that whereas < E > (Eq. 171) is a well-defined function, S (Eq. [91) is defined up to an additive constant that depends on the units used for x. Equation [Sl is useful because it enables one to calculate the contribution of the momenta to the entropy. The configurational Helmholtz free energy F can also be expressed formally as a statistical average, < F >, of the random variable [E(x) + k,T lnPB(41, F = < F > = - k B T l n Z = < E > - T S = I Cl P B ( x ) [ E ( x ) + ~ T I n P B ( x ) ] d x 1101
Again, for simplicity we shall use the common notation F rather than < F >. Some theories enable one to define an approximate PD that depends on a set of parameters m and thus leads to an approximate free energy F(m).Such a parameter, for example, is the long-range order in the Bragg-Williams approxi-
6
Calculation of Free Energy and Entropy by Computer Simulation
mation of the Ising model.50 The “minimum free energy p r i n ~ i p l e ”states ~~*~~ that F(m) 2 F, and thus the lower F(m),the better the approximation. Relying on this principle, we see that the best free energy F(m”) is obtained for the optimal set m * that minimizes F(m).The principle is used in the stochastic modz ~ ~the ~ ~section following Eq. 1791) and the els method of A l e x a n d r o ~ i c (see scanning m e t h ~ d ~(see l - ~the ~ section following Eq. [65]). F can be expressed in terms of another statistical average. By using the equality VN = J exp[-E(x)/k,T]exp[ +E(x)lk,T]dx, one can write the identity Z = ZVN/J, where J is the latter integral; this leads
=-kTln-
1 <exp[+E/bT]> VN
[I11
F is thus expressed as the statistical average of the random variable exp[+E(x)/k,T] with the Boltzmann PD. Notice that F (like S ) is defined up to an additive constant. < E >, S, and F are extensive variables; that is, they are proportional to the number of particles N.One can also define the Gibbs free energy G = F + PV of the (NPT)ensemble. Even though G is calculated in simulations more often than F, for the sake of simplicity (and without a loss of generality) we shall describe the general theory and the various methods with respect to F.
Fluctuations Fluctuations in the thermodynamic functions determine the efficiency with which the statistical averages of the latter can be estimated (see the next section). For example, the variance of the energy is 02(E)=JPB(x)[E(x)-<E>]2dx= < E 2 > - <E>2
[I21
and it is easy to show that u2 ( E ) is proportional to the specific heat,62 which in most cases is an extensive variable, that is, C c~ N. (Notice, however that in a second-order phase transition at the critical point, C can increase faster, C N1+“*,where a* > 0 is a critical exponent.) Therefore, in most cases the standard deviation of the energy, u(E), and the standard deviation of the entropy, u(S), are proportional to V% (see below). Thus, while the absolute fluctuation increases with increasing the system size, the relative fluctuation decreases as f i / N = N-1’2.In homogeneous systems such as fluids, one is generally interested in the energy (or entropy, or other property) per particle of very large systems (i.e., in the thermodynamic limit); therefore, simulating large systems is desirable, not only because of the smaller finite size effects, but also from the
Statistical Mechanics of Fluids and Chain Systems
7
statistical point of view. As discussed later, for protein systems one seeks to minimize the absolute fluctuation. The fluctuation of the free energy depends on the representation of E In Eq. [lo] the random variable [E(x)+ k,T In PB(x)1is equal to -k,T In Z (i.e., it is constant for any x);this can be obtained by expressing PB(x)in terms of Eq. [Sl and taking the logarithm. The fact that the random variable is constant means that the fluctuation is zero; that is, u(F) = 0. In other words, the fluctuations of the energy and entropy cancel each other: u(E) = Tu(S).The zero fluctuation of F is not only of theoretical interest but also has a practical use.66*67 For approximate procedures, the fluctuation of the approximate F is not zero, and one can propose, in addition to the “minimum free energy principle” (see discussion following Eq. [lo]),the “principle of minimum free energy fluctuation,” which states that as u(F(m))decreases, the approximation improves.67 This principle enables one to improve the analysis of simulation data obtained by the stochastic models method3941 and the scanning m e t h ~ d . ~ l - ~ ~ Before discussing the fluctuation of the second representation of F (Eq. [ll]), it would be helpful to emphasize some of the properties of the Boltzmann I’D. The sharp decrease of the ratio a ( E ) l < E > which is l/*, as N is increased means that for a large enough system at a finite temperature the main contribution to Z comes from a narrow range of energies around < E >. In other words, the function f T ( E )= u(E)exp[-E/k,T] (Eq. [4]) obtains a sharp peak at a typical temperature-dependent energy E+ = < E > (see Figure 1). Thus, to a very good approximation,
Z = u(E;) exp[-E;/k,T]
v31
and the entropy can be approximated by
S = k, In u(E;)
1141
and the free energy by
F = E;
+ kBTIn u(E;)
1151
Another expression of Eq. [13] is u(E+)PB[x(E;)] = 1; therefore,
S = -kB ln PB[x(E;)]
[I61
This equation means that for a large system, the entropy can be obtained from the probability of a “typical” configuration, i.e., that with energy E+. This formula was used by Alexandrowicz to obtain estimates of the entropy of the Ising model. 39 The discussion above demonstrates that for large systems the main con-
8
Calculation of Free Energy and Entropy by Computer Simulation
.--A
Max fT(E)= V(E;)e-
/
I I \
''
E'/k T
I
'.
,I
I I I I I
I
v(E)
I
I I
I I I
I
I
I I
I
Figure 1 Schematic behavior of the function fJE) = v(E)exp[-E/k,T] at a finite temperature T (see discussion about Eqs. [13]-[16]). For a large system, fT(E) has a sharp peak at the typical energy EG, and u(EJ is very small, which means that the part of configurational space contributing significantly to this function is exceedingly small. In the case of a first-order phase transition, two peaks exist. A peptide can reside in several different stable states in thermodynamic equilibrium corresponding to several peaks of f T ( E ) ;however because of the relatively small system size, the maxima of f,(E) will not be sharp (compare with Figure 2).
Statistical Mechanics of Fluids and Chain Systems
9
tribution to the integrals defining < E >, S, and Z comes from a very small region of R that depends on IT: The corresponding configurations share the same energy E; and other macroscopic properties (e.g., the magnetization of a spin system). Although their microscopic structures can be different, there may not exist large energy barriers between them. This picture is different for a first-order phase transition, for example, where typically two different energies contribute most significantly to the partition function. In systems such as peptides and proteins, the situation is even more complex, because of the existence of multiple potential energy wells. Thus, structures belonging to the same well are similar (microscopically), but those belonging to different wells with approximately the same minimum energy and large energy barriers separating the wells might be very different. Moreover, several such wells can contribute significantly to the partition function, making simulation of protein systems extremely difficult (discussed in later sections of this chapter). Finally, we return to the fluctuation in the second representation of F (Eq. [l11). At finite temperatures the largest contributions of the random variable of < exp[ +E/k,T] > come from the high energy region where the Boltzmann PD is exceedingly small. Therefore if the latter average is ca. exp[a(T)N], the fluctuation is larger, being approximately exp[b(T)N)],where a( T )and b(T )are temperature-dependent constants and b(T)> a(?’). This makes Eq. [ll]impractical for estimating F, an issue that is further discussed below (see “Free Energy from < exp[E/k,T] >”).
Entropy and Free Energy Differences by “Calorimetric” Thermodynamic Integration Thus far we have discussed the absolute free energy and entropy. In many cases one is interested only in the difference of these quantities between two states, and this difference can be obtained by suitable thermodynamic integration schemes. The entropy difference of a system at temperatures TI and T, is AS = S(T2)-$TI) =
IT,Easd T T~
Expressing S = (F - < E >)/T and taking the derivative of S with respect to T (using Eqs. [ 3 ] , [S],and [ 7 ] )leads to ( T 1 % E >)/dT = T-lC, where C is the specific heat. Therefore AS =
;1 T-’C dT
Equation [18] enables one to obtain the absolute entropy at T2 if the absolute entropy at TI is known; note that for a classical system, TI = 0 cannot be chosen because C is finite at this temperature and the integral is undefined. Calcu-
10
Calculation of Free Energy and Entropy by Computer Simulation
lation of AS enables one to obtain AF because in simulations El and E, (hence AE) are obtained easily. The difference in -In 2 between two temperatures (at constant N a n d V) can be obtained by integrating -(a In Z)/dp = < E > with respect to p where p = l/k,T. Thus
Another integration between two densities p1 and pz (p = N/V) at constant T leads to the corresponding difference in In Z (p) kBT
-{-lnZ(p2) - [-InZ(pl)]) =
N
-I
P2
P1
P / p2dp
[lo1
where P is the pressure. Although the integration in both Eqs. [18] and [19] is carried out over extensive variables, the use of Eq. [19] is preferred because in simulations the energy converges faster than the specific heat.
The Kirkwood and Zwanzig Equations Important integration schemes by Kirkwood13 and perturbation schemes by Zwanzig,14 were proposed and applied extensively and with great success, first to fluid models, but more recently to small molecules, peptides, and proteins in aqueous solutions. With these methods, one calculates the difference in free energy, not due to a change in thermodynamic variables, but rather from a change in parameters of the Hamiltonian, such as the Lennard-Jones constants E and u. The word “Hamiltonian” was used because such changes can include the masses of the particles as well change in the potential functions. We shall use this word more often in this chapter as a synonym for the potential energy function E(x) of Eq. [l]. Denoting two different potential energies by E,(x) and E,(x), one obtains
Multiplying the integrand in the numerator by exp[ -E,(x)/k,T] exp[+E,(x)lk,T] = 1 leaves it unchanged. Denoting AE(x) = E,(x) - E,(x) leads to
Statistical Mechanics of Fluids and Chain Systems
11
This is the basic perturbation formula of Zwanzig,14 in which A F consists of the ensemble average (at T )of the exponential of the difference in energy with a Boltzmann PD defined by E , ( x ) ; this average is denoted by Clearly, if the two Hamiltonians are significantly different (i.e., if AE(x) is large), the fluctuation of the exponential will be extremely large; in this respect Eq. [22] is similar to Eq. [ 111 (see discussion following Eq. (161). Therefore, Eq. 1221 is useful only for small differences in the Hamiltonian. We show later (see Eq. [ 331) that the fluctuation of this exponential can be decreased by expressing the average of Eq. I221 in terms of statistical averages defined with a non-Boltzmann PD. In many applications a continuous transition between El and E , can be defined with a parameter A, (0 5 h 5 l),such that E(x, A = 0) = E,(x) and E(x,A = 1) = E2(x).This range can be subdivided further into smaller ranges (AAl, Ah,, etc.), and A F can be expressed in terms of the corresponding differences AF(AXJ. For very small increments AX,, one can expand the exponent in Eq. [22] and consider only the first term in that expansion. This leads to a summation over the terms < AE(AXz) >x, which can be approximated by the integral,
This thermodynamic integration equation was first derived by Kirkwood13; it can also be obtained directly by integrating aF(A)/dX with respect to A from X = 0 to 1 where F is expressed in terms of Z (Eq. [3]). Equation [23] is similar to Eq. [18], where the specific heat, which is the derivative of the energy with respect to the temperature, is integrated. However, the latter derivative is generally smoother than that of Eq. [23] when creation and annihilation of particles is involved (see the section “Thermodynamic Cycles”). The Zwanzig perturbation scheme can also be applied to changes in the temperature, where the Hamiltonian is kept constant. Thus, for T , and T, one obtains Eq. 1241 in a similar manner to that used for Eq. [22]
Again, if A T is large, intermediate temperatures TIcan be defined and the difference in free energy expressed in terms of intermediate free energy differences. For a fine mesh, expanding the exponential in Eq. [24] to first order and taking the logarithm leads to < E >T AT/Tf, where AT = Tz+,- T, and T, and Tt+lare two close successive temperatures. One can sum up these differences between TI and T2 or integrate them, which leads to Eq. [19] (see also the derivation of Eq. [23]. To decrease the fluctuation of the perturbation average (Eq. [22]), Bennett proposed the following scheme68:
12
Calculation of Free Energy and Entropy by Computer Simulation
Notice that here two averages are defined with respect to El and E,, whereas the Zwanzig equation is based on only one average. The function w is chosen to minimize the variances of both the numerator and denominator. Finally, it should be pointed out that a derivation similar to that of Eq. [22] was also proposed by W i d ~ m and~ by ~ Jackson and Klein'O for calculating the chemical potential ljl. Here, however, the variable is the number of particles, N, and so one calculates k = AF = F(N + 1) - F(N)at constant T and 1/: This method has been further developed over the intervening years and is now used extensively in simulations of fluid systems (see Refs. 71-75).
BASIC SAMPLING THEORY A N D SIMULATION In the preceding sections we defined the configurational space 0, the Boltzmann PD, and random variables such as the configurational energy E(x), the entropy In PB(x),and the free energy E(x) + k,T In PB(x),where our interest is to calculate their ensemble averages and fluctuations. However, for realistic models these integrals cannot be evaluated analytically and their estimation becomes a target of various numerical statistical methods. The mathematical formalism behind the basic estimation theory requires defining the "product" space denoted by (a, X R2--* an), where all the atare equal to R. This is needed because R will be sampled n times, and these samples should be distinguishable. A configuration of this space is represented by the vector (x,x,, . . . ,x,,),where xzis a 3N coordinate vector of Rz.A random variable, such as the energy, defined on Ri is denoted by Ei. A random variable E n called "the sample mean of the energy" is defined over the product space E" =
-c
1 " E, n I =1
It can be seen that < E" > < E >, and if the energies are uncorrelated (i.e., if for each pair i, 1, < EiEi > = < E j>< Ei> = < E >,), the standard deviation of En decreases with increasing n as 76 I=
If the energy is correlated, the decrease in a(E") is smaller. To estimate < En >, one can select from a n configurations with the Boltzmann PD and calculate the arithmetic average of the energy, E n
Basic Sampling Theory and Simulation
13
where i ( t )is configuration i obtained at the tfh step of the process. En will converge to < E > as n is increased because there is a corresponding decrease in a(En) (Eq. [27]).It is important t o make the distinction between the ensemble average < E > (the integral), which is denoted with , and its estimation En, which appears with the bar. Obviously, the estimation described by Eq. [28] becomes practical only if one knows how to select configurations with their Boltzmann PD. At finite temperatures such a sampling might not be easy because of the need to find the very small region of R that contributes most significantly to PB (see discussion preceding and following Eq. [13]). Fortunately, the Metropolis Monte Carlo (MC) procedure and the molecular dynamics (MD) method enable one to achieve this goal under certain conditions (see section “The Monte Carlo and Molecular Dynamic Methods”). Finally, we write the estimator of the entropy Sn,
”
t=l
which differs from E n (Eq. [28]) in that one needs to know not only how to sample with PB, but also the value of In P$. The latter value in most cases is not available, as mentioned in the Introduction and as discussed later (see the section entitled “Properties of M C and the Difficulty of Obtaining S”). It should be pointed out that for fluid systems, where the energy (or S and F ) per particle is of interest (see discussion following Eq. [12]), Eq. 1271 yields u(E”) (nN)-lI2,which means that decreasing the fluctuation can be achieved equivalently by increasing N o r n of an uncorrelated sample. On the other hand, a protein system cannot be increased, and one is generally interested in the difference in energy (or F ) between two states. If each of these states is simulated separately, one seeks to decrease the absolute errors (Eq. [27])by increasing the sample size n until the two errors (fluctuations) become significantly smaller than the expected difference in energy.
Importance Sampling Importance sampling is a practical tool for estimating integrals such as Z (Eq. [3]). The first step is to express Z as a statistical average by multiplying and dividing the integrand by any PD, denoted P(x),which is normalized over 0 (P,P(X)dX = 11,
The expression in the braces can be considered to be a random variable, which is averaged with P(x).Selecting n configurations with this PD enables one to es-
14
Calculation of Free Energy and Entropy by Computer Simulation
timate 2 by Zn, which is defined similarly to En (Eq. 1281). Notice, however, that unlike the latter case [but like for Sn (Eq. [29])] one must know the value of P(x), which is part of the random variable. The efficiency of this procedure depends on the variance, a2[Z(P)],
Thus, with “simple sampling”, which is based on a uniform PD (i.e., Pu(x) = R-l = V N which , means that all points of configurational space are equally the estimation of Z is efprobable; see discussions preceding Eqs. [l]and [ll]), ficient only at very high temperatures, where the system configurations are approximately equally probable. At finite temperatures, however, 0 2 [ Z ( P ) ]becomes very large (for a large system), because the probability for selecting configurations from the small region of R with energy E; is slim. However, defining (sometimes guessing) PDs that give higher probability than Pu (x)to this region will decrease a 2 [ Z ( P ) ]and improve the sampling efficiency. This practice is known as “importance ampl ling,"^^,^^ as opposed to simple Sampling described above (a detailed discussion on importance sampling can be found in Chapter 4 of Ref. 78, and in Ref. 79). Notice that a perfect sampling is obtained if P(x) = PB(x)because the variance vanishes. This means that a sample of a single configuration estimates Z exactly, which is equivalent to the result of o(F)= 0, derived in the section following Eq. [12]. In practice this is not of much help because the normalization factor of PB should be known, which is the integral Z itself! Obviously, perfect sampling will be obtained for any function G (replacing exp[-E(x)/k,T] in Eq. [30]) for the PD G/Z. In an ensuing section entitled “The Scanning Method,” we show that within the framework of this method one defines PDs for polymer chains that can be improved systematically.
-
The Umbrella Sampling Equation
The discussion above highlights the fact that estimating Z by importance sampling requires knowledge of the value of a normalized PD, P. However, the normalization factor is not needed for estimating a statistical average. For example, the average in Eq. (221 is defined with the Boltzmann PD. To decrease the variance, one can use a different PD such as pw = 4x)exp[-El(x) / &TI - 4x)exp[-El(x)/ I$TI Jaw(x)exp[-El(x)1 I$TI zw
[=I
where w ( x )is a known function of the coordinates. Calculation of the normalization factor Zw = J,w(x) exp[-E(x)lk,T] might be as difficult as calculation of Z itself. However, each of the integrals in the numerator and denominator of Eq. [22] can be converted into a statistical average defined with P”. Thus,
Basic Sampling Theory and Simulation
15
Eq. 1221 becomes Eq. [33], where unlike in Eq. [30], Zw does not appear in the random variables, exp[ -AE/k,T]lw and llw. Equation [33] is AF
w(x)exp[-~(x)l/+,T]{exp[-AE(x)l kBT]/w ( x ) ) d x/ Z ,
= -&T In
w ( x ) e x p [ - 4 ( xI)b T ]I w ( x ) d x I Z , = -/+,Tin
< exp[-AE I b T ]I w >, < l l W >w
t331
where These ' ~ O methods led to moderate improvement in the efficiency, meaning that for macroscopic perturbations, several windows should be used between the initial and target Hamiltonians, and their contributions to the free energy should be combined.
Potential of Mean Force: Theory Umbrella sampling has been found to be especially useful in simulations where the changes are of a local character, such as in the calculation of the potential of mean force (PMF)along a reaction coordinate. The general definition of the PMF, W(x1,x2) (see Chapter 5 of Ref. 80), is W(x1,x2) = -k,T In g(xl,xz)
1461
where x1 and x2 are two position vectors in the three-dimensional volume, and g(x1,x2) is the radial distribution function; more specifically, ~ ( x 'x,2 ) = +T
In
N(N-1)
}
exp[-E(x I (x', ~ ~ ) ) / l z ~ B T ] d x ( ~ [471 -~)
Applications of Integration and importance Sampling Techniques 27 where E(x ] (x1,x2))is the system energy when the two particles are held fixed - ~ )an integration over the N - 2 moving at their positions, and d ~ ( ~denotes particles only. The integral divided by Z is the probability density to find the two particles fixed at their positions, and the logarithm of this PD (multiplied by k,T) is the free energy of the system under the imposed restriction. For a simple fluid, W depends only on the distance r between the two particles and will be denoted W(r).Notice that at large separation g + 1, and therefore W ( r ) + 0; this means that W ( r ) is the work required to bring the two particles from infinite separation to distance r (see Chapter 5 of Ref. 80). However, in practice one is interested not in the absolute value of W but in its shape, from which differences in free energy can be obtained. One can study, for example, the PMF of two hydrophobic particles in water as a function of their distance y,I3’ which defines in this case the reaction coordinate 5. The definition of W can also be extended to other systems and various reaction coordinates, such as one or more dihedral angles x of a side chain of a protein. The PMF for changing x from one torsional potential well to another can thus be calculated. However, for more complex reaction coordinates, defining the PD function P ( 5 ) in Eq. [47] might not be straightforward, because of the need to calculate the Jacobian of the transformation of the coo r d i n a t e ~ . ~ ~ ~In~ practice, ~’*’ a standard (unbiased) long MC or MD simulation is not expected to cover the full range of 5 of interest. One way to compensate for this shortcoming and to estimate efficiently the PD of Eq. 1471 is by carrying out several simulations along the required region of 5 using umbrella sampling. In a simple implementation, one selects several values 5; along the reaction coordinate, which constitute “seeds” around which simulations will be performed. To keep the system around 5, (for simplicity the superscript i is omitted), a harmonic restoring potential U (Eo,6) is applied,
where K is a parameter, and the biasing function w is w(x) = exp[-K“&)
-
5(x)I2/k,7-l
In practice, one must estimate the PD of Eq. [47] at several values of
to.P ( 5 ) can be written as
WI
5, around
where 6 is the Dirac 6 f ~ n c t i 0 n . To l ~ ~obtain the umbrella sampling equation, one multiplies the integrand above and that defining Z by wlw and divides the numerator and the denominator by J w exp[-E(x)/k,T], which leads to
28
Calculation of Free Energy and Entropy by Computer Simulation
where P w is the biased PD measured directly from the simulation and defined by replacing E in Eq. [50] by E + U; w means averaging with w exp[ -E(x)lk,T]/.f w exp[ -E(x)lk,T]. P(I;)is calculated around each of the seeds c;, where the constant K of Eq. [48] determines the range of I; around E; that will be sampled effectively (i-e., different 6 functions appear in Eq. [SO]). Notice that the normalizations of the probability density functions of different windows differ because only a small part of conformational space is visited during each simulation. Therefore the distance between the seeds should not be too large because one wants to allow overlap of results for P ( c )from simulations performed around neighbor e; (windows), eventually enabling one to match them to form one PD function. The match between Pl(E)and P,(E;) (around neighbor seeds, 6; and E;, respectively) can be obtained by selecting a I;, from the overlap regions, calculating the ratio Pl(ca)/P,((,), and defining P;(I;) by P,(E;) Pl(tJ/P,($). The PMF is -k,T X In P(c), where P ( e ) is the matched function. Finally, it should be pointed out that simulation of Eq. [51] is carried out with MD or the usual MC procedure (Eq. [34]) because the function w is implemented here in terms of a restraining energy U that is added to the interaction energy E. The facts that the energies at different windows are not significantly different (unless near a transition state) and that U is a local interaction (meaning that the fluctuation ’of exp[ U/k,T] is relatively small) contribute to the success of this umbrella sampling technique. However, one must bear in mind that the system still should be relaxed for each window, and such relaxation may require long simulation times. Accordingly, the usual convergence checks should be applied carefully. The procedure above was first suggested by Pangali, Rao, and Berne.l3l It is well explained (with further details) in their paper. These authors studied various aspects of the hydrophobic effect using MC for calculating the PMF of two nonpolar particles approaching each other in a bath of water molecules. A nice account of the method (and further developments) is also provided by Northrup et al.,132 who studied the PMF of rotation of a tyrosine ring in the protein BPTI by MD. Again, the efficiency of this method stems from the relatively small size of the random variable, S(I; - E,)exp[ U/k,T] compared to the large random variable exp[-AE/k,T] for fluids averaged in Eq. [ 3 3 ] ,as discussed earlier. In this context, it should be pointed out that other methods for calculating the PMF by MC133 and MD134 simulations have been proposed, which d o not apply umbrella sampling but instead calculate differences in free energy for small deviations *At. Selecting the reaction coordinate and choosing an efficient umbrella sampling function is not always straightforward for a complex system. Several attempts have been made to develop general procedures for obtaining improved ’~~ an iterative procedure in which the PMF bias functions w. M e ~ e i suggested results obtained from one set of simulations is used to define a better w for the next set. He found this “adaptive umbrella sampling” to be better than the usual umbrella sampling procedure in his application of this strategy to two confor-
Applications of Integration and Importance Sampling Techniques 29 mational states of alanine dipeptide in water simulated by MC. Hooft, van Eijck, and Kroon136asuggested another method in this spirit in which, unlike Mezei's method, information from all the sets of simulations is taken into account. Their method was used for calculating the PMF of the central torsional angle of glycol in vacuum, in water, and in CCI,. For these systems, simulated by MD, the efficiency of the method was found to be comparable to that of another method they had used earlier.136bEfficiency studies concerning umbrella sampling were also carried out by Beutler and van Gunsteren, who applied variand to glycine dipeptide both ous procedures to 1,2-dichloroethane in in vacuum and in water.13* An important part of these studies was to check the performance of two-dimensional bias functions. In a more recent paper Beutfurther applied such functions to study PMFs of side chain rotations ler et in the peptide antamanide and compared their results to those obtained from NMR experiments. Finally, methods for better matching the results of different windows to form a single PD function have also been suggested (see Refs. 140, 141, and van Gunsteren in Ref. 53). Potential of Mean Force: Applications Thus far we discussed developments in the methodology of umbrella sampling and PMF calculations. An extensive literature exists, but we mention only several examples to provide an idea about the types of system one can study. of work has Further references are cited e l ~ e w h e r e . ~ ~ , ' ~A~large ~ ' - ~amount ' been devoted to explaining the association of NaCl in water using different models for water and the solutes and varying computational parameters, such as the cutoff distances of the interactions and the boundary conditions, applying both the M C and M D t e ~ h n i q u e s . * ~ ~It- ~ should ~ O be pointed out that in the work of Friedman and MezeiI4' the PMF obtained is approximate, because it was calculated by a slow growth thermodynamic integration procedure, where 6 is not determined at fixed values. In the work of Resat, Mezei, and McCammon,lso the PMF is calculated from grand canonical M C simulations. Chandrasekhar et al. studied the PMF of an S,2 reaction between chloride ion and methyl chloride in aqueous solution and demonstrated the large influence of the solvent on the PMF as compared to the corresponding vacuum energy profile.1s1i1s2 Another reaction, studied by Rossky's group, is of sodium-dimethyl phosphate ion pairing.1s33154 A well-studied system is alanine dipeptide ( AcAlaNHMe). The relative stability of different conformational states in and in water were obtained from PMF calculations,140J56 and again different models and simulation parameters were applied. In a recent study Marrone, Gilson, and M ~ C a m m o n calculated '~~ the PMF of alanine dipeptide by using the Poisson-Boltzmann method with a hydrophobic term and by using explicit water and found comparable results. Fraternali and van Gunsteren studied PMFs of glycine dipeptide in water for two reaction coordinate^.'^^ Tobias and Brooks used their own technique13, to calculate the PMF of the central torsional angle
30
Calculation of Free Energy and Entropy by Computer Simulation
of n-butane in vacuum,155in water, and in CC14159(see also Refs. 160 and 161). They also applied umbrella sampling to evaluate the free energy of foldinglunfolding of one turn of a helix of alanine and valine in water,162 along with the conformational equilibria of two blocked dipeptides as models for reverse turns.163 We also mention the work of Roux and K a r p l ~ s , 'who ~ ~ calculated the PMF of Na' and K' ions as a function of position in a model of gramicidin A, and that of Dang and Kollman, who calculated the free energy of association of 9-methyladenine and 1-methylthymine bases in water.165 Finally, Warshel and collaborators have used umbrella sampling for calculating free energy profiles in chemical reactions in solutions and proteins, as well as for ligand binding to proteins.166-168 In summary, umbrella sampling has been found to be a useful tool for calculating PMFs of relatively simple systems, where the reaction coordinate can be defined efficiently. For complex systems, such a coordinate might depend on several structural parameters simultaneously and defining an efficient importance sampling function w might become difficult. In this context, it is of inwho studied by MD the terest to mention the work of Wallqvist and C0ve11,l~~ free energy profiles of a relatively large conformational change of dodecane, from an extended to a hairpin state in vacuum and in a box of 717 water molecules. The reaction coordinate used was the end-to-end distance R, and the free energy differences were calculated for small changes in R by a perturbation scheme (Eq. [22])similar to that suggested in Ref. 134. However, to handle such a large system they had to define 500 small windows, each being 0.02 A, and they spent ca. 40 CPU days on that calculation. Boczko and Brooks studied the PMF of folding and unfolding of a three-helix bundle where the reaction coordinate was the radius of gyration.141d Another study in which the PMF of a global conformational change was calculated is that of Wang et aLY1'O who investigated the twisting of p sheets in polypeptides containing up to 15 residues.
THERMODYNAMIC CYCLES Historical Perspective So far we have mainly described perturbations (and integrations) with respect to real physical properties, such as a reaction coordinate or the temperature. However, in the section on umbrella sampling we provided several applications of theories derived by Zwanzig14 (Eq. [22])and Kirkwood13 (Eq. [33]) for fluids that are based on nonphysical transformations of the Hamiltonian.16J7y68J27J28Such transformations constitute the basis of thermodynamic cycles, which have been used extensively for evaluating the free energy of binding of ligands to an enzyme, hydration free energies of small molecules, enzymatic reactions, etc.
Thermodynamic Cycles
31
Therefore it is of interest to review the applications along with ensuing modifications of the Kirkwood-Zwanzig ideas to computational chemistry. These ideas were introduced to simulations of simple liquids in 1969 by Hansen and Verlet'O and Levesque and Verlet,12' who have built on the theoretical work of Barker and H e n d e r ~ o n . ' ~Hansen, ~ , ' ~ ~ Levesque, and Verlet integrated dElliX with respect to a parameter X of a fluid system of N = 864 particles. The ability to perturb such relatively large macroscopic energy changes has two sources: first, the Hamiltonian is a smooth function of h, and second, the investigators were interested in the free energy per particle, which is an intensive property (see discussion following Eqs. [12] and [29]). Another nonphysical macroscopic perturbation was carried out later by M r ~ z i k , who ' ~ ~ calculated the free energy of water clusters containing 8 and 64 molecules by thermodynamic integration. Likewise, Mezei et calculated the free energy of 6 4 water molecules by integrating their interaction energy from zero (i.e., from an ideal gas of the same density) to its full value using the M C method (see also Ref. 128). Squire and Hoover15 used this idea to induce a local, rather than a macroscopic change, by adding reversibly a particle into a vacancy in a crystal. Mruzik et al.175 used thermodynamic integration to calculate the free energy required to add a water molecule to ion-water systems. Owicki and Schergrew a cavity in a Lennard-Jones solvent by gradually changing its radius from zero to the required size and calculating the free energy by umbrella sampling. Similar studies were carried out by Postma et al.'77 (for a more detailed review, see Ref. 18). Further developments of these ideas took place in computational structural biology, where nonphysical local transformations were implemented within the framework of thermodynamic cycles. These nonphysical transfor~ ~ studied ionization in mations were introduced in 1981 by W a r ~ h e l , 'who acidic residues in proteins (pK, calculations). Although the cycle included nonphysical transformations, they were not carried out by the perturbation technique. A year later166Warshel used the perturbation method together with umbrella sampling to study the solvation free energy contribution to an electron transfer reaction coordinate, using two spheres for donor and acceptor in water; the perturbation, however, was performed along a physical path. Warshel also modeled some enzymatic reactions that involve nonphysical processes.16' In 1984 Tembe and M ~ C a r n r n o ncombined l~~ all these ingredients into a method that has made a large impact in the computational chemistry community. They suggested using thermodynamic cycles for calculating the relative binding of two ligands to the active site of an enzyme by converting one ligand to the other by nonphysical perturbations. This idea, which they applied initially to a simplified model, was adopted immediately by others. Jorgensen and Ravimohan180 calculated the difference in the solvation free energy of ethane and methanol in water from Monte Carlo simulations. Solvation free energy studies of small molecules and ions were carried out by McCammon's group,1s1,182 by and by Kollrnan's group,' 8 4 ~ 1 8 5who calculated the
32
Calculation of Free Energy and Entropy by Computer Simulation
free energy difference of a leucine-alanine mutation in chymotrypsin. Wong and McCarnmonlS6studied the change of benzamidine to p-fluorobenzamidine and glycine to alanine in trypsin. Hermans and Shankar calculated the free energy of binding of xenon to myoglobin.’*’ The number of applications of this methodology to a wide range of problems has increased rapidly, and a new subfield in computational structural biology has emerged based on these concepts (for detailed reviews see Refs. 18, 19, and 51-61).
Free Energy of Enzyme-Ligand Binding To obtain the free energies of binding of two ligands, L1 and L2 to a protein ??experimentally, one measures AFl and AF, (i.e., the free energies required to transfer the ligands from the solvent, where they are separated from the protein, to the active site; Figure 3 ) .This physical transformation, however, is difficult to perform computationally. Instead, one can carry out a nonphysical perturbation in both the protein and the solvent environments, in which L1 is converted into L2, and calculate thereby AFp and AFs, respectively. The free energy is a state function, which means that the total free energy of the cycle is zero: AAF = AFl - AF, = AFs
-
AFp
[521
In other words, the relative free energy of binding is obtained from the free energy differences of the nonphysical transformations. This process can involve the annihilation and creation of atoms. In practice, considering for example the solvent case, one defines a linear hybrid potential energy function Es(x, A), (0 5 A 5 l),
where x1 and x2 both consist of the coordinates of the solvent molecules and those of L1 and L2, respectively; x = xlUx,. It is seen immediately that the solvent-solvent interactions are unaffected: the solvent molecules interact with both L1 and L2, and L1 and L2 do not “feel” each other. Similar discussion applies to the hybrid energy in the complexed protein. With the perturbation method, the range [0, 11 of A is divided into m smaller segments of size AA = l / m defined by A, = 0, A, = AA, . . . ,Am = 1; the differences in the segments’ energies are denoted by AExj = Ex(j+l)(x)- Exj(x), where the subscript s is omitted from the energy for simplicity. Notice that AExj includes only the intraligand energy and the interaction of the ligands with the environment; if only parts of the ligand are changed, this comment applies to these parts. AFS is then obtained from Eq. [22], m-I
I
=o
Thermodynamic Cycles
P : L1
P + L1
33
P : L2
___)
sim. (p)
sim. (s)
P-
_____)
A Fs
Figure 3 A thermodynamic cycle for the binding of two ligands, L1 and L2, to a protein P. In the experiment the ligands are transferred from the solvent to the active site, and the difference AAF = AFl - AF, is measured. In simulations the nonphysical transformation of L1 -+ L2 is carried out in the protein and in solution, and the corresponding free energies AFp and AFs are calculated. The thermodynamic cycle leads to the desired AAF in terms of the latter free energy differences, AAF = AFp - AFs.
and a similar equation holds for AF . For small enough segments, one can apP. proximate each summand by taking into account the first term of the Taylor expansions of both the exponent and the logarithm, which leads to
- C,
AF~
[551
i=O
Thus, Eq. [55] has been derived like Eq. [ 2 3 ] ,which means that AFS and AFp can also be obtained by thermodynamic integration (Eq. [ 2 3 ] ) In . this case, one can also carry out the partial derivative of the hybrid energy (Eq. [53])with respect to A, which leads to AE = E , - El, and Eq. [ 2 3 ] becomes
Therefore, the statistical average of AE is evaluated during the integration for different ensembles defined by A. Equation [56] is based on the linear hybrid Hamiltonian defined in Eq. [ 5 3 ] ,which is simple but might cause singularities in the integrand of Eq. [56] at the end points, in particular when the creation o r annihilation of atoms is involved. To avoid this, a nonlinear hybrid energy function can be defined in various ways. Thus, one can replace A in Eq. [53] by Ak, where k > 1 is sufficiently large, or apply the above A parameterization to each of the potential energy parameters rather than to the whole Hamiltonian Beutler et al. proposed a nonlinear hyas in Eq. [53].15,18,51757,175,177~1~8,’89 brid energy function that is based on both Ak (k > 1)and Lennard-Jones and
34
Calculation of Free Energy and Entropy by Computer Simulation
electrostatic interactions with soft core potential^.'^^ It should also be pointed out that the integration of Eq. [23] is frequently carried out by a “slow growth” procedure, which is based on very small but constant changes of X during the ~imulation.~~~3~~~ Before we discuss the efficiency of this methodology, it should be pointed out that the thermodynamic cycle described above is not unique and other adequate cycles can be defined for various problems. For example, to compare the free energy of solvation of molecules A and B, one performs the nonphysical conversion of A into B in water and vacuum, instead of calculating the physical transformation, AF(vacuum + water) for both molecules. One can also obtain the absolute free energy of solvation of, say, A, by annihilating it in both . ~compare ~ ~ ~ ~the ~ stabil~ environments (i.e., B becomes a ghost m ~ l e c u l e )To ity of a mutated protein to that of the native one, the original side chain should be mutated in the folded and the unfolded structures. The unfolded protein is commonly approximated by a chain of three (or more) residues, consisting of the mutated residue flanked by its neighbor residues in the protein. 84,194-197
Application of Thermodynamic Cycles Why Thermodynamic Cycles Are Successful The success of using the thermodynamic cycles as described above stems from the local character of the induced perturbations and is clearly demonstrated by Eq. [SS].There, AF is expressed as a sum of energy differences based only on the interactions between the n created and annihilated atoms with the rest of the s stem. Hence, if the number of atoms n is small, the energy fluctudue to these atoms is likewise small, and the averages < AE >r ation ( x (hence AF) can be obtained with relatively low statistical noise. This is important because the difference in the free energy (and the energy) of binding of chemically similar ligands is small: typically on the order of only several kilocalories per mole. However, no such expressions (which only depend on the n atoms mentioned above) exist for the differences in energy or entropy (see Refs. 183 and 198 and van Gunsteren in Ref. 53a). Thus, if one were to seek to calculate AE = E(PL1) - E(PL2) from two separate simulations of PL1 and PL2, one would encounter relatively large fluctuations ( V k = 100) for the whole system of N atoms because typically N = lo4 for a large solvated protein. Two simulations would require generating two uncorrelated samples, each of ca. 2 X 1O4 configurations, which would be excessively time-consuming because of the strong energy correlations (i.e., large tc)induced by both the M D and MC methods (see discussion above: “Properties of MC and the Difficulty of Obtaining S ” ) . Also, as discussed below, in a long simulation the system might escape from its original wide microstate, and if this happened, the results would become nonconverging. Finally, it should be emphasized again that the computational advantage of AF over BE exists as long as the chemical change is small (i.e., n K N ) .Thus, estimating energies in the binding problem is difficult
Thermodynamic Cycles
35
because of the relatively large (absolute)fluctuations. This problem is alleviated in homogeneous fluid systems where the energy per particle (with fluctuation 0~ N-1’2) is commonly calculated, as discussed earlier.
Perturbation in the Protein Environment The low statistical noise of AF has enabled the calculation of the relative free energy of binding of various ligands, thereby sparking a great deal of enthusiasm in the early stages of the method’s d e v e l ~ p m e n t . ~Subsequently *,~~ accumulated experience has indicated that the analysis should be carried out with caution because AFp is also affected in a complicated way by the dynamics of the rest of the system (i.e., the protein and the solvent). Formally, the averages (denoted by in Eqs. [54-561)are integrals over the configurational space of the whole system, including the entire conformational space of the protein; in practice though, the simulation (for A = 0) starts from the X-ray structure of the protein, which is optimally complexed with ligand L1, and the simulation is expected to span only the wide microstate of the native structure of the protein (denoted by fig), while a full relaxation of the solvent is obtained. (Typically the X-ray structure is an average of conformational fjuctuations over a wide microstate.) The contribution of other wide microstates to the partition function is ignored, given the experimental fact that the protein is folded, and therefore f i g is the most stable microstate. Similarly, it is assumed that for each A > 0 there is a corresponding dominant wide microstate fit’, which is close to f i g and is adequately spanned by the simulation. Convergence of the Nonphysical Transfornations Convergence of the free energy difference for each perturbation requires generally long simulation times, and, if crossing of energy barriers is involved (e.g., torsional barriers of side chains), a temporary (presumed)stability of the results might be mistakenly interpreted as a real convergence. However, with the assumption that during integration the protein stays in the most stable wide microstate, a perfect force field should lead to the correct differences in the free energy for long enough simulation times. With the usual approximate force fields, however, further problems may arise. For example, if the most stable wide microstate does not correspond to the native structure, during a long simulation the system might escape from the starting microstate using such approximate force fields. Thus, in practice, the free energy difference may converge to the wrong answer or may not converge at all. Further caveats demonstrate that free energy perturbations in a protein environment should be carried out with caution. In particular, it is possible that short simulations might lead to a better agreement with experiment than longer ones. Also, it is not necessarily correct to assert that a reliable criterion of convergence for the calculated free energy difference exists when the forward perturbation (i.e., A = 0 -+ A = 1)and the backward one ( A = 1 A = 0) give the same result without hysteresis being detected. A lack of hysteresis suggests that the forward and back+
36
Calculation of Free Energy and Entropy by Computer Simulation
ward paths are similar and were sampled comparably, but not that they must be the correct ones. This means that unless AAF shows stability during extremely long simulations, and no hysteresis is detected for both AFp and AFs, the reliability of the results should be viewed with some reservation. In practice, these conditions are never fulfilled. Both systematic and statistical errors, which are difficult to estimate, are always involved, as demonstrated by the following examples. A thermodynamic cycle was defined in which oxidized Asn47 of azurin is mutated to oxidized Leu and both amino acid side chains are reduced; in this case AAG can be calculated by two different transformations (say, A and B). A thermodynamic integration study by Mark and van G ~ n s t e r e n led l ~ ~to AAG(A) = -24 kJ/mol, AAG(B) = -15 kJ/mol, whereas the experimental value is -11 kJ/mol. Another example is the conversion of 2’GMP to 3’GMP in solution and in RNase TI. Using thermodynamic integration, MacKerell et a1.200a found in solution AGs = 6.24 ?0.15 and -3.25 20.17 kcal/mol for the forward and reverse transformation, respectively; in the bound state the corresponding results are 12.69 50.19 and -7.65 2 0 . 1 9 kcaUmol. Again, these results demonstrate typical inconsistencies that can stem from an approximate force field, escape of the protein from the initial wide microstate, or incomplete sampling. Lee and Warshel pointed out, however, that in the case of charges, proper treatment of boundary conditions and long-range effects should reduce the errors.200b The escape of the protein from its original wide microstate might be prevented by adding, for example, restraining potentials to the force field, keeping each dihedral angle in some region around its starting value. However, this alleged remedy is problematic for two reasons. First, the wide microstate of a large protein is unknown a priori, and therefore the definition of the restraints is not straightforward. Second, any addition to the potential energy function changes the computed entropy, energy, and the free energy unless this effect is adequately corrected. Such restraint potentials have been used by Hermans and collaborators for calculating the stability of relatively short helices of polypeptides upon replacement of one residue by a n ~ t h e r . ~ O I - In ~ Othis ~ case, however, the wide microstate is well defined. Thermodynamic Integration Versus Perturbation A great deal of work has been devoted to comparing and contrasting the perturbation and integration procedures and improving their convergence.61b In contrast to the perturbation procedure, thermodynamic integration has the advantage of being decomposed into separate integrations over specific types of interaction (van der Waals, electrostatic, etc.) or for interactions of certain groups of Such decomposition not only enhances convergence, but also provides insight into the effect of various free energy contributions on the mechanism of the molecular transformations involved. The significance of such individual contributions has been questioned recently, how(see ever, because of their dependence on the integration path196~199~206b,207 also van Gunsteren et al. and Straatsma et al. in Ref. 53b). Other studies have 849204-205,206b
Thermodynamic Cycles
37
concluded that such decompositions can be physically meaningful if defined and treated carefully.208J0y Many attempts were made to determine the computed errors and to assess their correlation with hysteresis, the sample size, and various transformation paths.210-216 Likewise the efficiency of slow and the effect of different numerical quadratures on the convergence of thermodynamic inte~ , ~efficient ~ ~ , ~ways ~ ~ .to~ handle ~ ~ bond gration have been i n v e ~ t i g a t e d , ~ and lengths and constrained coordinates have been s u g g e ~ t e d In . ~general, ~ ~ ~ ~ ~ ~ thermodynamic integration is considered to be the more efficient procedure, 1 y ~ 5 3 cbut the efficiency is system dependent and in general the advantage is small. Nonetheless both techniques are used extensively in the field of computational chemistry. It should be pointed out that most of these studies are based on testing small systems (e.g., alanine dipeptide? 0,218 N-acetyl-threonylN-methylamide2'O) that are computationally controllable; therefore, some of the conclusions derived from those studies may not apply to proteins. Indeed, the strong correlation found for these systems between sample size and accuracy is not necessarily satisfied in proteins, as argued in the preceding section.
Thermodynamic Cycles in Small Systems The strong correlation between sample size and accuracy discussed above makes both perturbation and integration techniques powerful tools for studying the free energy of solvation of small ionic, polar and nonpolar systems in various solvents:21-231 and for studying the relative free energy of binding of different guest molecules to a host Free energy studies of somewhat larger systems (e.g., peptides of up to 15 amino acid residues) also fit to this category.201-203~237 Comprehensive assessments of free energy studies of small systems are available in the recent reviews of Kollman60 and Jorgensen and T i r a d o - R i ~ e s It. ~is~ of interest to relay the extent of errors involved: the average error between calculated and experimental absolute free energy of solvation values for 16 small organic molecules in water is 1.0 kca1/moLs* This error, which is considered to be small, was obtained for two force fields differing mainly by their partial atomic charges and by the cutoff distance used for the water-water interactions. However, for some molecules the deviations between theoretical and experimental values are large: -13.4 20.4 versus -9.7 kcal/mol for acetamide and 3.3 50.5 versus 1.8 kcal/mol for ethane, respectively. These deviations can be blamed mainly on the approximate force fields rather than inadequacies in the free energy calculation. Therefore, free energy calculation constitutes an important tool for checking and thus improving the quality of force fields.
New Perturbation-Related Procedures The perturbation and integration procedures discussed above have the advantage that in principle they are exact, and they can be applied to systems of high complexity. But, only minor chemical and structural changes can be
38
Calculation of Free Energy and Entropy by Computer Simulation
treated reliably, and even in these cases the calculations are time-consuming because of the large number of integration steps required. As discussed earlier, convergence is problematic in a protein environment. Therefore several techniques have been proposed to remedy at least part of these problems. Peptide Growth Simulation The peptide growth simulation (PGS) was suggested by Tropsha et al.238 The simulation starts from a molecule comprised of only the end groups of the peptide. The peptide is then constructed step by step, where at each step an amino acid residue is slowly grown by a thermodynamic integration procedure and added to the partially constructed chain. The free energy required to grow each residue is calculated, and the absolute free energy of the whole chain is obtained by summing the individual free energies; thus, PGS is related to other growth procedures developed for synthetic polymers25-35,239(see the later section “Entropy from Linear Buildup Procedures”). Clearly, the main advantage of PGS is that it enables one to calculate the free energy difference between significantly different conformations, each constructed by PGS. The method was used for studying the helix-coil transition of polyalanine molecules of different size both in vacuum and in ~ a t e r , It~should ~ ~ ybe ~pointed ~ ~ out, however, that to hold the molecule in the helical or extended (coil) states, restraint potentials were added to the Hamiltonian, clearly affecting the results.
Expansion in A Another approach, which is aimed at eliminating the number of integration steps, was proposed by van Gunsteren’s The original idea was to expand the free energy difference F(A) - F(0) in a Taylor series in A around h=O
F ( A ) - F(0) = AF(A)= F(,=,,A
+ $F;’,,A2
+
The derivatives F‘, F”, P”, etc. at A = 0 can be expressed in terms of statistical averages and fluctuations of the energy difference E ( A ) - E(0).242J43The latter are estimated from a single long MD (or MC) simulation at A = 0. (In a similar strategy proposed earlier by Levy et al.,244 expansion only up to second order was used with respect to the electrostatic energy; this method is related to the linear response approximation described below.) Moreover, results for AF(A)for several mutations can be obtained from this single MD sample, in contrast to the usual procedure that calls for several different integrations (one for each mutation). It was realized later that within the range of convergence, the expansion of Eq. [57] is equivalent to the perturbation, AF(A) = < exp[-(E(A) - E(O)l/k,Tl>,=,. The problem remains that one is generally interested in results for h = 1, for which the typical conformations are significantly different from those of A = 0, meaning that in practice the free energy difference will not converge. In-
Thermodynamic Cycles
39
deed, Eq. [57] failed when applied to a macroscopic change, such as inverting the signs of the atomic charges of 216 water molecules.242 On the other hand, applying this charge inversion locally to a single dipole immersed in a box of water molecules, using both Eq. [57] and the foregoing perturbation formula,242,245yielded good results, but only for A I0.5. This means that the method can be effective only if the reference and the target states d o not differ significantly, leading to the idea of designing nonphysical reference states that are close to both the physical reference state and the target state. One then calculates reliably AF values between the physical state and the nonphysical reference states and adds them to the corresponding AF values between the nonphysical and target states. For example, to calculate the free energy change associated with the mutations of para-substituted phenol molecules in water, two different nonphysical reference states were defined that were based on suitable soft core LennardJones potentials.245 A similar strategy was employed for calculating differences in the binding affinities of small aromatic ligands to a hydrophobic cavity in T 4 - l y ~ o z y m e In . ~ general, ~~ the perturbation formula was found to be better suited than Eq. [57],and the use of nonphysical reference states improved the results, so that for most compounds they were satisfactory. In some cases, though, the deviations of the calculated results from the experimental values or from the thermodynamic integration values were relatively large, suggesting that the procedure should be used with caution. In practice, two or more reference states should be investigated. Free Enevgy Calculation Based on the Linear Response Approximation An approximate procedure, related to the Taylor expansion of A technique above and based on the linear response approximation (LRA) for electrostatic interactions, has been developed in the last 10 years by several researchers. Hwang and W a r ~ h e and l ~ ~Kuharski ~ et demonstrated the utility of LRA in polar solvents by verifying the microscopic validity of the Marcus r e l a t i o n ~ h i p . Levy ~ ~ ~ et . ~al. ~ ~demonstrated the formal relation between LRA and free energy perturbation.244 Lee et a1.206a used this method for calculating free energy of binding of two ligands to an antibody, and Sham et al. used it to calculate pK,'s in proteins.251 Systematic studies of this procedure have been carried out by Aqvist and collaborator^^^^-^^^ and by Jorgensen's g r o ~ p . ~ ~ , ~ ~ ~ With the LRA-based method, the absolute free energy of binding due to the electrostatic interactions of a ligand to a protein is obtained from two simulations, one on the free ligand in solvent and the other on the ligand-protein complex. To effect this, one calculates the ligand-surrounding (I-s) average electrostatic energy for the bound ligand (denoted p), < E;((l-s) >, and for the unbound ligand in solution (denoted s), < E$-s) >. The free energy of binding At;,,", due to the electrostatic interaction is
40
Calculation of Free Energy and Entropy by Computer Simulation hFblnd = + A < Ei-s, > = ?[< 1 E$- s) > - < Ei'(1- s) >]
1581
The advantages of this procedure are that only two simulations per ligand are required, and the physical transformation is calculated, meaning that the absolute free energy of binding (with respect to the solvent) is obtained. Further, if the ligand is not too large, the fluctuations of the I-s energies are relatively small and a reliable estimation of a small difference AFbind is feasible, as already discussed. Still, this procedure is approximate, in contrast to those based on perturbation or integration concepts which, in principle, are exact (the approximations involved in the derivation of Eq. [58] are described in Refs. 252-25 6 ) . To validate the LRA, Aqvist and Hansson calculated by free energy perturbation the free energy associated with charging various solutes in water and in other solvents.2s4 They found that the LRA prediction (i.e., AFbind/ = 1/2) is satisfied for monovalent ionic solutes but is less accurate for dipolar ones. This conclusion agrees with earlier calculations of King and Barford,2s8 who proposed yet another method based on LRA for calculating differences in the electrostatic free energy. In their method, the difference in the electrostatic free energy for a nonphysical mutation is obtained from the energy difference at the midpoint A = 112: that is, < E&,(A = 1) - E&,(A = 0) >x=1,2. To account also for the van der Waals (vdw) interactions, Aqvist et a1.2s2 calculated the difference of the (1-s) vdw average energies in the two environments and added that value to Eq. [ 581; thus, AFbind becomes
The factor (Y multiplying the van der Waals energy difference was found to be 0.162, by best-fitting experimental binding data.2s2 Equation [ 5 9 ] was used to calculate the absolute binding free energy of several sugars to the periplasmic glucose/galactose receptor from Salmonella typhimurium; the calculated results were within 10-15% of the experimental values.2ss In another study the absolute free energy of binding of two charged benzamidine inhibitors to trypsin was calculated. The agreement with experiment was good and was superior to that obtained by free energy perturbation.ls6 Carlson and J ~ r g e n s e napplied ~ ~ ~ the foregoing LRA ideas to the calculation of the hydration free energies of small molecules. However, they used the following modified equation, where p (replacing 1/2 in Eqs. [58] and [ S S ] ) is now a free parameter. The parameter y multiplies A i , which can be the solute surface area (i = l ) ,the solvent-accessible surface area (SASA) (i = 2), or the solute volume (i = 3 ) ( for
Entropy from Linear Buildup Procedures
41
a transformation from the gas phase to water, the averages <E&,> and <E;1"-5"; > calculated in water replace the corresponding differences in Eq. [ 601). The three parameters of Eq. [60] were best-fitted to experimental and perturbation data of free energies of hydration of 16 small organic molecules. The calculations were carried out for the Gibbs free energy of hydration rather than E The best results were obtained with SASA, whose average error of 0.6 kcal/mol versus experiment is somewhat better that the 1 kcaVmol error obtained with perturbation (see Table 3 of Ref. 5 8 ) , and both these results are significantly better than those obtained by using Eq. 1.591with a best-fitted a.The Yale workers found that if only 13 of these molecules are considered, the optimized parameters with SASA are 0.35 2 01 z 0.49, p = 0.43, and y = 0.02 (i.e., p is still close to 1/2), whereas the optimized parameters for the whole group of 16 molecules (i.e., including the omitted aniline, nitrobezene, and chlorobenzene) are significantly different: a = 0.11, p = 0.29, and y = 0.01. This raises a question about the generality of these parameters for a large set of molecules. In summary, although using Eqs. [ 5 8 ] or [60] for calculation of free energy differences is appealing, the general validity of these equations should further be studied. A systematic effort in this direction was recently made by Hummer and S z a b ~At. ~this ~ ~point, it is safe to state that the equations above enable one to obtain approximate differences of free energy relatively quickly, but those values should be verified by other methods.
ENTROPY FROM LINEAR BUILDUP PROCEDURES Thus far, we have mainly discussed methods for estimating the free energy and entropy within the framework of the M C and M D dynamical simulation techniques. We pointed out the inefficiency (due to the need to perform a large number of simulations of intermediate states) of methods based on thermodynamic integration. Thermodynamic integration, though, was developed because of the difficulty in calculating the Boltzmann PD PB (Eq. [ S ] )(hence the absolute entropy) of a given state, from a single M D or M C sample. Indeed, only three direct methods for F or S have been described in this chapter thus far: the one based on calculating exp[+E/k,T] is i m p r a c t i ~ a land , ~ ~ the other two are based on a Gaussian distribution assumption.6-B A different class of simulation methods, which are not of a dynamical type, was developed for the study of synthetic polymers. In this approach, a polymer is grown step by step with the help of transition probabilities (in most cases approximate), where their product is the construction probability Pi of the whole chain i. Because the value of Pi is known, the partition function, hence the absolute free energy, can be estimated by importance sampling (Eq. [30]) from a single sample without the need to resort to thermodynamic integration.
42
Calculation of Free Energy and Entropy by Computer Simulation
The basic ideas of this approach are far-reaching. They can be extended and used for extracting the entropy from samples generated by MC, MD, or any other simulation technique, as demonstrated by the local s t a t e ~and ~ ~the , ~ ~ hypothetical scanning38 methods of Meirovitch. The idea of a linear buildup was transferred by Alexandrowicz to magnetic and fluid models3941 and has enabled calculation of the entropy in new ways in such systems as we11.36,42743 Many of the methods used in the area of polymer simulation have been developed using lattice models, which have also become popular in studies of protein folding. Accordingly, we describe the basic methods in this section as applied to simplified lattice models and discuss later how they can be extended to continuum systems. Notice that the equations introduced in the sections “Statistical Mechanics of Fluids and Chain Systems” and “Basic Sampling Theory and Simulation” still hold, where the integrations are replaced by summations over a discrete space, and the probability densities thus become probabilities.
r
Step-b -Step Construction Methods for Po ymers Simple Sampling The simplest polymer lattice model is that of an ideal chain, that is, a chain of N connected bonds (N + 1monomers) whose terminus is placed on the origin of a square lattice. Immediate chain reversals are not permitted, but no interaction energy is defined, which means that the chain can intersect and retrace itself (Figure 4). Although this model is unrealistic, its partition function can be easily calculated. It is a summation over all the different chain conformations; thus, Z , = 4 X 3(Np1).Each chain conformation is equally probable, and therefore the Boltzmann probability of ideal chain j is
where for simplicity the subscript j is omitted in Eq. [61] and what follows. A sample of n ideal chains can be obtained by growing each chain from the origin step by step as a random walk, where, at each step, one direction out of three is determined with a probability 1/3 by a random number. Obviously, in this case each chain is generated according to its Boltzmann probability P:. We shall use the terms “ideal chains” and “random walks” (RWs) interchangeably. A more realistic model is that of chains with excluded volume, i.e., selfintersections of the chain are not allowed because two monomers occupying the same lattice site have infinite interaction energy. These chains, which constitute a subgroup of the ideal chains, are also called self-avoiding walks (SAWs) (see Figure 4). Because of the lack of finite interaction energy in this model, the SAWs are equally probable, and the partition function Z,, is the total number of different SAWs, which, in contrast to ideal chains, is unknown exactly
Entroby from Linear Buildub Procedures
a
b
43
C
T 0
l
I
o
1
1
Figure 4 Chains of N = 11 steps (bonds), that is, 12 monomers (solid circles) on a square lattice (open circles). Immediate chain reversals are not allowed; therefore the maximum directions v available are 4 for the first step and 3 for the later steps. (a)An ideal chain (random walk) starting from the origin (1);the chain intersects itself and the last step (dashed line) goes on the third one. (b) A self-avoiding walk (SAW) is not allowed to self-intersect. The SAWs are a subgroup of the ideal chains. (c) A SAW with a finite interaction E between nonbonded nearest-neighbor monomers. The total energy of the chain is 6 ~ .
for long chains. Therefore, the Boltzmann probability P$ is unknown as well (we omit the abbreviation “SAWs”; they are denoted by i and the random walk by if,
4B =-
1
ZSAW
Although Z,, is unknown exactly, it can be estimated by importance sampling (Eq. [30]) from a sample of random walks generated step by step, as explained earlier in this section. To accomplish that calculation, it is convenient first to express Z,, in terms of the ensemble of random walks,
where E j = 0 for a SAW and 00 for a self-intersecting walk. Consequently, the latter walks do not contribute to the summation. In practice, the elimination of these walks is carried out at each step of the chain construction: if a new bond added to the walk violates the excluded volume condition, the partial walk is discarded and a new one is started. Therefore, only < nstartSAWs will be
nsuc.
44
Calculation of Free Energy and Entropy by Computer Simulation
generated, out of the nstartrandom walks started. ZSAy is estimated by evaluating the average value of the random variable defined in the braces in Eq. 163j. Z,, is (see Eqs. [28] and [29])
where the summation is carried out over the SAWs generated in the process ( means that SAW i generated in the tfhexperiment contributes 1 to the partition function). Since PF is known (Eq. [61]), Z and the entropy (0:In Z ) are known as well. It should be pointed out that according to Eq. [30], the probability should have been normalized over the ensemble of SAWS-the chains of interest. Instead, it is normalized here over the larger set of random walks where the self-intersecting ones are eliminated by the interaction E .. Notice that contrary to the M C or MD procedures, the chains are statistically independent because each one of them is constructed ab initio. This “simple sampling” or “direct Monte Carlo” procedure was introduced by Wall, Hiller, and Wheeler25 in 1954. “Simple sampling” is an exact procedure. This means that the chain configurations are constructed with equal probability; however, it is very inefficient for generating long SAWs because the probability pa,,, of the latter decreases exponentially with increasing chain length N, nsucc Pattr = -= exp[-AN] %tart
where A is called an attrition constant. For SAWs on the square lattice, A = 0.128 is relatively large, and the chain length that can be handled does not exceed 90. This “chain attrition problem” stems from the fact that the selection of a direction is done “blindly,” i.e., without checking first the occupancy of neighbor sites. Consequently the walk has a high probability of getting trapped in a cul-de-sac. On the other hand, the blind selection of steps guarantees that the SAWs are generated with equal probability, as required. Several methods have been suggested to overcome the attrition problem. One is the scanning method, which is an extension of the Rosenbluth and Rosenbluth procedure.26 The scanning method forms the basis of the local states method and the hypothetical scanning method, enabling one to estimate the entropy from MC and MD simulations. The fundamental concepts of these techniques are described next.
The Scanning Method The scanning method was developed by M e i r o v i t ~ h . ~As I - ~with ~ “simple sampling,” a SAW is grown step by step, but to decrease future trapping of the chain in a dead end, a direction is selected after a suitable search of the sur-
Entropy from Linear Buildup Procedures
45
roundings, rather than blindly. More specifically, at step k of the growth, k 1 directions v ( u = 1 , 4 for the square lattice) will have already been constructed ) ,the value of v k should be determined out (they are denoted u,, . . . ,v ~ - ~and of the four possible directions (steps) v, where chain reversal is not allowed. To determine the direction uk,one enumerates all possible continuations n i ( f )of the chain in f future steps that start from v of step k; these continuations are called future SAWs. f(stands for future) is the scanning parameter that was de’ - ~values ~ of n l ( f )enable one to define transition noted b in earlier ~ o r k . ~ The probabilities for u,
Thus, v is selected by a random number according to this probability and defined as Gvk (see Figure 5 ) . For a long chain only part of the future can be scanned, owing to the exponential increase of a;. Therefore the chain may still get trapped in a dead end; in this case the chain is discarded, and the construction of a new one is started. The attrition ratio pattr , Eq. 1651, still decreases exponentially with increasing N, but A is much smaller than that of simple sampling. Obviously, the larger is f, the smaller is A. For example, for SAWs of N = 700 using f = 6 , X = 0.002, compared with A = 0.162 for simple sampling (A increases slightly with increasing N).260 The construction probability P:(f) of SAW i is N
where P q ( f ) is normalized over a subgroup of the RWs that also includes selfintersecting walks. The construction probability P i ( f ) that is normalized over the SAWs is
It should be pointed out that while the Boltzmann probability P: (Eq. 1621) is the same for all the SAWs, P J f ) (Eq. 1681) is larger for the compact SAWs than for the open ones. This bias is the price paid for the ability to generate longer chains than with simple sampling. However, the bias can be decreased systenzatically by increasing the scanning parameter f a n d by adding certain mean field parameters m.32,33,260 The optimal set of m is obtained by maximizing with respect to m the approximate entropy functional S(f, m ) , where S ( f , m ) = - k J l P 1 ( f , m ) In p,(f, m ) [note that -TS(f, m ) is the free energy for this model] and minimizing its fluctuation, for a given f. These optimizations are based on the minimum free energy principle and the minimum free energy fluctuation principle discussed after Eq. [lo] and prior to Eq. 1141. Such optimizations are
46
Calculation of Free Energy and Entropy by Computer Simulation
1'
zt. II
I
I
v=3
2
Figure 5 The scanning construction at step k = 12 based on a scanning parameter f = 2. The partially built SAW of 11 steps is denoted by solid lines; the future SAWs are denoted by broken arrows. There are all together four future SAWs of two steps, three starting from direction u = 1 and one from u = 3; u = 2 is forbidden. The transition probabilities at step k = 12 are thus p ( u = 1) = 3/4 and p ( u = 3) = 1/4. Assuming that energy e is defined between nonbonded nearest-neighbor monomers (see Figure 44, the future SAW starting at u = 3 has energy of SE, whereas the future SAWs that start from u = 1 have energies E, 0, and E. In this case, the transition probabilities are p ( u = 1) = (1 + Zexp[-dk,T])/(l + 2 exp[-e/k,T] + exp[-5dkBT]), and p ( u = 3 ) = (exp[- 5 d k , T ] ) / ( 1 + 2 exp[ -dRBT] + exp[-Sdk,T]).
also carried out in the stochastic models method of Alexandrowicz, discussed later in the section following Eq. [79]. For a complete scanning (i.e., fma, = N - k + l ) ,the transition probabilities (Eq. [66]) became exact.32 Hence P i ( f ) becomes exact-that is, P j ( f )is , the attrition disappears (pat,, = 1). Obviously, exequal to PH (Eq. [ 6 2 ] ) and act scanning is not feasible for a long chain, because the number of future chains grows exponentially with increasing f.
Entropy from Linear Buildup Procedures
47
Again, the bias can be removed rigorously by applying importance sampling, that is, by replacing P: in Eq. 1631 by P p ( f ) (Eq. [67]), as was first done by the Rosenbluths.26 Thus, an estimation for the entropy S is
where i(t)is SAW i obtained at time t of the process. Notice that in Eq. [69] the contribution of the compact SAWs is small owing to their small l/PP factors, whereas the open SAWs are associated with large factors and therefore dominate the summation. Obviously, if the sample is highly biased and does not include the open chains (which are the most probable in a Boltzmann sample), this compensation mechanism would fail, and the entropy would remain biased. More formally, the convergence of S to S depends on the fluctuation of S that is determined by the importance sampling probability P l ( f ) . Thus, the larger is f, the closer is P J f ) to P y (Eq. [62], the smaller is the fluctuation, and the smaller is the sample size n required to obtain a certain precision. For P s the fluctuation is zero, and a perfect sampling is achieved already with n = 1 (see discussions following Eqs. [12] and [31]). Because both the bias and the attrition increase with chain length, the method is practical for SAWs of up to N = 700 and 1500 on a square and a simple cubic lattice, respectively. Obviously, as with simple sampling, the generated chains are statistically independent. Several techniques such as the “Schmidt procedure” have been included within the framework of the method to assess the extent of convergence of the results.32 We emphasize that the Rosenbluth method,26 proposed in 1955, is the scanning method based on f = 1. However, using f > 1 increases the efficiency significantly over the original concept.32 Finally, the method of strides, which also consists of future scanning, was suggested by Wall et al. in 1957,28 but this method differs from the scanning method and is not discussed here. The scanning method can easily be extended to a chain model with finite interactions (e.g., attractions, see Figure 4). In this case, one also calculates the interaction energy E, of each future chain with itself and with the partially constructed chain. The transition probability becomes
where j ( u ) are the future SAWs of length fthat start a t direction u a t step k (see Figures 4 and 5 ) . In this implementation, the free energy F rather than the entropy is estimated by
t=l
J
48
Calculation of Free Energy and Entropy by Computer Simulation
where El is the energy of SAW i (see also Ref. 30, where the Rosenbluth method is extended to self-attracting SAWs).This equation is equivalent to Eq. [69]. For a continuum chain model with constant bond lengths and angles, for example, the range [-IT,IT] of each dihedral angle is discretized. Chain construction is performed in the same way as on the square lattice, where the number of directions at each step increases above 4 according to the extent of discretization (see application of the method to decaglycine described by the ECEPP81.82potentia1).261,262 The scanning method has been applied successfully to a wide range of models, self-attracting SAWs, trails, and random walks in the bulk and in the presence of an adsorbing surface. For these models, the partition function of a very long chain is expressed as Z = ApL”(’-I), where p is the growth parameter, y is a critical exponent, and A is a prefactor. These parameters are extrapolated from results obtained for finite chains, and the expected systematic and statistical errors for p lie between 0.2 and 0.008%. The errors of In Z ( N ) for specific chain lengths are an order of magnitude smaller (see references cited in Refs. 33 and 263). The scanning method was also applied to a model of 1 SAWs enclosed in a “box” on a square lattice.264 The chains are added gradually to an initially empty box, where each chain is grown by a scanning procedure as described above. One calculates the system’s probability, which is the product of the construction probabilities of the individual chains.
The Enrichment Method An important step-by-step procedure that also provides the entropy is the enrichment method proposed in 1959 by Wall and E r p e n b e ~ k , ~ ’who 3 ~ ~ developed the method in order to overcome the strong sample attrition discussed above. The idea is simple and elegant. If one attempts to generate no SAWs of N steps by simple sampling, one will end up with only nN = n,exp[-XN] SAWs. The number of chains that will survive after N/2 steps is nN12= noexp[-XN/2], indicating that the unsuccessful attempts to construct no - nN12chains of length smaller than N / 2 wasted computer time. To increase the efficiency of the process one can, for example, begin generating only n0/2 chains whose size is only up to half their required length (i.e., N/2 ).The number of SAWs of this length that are successfully generated is n,,,/2. Then, if one grows the second half of those chains twice, one obtains the same number of SAWs of length N as in the original procedure, but the number of wasted chains is half of that in the first procedure. This increase in efficiency is somewhat offset, however, by the fact that the SAWs are no longer statistically independent because some of them share the same first half. Obviously, one can increase the efficiency further by growing p > 2 half-chains. Using the enrichment method, one determines a priori a value of p and k - 1 branching points along the chain at monomers s = ( N / k )+ 1,2s = (2N/k)
Entvopy from Linear Buildup Procedures
49
+ 1 , . . . ,and the total number of chains started is nstart= n@--l) rather than no. This nstartis used to obtain the partition function (hence the entropy) with Eq. 1641. The total number of SAWs generated with an enrichment procedure is n,,p("-')exp[-Xsk]. Obviously, if p > 1 is used, the attrition will decrease, but if p increases such that p exp[-As] > 1, an explosion in the number of chains will occur. Even though many chains can be produced in this way, they become highly correlated. An optimal choice of the parameters would be p = cxp[Xs]. The ideas underlying the enrichment method have been used within the framework of other methods and have been extended recently for chains with finite interactions on the lattice and in c o n t i n ~ u m .The ~ ~ potential -~~ of the enrichment method has not been fully exploited yet, and we expect to see future developments opening new ways to handle complex polymer and protein systems. Direct Methods for Calculating the Entropy from MC and MD Samples Two methods are described here: the hypothetical scanning method and the local states (LS) method. Both were developed originally for spin systems and are discussed in a later section, but for simplicity we illustrate how they have been applied to a model of SAWs on a square lattice. These methods enable one t o extract the approximate entropy from a sample simulated by any technique, in particular by the MC or the MD procedures. They are based on the concept that two samples in equilibrium generated by different simulation methods are equivalent in the sense that they lead to the same estimates (within statistical error) of average properties, such as the entropy, energy, and their
fluctuation^.^^
The Hypothetical Scanning Method The hypothetical scanning method, based on the above-mentioned equivalence, assumes that a given sample of SAWs (produced correctly with PB by MC or MD or any other exact simulation method) has rather been generated with the scanning method. This assumption enables one to reconstruct for each step the (hypothetical) scanning transition probabilities. One starts with the first chain of the given sample and imagines how this chain would have been constructed with the scanning method. Thus, assuming that the chain does not yet exist, one calculates the four transition probabilities (Eq. [66]) of the first step. Since the first step of the chain is known, its transition probability becomes known too. This first step is then constructed; the probability of the sccond step is calculated the same way, and the step is added to the first. The process continues until the first chain has been completely reconstructed and the scanning probabilities of all its steps have been calculated; their product leads to the chain
50
Calculation of Free Energy and Entropy by Computer Simulation
approximate probability P q ( f ) (Eq. [67]). This procedure thus defines probability, not only for each chain of the sample, but, in principle, for the whole ensemble of SAWs, leading to an approximate absolute entropy functional SA,
where i runs on the ensemble of SAWs. Notice that because the sample is generated with an exact simulation procedure, SA is a statistical average defined with the Boltzmann probability, which is normalized over the ensemble of SAWs. Each SAW i is associated with the random variable lnPP(f), where P q ( f ) is normalized over a larger ensemble that also consists of self-intersecting walks; in the correct entropy calculation, P q ( f ) is replaced by P y (Eq. [62]). SA(f)isan upper bound for the exact entropy S (see Ref. 38) and can be improved (i.e., decreased) systematically by increasing f. Another criterion for the improvement of S A ( f ) is the decrease of its fluctuation u[SA(f)]as f is increased. For SAWs, the entropy S plays the role of the free energy in a system with finite interactions; therefore the fluctuation of the exact S is zero (see discussion following Eq. [12]), which is also demonstrated by the fact that PB is equal for all the SAWs. On the other hand, u(SA)is larger than zero but is expected to decrease as the approximation improves, which means that the sample size required for estimating SA with a given precision decreases as well. Indeed, such a decrease of o[SA(f)] has been found in all systems studied thus far.38,265,266SA is estimated from the given sample by SA (see Eq. [29]),
One can define another approximation for the absolute entropy denoted SB(f) (for details see Ref. 381, estimated by S B ( f ) :
Estimation of SB is more difficult than that of SA;also, it can be shown rigorously that SB5 SA. However, for all the systems studied, where SBcould be estimated reliably, it has been found that SB 5 S, and the absolute value of the deviations of SA and SBfrom S are approximately equal. This is an important result because the average value of SA and SB, denoted by SM, becomes a better approximation than either of them individually. Typically, several approximations for SA, SB, and SM are calculated as a function off, and their convergence enables one to determine the correct entropy with high accuracy. As an example, for SAWs of N = 79 on a square lattice, one finds for f = 1,4, and 7, SM = 0.1035 ~0.0003,0.11081?0.00003,
Entropy from Linear Buildup Procedures
51
and 0.1 1089 *0.00002, r e s p e ~ t i v e l yThe . ~ ~ latter estimate is equal to the correct value obtained by a simple sampling calculation (notice that N was limited to 79 only because of the inefficiency of the simple sampling method). SA, SB, and SM were also calculated for a system of many SAWS in a “box” on the square lattice.266 The hypothetical scanning method is especially well suited for treating random coil chains because the future scanning is carried out in all directions. O n the other hand, if the chain resides in a confined region of configuration space (e.g., the a-helical region of a peptide), it is difficult to limit the scanning to this region, as required; in such cases the system can be better handled by the local states method.
The Local States Method The exact transition probabilities of the scanning method ~ ( v ~ l v (. .~. ,~ , ) , v,, fmax) can be used only for short chains. Therefore, the scanning and hypothetical scanning methods are based on the approximate transition proba) ,f ) (Eq. [66]), where in practice f < N. In principle, bilities p ( ~ ~ l u ( .~. .- ,~vl, one can define another set of approximate transition probabilities, p ( vk 11,(k-1)’ . . . , u ( ~ - fmaX), ~ ) , which are based on a complete future scanning fma,, but only the b steps preceding step k are taken into account; b (signifying “back”) is the correlation parameter. These b preceding steps can have many different configurations, which are called local states (their number is smaller than 3b for a nonreversal SAW on the square lattice). It is impossible to construct long chains on the basis of these transition probabilities because a complete future scanning is impractical. However, one can interpret a given sample, generated with any simulation technique, as if it were generated with a step-by-step construction ) , using the same philosophy procedure based on p ( ~ ~ l v ( .~. .- ,~u )( ,~ - ~fmaX), used in the hypothetical scanning method. Assume first that the chains are not too long and one seeks to obtain p ( u k l u ( k - l ) ,. . . ,vl, fmax), the exact transition probability, from a given sample. To determine that, one can calculate the number of occurrences, n(v(kpl),. . . , v l ) and ~ ( v ~ , u ( .~. .- ,~v,) ) ,of the local states ( u ( ~ - .~.). ,,v l ) and ( V ~ , V ( ~ - ~ ) , . . . ,vl),respectively, in the sample. Because the sample was generated by an exact method (e.g., MC), the following ratios should lead to the correct transition probabilities:
where the equal sign will replace for an infinite sample. Thus, in principle, all the exact transition probabilities can be recovered in this way. Obviously, for long chains this is impractical because of the exponential increase in the number of local states. However, one can obtain the approximate transition probability, p(’klv(&l), * 7 u(k-b)9 fmax): -‘I
a
a
52
Calculation of Free Energy and Entropy by Computer Simulation
The process is carried out in two stages. First, one calculates the numbers of oc, . . . , Y ( ~ ~ from ~ ) ) which , a set of trancurrence of all the local states, n ( v sition probabilities is obtained andbk:& in the computer memory. Then, each chain is visited again and the transition probabilities corresponding to the chain steps are obtained; their product defines an approximate probability of the chain:
As for the hypothetical scanning method, one defines an entropy functional SA(b)by replacing PP(6)with P ? ( f ) in Eq. 1721; SB(b)and SM(b)are defined correspondingly and have the same properties as those described for their counterparts defined for the hypothetical scanning method. For models with finite interactions, the average energy and each of these functionals lead to approximations for the free energy. For example, one can define FA(b ) by
where FA(6) provides an underestimation of the correct F. The local states (LS) method was applied to d e ~ a g l y c i n e ~and ~ ’ the pentapeptide Leu-enkephaling6yg7simulated by MC using the potential energy function ECEPP.81~82 This potential is based on rigid geometry, having constant bond lengths and angles, but variable dihedral angles. To apply the LS method, the dihedral angles were discretized by a parameter 1 (1 5 40, vs. 1 = 4 on the square lattice), providing a suitable definition of local states and transition probabilities, not only for the backbone but also for the side chains; in this case the probabilities become probability densities. The method was also used to calculate the entropy and free energy of the cyclic hexapeptide cyclo-(Ala-Pro-DPhe), in vacuum and in a crystalline environment,lOl and analogs of the decapeptide gonadotropin-releasing h ~ r m o n e . ~More ’ recently the LS method was used for calculating the entropy of loops in a ras protein23 and the free energy of wide microstates of a cyclic hexapeptide in DMS0.268In these works, the simulations were carried out using MD where the internal variables also included the bond angles, while bond lengths were assumed to be constant. In several of the studies above, estimation of SB was found to be unreliable, indicating that one must rely on the approximate SA alone. This is not a serious limitation, because, in general, one is interested in the difference in the free energy between two or more wide microstates, rather than in their absolute values. In many cases it has been found that such differences AFti( b, 1) between the wide microstates i and j
Entropy from linear Buildup Procedures
53
converge rapidly as the approximation improves (i.e., as b and 1 increase). Therefore the converged value is expected to remain unchanged even for larger values of these parameters, thus representing faithfully the correct difference, within the statistical error. Finally, it should be pointed out that because the transition probabilities are calculated in the LS method from the existing conformations and not from the future ones, the method is suited t o handle a wide microstate that is defined on a limited region of conformational space (e.g., an a-helical ~tate).37,86,87,101,267
The Stochastic Models Method of Alexandrowicz and Its Implications Alexandrowicz proposed the stochastic models (SM) simulation method based on the premise that a magnetic system, such as the Ising model, can be described in terms of a long linear chain of spins.39 Thus, a system configuration can be obtained by generating the chain step by step with the help of transition probabilities, leading to the absolute entropy as previously described for the Rosenbluth or the scanning methods. For simplicity, the SM method is discussed here as applied to an Ising model, which is of interest because of its equivalence t o a lattice gas model.50 However the ideas of the SM method can be further extended to fluid systems as well.40 Assume an Ising model on a square lattice of N = L X L spins. At site k on the lattice the possible spin orientations are uk= -t 1. Neighbor spins m and 1 interact with ferromagnetic energy -Ju,,ul(J > 0) and the reciprocal temperature is defined by K = J/k,T. With the SM method one starts from an empty lattice that is filled with spins step by step and row by row, with the help of transition probabilities (Figure 6 ) . At step k of the process, k - 1 spins have already been constructed, and one wants to determine the spin at the vacant site k. The last L spins added to the lattice are called the “uncovered” spins. Became of the nearest-neighbor interaction, the transition probabilities depend only on the uncovered spins (see Ref. 269).Thus, if the uncovered nearest neighbor spins to site k, ( T ~ and - ~ ukpL,are both +1 (-1), the probability for a u k = + 1 ( - 1) is large because the corresponding interaction energy is negative; this probability is further enhanced if the distal uncovered spins uk-2, etc.) are also equal to + 1 ( - 1).Obviously, the influence of the remote uncovered spins is weaker. Alexandrowicz defined transition probabilities taking into account as many uncovered spins as possible around site k. For each of these spins around site k, he assigned a free parameter. If the configuration of this group of spins is denoted by 5 and the corresponding set of parameters by a, the transition probability becomes p(uk)= p(uklt?,a). The set of parameters is determined prior to the simulation, thus defining
54
Calculation of Free Energy and Entropy hy Computer Simulation
0
0 0
0
0
0
0
0
0
k-L
k-L+1
k-L+2
k-L+3
0
0
0
0
0
00
0
0
0
0
0
k-2
k-I
k
0
0
0
0
0
0
0
0
~
0
0
0
0
Figure 6 Diagram illustrating kth step of the construction of a square Ising lattice of L x L spins with the stochastic models (SM) method: solid circles denote lattice sites already filled with spins (? l )in preceding steps of the process; open circles denote the still empty lattice sites. The linear nature of the buildup construction is achieved by using “spiral” boundary conditions (i.e., the first spin in a row interacts with the last spin of the preceding row). Whereas all the L “uncovered” spins (at sites k - L, k - L + 1, . . . , k - 1)determine the transition probability for selecting spin k, the spins in close proximity to k (k - 1, k - L, etc.) have the largest effect. The local states method is based on the SM construction. Thus, the transition probabilities for spin k are obtained from a Metropolis Monte Carlo sample by calculating the number of occurrences of the various local states, n(uk,6)= ~ ~ ( u ~ , u ~ ~ ~ , u ~ ~ ~ , ).u The ~ ~transi~ , u ~ ~ ~ k 1,O) + n(u, = -1,6)]. Tiese transition tion probability is p(uk16)= n ( a k , 6 ) / [ n ( o= probabilities lead to the probability of lattice configuration i and to its contribution to the entropy (see text).
a set of transition probabilities. An initially empty lattice can be filled with spins using the p ( a k ) s ,the product of which leads to the probability of construction P , @ ) of configuration i (compare with Eq. [67]). A sample of configurations thus constructed leads to an approximation for the free energy F ( d ) , which is estimated by
R 4=
c n
l q t ) + kBT In9(t)(41
[801
t=1
One generates several samples with different sets of parameters, and the set leading to the lowest value of P(2) is the optimal one in accordance with the “minimum free energy principle” (see Eq. [lo]).That set is then used in the production runs. In principle, one can estimate the correct F by importance sampling (as in Eq. [71]); however, satisfactory results for F have already been obtained from Eq. [80]. This method was improved later by Meirovitch, who calculated the transition probabilities differently, by looking ahead as with the scanning method.269Finally we point out that the transformation from an Ising
Additional Methods for Calculating the Entropy
55
model with an external magnetic field to a lattice gas model is basically achieved by identifying a positive spin with a particle and a negative spin with a vacan~y.~~ Calculation of Entropy from an MC Sample Although the SM method has the important advantage of providing the entropy, it is less convenient to use than the usual Metropolis M C method. However, its step-by-step construction scheme enables one to apply both the LS and the hypothetical scanning methods to samples obtained with M C in the same way described earlier for polymers. In fact, these methods were developed originally for Ising systems, and only later extended to polymer chains. Furthermore, because Ising and lattice gas systems do not have long-range interactions, both methods are more efficient for these systems than for polymers. For example, application of the hypothetical scanning method to M C samples of the simple cubic Ising of size L = 25 and 30 has led to SA(f)/k,N = 0.5856 20.0001, 0.552 20.002, and 0.5052 +-0.0003 (Eq. [72]) for K = 0.213, Kc = 0.22169 and 0.226, respectively, where the corresponding best series expansion estimates are 0.5856, 0.558, and 0.497 ( K c is the approximate critical temperature). Highly accurate results for SM (see definition after Eq. [74]) were obtained for the square king lattice using the LS method. For L = 100 and K = 0.4, the free energy obtained is FM = -0.87937 ?0.00001 versus the exact value F = -0.87936; at Kc ( L = 40) FM = -0.9300 +0.0003 versus F = -0.9301.270 The LS method was also applied to study first-order phase transitions in the face-centered-cubic Ising a n t i f e r r ~ m a g n e t . ~ ~ It would be of interest to extend the LS method and the hypothetical scanning method to MC and MD samples of continuum models of fluids (e.g., argon and water) and peptides in aqueous solutions. These methods enable one to calculate differences in free energy between significantly different wide microstates directly-that is, from two samples-and in principle, can handle even small differences in free energy of such microstates because the free energy fluctuations decrease with enhancement of the quality of the approximation.
ADDITIONAL METHODS FOR CALCULATING THE ENTROPY In this section we describe several methods not pertaining to the techniques described earlier. We discuss the multicanonical method of Berg as ap~ ~ ”the adiabatic switching procedure of Reinplied to m a ~ r o m o l e c u l e s , 4 ~and hardt,271 which is related to the thermodynamic integration approach but is based on different grounds. For completeness, we mention four additional techniques, three of which were developed originally for spin models.
556
Calculation of Free Energy and Entropy by Computer Simulation
The Multicanonical Approach The multicanonical algorithm of Berg and c o l l a b ~ r a t o r was s ~ ~origi~~~ nally developed for spin systems, but those ideas were later transferred to proteins. Formally, this technique is a Metropolis Monte Carlo procedure that is applied to a probability distribution that is different from the commonly used Boltzmann PD, and because of this the entropy is obtained as a by-product of the simulation. Later, Lee46 found an alternative implementation of the multicanonical algorithm, the “entropic sampling method,” that is simpler for some . ~ multicanonical ~ ~ method was first apapplications than the original O I I ~ The plied to peptides by Hansmann and Okamoto,4’ who used the method to study the helix-coil transition273 and used the pentapeptide M e t - e n k e ~ h a l i n ~ ~ ~ - ~ ~ to compare its efficiency to that of other techniques. Meirovitch et al.98 applied imthe multicanonical method to Leu-enkephalin. Hao and proved the method and applied it to simplified lattice chain models to study various aspects of protein folding. A similar study was carried out by Kolinski et al.49 The “entropic sampling” version46 of the multicanonical algorithm, as applied to a discrete lattice model of a chain with attracting interactions, is of interest and is described here. Let the number of different energies of the system be L, and let i and j be chain conformations, with the degeneracy of energy E denoted by w E (compare with Eqs. [ 3 ] ,[13], and [14]). The multicanonical probability of conformation i is
l’y
where PM(E,) = 1/L is the probability of finding the system in conformation i with energy Ei. Thus, all the energies have equal probability, compared with the highly uneven Boltzmann probability for E , PB(E)= vE exp[-E/kBT]/Z (where the subscript i is omitted). S ( E ) is the microcanonical entropy of energy E divided by k,. One can simulate the system with the Metropolis method accordinstead of Py, using transition probabilities that satisfy the detailed ing to balance condition (Eq. [35]). L is canceled giving (compare with Eq. [34]),
Py
p . . = min(1, exp[-S(EJ 191
+ S(E,)]) = min[l, v(E,)/v(Ei)]
[821
Whereas chain conformations that pertain to the energies with high degeneracy will be selected more frequently as trial configurations with Tij(Eq. [35]),pij will favor the structures of the low degenerate energies, so that the net result is that all the energies will be visited “democratically” with the same probability. This way the simulation reaches the low energy structures without becoming trapped in potential energy wells with high barriers, as occurs with the usual MC procedure.
Additional Methods for Calculating the Entropy
57
The problem of course lies in the fact that the entropies are not known a priori. The “entropic sampling” version46 provides a simple prescription enabling one to build, in a recursive way, a function J ( E ) that is proportional to S ( E ) . More specifically, the process consists of several separate simulations carried out with transition probabilities
where the improved values ofJ(E) obtained from simulation k are used to perform the (k + simulation, and the process continues until all the energies are visited approximately with the same probability. Then a long production simulation is carried out. More specifically, one can start the process with a usual M C run at a very high temperature, where the Boltzmann probability is approximately random. The energy range obtained in the simulation is divided into bins, and the number of visits (i.e., M C steps) H ( E ) to each energy bin is calculated. Each bin is assigned an initial value J ( E ) = 1, which after this first run is corrected in the following way:
The next simulation is carried out with the new set ofJ(E) using Eq. [S3], where more bins are visited, and the histogram H ( E )is calculated and used to update J ( E )again, and so on. If the total number of configurations R is known, at the end of the process one can obtain the absolute entropies,
In any case, the final set of J ( E )enables one to estimate the ensemble average of any function X at a n y temperature T, -
=
C,X(E)exp[/(E)-E/kBT1
c,
exp[J(E ) - E / ksT]
[861
It should be pointed out, however, that for a large system, most of the contribution to < X ( T )> comes from a very narrow energy region around E; (see Eq. [ 131) (or several such regions). Therefore, the size of the bins should be decreased to gain accuracy, but this requires longer simulations to accumulate enough statistics. In this case the efficiency can be enhanced by limiting the simulations to a small range of energies around E;, but the results for < X ( T ) > will be limited to a narrow range of temperatures. The ability of the multicanonical algorithm to cover the whole energy
58
Calculation of Free Energy and Entropy by Computer Simulation
range (especially the low end) has not been well documented. It depends on the particular system studied and can be affected significantly by the procedure used for selecting trial conformations with the probabilities Ti,j.Indeed, to enhance efficiency, Hao and Scheraga implemented a biased set of Tijand “jump walking techniques” in their simulations of a 38-residue lattice protein mode148*277-280(see also Ref. 49). Hansmann and Okamoto, who carried out efficiency studies and comparisons with other techniques, used essentially the ? ~ ~ multicanonical ~-~’~ same relatively short peptide, M e t - e n k e p h a l i r ~ . ~ ~ The ideas were also used by Kumar, Payne, and Vgsquez for calculating the potential of mean force.l4lC In summary, the multicanonical method provides unique features that are important for investigating first-order phase transitions and phenomena that occur over a large range of energies, such as protein folding. Upon further optimization, the method is expected to become an important tool for conformational search and for studying phase transitions in macromolecular systems.
Calculation of Entropy by Adiabatic Switching An interesting method for calculating differences in entropy and free energy, and in some cases their absolute values, was developed by Watanabe and rein hard^^'^ The method is based on the observation of Hertz (see discussion in Ref. 271) that for Hamiltonian dynamics, the energy shell in phase space H(p, q) = E (where q is a vector of the generalized coordinates of an N particles system) for a given energy E can be an adiabatic invariant. More specifically, assume the parameters a(t)of a Hamiltonian depend on the time t and for each set a ( t )the Hamiltonian is ergodic; in this case a trajectory started on the energy shell of H(p, q, a(to)), at time to, where a(t) is changed adiabatically (i.e., very slowly in time), will end up on the energy shell of the Hamiltonian H(p, q, a( tfinal)), with a different energy. According to Liouville’s theorem,62 the starting and final energy shells will have the same phase space volume, i.e., the same entropy. Watanabe and Reinhardt271 used these two invariants for calculating the entropy and free energy of water models by MD simulations. In that study, the simulation started with the full water potential at temperature T and constant density. The potential energy parameters were decreased very slowly to zero during the (microcanonical) simulation using some switching functions. The final system thus became an ideal gas with some kinetic energy and known entropy that is also the absolute entropy of the water system. The computed entropy for three water systems deviated by 0-20% from those obtained by simulations based on other techniques. The authors found that the accuracy of the method depends on the choice of switching functions. This interesting approach has been developed further for Ising and other simple models281-285but has not yet been applied to realistic macromolecular or fluid systems.
Additional Methods for Calculating the Entropy
59
Four Additional Methods For the sake of completeness, we mention here without elaboration four additional methods for calculating the free energy. These methods are related to methods described throughout this chapter. The first is the multistage sampling method of Valleau and Card286 (see also Ref. 123),which was applied initially to a fluid model of hard spheres with finite interactions. In its simplest form, the method is based on two M C simulations, one at T = 00 and the other at the temperature of interest T The corresponding energy distributions are matched, enabling one to obtain the absolute free energy at T (and at higher temperatures), based on the known free energy of the hard sphere model. For large systems several simulations at different temperatures are required, and the method becomes impractical. The second method was suggested by Bhanot et al.287 and applied to lattice spin models. For an Ising model, the smallest energy change that can be applied by flipping a single spin is of ?4J, whereJ is the interaction constant. The energy range is divided into overlapping groups of four energies ( E , E + 4J, E +SJ, E + 12J), ( E + 1217, E + 16J, E + 20J, E +24J), etc., where one is interested in the ratiosp(E) = u(E)/u(E’)(Eq. [13]) between neighbor energies of the group. This is achieved by the following M C procedure: one starts from a spin configuration that pertains to one of the groups; a randomly chosen spin is flipped, and the move is accepted if the energy remains within the group energies (otherwise the move is rejected). A large number of such M C trials enable one to estimate the ratios p ( E ) within all the different groups, which are matched to give the absolute values of u(E) if u ( E )for one energy is known. This method is also limited to relatively small systems. The third method is the histogram method of Ferrenberg and Swendsen.288-2yoThese authors first show that the free energy (up to an additive constant) can be obtained from a usual M C simulation at temperature T The energy range is divided into bins around the values El and the probability pz(T) of energy E l is obtained from the number of times the bin of E I was visited during the simulation. The degeneracy u(Ez)= exp[Sz] of El (see Eqs. [4],[ll], [13j, and 1811) is u ( E J = p,(T)Z(T)exp[+Ez/k,T], where the partition function is Z ( T )= Z u ( E I )exp[-Ez/k,T]. Z and u ( E l )are unknown but can be calculated in a recursive way. One starts from some value of Z and calculates the resulting u ( E J using the first equation above, which leads to a new value of Z through the second equation. The new value of Z is then used for recalculating the u ( E I ) ,and the process continues until Z and the u ( E z )do not change, meaning that their values are the correct ones up to a multiplicative factor. Ferrenberg and Swendsen then show how to evaluate the quantities above from results obtained from several M C simulations at different temperatures in a manner that serves to minimize the statistical error. This enables one to calculate differences in free energy at different temperatures, and for systems that are not too large to cover a large range of T. The histogram method was ex-
60
(:alculation of Free Energy m d Entropy 6 y Coniputer Simulation
tended and used for calculating the potential of mean force in macromolecules (see Refs. 141a-141c). uses the perturbation idea but is The last method, suggested by applied to two Hamiltonians corresponding to systems that differ in size. The exact absolute free energy of the smaller system (an Ising model in this study) is known, so that of the larger one can be determined. The process uses several perturbations in which the size of the system is increased, and the efficiency is enhanced using Bennett’s procedure6* (Eq. 12.51).
SUMMARY In the last 40 years significant progress has been made in developing computer simulation techniques for macromolecular systems and methods for calculating the entropy and free energy from simulation samples. These methods originated from various disciplines, where scientists were interested in studying diverse materials, such as fluids, synthetic polymers, and magnetic systems. The importance of interaction between different scientific fields is thus clearly relevant here. The developments of the last two decades, especially the advent of umbrella sampling techniques and the combination of nonphysical perturbations within the framework of thermodynamic cycles, are particularly impressive. They have opened new avenues for calculating, for example, the free energy of binding of ligands to an enzyme, the relative stability of mutated proteins, the relative and absolute free energy of solvation of small molecules, and the potential of mean force along a reaction coordinate. The ability to compare these theoretical results with experimental data also provides new ways to check the reliability of force fields and thus to improve their quality. Unfortunately these methods are based on integration steps that are time-consuming, and they are limited to relatively small chemical and structural changes. In recent attempts to overcome this problem, direct methods (e.g., the linear response technique) provide the free energy from a single sample. Development of efficient methods of this type is one of the most important challenges in theoretical structural biology.
ACKNOWLEDGMENTS I am grateful to Dr. Canan Baysal for her comments regarding the manuscript, and to Dr. Max VBsquez for valuable discussions. I acknowledge the hospitality of the Physics Department at Bar Ilan University, Israel, where part of this work was written. Support is acknowledged from the Florida State University Supercomputer Computations Research Institute, which is partially funded by the U.S. Department of Energy (DOE) under contract number DE-FC05-85ER250000. This work was also partially supported by DOE grant DE-FG05-95ER62070.
References
61
REFERENCES 1 . N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, /. Chem. Phys., 21, 1087 (1953).Equation of State Calculations by Fast Computing Machines. 2. K. Binder, Ed., Application of the Monte Carlo Method in Statistical Physics, Springer Verlag, Berlin-Heidelberg, 1987. 3. B. J. Alder and T. E. Wainwright,/. Chem. Phys., 27,1208 (1957).Phase Transition of Hard Sphere System. 4. B. J. Alder and T. E. Wainwright,]. Chem. Phys., 31,459 (1959).Studies of Molecular Dynamics. 1. General Method. 5. J. A. McCammon, B. R. Gelin, and M. Karplus, Nature, 267, 585 (1977). Dynamics of Folded Proteins. 6. N. Gband H. A. Scheraga,/. Chem. Phys., 51,4751 (1969).Analysis of the Contribution of Internal Vibrations to the Statistical Weights of Equilibrium Conformations of Macromolecules. A. T. Hagler, P. S. Stern, R. Sharon, J. M. Becker, and F. Naider,]. Am. Chem. Soc., 101,6842 (1979).Computer Simulation of the Conformational Properties of Oligopeptides. Comparison of Theoretical Methods and Analysis of Experimental Results. 8. M. Karplus and J. N. Kushick, Macromolecules, 14,325 (1981).Method for Estimating the Configurational Entropy of Macromolecules. 9. I. R. McDonald and K. Singer, 1. Chem. Phys., 47, 4766 (1967).Machine Calculation of Thermodynamic Properties of a Simple Fluid. 10. J.-P. Hansen and L. Verlet, Phys. Rev., 184, 151 (1969). Phase Transition of the LennardJones System. 11. W. G. Hoover and F. H. Ree,]. Chem. Pbys., 47,4783 (1967).Use of Computer Experiments to Locate the Melting Transition and Calculate the Entropy in the Solid Phase. 12. M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids, Clarenden Press, Oxford, 1987. 13. J. G. Kirkwood,]. Chem. Phys., 3, 300 (1935).Statistical Mechanics of Fluid Mixtures. 14. R. W. Zwanzig,J. Chem. Phys., 22, 1420 (1954).High-Temperature Equation of State by a Perturbation Method. I. Nonpolar Gases. 15. D. R. Squire and W. G. Hoover,J. Chem. Phys., 50,701 (1969).Monte Carlo Simulation of Vacancies in Rare-Gas Crystals. 16. G. M. Torrie and J. P. Valleau, Chem. Phys. Lett., 28,578 (1974). Monte Carlo Free Energy Estimates Using Non-Boltzmann Sampling: Application to the Sub-Critical Lennard-Jones Fluid. 17. G. M. Torrie and J. P. Valleau,J. Comput. Phys., 23,187 (1977).Nonphysical Sampling Distributions in Monte Carlo Free-Energy Estimation: Umbrella Sampling. 18. D. L. Beveridge and F. M. DiCapua, Annu. Rev. Biophyr. Chem., 18,431 (1989). Free Energy Via Molecular Simulation: Application to Chemical and Biomolecular Systems. 19. P. A. Kollman, Chem. Rev., 93, 2395 (1993). Free Energy Calculations: Applications to Chemical and Biochemical Phenomena. 20. M. Akke, R. Bruschweiler, and A. G. Paler,/. A m . Chem. SOC.,115,9832 (1993).NMR Order Parameters and Free Energy; An Analytic Approach and Application to Cooperative Ca2+ Binding by Calbindin D9k. 21. D. Yang and L. E. Kay,]. Mol. Biol. 263,369 (1996). Contribution to Conformational Entropy Arising from Bond Vector Fluctuations Measured from NMR-Derived Order Parameters: Application to Protein Folding. 22. J. T. Stivers, C. Abeygunawardana, A. S. Mildvan, and C. P. Whitman, Biochemistry, 35, 16036 (1996). 15N NMR Relaxation Studies of Free and Inhibitor-Bound 4-Oxalocroto-
62
23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37.
38. 39. 40. 41. 42. 43, 44,
Calculation of Free Energy and Entropy by Computer Simulation nate Tautomerase: Backbone Dynamics and Entropy Changes of an Enzyme upon Inhibitor Binding. H. Meirovitch and T. F. Hendrickson, Proteins, 29, 127 (1997). The Backbone Entropy of Loops as a Measure of Their Flexibility. Application to a ras Protein Simulated by Molecular Dynamics. M. Visquez, G. Ntmethy, and H. A. Scheraga, Chem. Rev., 94, 2183 (1994). Conformational Energy Calculations on Polypeptides and Proteins. F. T. Wall, L. A. Hiller, and D. J. Wheeler,J. Chem. Phys., 22, 1036 (1954).Statistical Computation of Mean Dimensions of Macromolecules. M. N. Rosenbluth and A. W. Rosenbluth,]. Chem. Phys., 23,356 (1955).Monte Carlo Calculation of the Average Extension of Molecular Chains. F. T. Wall and J. J. Erpenbeck, J . Chem. Phys., 30, 634 (1959). New Method for the Statistical Computation of Polymer Dimensions. F. T. Wall, R. J. Rubin, and L. M. Isaacson, J . Chem. Phys., 27, 186 (1957).Improved Statistical Method for Computing Mean Dimensions of Polymer Molecules. F. T. Wall, S. Windwer, P. J. Gans, and L. M. Isaacson, Methods Comput. Phys., 1,217 (1963). Monte Carlo Methods Applied to Configurations of Flexible Polymer Molecules. F. L. MacCrackin, J. Mazur, and C. M. Guttman, Macromolecules, 6, 859 (1973). Monte Carlo Studies of Self-Interacting Polymer Chains with Excluded Volume. I. Squared Radii of Gyration and Mean-Square End-to-End Distances and Their Moments. H. Meirovitch, J. Phys. A, 15, L735 (1982). A New Method for Simulation of Real Chains. Scanning Future Steps. H. Meirovitch,J. Chem. Phys., 89,2514 (1988). Statistical Properties of the Scanning Simulation Method for Polymer Chains. I. Chang and H. Meirovitch, Phys. Rev. E, 48, 3656 (1993). The Collapse Transition of SelfAvoiding Walks on a Square Lattice in the Bulk and Near a Linear Wall: The Universality Class of the e and 8’ points. J. Bascle, T. Garel, H. Orland, and B. Velikson, Biopolymers, 33, 1843 (1993). Biasing a Monte Carlo Chain Growth Method with Ramachandran’s Plot: Application to Twenty+Alanine. P. Grassberger and R. Hegger, J. Phys. A, 27, 4069 (1994). Chain Polymers Near an Adsorbing Surface. H. Meirovitch, Chem. Phys. Lett., 45, 389 (1977). Calculation of Entropy with Computer Simulation Methods. H. Meirovitch, S. C. Koerber, J. Rivier, and A. T. Hagler, Biopolymers, 34,815 (1994).Cornputer Simulation of the Free Energy of Peptides with the Local States Method: Analogues of Gonadotropin Releasing Hormone in the Random Coil and Stable States. H. Meirovitch, Phys. Rev. A, 32, 3709 (1985). Computer Simulation of the Free Energy of Polymer Chains with Excluded Volume and with Finite Interactions. 2. Alexandrowicz,J. Chem. Phys., 55,2765 (1971).Stochastic Models for the Statistical Description of Lattice Systems. Z. Alexandrowicz and M. Mostow, J . Chem. Phys., 56, 1274 (1972). Stochastic Model for a Fluid of Hard Cubes with Attractive Potential. H. Meirovitch and Z. Alexandrowicz,J. Stat. Phys., 15, 121 (1977).The Stochastic Models Method Applied to the Critical Behavior of king Lattices. H. Meirovitch, J . Phys. A, 16, 839 (1983).Methods for Estimating the Entropy with Computer Simulation. The Simple Cubic king Lattice. H. Meirovitch, Phys. Rev. B, 30,2866 (1984).Computer Simulation Study of Hysteresis and Free Energy in the fcc king Antiferromagnet. B. A. Berg and T. Neuhaus, Phys. Lett., B267, 249 (1991).Multicanonical Algorithms for First Order Phase Transition.
References
63
45. B. A. Berg, Int. I. Mod. Phys. C, 3,1083 (1992).The Multicanonical Ensemble: A New Approach for Computer Simulations. 46. J. Lee, Phys. Rev. Lett., 71, 211 (1993).New Monte Carlo Algorithm: Entropic Sampling. 47. U. H. E. Hansmann and Y. Okamoto,J. Comput. Chem., 14,1333 (1993).Prediction of Peptide Conformation by a Multicanonical Algorithm: New Approach to the Multiple Minima Problem. 48. M.-H. Hao and H. A. Scheraga, 1.Phys. Chem., 98,4940 (1994).Monte Carlo Simulation of a First-Order Transition for Protein Folding. 49. A. Kolinski, W. Galazka, and J. Skolnick, Proteins, 26,271 (1996).On the Origin of the Cooperativity of Protein Folding: Implications from Model Simulations. 50. K. Huang, Statistical Mechanics, Wiley, New York, 1963. 51. M. Mezei and D. L. Beveridge, Ann. N.Y.Acad. Sci., 482,l (1986).Free Energy Simulations. 52. C. L. Brooks 111, M. Karplus, and B. M. Pettitt, Adv. Chem. Phys., 71 (1988). Proteins: A Theoretical Perspective of Dynamics, Structure, and Thermodynamics. 53. (a) W. F. van Gunsteren and P. K. Weiner, Eds., Computer Simulation of Biomolecular Systems, ESCOM, Leiden, 1989, Vol. 1. (b)W. F. van Gunsteren, P.K. Weiner, and A. J. Wilkinson, Eds., Computer Simulation of Biomolecular Systems: Theoretical and Experimental Applications, ESCOM, Leiden, 1993, Vol. 2. (c) A. E. Mark and W. F. van Gunsteren, in New Perspectives in Drug Design, Proceedings of the Ninth International Roundtable, April 11-13,1994, Turnbury, Scotland, P. M. Dean, G. Jolles, and C. G. Newton, Eds., Academic Press, San Diego, CA, 1995, pp. 185-200. Free Energy Calculations in Drug Design: A Practical Guide. 54. W. L. Jorgensen, Acc. Chem. Res., 22,184 (1989).Free Energy Calculations: A Breakthrough for Modeling Organic Chemistry in Solution. 55. J. Hermans and A. G. Anderson, in Theoretical Biochemistry and Molecular Biophysics, D. L. Beveridge and R. Lavery Eds., Adenine Press, Guilderland, NY, 1990, pp. 45-51. Microfolding: Use of Simulations to Study Peptidekotein Conformational Equilibria. 56. T. P. Straatsma and J. A. McCammon, Annu. Rev. Phys. Chem., 43,407 (1992).Computational Alchemy. 57. M. Mezei, Mol. Sirnulation, 10, 225 (1993). Calculation of Solvation Free-Energy Differences for Large Solute Change from Computer Simulations with Quadrature-Based Nearly Linear Thermodynamic Integration. 58. W. L. Jorgensen and J. Tirado-Rives, Perspect. Drug Discovery Design, 3, 123 (1995).Free Energies of Hydration for Organic Molecules from Monte Carlo Simulations. 59. (a) A. Warshel and Z. T. Chu, in Structure and Reactivity in Aqueous Solution. Characterization of Chemical and Biological Systems, C. J. Cramer and D. G. Truhlar, Eds., ACS Symposium Series No. 568, American Chemical Society, Washington, DC, 1994, pp. 71-94. Calculations of Solvation Free Energies in Chemistry and Biology. (b) A. Warshel, Computer Modeling of Chemical Reactions in Enzymes and Solutions, Wiley, New York, 1991. 60. P. A. Kollman, Acc. Chem. Res., 29,461 ( I 996). Advances and Continuing Challenges in Achieving Realistic and Predictive Simulations of the Properties of Organic and Biological Molecules.
61. (a) T. P. Lybrand, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1990. Vol. 1, pp. 295-320. Computer Simulation of Biomolecular Systems Using Molecular Dynamics and Free Energy Perturbation Methods. (b) T. P. Straatsma, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1996, Vol. 9, pp. 81-127. Free Energy by Molecular Simulation. 62. T. L. Hill, Statistical Mechanics: Principles and Selected Applications, Dover, New York, 1956. 63. (a) R. Q. Topper, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B Boyd, Eds., VCH Publishers, New York, 1997, Vol. 10, pp. 101-176. Visualizing Molecular Phase
64
64. 65. 66. 67. 68. 69. 70. 71. 72. 73. 74. 75. 76. 77. 78. 79. 80.
81.
82. 83. 84. 85.
86.
Calculation of Free Energy and Entropy by Computer Simulation Space: Nonstatistical Effects in Reaction Dynamics. (b) R. Larter and K. Showalter, in Reuiews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1997, Vol. 10, pp. 177-270. Computational Studies in Nonlinear Dynamics. R. Kubo, H. Ichimura, T. Usui, and N. Hshitsume, Statistical Mechanics. North Holland, Amsterdam, 1974. Z . W. Salsburg, J. D. Jacobson, W. Fickett, and W. W. Wood,/. Chem. Phys., 30,65 (1959). Application of the Monte Carlo Method to the Lattice Gas Model. I. Two-Dimensional Triangular Lattice. J. W. Gibbs, Elementary Principles in Statistical Mechanics, Yale University Press, New Haven, CT, 1902, Chapter XI, pp. 129-138. H. Meirovitch and Z. Alexandrowicz, /. Stat. Phys., 15, 123 (1976). On the Zero Fluctuation of the Microscopic Free Energy and Its Potential Use. C. H. Bennett, J. Comput. Phys., 22, 245 (1976). Efficient Estimation of Free Energy Differences. B. Widom, /. Chem. Phys., 39,2808 (1963).Some Topics in the Theory of Fluids. J. L. Jackson and L. S. Klein, The Physics of Fluids, 7, 228 (1964). Potential Distribution Method in Equilibrium Statistical Mechanics. K. S. Shing and K. E. Gubbins, Mol. Phys., 43, 717 (1981). The Chemical Potential from Computer Simulation. Test Particle Method with Umbrella Sampling, K. S. Shing and K. E. Gubbins, Mol. Phys., 46,1109 (1982).The Chemical Potential in Dense Fluids and Fluid Mixtures via Computer Simulation. E A. Escobedo and J. J. de Pablo, Mol. Phys., 89, 1733 (1996).Chemical Potential and Dimensions of Chain Molecules in Athermal Environments. N. G. Parsonage, Mol. Phys., 89, 1133 (1996). Computation of the Chemical Potential in High Density Fluids by a Monte Carlo Method. P. Bereolos, J. Talbot, and K.-C. Chao, Mol. Phys., 89, 1621 (1996).Simulation of Free Energy Without Particle Insertion in the NPT Ensemble. H. Cramir, The Elements of Probability Tbeory and Some of Its Applications, Robert E. Krieger, Huntington, NY, 1973. M. Kahn and A. W. Marshall, Oper. Res. 1 , 2 6 3 (1953).Methods of Reducing Sample Size in Monte Carlo Computations. M. H. Kalos and P. A. Whitlock, Monte Carlo Methods, Vol. I: Basics, Wiley, New York, 1986. J. M. Hammersley and D. C. Handscombe, Monte Carlo Methods, Wiley, New York, 1965. A. Ben-Naim, Statistical Thermodynamics for Chemists and Biochemists, Plenum Press, New York, 1992. F. A. Momany, R. F. McGuire, A. W. Burgess, and H. A. Scheraga, J. Phys. Chem., 79,2361 (1975). Energy Parameters in Polypeptides. VII. Geometric Parameters, Partial Atomic Charges, Nonbonded Interactions, Hydrogen Bond Interactions, and Intrinsic Torsional Potentials for the Naturally Occurring Amino Acids. M. J. Sippl, G. Nimethy, and H. A. Scheraga,]. Phys. Chem., 88, 6231 (1984). Intermolecular Potentials from the Crystal Data. 6. Determination of Empirical Potentials for 0H.*.O=C Hydrogen Bonds from Packing Configurations. B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus, J. Comput. Chem., 4, 187 (1983). CHARMM: A Program for Macromolecular Energy, Minimization and Dynamics Calculations. J. A. McCammon and S. C. Harvey, Dynamics of Proteins and Nucleic Acids, Cambridge University Press, Cambridge, UK, 1987. J. Brady and M. Karplus, J. Am. Chem. Soc., 107, 6103 (1985). Configuration Entropy of the Alanine Dipeptide in Vacuum and in Solution: A Molecular Dynamics Study. E. Meirovitch and H. Meirovitch, Biopolymers, 38, 69 (1996).New Theoretical Methodol-
References
87.
88. 89. 90. 91. 92. 93. 94. 95. 96. 97. 98. 99. 100.
101. 102. 103. 104. 105.
65
ogy for Elucidating the Solution Structure of Peptides from NMR Data. 11. Free Energy of Dominant Microstates of Leu-enkephalin and Population-Weighted Average Nuclear Overhauser Effects Intensities. H. Meirovitch and E. Meirovitch, J. Phys. Chem., 100, 5123 (1996). New Theoretical Methodology for Elucidating the Solution Structure of Peptides from NMR Data. 111. Solvation Effects. F. H. Stillinger and T. A. Weber, Science, 225,983 (1984). Packing Structures and Transitions in Liquids and Solids. R. Elber and M. Karplus, Science, 235,318 (1987). Multiple Conformational States of Proteins: A Molecular Dynamics Analysis of Myoglobin. K. D. Gibson and H. A. Scheraga,]. Physiol. Chem. Phys., 1, 109 (1969).Minimization of Polypeptide Energy. V. Theoretical Aspects. N. Go and H. A. Scheraga, Macromolecules, 9,535 (1976).On the Use of Classical Statistical Mechanics in the Treatment of Polymer Chain Conformation. D. A. Case, Curr. Opin. Struct. Biol. 4, 285 (1994). Normal Mode Analysis of Protein Dynamics. M. Go, N. Gd, and H. A. Scheraga,J. Chem. Phys., 52,2060 (1970). Molecular Theory of the Helix-Coil Transition of Polyamino Acids. 11. Numerical Evaluations of s and u for Polyglycine and Poly-L-alanine in the Absence (for s and u) and Presence (for u) of Solvent. F. T. Hesselink, T. Ooi, and H. A. Scheraga, Macromolecules, 6,541 (1973).Conformational Energy Calculations. Thermodynamic Parameters of the Helix-Coil Transition for Poly(~lysine) in Aqueous Salt Solution. M. G6, T. T. Hesselink, N. Gd, and H. A. Scheraga, Macromolecules, 7,459 (1974).Molecular Theory of the Helix-Coil Transition of Polyamino Acids. IV. Evaluation and Analysis of s for Poly(L-valine) in the Absence and Presence of Water. M. Gd and H. A. Scheraga, Biopolymers, 23, 1961 (1984). Molecular Theory of the Helix-Coil Transition of Polyamino Acids. V. Explanation of the Different Conformational Behavior of Valine, Isoleucine, and Leucine in Aqueous Solution. M. Vasquez, E. Meirovitch, and H. Meirovitch,]. Phys. Chem., 98,9380 (1994).A Free Energy Based Monte Carlo Minimization Procedure for Macromolecules. H. Meirovitch, E. Meirovitch, and J. Lee, J . Phys. Chem., 99, 4847 (1995). New Theoretical Methodology for Elucidating the Solution Structure of Peptides from NMR Data. 1. The Relative Contribution of Low-Energy Microstates to the Partition Function. H. Meirovitch and E. Meirovitch,]. Comput. Chem., 18, 240 (1997). Efficiency of Monte Carlo Minimization Procedures and Their Use in the Analysis of NMR Data Obtained from Flexible Peptides. J. Rizo, S. C. Koerber, R. J. Bienstock, J. Rivier, A. T. Hagler, and L. M. Gierasch, J. Am. Chem. SOC.,114,2860 (1992).Conformational Analysis of the Highly Potent Constrained Gonadotropin-Releasing Hormone Antagonist. 2. Molecular Dynamics Simulations. H. Meirovitch, D. H. Kitson, and A. T. Hagler, J. Am. Chem. SOC.,114, 5386 (1992).Computer Simulation of the Entropy of Polypeptides using the Local States Method: Application to cyclo-(Ala-Pro-o-Phe), in Vacuum and the Crystal. P. Dauber-Osguthorpe, V. A. Roberts, D. J. Osguthorpe, J. Wolff, M. Genest, and A. T. Hagler, Proteins, 4, 31 (1988). Structure and Energetics of Ligand Binding to Proteins: Escherichia coli Dihydrofolate Reductase-Trimethoprim, A Drug-Receptor System. K. K. Irikura, B. Tidor, B. R. Brooks, and M. Karplus, Science, 229, 571 (1985).Transition from B to Z DNA: Contribution of Internal Fluctuations to the Configurational Entropy Difference. B. Tidor and M. Karplus, Proteins, 15,71 (1993). The Contribution of Cross-Links to Protein Stability: A Normal Mode Analysis of the Configurational Entropy of the Native State. B. Tidor and M. Karplus, J. Mol. B i d , 238, 405 (1994). The Contribution of Vibrational Entropy to Molecular Association. The Dimerization of Insulin.
66
Calculation of Free Energy and Entropy by Computer Simulation
106. J. Wang, 2. Szewczuk, S.-Y. Yue, Y.Tsuda, Y. Konishi, and E. 0.Purisima,]. Mol. Biol., 253, 473 (1995).Calculation of Relative Binding Free Energies, and Configurational Entropies: A Structural and Thermodynamic Analysis of the Nature of Non-Polar Binding of Thrombin Inhibitors Based on H i r ~ d i n ~ ~ - ~ ~ . 107. M. Fixman, Proc. Natl. Acad. Sci. U.S.A., 71, 3050 (1974). Classical Statistical Mechanics of Constraints: A Theorem and Application to Polymers. 108. R. M. Levy, M. Karplus, J. Kushick, and D. Perahia, Macromolecules, 17, 1370 (1984). Evaluation of Configurational Entropy for Proteins: Application to Molecular Dynamics Simulations of a-Helix. 109. 0. L. Rojas, R. M. Levy, and A. Szabo,]. Chem. Phys., 85,1037 (1986). Corrections to the Quasiharmonic Approximation for Evaluating Molecular Entropies. 110. R. M. Levy, 0. L. Rojas, and R. A. Friesner,]. Phys. Chem., 88, 4233 (1984).Quasi-Harmonic Method for Calculating Vibrational Spectra from Classical Simulations on Multidimensional Anharmonic Potential Surfaces. 111. A. DiNola, H. J. C. Berendsen, and 0. Edholm, Macromolecules, 17,2044 (1984).Free Energy Determination of Polypeptide Conformations Generated by Molecular Dynamics. 112. R. A. Friesner and R. M. Levy,]. Chem. Phys., 80,4488 (1984).An Optimized Harmonic Reference System for the Evaluation of Discretized Path Integrals. 113. 0. Edholm and H. J. C. Berendsen, Mol. Phys., 51,1011 (1984).Entropy Estimation from Simulations of Non-Diffusive Systems. 114. K. Binder, Z . Phys. B, 45,61 (1981).Monte Carlo Study of Entropy for Face-Centered-Cubic king Antiferromagnet. 115. M. L. Mansfield, Macromolecules, 27, 4699 ( 1994). Concentrated, Semiflexible Lattice Chain Systems and Criticism of the Scanning Technique. 116. H. Meirovitch, Mucrornolecules, 29, 475 (1996). A Response to Mansfield’s Paper “Concentrated, Semiflexible Lattice Chain Systems and Criticism of the Scanning Method.” 117. I. R. McDonald and K. Singer, Discuss. Furuduy SOC., 43, 40 (1967). Calculation of Thermodynamic Properties of Liquid Argon from Lennard-Jones Parameters by Monte Carlo Method. 118. Z. Li and H. A. Scheraga, J. Phys. Chem., 92,2633 (1988). Monte Carlo Recursion Evaluation of Free Energy. 119. 2. Li and H. A. Scheraga, Chem. Phys. Lett., 154, 516 (1989). Calculation of the Free Energy of Liquid Water by the Monte Carlo Recursion Method. 120. J.-P. Hansen, Phys. Rev. A, 2,221 (1970). Phase Transition in the Lennard-Jones System. 11. High Temperature Limit. 121. D. Levesque and L. Verlet, Phys. Rev., 182, 307 (1969).Perturbation Theory and Equation of State for Fluids. 122. W. G. Hoover, M. Ross, K. W. Johnson, D. Henderson, J. A. Barker, and B. C. Brown, I. Chem. Phys., 52,4931 (1970). Soft-Sphere Equations of State. 123. G. N. Patey and J. P. Valleau, Chem. Phys. Lett., 21,297 (1973).The Free Energy of Spheres with Dipoles: Monte Carlo with Multistage Sampling. 124. B. Srnit, K. Esselnik, and D. Frenkel, Mol. Phys., 87, 159 (1996). Solid-Solid and Liquid-Solid Phase Equilibria for the Restricted Primitive Model. 125. R. K. Bowles and R. J. Speedy, Mol. Phys., 87,1349 (1996). The Vapour Pressure of Glassy Crystals of Dimers. 126. C. Kriebel, A. Miiller, J. Winkelmann, and J. Fischer, Mol. Phys., 87, 151 (1996). Excess Properties of Dipolar and Non-Polar Fluid Mixtures from NpT Molecular Dynamics Simulations. 127. H. L. Scott and C. Y. Lee,J. Chem. Phys., 73,4591 (1980).The Surface Tension of Water: A Monte Carlo Calculation Using an Umbrella Sampling Algorithm.
-
References
67
128. M. Mezei, Mol. Phys., 47,1307 (1 982). Excess Free Energy of Different Water Models Computed by Monte Carlo Methods. 129. G. Jacucci and N. Quirke, Mol. Phys., 40,1005 ( 1 980). Monte Carlo Calculation of the Free Energy Difference Between Hard and Soft Core Diatomic Liquids. 130. N. Quirke and G. Jacucci, Mol. Phys., 45,823 (1982).Energy Difference Functions in Monte Carlo Simulations. Application to ( 1 ) The Calculation of the Free Energy of Liquid Nitrogen, (2)The Fluctuation of Monte Carlo Averages. 131. C. Pangali, M. Rao, and B. J. Berne,]. Chem. Phys., 71,2975 (1979). A Monte Carlo Simulation of the Hydrophobic Interaction. 132. S. H. Northrup, M. R. Pear, C.-Y. Lee, J. A. McCammon, and M. Karplus, Proc. Nutl. Acad. Sci. U.S.A., 79, 4035 (1982). Dynamical Theory of Activated Processes in Globular Proteins. 133. A. F. Voter,]. Chem. Phys., 82,1890 (1985). A Monte Carlo Method for Determining FreeEnergy Differences and Transition State Theory Rate Constants. 134. D. J. Tobias and C. L. Brooks 111, Chem. Phys. Lett., 142,472 (1987). Calculation of Free Energy Surfaces Using the Methods of Thermodynamic Perturbation Theory. 135. M. Mezei, I. Comput. Phys., 68, 237 (1987).Adaptive Umbrella Sampling: Self-Consistent Determination of the Non-Boltzmann Bias. 136. (a) R. W. W. Hooft, B. P. van Eijck, and J. Kroon,J. Chem. Phys., 97,6690 (1992).An Adaptive Umbrella Sampling Procedure in Conformational Analysis Using Molecular Dynamics and Its Application to Glycol. (b) R. W. W. Hook, B. P.van Eijck, and J. Kroon,]. Chem. Phys., 97,3639 (1992).Use of Molecular Dynamics Methods in Conformational Analysis. Glycol. A Model Study. 137. T. C. Beutler and W. F. van Gunsteren,]. Chem. Phys., 100, 1492 (1994).The Computation of a Potential of Mean Force: Choice of the Biasing Potential in the Umbrella Sampling Technique. 138. T. C. Beutler and W. F. van Gunsteren, Chem. Phys. Lett., 237, 308 (1995). Umbrella Sampling Along Linear Combinations of Generalized Coordinates. Theory and Application to a Glycine Dipeptide. 139. T. C. Beutler, T. Bremi, R. R. Ernst, and W. F. van Gunsteren, 1. Phys. Chem., 100, 2637 (1996). Motion and Conformation of Side Chains in Peptides. A Comparison of 2D Umbrella-Sampling Molecular Dynamics and NMR Results. 140. M. Mezei, P. K. Mehrotra, and D. L. Beveridge,]. Am. Chem. SOC., 107,2239 (1985).Monte Carlo Determination of the Free Energy and Internal Energy of Hydration for the Ala Dipeptide at 25 "C. 141. (a) S. Kumar, D. Bouzida, R. H. Swendsen, P.A. Kollman, and J. M. Rosenberg,]. Comput. Chem., 13, 1011 (1992).The Weighted Histogram Analysis Method for Free Energy Calculations on Biomolecules. I. The Method. (b) E. M. Boczko and C. L. Brooks III,]. Phys. Chem., 97, 4509 (1993). Constant-Temperature Free Energy Surfaces for Physical and Chemical Processes. (c) S. Kumar, P. W. Payne, and M. Visquez J. Comput. Chem., 17,1269 ( 1 996). Method for Free Energy Calculations Using Iterative Techniques. (d) E. M. Boczko and C. L. Brooks 111, Science, 269, 393 (1995). First-Principles Calculation of the Folding Free Energy of a Three-Helix Bundle Protein. 142. M. Berkowitz, 0. A. Karim, J. A. McCammon, and P. J. Rossky, Chem. Phys. Lett., 105,577 (1984).Sodium Chloride Ion Pair Interaction in Water: Computer Simulation. 143. A. C. Belch, M. Berkowitz, and J. A. McCammon,]. Am. Chem. Soc., 108, 1755 (1986). Solvation Structure of Sodium Chloride Ion Pair in Water. 144. E. Guirdia, R. Rey, and J. A. Padro, Chem. Phys., 155,187 (1991).Potential of Mean Force by Constrained Molecular Dynamics: A Sodium Chloride Ion-Pair in Water. 145. D. E. Smith and L. X. Dang,]. Chem. Phys., 100, 3757 (1994). Computer Simulations of NaCl Association in Polarizable Water.
68
Calculation of Free Energy and Entropy by Computer Simulation
146. J. Gao, /. Phys. Chem., 98,6049 (1 994). Simulation of the Na'CI- Ion Pair in Supercritical Water. 147. R. A. Friedman and M. Mezei, /. Chem. Phys., 102, 419 (1995).The Potentials of Mean Force of Sodium Chloride and Sodium Dimethylphosphate in Water: An Application of Adaptive Umbrella Sampling. 148. L. X. Dang, J. E. Rice, and P. A. Kollman, 1. Chem. Phys., 93, 7528 (1990). The Effect of Water Models on the Interaction of the Sodium Chloride Ion Pair in Water: Molecular Dynamics Simulations. 149. J. V. Eerden, W. J. Briels, S. Harkema, and D. Feil, Chem. Phys. Lett., 164, 370 (1989).Potential of Mean Force by Thermodynamic Integration: Molecular Dynamics Simulation of Decomplexation. 150. H. Resat, M. Mezei, and J. A. McCammon, ]. Phys. Chem., 100, 1426 (1996).Use of the Grand Canonical Ensemble in Potential of Mean Force Calculations. 151. J. Chandrasekhar, S. F. Smith, and W. L. Jorgensen,]. Am. Chem. Soc., 106, 3049 (1984). S,2 Reaction Profiles in the Gas Phase and Aqueous Solution. 152. J. Chandrasekhar, S. E Smith, and W. L. Jorgenson,j. Am. Chem. SOC.,107, 154 (1985). Theoretical Examination of the S,2 Reaction Involving Chloride Ion and Methyl Chloride in the Gas Phase and Aqueous Solution. 153. S. E. Huston and P.J. Rossky,]. Fhys. Chem., 93,7888 (1989). Free Energies of Association for the Sodium-Dimethyl Phosphate Ion Pair in Aqueous Solution. 154. S.-w. Chen and P. J. Rossky,]. Phys. Chem., 97,6078 (1993). Potential of Mean Force for a Sodium Dimethyl Phosphate Ion Pair in Aqueous Solution: A Further Test of the Extended RISM Theory. 155. D. J. Tobias and C. L. Brooks Ill,]. Chem. Phys., 89,5115 (1988).Molecular Dynamics with Internal Coordinate Constraints. 156. H. Resat, P. V. Maye, and M. Mezei, Biopolymers, 41, 73 (1997). The Sensitivity of Conformational Free Energies of the Alanine Dipeptide to Atomic Site Charges. 157. T. J. Marrone, M. K. Gilson, and J. A. McCammon,]. Phys. Chem., 100,1439 (1996).Comparison of Continuum and Explicit Models of Solvation: Potentials of Mean Force for Alanine Dipeptide. 158. F. Fraternali and W. F. van Gunsteren, Biopolymers, 34, 347 (1994).Conformational Transitions of a Dipeptide in Water: Effects of Imposed Pathways Using Umbrella Sampling Techniques. 159. D. J. Tobias and C. L. Brooks III,]. Chem. Phys., 92,2582 (1990).The Thermodynamics of Solvophobic Effects: A Molecular-Dynamics Study of n-Butane in Carbon Tetrachloride and Water. 160. C. D. Bell and S. C. Harvey, ]. Phys. Chem., 90, 6595 (1986). Comparison of Free Energy Surfaces for Extended-Atom and All-Atom Models of n-Butane. 161. W. L.]orgensen and J . K. Buckner,]. Phys. Chem., 91, 6083 (1987). Use of Statistical Perturbation Theory for Computing Solvent Effects in Molecular Conformation. Butane in Water. 162. D. J. Tobias and C. L. Brooks 111, Biochemistry, 30, 6059 (1991). Thermodynamics and Mechanism of a Helix Initiation in Alanine and Valine Peptides. 163. D. J. Tobias, S. F. Sneddon, and C. L. Brooks 111, ]. Mol. Biol., 216, 783 (1990). Reverse Turns in Blocked Dipeptides Are Intrinsically Unstable in Water. 164. B. Roux and M. Karplus, Biophys. J., 59,961 (1991). Ion Transport in a Model Grarnicidin Channel Structure and Thermodynamics. 165. L. X. Dang and P. A. Kollman, J. Am Chem. Soc., 112, 503 (1990).Molecular Dynamics Simulation Study of the Free Energy of Association of 9-Methyladenine and 1Methylthymine Bases in Water. 166. A. Warshel,]. Phys. Chem., 86,2218 (1982).Dynamics of Reactions in Polar Solvents. Semiclassical Trajectory Studies of Electron-Transfer and Proton-Transfer Reactions.
References
69
167. A. Warshel, Working Group on Specificity in Biological Interactions, C. Chagas and B. Pullman, Eds., Pontificiue Academiue Scientiurum Scriptu Variu, 55, 59 (1983). Simulating the Energetics and Dynamics of Enzymatic Reactions. 168. (a) J. Aqvist and A. Warshel, Biophys. I. 56, , 171 (1989). Energetics of Ion Permeation Through Membrane Channels. Solvation of Na+ by Gramicidin A. (b) A. Warshel, F. Sussman, and J.-K. Hwang, J. Mol. Biol., 201, 139 (1988). Evaluation of Catalytic Free Energies in Genetically Modified Proteins. 169. A. Wallqvist and D. G. Covell, J. Phys. Chem., 99, 13118 (1995). Free-Energy Cost of Bending n-Dodecane in Aqueous Solution. Influence of the Hydrophobic Effect and Solvent Exposed Area. 170. L. Wang, T. O’Connell, A. Tropsha, and J. Hermans, J. Am. Chem. SOC.,262,283 (1996). Molecular Simulations of p-Sheet Twisting. 171. J. A. Barker and D. Henderson, J. Chem. Phys., 47,2856 (1967). Perturbation Theory and Equation of State for Fluids: The Square-Well Potential. 172. J. A. Barker and D. Henderson, J. Chem. Phys., 47,4714 (1967). Perturbation Theory and Equation of State for Fluids. 11. A Successful Theory of Liquids. 173. M. R. Mruzik, Chem. Phys. Lett., 48, 171 (1977). A Monte Carlo Study of Water Clusters. 174. M. Mezei, S. Swaminathan, and D. L. Beveridge, 1. A m Chem. Soc., 100, 3255 (1978). Ab Initio Calculation of the Free Energy of Water. 175. M. R. Mruzik, F. E. Abraham, D. E. Schreiber, and G. M. Pound, J. Chem. Phys., 64, 481 (1976). A Monte Carlo Study of Ion-Water Clusters. 176. J. C. Owicki and H. A. Scheraga, J. Phys. Chem., 82, 1257 (1978).Monte Carlo Free Energy Calculations on Dilute Solutions in the Isothermal-Isobaric Ensemble. 177. J. P. M. Postma, H. J. C. Berendsen, and J. R. Haak, Faruday Symp. Chem. Soc., 17, 55 (1982). Thermodynamics of Cavity Formation in Water, a Molecular Dynamics Study. 178. A. Warshel, Biochemistry, 20, 3167 (1981). Calculations of Enzymatic Reactions: Calculations of pK,, Proton Transfer Reactions, and General Acid Catalysis Reactions in Enzymes. 179. B. L. Tembe and J. A. McCammon, Comput. Chem., 8,281 (1984). Ligand-Receptor Interactions. 180. W. L. Jorgensen and C. Ravimohan,J. Chem. Phys., 83, 3050 (1985). Monte Carlo Simulation of Differences in Free Energies of Hydration. 181. T. P. Lybrand, I. Gosh, and J. A. McCammon, J . Am. Chem. Soc., 107, 7793 (1985). Hydration of Chloride and Bromide Anions. Determination of Relative Free Energy by Computer Simulation.
182. T. P. Lybrand, J. A. McCammon, and G. Wipff, Proc. Nutl. Acad. Sci. U.S.A., 83,833 (1986). Theoretical Calculation of Relative Binding Affinity in Host-Guest Systems. 183. C. L. Brooks III,]. Phys. Chem., 90, 6680 (1986). Thermodynamics of Ionic Solvation: Monte Carlo Simulations of Aqueous Chloride and Bromide Ions. 184. P. A. Bash, U. C. Singh, R. Langridge, and P. A. Kollman, Science, 236,564 (1987). Free Energy Calculations by Computer Simulation. 185. U. C. Singh, F. K. Brown, P. A. Bash, and P. A. Kollman,J. Am. Chem. Soc., 109,1607 (1987). An Approach to the Application of Free Energy Perturbation Methods Using Molecular Dynamics: Applications to the Transformations of CH,OH + CH,CH,, H,O+ NH;, Glycine --* Alanine, and Alanine + Phenylalanine in Aqueous Solution and to H,O’(H,O), NH;(H,O), in the Gas Phase. 186. C. F. Wong and J. A. McCammon, J . Am. Chem. Soc., 108,3830 (1986). Dynamics and Design of Enzymes and Inhibitors. 187. J. Hermans and S. Shankar, Isr. J. Chem., 27,225 (1986). The Free Energy of Xenon Binding to Myoglobin from Molecular Dynamics Simulation. 188. A. J. Cross, Chem. Phys. Lett., 128, 198 (1986). Influence of Hamiltonian Parametrization on Convergence of Kirkwood Free Energy Calculations. -+
+
70
Calculation of Free Energy and Entropy by Computer Simulation
189. H. Resat and M. Mezei,]. Chem. Phys., 99, 6052 (1993). Studies on Free Energy Calculations. I. Thermodynamic Integration Using Polynomial Path. 190. T. C. Beutler, A. E. Mark, R. C. van Schaik, P. R. Gerber, and W. F. van Gunsteren, Chem. Phys. Lett., 222, 529 (1994). Avoiding Singularities and Numerical Instabilities in Free Energy Calculations Based on Molecular Simulations. 191. D. A. Pearlman and P. A. Kollman, J. Chem. Phys., 90, 2461 (1989). A New Method for Carrying Out Energy Perturbation Calculations: Dynamically Modified Windows. 192. P.Cieplak and P. A. Kollman,]. Am. Chem. SOC., 100, 3734 (1988). Calculation of the Free Energy of Association of Nucleic Acid Bases in Vacuo and Water Solution. 193. W. Jorgensen, J. K. Buckner, S . Boudon, and J. Tirado-Rives,]. Chem. Phys., 89,3742 (1988). Efficient Computation of Absolute Free Energy of Binding by Computer Simulations. Application to the Methane Dimer in Water. 194. L. X. Dang, K. M. Merz Jr., and P. A. Kollman, /. Am Chem. SOC., 111, 8505 (1989). Free Energy Calculations on Protein Stability: Thr-157 --t Val-157 Mutation of T4 Lysozyme. 195. B. Tidor and M. Karplus, Biochemistry, 30, 3217 (1991). Simulation Analysis of the Stability Mutant R96H of T4 Lysozyme. 196. S . Yun-yu, A. E. Mark, W. Cun-xin, H. Fuhua, H. J. C. Berendsen, and W. F. van Gunsteren, Protein Eng., 6, 289 (1993). Can the Stability of Protein Mutants Be Predicted by Free Energy Calculations? 197. N. Yamaotsu, I. Moriguchi, P. A. Kollman, and S . Hirono, Biochim. Biophys. Actu, 1163, 81 (1993). Molecular Dynamic Study of the Stability of Staphylococcal Nuclease Mutants: Component Analysis of the Free Energy Difference of Denaturation. 198. C. L. Brooks 111, /. Chem. Phys., 87,3029 (1987). Thermodynamics of Aqueous Solvation: Solution Properties of Alcohols and Alkanes. 199. A. E. Mark and W. F. van Gunsteren, ]. Mol. Biol., 240, 167 (1994). Decomposition of the Free Energy of a System in Terms of Specific Interactions. Implication for Theoretical and Experimental Studies. 200. (a) A. D. MacKerell Jr., M. S. Sommer, and M. Karplus, 1.Mol. Biol., 247, 774 (1995). pH Dependence of Binding Reactions from Free Energy Simulations and Macroscopic Continuum Electrostatic Calculations: Application to 2'GMP/3'GMP Binding to Ribonuclease T, and Implications for Catalysis. (b)F. S . Lee and A. Warshel,]. Chem. Phys., 97,3100 (1992). A Local Reaction Field Method for Fast Evaluation of Long-Range Electrostatic Interactions in Molecular Simulations. 201. R. H. Yun, A. G. Anderson, and J. Hermans, Proteins, 10, 219 (1991). Proline in a-Helix: Stability and Conformation Studied by Dynamics Simulation. 202. R. H. Yun and J. Hermans, Protein Eng., 4,761 (1991). Conformational Equilibria of Valine Studied by Dynamics Simulation. 203. J. Hermans, A. G . Anderson, and R. H. Yun, Biochemistry, 31,5646 (1992).Differential Helix Propensity of Small Apolar Side Chains Studied by Molecular Dynamics Simulations. 204. V. Daggett, F. Brown, and P. A. Kollman,]. Am. Chem. SOC., 111,8247 (1989). Free Energy Component Analysis: A Study of Glutamic Acid 165 + Aspartic Acid 165 Mutation in Triosephosphate Isomerase. 205. J. Gao, K. Kuczwra, B. Tidor, and M. Karplus, Science, 244,1069 (1989). Hidden Thermodynamics of Mutant Proteins: A Molecular Dynamics Analysis. 206. (a) F. S . Lee, Z.-T. Chu, M. B. Bolger, and A. Warshel, Protein Eng., 5,215, (1992). Calculations of Antibody-Antigen Interactions: Microscopic and Semi-Microscopic Evaluation of the Free Energies of Binding of Phosphorylcholine Analogs to McPC603. (b) T. Simonson and A. Briinger, Biochemistry, 31,8661 (1992). Thermodynamics of Protein-Peptide Interactions in the Ribonuclease-S System Studied by Molecular Dynamics and Free Energy Calculations. 207. P.E. Smith and W. F. van Gunsteren,/. Phys. Chem., 98,13735 (1994). When Are Free Energy Components Meaningful?
References
71
208. S. Boresch, G. Archontis, and M. Karplus, Proteins, 20,25 (1994).Free Energy Simulations: The Meaning of the Individual Contributions from Component Analysis. 209. S. Boresch and M. Karplus, J. Mol. Biol. 254, 801 (1995). The Meaning of Component Analysis: Decomposition of the Free Energy in Terms of Specific Interactions. 210. M. J. Mitchell and J. A. McCammon,]. Comput. Chem., 12,271 (1991).Free Energy Difference Calculations by Thermodynamic Integration: Difficulties in Obtaining a Precise Value. 21 1. A. E. Mark, W. F. van Gunsteren, and H. J. C. Berendsen, J. Chem. Phys., 94,3808 (1991). Calculation of Relative Free Energy via Indirect Pathways. 212. A. E. Mark, S. P.van Helden, P.E. Smith, L. H. M. Janssen, and W. F. van Gunsteren, J. Am. Chem. SOC., 116,6293 (1994).Convergence Properties of Free Energy Calculations: a-Cyclodextrin Complexes as a Case Study. 213. J. Hermans, R. H. Yun, and A. G. Anderson, J. Comput. Chem., 13,429 (1992). Precision of Free Energies Calculated by Molecular Dynamics Simulations of Peptides in Solution. 214. A. Tropsha and J. Hermans, Protein Eng., 5,29 (1992).Application of Free Energy Simulations to the Binding of a Transition-State-Analog Inhibitor to HIV Protease. 215. J. Hermans,]. Phys. Chem., 98, 13735 (1994).Simple Analysis of Noise and Hysteresis in (Slow Growth) Free Energy Simulations. 216. R. H. Wood,]. Phys. Chem., 95,4838 (1991).Elimination of Errors in Free Energy Calculations Due to the Lag Between the Hamiltonian and the System Configuration. 217. S. R. Durell and A. Wallqvist, Biophys. J,, 254, 1695 (1996).Atomic Scale Analysis of the Solvation Thermodynamics of Hydrophobic Hydration. 218. H. Resat and M. Mezei,]. Chem. Phys., 101,6126 (1994).Studies on Free Energy Calculations. 11. A Theoretical Approach to Molecular Solvation. 219. L. Wang and J. Hermans,j. Chem. Phys., 100,9129 (1994).Change of Bond Length in Free Energy Simulations: Algorithmic Improvements, But When Is It Necessary? 220. D. L. Severance, J. W. Essex, and W. L. Jorgensen, J. Comput. Chem., 16,311 (1995).Generalized Alteration of Structure and Parameters: A New Method for Free-Energy Perturbations in Systems Containing Flexible Degrees of Freedom. 221. W. L. Jorgensen, J. F. Blake, D. Lim, and D. L. Severance,]. Chem. SOC., Faruduy Trans., 90, 1727 (1994).Investigation of Solvent Effects on Pericyclic Reactions by Computer Simulations. 222. D. K. Jones-Hertzog and W. L. Jorgensen,J. Am. Chem. SOC., 117,9077 (1995).Elucidation of Transition Structures and Solvent Effects for the Mislow-Evans Rearrangement of Allylic Sulfoxides. 223. H. A. Carlson and W. L. Jorgensen,J. Am. Chem. SOC., 118, 8475 (1996).Monte Carlo Investigations of Solvent Effects on the Chorismate to Prephenate Rearrangement. 224. W. L. Jorgensen, N. A. McDonald, M. Selmi, and P. R. Rablen, J. Am. Chem. SOC., 117, 11809 (1995).Importance of Polarization for Dipolar Solutes in Low-Dielectric Media: 1,2Dichloroethane and Water in Cyclohexane. 225. T. Z. M. Denti, T. C. Beutler, W. F. van Gunsteren, and F. Diederich, J. Phys. Chem., 100, 4256 (1996).Computation of Gibbs Free Energies of Hydration for Simple Aromatic Molecules: A Comparative Study Using Monte Carlo and Molecular Dynamics Computer Simulation Techniques. 226. X. Daura, P. H. Hiinenberger, A. E. Mark, E. Querol, F. X. AvilCs, and W. F. van Gunsteren, J. Am Chem. SOC., 188,6285 (1996).Free Energies of Transfer of Trp Analogs from Chloroform to Water: Comparison of Theory and Experiment and the Importance of Adequate Treatment of Electrostatic and Internal Interactions. 227. J. S. Perkyns, Y. Wang, and M. Pettitt,J. Am. Chem. SOC., 118,1164 (1996).Salting in Peptides: Conformationally Dependent Solubilities and Phase Behavior of a Tripeptide Zwitterion in Electrolyte Solution.
72
Calculation of Free Energy and Entropy by Computer Simulation
228. M. Mezei and G. Jancs6, Chem. Phys. Lett., 239,237 (1995).Free Energy Simulation Studies on the Hydration of Tetramethylurea and Tetramethylthiourea. 229. P. V. Maye and M. Mezei, J. Mol. Struct. (Theochem),362, 317 (1996).Calculation of the Free Energy of Solvation of Li’ and Na’ Ions in Water and Chloroform. 230. A. Wallqvist and D. G. Covell J . Phys. Chem., 99, 5705 (1995). Cooperativity of Water-Solute Interactions at Hydrophilic Surface. 231. H. Resat and J. A. McCarnmon,J. Chem. Phys., 104,7645 (1996).Free Energy Simulations: Correcting for Electrostatic Cutoffs by Use of the Poisson Equation. 232. T. Z. M. Denti, W. F. van Gunsteren, and F. Diederich,/. Am. Chem. Soc., 118,6044 (1996). Computer Simulations of the Solvent Dependence of Apolar Association Strength: Gibbs Free Energy Calculations on a Cyclophane-Pyrene Complex in Water and Chloroform. 233. H. A. Carlson and W. L. Jorgensen, Tetrahedron, 51, 449 (1995). Investigations into the Stereochemistry of Cyclophane-Steroid Complexes Via Monte Carlo Simulations. 234. L. Troxler and G. Wipff, J. Am. Chem. Soc., 116,1468 (1994).Conformation and Dynamics of 18-Crown-6, Cryptand 222, and Their Cation Complexes in Acetonitrile Studied by Molecular Dynamics Simulations. 235. E. M. Duffy and W. L. Jorgensen,/. Am. Chem. Soc., 116,6337 (1994).Structure and Binding for Complexes of Rebek’s Acridine Diacid with Pyrazine, Quinoxaline, and Pyridine from Monte Carlo Simulations with an All-Atom Force Field. 236. W. L. Jorgensen and T. B. Nguyen, Proc. Natl. Acad. Sci. U.S.A., 90, 1194 (1993).Modeling the Complexation of Substituted Benzenes by a Cyclophane Host in Water. 237. B. Prod’hom and M. Karplus, Protein Eng., 6, 585 (1993).The Nature of the Ion Binding Interactions in EF-Hand Peptide Analog: Free Energy Simulation of Asp to Asn Mutations. 238. A. Tropsha, Y. Yan, S. E. Schneider, L. Li, and B. W. Erickson, in Peptides: Chemistry, Strtrcture and Biology, R. S . Hodes and J. A. Smith, Eds., ESCOM, Leidon, 1994, pp. 883-885. Relative Free Energies of Folding and Refolding of Model Secondary Structure Elements in Aqueous Solution. 239. S. K. Kumar, I. Szleifer, and A. 2. Panagiotopoulos, Phys. Rev. Lett., 66, 2935 (1991). Determination of the Chemical Potential of Polymeric Systems from Monte Carlo Simulations. 240. L. Wang, T. O’Connell, A. Tropsha, and J. Hermans, Proc. Natl. Acad. Sci. U.S.A., 92,10924 ( 1 995). Thermodynamic Parameters for the Helix-Coil Transition of Oligopeptides: Molecular Dynamics Simulation with the Peptide Growth Method. 241. L. Wang, T. O’Connell, A. Tropsha, and J. Hermans, Biopolymers, 39,479 (1996).Energetic Decomposition of the a-Helix-Coil Equilibrium of a Dynamic Model System. 242. P. E. Smith and W. F. van Gunsteren, /. Chem. Phys., 100, 577 (1994). Predictions of Free Energy Differences from a Single Simulation of the Initial State. 243. G. Archontis and M. Karplus,/. Chem. Phys., 105, 11246 (1996). Cumulant Expansion of the Free Energy: Application to Free Energy Derivatives and Component Analysis. 244. R. M. Levy, M. Belhadj, and D. B. Kitchen,/. Chem. Phys. 95,3627 (1991). Gaussian Fluctuation Formula for Electrostatic Free-Energy Changes in Solution. 245. H. Liu, A. E. Mark, and W. F. van Gunsteren,]. Phys. Chem., 100,9485 (1996). Estimating the Relative Free Energy of Different Molecular States with Respect to a Single Reference State. 246. A. E. Mark, Y. Xu, H. Liu, and W. F. van Gunsteren, Acta Biochim. Polon., 42, 525 (1995). Rapid Non-Empirical Approaches for Estimating Relative Binding Free Energies. 247. J.-K. Hwang and A. Warshel, J. Am. Chem. Soc., 109, 715 (1987). Microscopic Examination of Free-Energy Relationships for Electron Transfer in Polar Solvents. 248. R. A. Kuharski, J. S. Bader, D. Chandler, M. Sprik, M. L. Klein, and R. W. Impey, J. Chem. Phys., 89, 3248 ( 1988). Molecular Model for Aqueous Ferrous-Ferric Electron Transfer. 249. R. A. Marcus,]. Chem. Phys., 24, 966 (1956).On the Theory of Oxidation-Reduction Reactions Involving Electron Transfer.
References
73
250. R. A. Marcus,]. Chem. Phys., 24,979 (1956).Electrostatic Free Energy and Other Properties of States Having Non-Equilibrium Polarization, 251. Y. Y. Sham, Z. T. Chu, and A. Warshel,]. Phys. Chem. B , 101,4458 (1997).Consistent Calculations of pK,’s as Ionizable Residues in Proteins: Semi-Microscopic and Microscopic Approaches. 252. J. Aqvist, C. Medina, and J.-E. Samuelsson, Protein Eng., 7,385 (1994). A New Method for Predicting Binding Affinity in Computer-Aided Drug Design. 253. T. Hansson and J. Aqvist, Protein Eng., 8, 1137 (1995).Estimation of Binding Free Energies for HIV Proteinase Inhibitors by Molecular Dynamics Simulations. 254. J. Aqvist and T. Hansson, J. Phys. Chem., 100, 9512 (1996).On the Validity of Electrostatic Linear Response in Polar Solvents. 255. J. Aqvist and S. L. Mowbray,]. Biol. Chem., 270,9978 (1995). Sugar Recognition by a GlucoselGalactose Receptor. Evaluation of Binding Energetics from Molecular Dynamics Simulations. 256. J. Aqvist,J. Comput. Chem., 17, 1587 (1996).Calculation of Absolute Binding Free Energies for Charged Ligands and Effects of Long-Range Electrostatic Interactions. 257. H. A. Carlson and W. L. Jorgensen,J. Phys. Chem., 99,10667 (1995).An Extended Linear Response Method for Determining Free Energies of Hydration. 258. G. King and R. A. Barford, J . Phys. Chem., 97, 8798 (1993). Calculation of Electrostatic Free Energy Differences with a Time-Saving Approximate Method. 259. G. Hummer and A. Szabo, J. Chem. Phys., 105,2004 (1996).Calculations of Free Energy Differences for Computer Simulations of Initial and Final States. 260. H. Meirovitch, Macromolecules, 18, 563 (1985). The Scanning Method with a Mean-Field Parameter: Computer Simulation Study of the Critical Exponents of Self-AvoidingWalks on a Square Lattice. 261. H. Meirovitch, M. Visquez, and H. A. Scheraga, Biopolymers, 27,1189 (1988).Stability of Polypeptides Conformational States. 11. The Free Energy of the Statistical Coil Obtained by the Scanning Simulation Method. 262. H. Meirovitch, M. Visquez, and H. A. Scheraga,]. Chem. Phys., 92,1248 (1990).Free Energy and Stability of Macromolecules Studied by the Scanning Method. 263. H. Meirovitch, Znt. J. Mod. Phys. C, 1, 119 (1990). The Scanning Simulation Method for Macromolecules. 264. H. Meirovitch, J. Chem. Phys., 97, 5803 (1992).Entropy, Pressure and Chemical Potential of Multiple Chain Systems from Computer Simulation. I. Application of the Scanning Method. 265. H. Meirovitch and H. A. Scheraga,J. Chem. Phys., 84,6369 (1986). Computer Simulation of the Entropy of Continuum Chain Models: The Two-Dimensional Freely-Jointed Chain of Hard Disks. 266. H. Meirovitch, J. Chem. Phys., 97, 5816 (1992).Entropy, Pressure and Chemical Potential of Multiple Chain Systems from Computer Simulation. 11. Application of the Metropolis and the Hypothetical Scanning Method. 267. H. Meirovitch, M. Vasquez, and H. A. Scheraga, Biopolymers, 26,651 (1987).Stability of Polypeptide Conformational States as Determined by Computer Simulation of the Free Energy. 268. C. Baysal and H. Meirovitch, submitted for publication. 269. H. Meirovitch, J. Phys. A, 15, 2063 (1982). An Approximate Stochastic Process for Computer Simulation of the Ising Model at Equilibrium. 270. H. Meirovitch, unpublished results, 1986. 271. M. Watanabe and W. P. Reinhardt, Phys. Rev. Lett., 65,3301 (1990). Direct Dynamical Calculation of Entropy and Free Energy by Adiabatic Switching.
74
Calculation of Free Energy and Entropy b y Computer Simulation
272. B. A. Berg, U. H. E. Hansmann, and Y. Okamoto,]. Phys. Chem., 99,2236 (1995).Comment on “Monte Carlo Simulation of a First-Order Transition for Protein Folding.” 273. Y. Okamoto and U. H. E. Hansmann, J. Phys. Chem., 99,11276 (1995).Thermodynamics of Helix-Coil Transitions Studied by Multicanonical Algorithms. 274. U. H. E. Hansmann and Y. Okarnoto, Physica A, 212,415 (1994). Comparative Study of Multicanonical and Simulated Annealing Algorithms in the Protein Folding Problem. 275. U. H. E. Hansrnann, Y. Okamoto, and F. Eisenmenger, Chem. Phys. Lett., 259,321 (1996). Molecular Dynamics, Langevin and Hybrid Monte Carlo Simulations in Multicanonical Ensemble. 276. U. H. E. Hansrnann and Y. Okamoto,]. Comput. Chem., 18,920 (1997).Numerical Comparisons of Three Recently Proposed Algorithms in the Protein Folding Problem. 277. M.-H. Hao and H. A. Scheraga, J . Phys. Chem., 98, 9882 (1994). Statistical Therrnodynamics of Protein Folding: Sequence Dependence. 278. M.-H. Hao and H. A. Scheraga, ]. Chem. Phys., 102, 1334 (1995). Statistical Thermodynamics of Protein Folding: Comparison of Mean Field Theory with Monte Carlo Simulation. 279. M.-H. Hao and H. A. Scheraga, Proc. Natl. Acad. Sci. U.S.A., 93,4984 (1996).How Optimization of Potential Functions Affects Protein Folding. 280. M.-H. Hao and H. A. Scheraga, ]. Phys. Chem., 100, 14540 (1996). Optimizing Potential Functions for Protein Folding. 281. M. Watanabe, A. M. Brodsky, and W. P. Reinhardt, J. Phys. Chem., 95, 4593 (1991). Dielectric Properties and Phase Transitions of Water Between Conducting Plates. 282. W. P. Reinhardt and J. E. Hunter III,]. Chem. Phys., 97, 1599 (1992). Variational Path Optimization and Upper and Lower Bounds of Free Energy Changes via Finite Time Minimization of External Work. 283. J. E. Hunter 111, W. P. Reinhardt, and T. F. Davis,]. Chem. Phys., 99,6856 (1993). A Finite Time Variational Method for Determining Optimal Path and Obtaining Bounds on Free Energy Changes from Computer Simulations. 284. G. J. Hogenson and W. P. Reinhardt,]. Chem. Phys., 102,4151 (1995). Variational Upper and Lower Bounds on Quantum Free Energy and Energy Differences via Path Integral Monte Carlo. 285. J. C. Schon, J. Cbem. Pbys., 10.5, 10072 (1996).A Thermodynamic Distance Criterion of Optimality for the Calculation of Free Energy Changes from Computer Simulations. 286. J. P. Valleau and D. N. Card,]. Chem. Phys., 57, 5457 (1972).Monte Carlo Estimation of the Free Energy by Multistage Sampling. 287. G. Bhanot, S. Black, P. Carter, and R. Salvador, Phys. Lett. 8, 183, 331 (1987). A New Method for the Partition Function of Discrete Systems with Application to the 3D Ising Model. 288. A. M. Ferrenberg and R. H. Swendsen, Phys. Rev. Lett., 61,2635 (1988).New Monte Carlo Technique for Studying Phase Transition. 289. A. M. Ferrenberg and R. H. Swendsen, Phys. Rev. Lett., 63,1195 (1989). Optimized Monte Carlo Data Analysis. 290. R. H. Swendsen, Physica A, 194, 53 (1993). Modern Methods of Analyzing Monte Carlo Computer Simulations. 291. K. K. Mon, Pbys. Rev. Lett., 54, 2671 (1985). Direct Calculation of Absolute Free Energy for Lattice Systems by Monte Carlo Sampling of Finite Size Dependence.
CHAPTER 2
Molecular Dynamics with General Holonomic Constraints and Application to Internal Coordinate Constraints Ramzi Kutteh* and T. P. Straatsma
High Performance Computational Chemistry, Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99352, (present address): *Department of Physics, Queen Mary and Westfield College, University of London, Mile End Road, London, E l 4NS, United Kingdom
INTRODUCTION The earliest molecular dynamics (MD) simulations on polyatomic molecular systems date back less than three d e ~ a d e s . l -Since ~ then, MD simulations have been applied to increasingly more complex molecules, such as nalkanes4-I3 and p r ~ t e i n s . l ~ -In* ~contrast to the earlier simulations of atomic systems, the simulation of polyatomic molecules is complicated by the existence of both intramolecular and intermolecular degrees of freedom. The presence of this variety of degrees of freedom leads to a wide range of time scales of molecular motions.” Because the size of the time step used in MD simulations is limited by the shortest period of motion present, simulations of long time scale Reviews in Computational Chemistry, Volume 12 Kenny B. Lipkowitz and Donald B. Boyd, Editors Wiley-VCH, John Wiley and Sons, Inc., New York, 0 1998
75
76
Molecular Dynamics with Gcnriral Holonomic Constmints
behavior are computationally expensive, if not impractical. For example, one may be interested in studying the slower modes in a system (e.g., torsional conformation interconversion), which necessitates a relatively long simulation time. Unfortunately, the upper limit imposed on the simulation time step by the presence of the fast modes, including bond-stretch and bond-angle changes, implies that a large number of short time steps is required to perform such a long simulation, thereby creating a major computational expense. Substantial improvement in efficiency can be achieved15 by freezing the fast modes, such as bond-stretching and possibly bond-angle vibrations, by constraining the appropriate hard (i.e., high frequency) degrees of freedom. Imposing constraints on the system to achieve greater computational efficiency changes the physical character of the system. The dynamics of the constrained system is different from that of the original unconstrained system. Nevertheless, if the constrained degrees of freedom are not strongly coupled to other unconstrained degrees of freedom, and hence can be frozen without greatly affecting the remaining degrees of freedom, the differences in the long-term dynamical behavior of the system when treated with and without constraints should be relatively small. Generally, this weak coupling condition holds true for modes in considerably higher frequencies than the remaining low frequency modes. Bond-stretching modes, and to a lesser extent certain angle-bending modes, are usually of high enough frequency to satisfy this criterion. Furthermore, the amplitudes of the high frequency bond-stretching and some bondbending modes are much smaller than those of the lower frequency modes. The physical effects of introducing constraints into a molecular model have been discussed by several author^."^'^^'*-^^ This chapter is concerned mainly with the methods of constraint dynamics. In addition to descriptions of bondstretch and angle-bend constraints, dihedral (or torsional) constraints are explicitly considered. Torsional modes generally have frequencies comparable to those of other modes, and the weak coupling condition is not satisfied in this case. Hence the constraint approximation is not justified for torsion. This fact is particularly important because torsional motions play a major role in conformation interconversion in small molecules as well as polymers, and constraining them can seriously alter the dynamics of the original, unconstrained system. Nevertheless, the ability to constrain dihedral angles can be useful in simulations in which one needs to assess the behavior of a molecule as a function of conformation. An important application of this type of constraint is the evaluation of potentials of mean force (PMF),which provide the work required to move the system along a predefined dihedral coordinate. The thermal probability is directly related to this free energy profile. PMF provide a means of estimating relative stabilities of conformers as well as the kinetics of conformational changes. One of the approaches to determine PMF is the application of thermodynamic perturbation or thermodynamic integration.26 These techniques are especially useful if the “reaction coordinate” is known analytically and if this re-
lntroduction
77
action coordinate can be expressed as a combination of constraints on a small number of internal coordinates. An example is the calculation of the potential of mean force for rotation around a bond as described by a dihedral torsion. More complex analytical reaction paths can be formulated based on gas phase calculations, exploratory simulations, or chemical intuition. In certain cases the reaction path is not readily defined in terms of internal coordinates. Instead, the sequence of structures (as in larger systems) along a reaction coordinate may be more conveniently defined in a Cartesian coordinate representation. Techniques employing PMF using linear constraints have been described extensively by Elber and c o - w o r k e r ~ . ~ ~ - ~ ~ Evaluating constraint contributions to calculated free energy differences in a consistent and efficient manner is problematic. The main conceptual difficulty is that the change in free energy of a system must be evaluated for a change in a coordinate that has been removed from the allowable phase space by means of the constraint, and thus has no corresponding term in the Hamiltonian. For thermodynamic perturbation, a numerical potential of mean force correction to evaluate constraint contributions has been suggested by Kollman and cow o r k e r ~ . However, ~ ~ . ~ ~ their method is not generally applicable because of its approximations. In a method to evaluate constraint contributions in thermodynamic integration simulations suggested by Straatsma, Zacharias, and M c C a r n r n ~ nthis , ~ ~contribution to free energy differences is evaluated from an analytical formulation of the derivative of the free energy with respect to the constrained degrees of freedom. All existing methods for evaluation of constraint contributions to free energy differences focus on bond-stretching constraints. The main reason for this is that simulations subject to bond-stretch constraints are simpler to implement than algorithms for other constraints. As this chapter shows, equations for other than bond-stretch constraints can be formulated and implemented into molecular simulations. These equations can also be used, akin to the treatment of bond-stretch constraints, to evaluate free energy contributions. The present treatment covers mainly general holonomic constraints, with only a brief mention of nonholonomic constraints in the context of the application of Gauss’s principle of least constraint in MD simulations. A holonomic constraint, u = 0, is defined33 as an algebraic equation connecting the coordinates of the particles (and in general also the time, but not here). To completely understand the various constraint methods used in modern simulations, it is appropriate to start the discussion from first principles. Lagrangian dynamics provides two approaches for dealing with systems with general holonomic constraint^:^^>^^ 1. The first approach of Lagrangian dynamics consists of transforming to a set of independent generalized coordinates and making use of Lagrange’s equations of the first kind, which do not involve the forces of constraint. The equations of constraint are implicit in the transformation to independent gen-
78
Molecular Dynamics with General Holonomic Constraints
eralized coordinates, hence are also implicit in the Lagrange equations of the first kind. 2. The second approach, which uses the Lagrange multiplier technique, consists of retaining the set of constrained coordinates and making use instead of Lagrange’s equations of the second kind, which involve the forces of constraints. The Lagrange equations of the second kind together with the equations of constraints are used to solve for both the coordinates and the forces of constraints. Use of this approach with Cartesian coordinates has come to be known as Constraint dynamic^.^ This chapter is concerned with the various methods of constraint dynamics. Each of these two approaches is in principle applicable to molecular models of both the types encountered in MD simulations: the totally rigid molecular models used in simulations of, for example, nitrogen3s and water,3 and the partially rigid molecular models (i.e., where only some of the internal degrees of freedom are constrained) used in early simulations of butane: for example. In practice, the first approach above is truly effective only when applied to totally rigid molecular models, whereas the second approach is equally effective for molecular models of both types, as described next. 1. Totally rigid models A special case of a system of particles with holonomic constraints is the rigid body. A rigid body can be thought of as a system of particles with a distance constraint between every pair of particles. A (nonlinear)rigid model has six degrees of freedom: three translational and three rotational. (a) Application to a rigid body of either Newtonian mechanics or the first approach of Lagrangian dynamics leads to the formalism of rigid body dynamics and the well-known Euler equations of rotational motion.33 The Euler equations, together with the equations relating the angular velocities to the Euler angles, can be used in principle to solve numerically for the molecular orientation^.^>^^ In practice, however, singularity problems arise in the equations connecting the angular velocities and the Euler angles. Although these problems can be remedied,35 a better solution was proposed by Evans3’ and by Evans and M ~ r a d whereby , ~ ~ the Euler angles are abandoned as a set of three independent generalized coordinates in favor of the redundant four-component, qua tern ion^.^^ Wellbehaved equations relate the angular velocities and the quaternions. Improvements on the original quaternion formalism followed.3941 These rigid body methods are evidently not applicable to partially constrained systems. In addition, as discussed below, application of the first approach of Lagrangian dynamics to partially constrained systems does not yield the computationally powerful formalism it gives in the case of totally rigid bodies.
Introduction
79
Although rigid body methods can be effective for simulations of rigid models, we focus here on methods that apply effectively also to flexible models. Therefore, this chapter is not concerned with the methods of rigid body dynamics, either in the form of the older and more problematic Euler angles method, or as represented by the more recent and more effective quaternion methods. (b) The second approach of Lagrangian dynamics is also applicable to the study of rigid bodies. Appropriate equations of constraint are used to impose the condition of rigidity on the molecular model. To impose total rigidity on a model, particular care must be given to the issue of redundancy of constraints, keeping in mind that for a (nonlinear) system of N particles, the minimum number of constraints t ~ ~ explored the use needed is 3N - 6. Hammonds and R y ~ k a e rhave of redundant constraints to improve the efficiency of evaluating the forces of constraints. The use of constraint dynamics to impose total rigidity on molecular models is illustrated in detail for a rigid water model in a later section. Ciccotti et al.43 dealt with the complications that arise when methods of constraint dynamics are applied and bond-stretch constraints alone are used to impose total rigidity on linear and planar molecules. The circumstances under which the first approach of Lagrangian dynamics (in particular, quaternion methods) is better or worse than the constraint dynamics approach for treating rigid bodies, are discussed elsewhere.36 As will be seen below, only constraint dynamics is truly effective for dealing with partially rigid models. Consequently a comparison between the two approaches for rigid bodies is tangential to the focus of this chapter on methods effectively applicable to both rigid and flexible models. 2. Partially rigid models In the general case of a molecular model that is only partially rigid (i.e., has more than six degrees of freedom, if nonlinear): (a) The first approach of Lagrangian dynamics is in principle applicable to such partially constrained systems and, in fact, was used in early ~ ? ~every ~ system of interest, a convesimulations of n - b ~ t a n e . For nient set of independent generalized coordinates must be selected, and the equations of motion in these coordinates must be derived from the Lagrange equations of the first kind. Unfortunately, as molecular size and number of constraints increase, this procedure becomes increasingly complex,4~18~19~22~44 both in the choice of suitable generalized coordinates and in the derivation of the equations of motion. (b) Because the rigid body formalisms are limited to totally rigid models, and because the first approach of Lagrangian dynamics does not yield equally powerful techniques for partially rigid models, this chapter,
80
Molecular Dynamics with General Holonomic Constraints as mentioned, focuses on the techniques of constraint dynamics, equally useful for totally and partially rigid models.
There is a regrettable lack in the literature of exposition of the fundamental theory behind constraint dynamics. In this chapter, we develop the basic theory for general forms of holonomic constraints and arbitrary integration algorithms. The benefits of such a general exposition are threefold. First, the methods of constraint dynamics, which to date have been used predominantly with simple bond-stretch constraints and integration algorithms requiring coordinate derivatives up to second order only, can now be readily implemented for any forms of coupled holonomic constraints and for integration schemes requiring coordinate derivatives up to any desired order. In MD simulations of systems involving constraints, the computation of the constraint forces typically takes far less CPU time than the computation of the forces deriving from the potential energy of the system. As larger systems with constraints are considered, or molecular models involving larger sets of constraints are simulated, the computation of constraint forces becomes increasingly CPU i n t e n ~ i v eand ~ ~ can , ~ ~parallel in computational the evaluation of the potential energy forces. To deal with this problem, Hammonds and R y ~ k a e r discussed t~~ techniques for improving the efficiency of computing the constraint forces in MD simulations. One suggested strategy calls for the use of equivalent alternative constraints. The formulation of constraint dynamics given for general holonomic constraints in this chapter provides an array of methods for implementing such equivalent alternative constraints. The resulting gain in efficiency can be dramatic, as shown later for a rigid water model. Hence the ability to implement arbitrary forms of holonomic constraints provides greater freedom in their application, often resulting in improved computational efficiency. Second, understanding the common methodological origin of the various techniques provides deeper insight than is available from purely numerical tests alone when accuracies and efficiencies are being compared. Third, the use of coupled holonomic constraints having more complex forms than bond-stretch constraints highlights the limitations of the methods, which are not obvious when only simple bond-stretch constraints are employed. Accordingly, in discussing the various algorithms of constraint dynamics, it is instructive to classify them and relate them to one another, as described next. 1. The basic constraint dynamics method employed There are two distinct methods of constraint dynamics: the analytical method and the method of undetermined parameters. The reason for the names will become apparent. (a) The analytical method was briefly described by Ryckaert et al.,s and the next section of this chapter is devoted to a detailed discussion of this method in its most general form. The analytical method con-
lntroduction
81
sists of first selecting an integration algorithm requiring the input of the total forces and their derivatives with respect to time up to some order smax.The forces of constraints and their derivatives with respect to time up t o order smaxare then computed and used with the potential energy forces in the adopted integration scheme, to generate the constrained coordinates. The analytical method was applied first by Orban and R y ~ k a e r t Without .~~ the use of a correction scheme, however, the analytical method leads to constrained degrees of freedom that diverge with time from their constraint values,48 because of the error introduced by the unavoidable numerical integration of the equations of motion. To deal with the problem of numerical drift, Ryckaert et al.5 introduced the method of undetermined parameters. As discussed in the next section, years later Edberg et al.49 revived the analytical method (for smax= 0 and bond-stretch constraints only) by introducing a constraint correction scheme that compensates for the error from the numerical integration. The analytical method deserves a detailed discussion for at least two major reasons. First, if used in conjunction with some constraint correction ~ c h e m e , 4 ~ >it ~isOimportant in its own right as a practical method of solution of the constrained dynamics problem. Second, the method of undetermined parameters, central to the subject matter of this chapter, is an outgrowth of the analytical method; hence a thorough understanding of the analytical method is an essential prerequisite for understanding the method of undetermined parameters. (b) As mentioned above, Ryckaert et aL5 introduced the method ofundetermined parameters, which is essentially the analytical method modified to ensure that the constraints are satisfied exactly at each time step. The method of undetermined parameters is discussed in detail, in its most general form, in a later section. Basically, the aforementioned modification requires that the highest derivatives with respect to time of the Lagrangian multipliers (i.e., of order smax)be replaced by a set of undetermined parameters with values to be determined such that the constraints are satisfied exactly at each time step. As will be seen, the algorithm does not introduce into the trajectories additional numerical errors any worse than the error already present in the integration scheme itself. Again, the method of undetermined parameters deserves a detailed discussion in its most general form because most researchers are interested in its implementation with different integration algorithms (e.g., basic Verlet,51 velocity Verlets2), with holonomic constraints of various types (e.g., bond-stretch, angle-bend, and torsional constraints), and with particular techniques of solution (e.g., the ma-
82
Molecular Dynamics with General Holonomic Constraints trix m e t h ~ dSHAKE,S ,~ RATTLE,53 and SETTLE45). A thorough understanding of the method of undetermined parameters in its most general form provides an instructive picture of how its special cases relate to each other. In addition, as for the analytical method, the rigorous treatment given here supplies missing steps and corrects minor errors in Reference 5 .
The general equations of both the analytical method and the method of undetermined parameters contain two parameters: smaxand the u. These are uniquely determined by the particular algorithm for the integration of the equations of motion, and by the particular forms of holonomic constraints, respectively, as discussed in points 2 and 3 below. 2. The type of integration algorithm used In the context of the mathematical formulation of constraint dynamics, the relevant property of the adopted integration algorithm is the highest order of time derivatives of the forces required by the algorithm. This property is represented by the parameter smax,and smax= 0 is widely encountered. The initial treatment given here is for unspecified smax,which in principle can have any positive integer value. For each method of constraint dynamics, after the final set of equations has been obtained, specialization to any integration scheme is achieved by setting smaxto its appropriate value. 3. The types of holonomic constraint applied As mentioned, most of the work performed to date has been restricted mainly to constraining bondstretch distances. Angle-bend constraints have been implemented with appropriate bond-stretch constraints using the triangulation procedure.36 A later section compares the angle-constraint and triangulation approaches for anglebend constraints. A procedure for performing constraint dynamics with general holonomic constraints was outlined by Ryckaert,* but implementations of this method and performance figures have not been reported for the important internal coordinate constraints, such as angle-bend and torsional constraints. In this chapter, the analytical method and the method of undetermined parameters are first described and their equations derived for the case of general holonomic constraints, denoted by us. The treatment is then specialized to three types of constraint of great interest in MD simulations: the bond-stretch, angle-bend, and torsional constraints. 4. The numerical procedure used to solve the final equations The analytical method leads to a system of equations linear in the unknowns (i.e., the Lagrangian multipliers and their time derivatives up to order smax).Therefore standard numerical techniques for solving such systems can be employed. The method of undetermined parameters leads to an additional system of equations generally nonlinear in the unknowns (i.e., the derivatives of the Lagrange multipliers of order smax).The order of nonlinearity depends on the particular
Introduction
83
type(s) of holonomic constraints used. For example, using the method of undetermined parameters for bond-stretch constraints together with the basic Verlet integration scherne’l leads to a system of equations quadratic in the unknown Lagrange parameters. Two numerical procedures can be used for solving these equations. The first, known as the matrix method,’ is an iterative coupled approach, whereas the second, termed SHAKE,5,15is an iterative decoupled approach. A third analytical approach, SETTLE:’ is applicable only to three-point rigid bodies with bond-stretch constraints. Although this chapter is written as a review of the methods of constraint dynamics, a substantial part of the material is new. In the next section, the analytical method is described in detail in its most general form. The gradual divergence of the constraints and the need for a constraint correction scheme are discussed. Finally, the method of Edberg et is discussed in the context of the analytical method, as a special case with smax= 0 and holonomic bondstretching constraints, together with a constraint correction scheme. In the section following that, the method of undetermined parameters is treated in detail, also in its most general form. An error analysis of the method of undetermined parameters is given, and the method is shown to introduce a numerical error no worse than that already inherent in the integration algorithm. With all the general formalism laid out, the following sections are concerned with important special cases. First, because of the particularly simple form assumed by the method of undetermined parameters when used with the basic Verlet algorithm, a section is devoted to a discussion of this case. The matrix method and the SHAKE algorithm are described for the case of general holonomic constraints, and a useful physical picture of the SHAKE procedure is given. The method of Tobias and Brookss4 is discussed in the context of the method of undetermined parameters with the basic Verlet algorithm. The section that follows discusses the application of the method of undetermined parameters with the basic Verlet scheme to the internal coordinate constraints of bond-stretch, angle-bend, and torsion. The material there connected with angle-bend and torsional constraints has not been treated in the literature. A comparative study of using angle constraints versus triangulation to impose angle-bend constraints is given in a separate section. Because of the computational advantages the velocity Verlet integration algorithm offers over the basic Verlet scheme, the section following is devoted to the description of the method of undetermined parameters with the velocity Verlet algorithm. The extension of SHAKE into RATTLE, given to date for the simple case of bond-stretch constraints, is described for general holonomic constraints. As with the basic Verlet algorithm, the treatment is then specialized to the important cases of bond-stretch, bond-angle, and torsional constraints.
84
Molecular Dvnamics with General Holonomic Constraints
Finally, further refinements and developments, such as the SETTLE algorithm and the special treatment of planar molecules in constraint dynamics, are briefly covered. A number of areas of possible progress are also mentioned.
THE ANALYTICAL METHOD OF CONSTRAINT DYNAMICS This section describes in detail the analytical method for any forms of holonomic constraints ukand with an integration algorithm requiring derivatives of forces with respect to time up to any order smax.The treatment is based mainly on the method described by Ryckaert et al.5 and used earlier by Orban and Ryckaert?* but it contains added details and minor corrections. Where fundamental equations are reached, the corresponding equations of Reference 5 (often with slight notational differences) are noted for comparison. As stated before, the treatment is by definition couched throughout in Cartesian coordinates. Consider a system of N interacting particles subject to 1general holonomic constraints uk({r(t))) = 0 (k = 1 , .
.. ,1)
I1 1
where (r(t)]denotes the subset of coordinates of the N particles involved in the particular constraint ak.Throughout this chapter, the symbol ( ] denotes either a set or a subset of quantities. For example, (A] stands for A , , A,, . . . , A,. The context will usually be sufficient to indicate to the reader the intended meaning. For a system of N particles subject to the 1 holonomic constraints in Eq. [ 1j, the Cartesian form of the Lagrangian equations of motion of the second kind,33 with the coordinates connected by the constraint equations and the constraint forces appearing explicitly, is
where ml is the mass of particle z, U is the potential energy of the system, h,(t) is the Lagrangian multiplier corresponding to constraint uk,and Fz(t ) and Gt(t ) are the potential energy forces and constraint forces on particle i, respectively. Throughout this chapter, a single dot on top of a coordinate denotes the first derivative with respect to time of that coordinate, and a double dot denotes the second derivative with respect to time. The forces (F(t)]and (G(t)]depend on time implicitly through the coordinates. Equations [ll and [2] form a set of N + 1 equations that can be solved for the N coordinates [r(t)]and the 1 multipli-
The Analytical Method of Constraint Dynamics
85
ers [X). The general formalism of the analytical method consists of two steps: (1)the computation of the forces of constraints and their derivatives up to smax and (2)the numerical integration of the equations of motion. 1. The computation of the forces of constraints and their derivatives This step begins with the selection of a numerical algorithm for the integration of the equations of motion. Assuming that the chosen integration algorithm requires the total forces and their derivatives with respect to time up to order smax,the forces of constraint and their derivatives with respect to time up to order smaxmust therefore be computed, implying that the Lagrangian multipliers and their derivatives with respect to time up to order smaxmust also be computed. The purpose of this step, then, is to derive a set of equations to be solved for the Lagrangian multipliers and their time derivatives up to order smax.The forces of constraint and their derivatives up to order smax are subsequently determined, for use in the adopted integration scheme. Computing the constraint forces and their derivatives is accomplished in three steps: (a) An expression is derived from the equations of motion (Eq. [2]),giving time derivatives of the coordinates of any order n L 2 in terms of the forces (F(t ) )and their derivatives, and the unknown Lagrangian multipliers ( A ( t )and ) their derivatives up to order n - 2. (b) The s + 2 time derivative of the constraint equations (Eq. [l])is evaluated, with 0 5 s 5 smax,to include the derivatives of the coordinates with respect to time up to order smax+ 2. As mentioned in step l a , these include the Lagrangian multipliers and their time derivatives up to order smax,as required by the integration algorithm. (c) The expression obtained in step l a is inserted into the s + 2 timedifferentiated Eq. [l] from step lb, providing the desired set(s) of equations to be solved for the Lagrangian multipliers (A] and their As will be seen, this set of equations time derivatives up to order smaX. is linear in the Lagrangian multipliers and their derivatives; it is solved recursively by setting s = 0, . . . , smaxto obtain the corresponding Throughout this chapter, + ( n ) denotes an nth-order derivative with respect to time of and +n denotes an nth power of +. For the first and second time derivatives of the coordinates, the dot and parentheses notations are used interchangeably.
+,
2. The integration of the equations of motion In this step, the forces of constraint and their time derivatives up to order smax,obtained from step 1, are used as input to the selected integration scheme, to generate the constrained coordinates. Note that the particular choice of numerical integration algorithm in step 2 determines the parameter smaxof step 1 .
We term the approach described here the analytical method to emphasize that the Lagrange multipliers and their derivatives are computed up to
86
Molecular Dynamics with General Holonomic Constraints
order smaxin the analytical method, but up to order smax - 1 only in the method of undetermined parameters, with the derivatives of order smaxcomputed ad hoc to satisfy the constraints, as described in the next section. In particular, for smax= 0 the actual forces of constraint must be computed (a priori) in the analytical method, in contrast to the method of undetermined parameters, where the approximate forces of constraint can be computed (a posteriori) if desired. We turn now to the details of step 1.
Computation of the Forces of Constraints and Their Derivatives We derive here the set of equations for computing the Lagrange multipliers and their derivatives with respect to time up to order smax.An integration algorithm requiring the forces and their derivatives with respect t o time up to order smaxis adopted. Taking the n - 2 time derivative of Eq. [2] and applying Leibniz's rule 5 5 for higher derivatives of the product of two functions, we obtain
("22; i=l,
..., N )
131
which accomplishes the aim of step l a . The assumed integration algorithm requires for input the Lagrangian multipliers and their derivatives with respect to time up to order smax.It is seen from Eq. 131 that to solve for the Lagrangian multipliers and their derivatives {X(P)] up to order smax,the derivatives of the coordinates with respect to time up to order smax+ 2 are needed. In steps l b and l c below, these derivatives of the coordinates are substituted into an appropriately differentiated form, with respect to time, of the constraint equation, Eq. [l],to yield the desired system of equations for the Lagrangian multipliers and their derivatives up to order smax. Turning to step lb, we now seek a form of Eq. [l]appropriately differentiated with respect to time, to include the Lagrangian multipliers and their derivatives up to order smax.This requires taking the s + 2 derivative with respect to time of the general constraint equation, Eq. 111, with 5 = 0, . . . ,smax. The first derivative with respect to time of a function f((r(t)})of N position vectors { r ( t ) }is 141
The Analytical Method of Constraint Dynamics Taking the s
87
+ 1 derivative with respect to time of Eq. [4],we have
where the last equality follows from Leibniz’s rule.55 Using Eq. [ 5 ] , the s + 2 derivative with respect to time of the general constraint equation, Eq. [ l ] (which holds for all times t ) ,evaluated at to, is
(s=O,
. . . , smax;k = l , . . . , 1)
[GI
where the term containing ii(to)was isolated and the remaining term explicitly contains second and higher time derivatives of the coordinates. Equation [6] accomplishes the aim of step 1b. (where (Y 5 s), Turning finally to step lc, the expression for rf+2pa)(t0) from Eq. [3] of step l a , is inserted into Eq. [6] of step l b , leading to
+c N
i=l
i;(to) . [ V ; q ] ( s + lt)o( )= 0
(s=
0,
. . . , smax;k =
1,
. . . , 1)
171
Equation [7] represents a sequence of smax+ 1 systems, one system for each value of s = 0, . . . ,smax,of 1 equations each, one equation for each value of k = 1, . . . ,I, to be solved for the [(smax + 1) X I ] unknowns (XL?(t0);p = 0, . . . ,smax;k‘ = 1, . . . ,1). The equations are linear in the (X(f”(to)). In principle then, Eq. [7] is the desired sequence of systems of equations that accomplishes the p = aim of step lc. For each value of s, Eq. [7] includes the values of (X(P)(t,); 0, . . . ,s]. This suggests that the sequence of systems in Eq. [7] must be solved p = 0, . . . ,smax),respectively. successively using s = 0, . . . , smaxfor {h(P)(t,); To this end, Eq. [7] should be recast in a more convenient form. The highest order { A ( P ) ( t o ) }term in Eq. [7] is for p = s - a,with a = 0. Extracting and isolating the first term on the left of Eq. [7] corresponding to these values (i.e., and {A(”( t o ) ]gives ) containing {F(”)(t,)]
88
Molecular Dynamics with General Holonomic Constraints
where the lower limit on s reflects the existence of at least one lower order term, ) Recalling that as implied by the extraction of the highest order ( A @ ) ( t o )term. ) on time only implicitly through the position vecboth Ft(t)and V i u k ( tdepend tors, we can write the expressions:
[91 given in Reference 5 are inNote that the expressions for F j P ) ( t o ) and ViuLP)(to) correct. It is seen that the last three terms on the left-hand side in Eq. [8] are func, . . ,dS+’)(to)]. Equation [8] can therefore tions of ( A ( P ) ( t o )p; < s] and (do)(to), be written more compactly as follows:
is defined as the sum of the last where $((A(Pc-”)(to)),( d o ) ( t 0 )., . . ,r(s+l)(to))) three terms in Eq. 181. Equation [lo] is consistent with Eq. [2.5b] of Reference 5. Equation [lo] is a sequence of linear systems of 1 equations each, by succeswhich can be solved recursively for (A(”)(to);s = 1, 2, . . .] ,,, ,s sively setting s = 1, 2, . . . , smax,The computation of the (A(’)(to)} requires E ( ( A ( o ) ( t o ) ][r(to), , E(to),? ( t o ) ] and ) hence computation of the [A(o)(to)]. For s = 0, Eq. 171 yields
The Analytical Method of Constraint Dynamics
89
Equation [I 11 is a linear system of 1 equations that can be solved for the 1 un) } been obtained, Eq. [lo] with s = 1 is knowns (A(o)(to)). Once the ( A ( O ) ( t O have ) using the solved for the {A(l)(t,)],with F([A(')(tO)},{r(t,), f(t,), : ( t o ) }evaluated ( A ( o ) ( t o )obtained ) earlier. Once the ( X ( l ) ( t 0 ) ) have been obtained, we proceed recursively as before to compute ( A ( 2 ) ( t o )and ) so on, until the (A(Sm*r)(t,)) have been computed. Note that Eq. [7] (s = 0 , . . . ,smax)was merely recast into the equivalent but computationally more useful forms of Eqs. 1101 (s = 1,. .)., ,,,s and [l11 (s = 0). Equations [lo] and [l11 also accomplish the aim of step l c and are applicable to any forms of 1coupled holonomic constraints uk.The convenient linear property of Eqs. [lo] and [ll]is absent from the last set of equations of the method of undetermined parameters described in the next section. Having computed the Lagrangian multipliers and their derivatives with respect we can compute the forces of constraint and their deto time up to order smaX, rivatives up to order smaxfor use by the chosen integration algorithm in step 2.
Numerical Integration of the Equations of Motion Having evaluated the forces of constraint and their derivatives up to order smaX in step 1, we can integrate numerically the constrained equations of motion Eq. [2]: 1. Assuming that the algorithm uses time derivatives of the coordinates up to order smax+ 2, we compute the ( X ( P ) ( t o ) ;p = 0,1, . . . ,smax)from Eqs. [ l o ] and [ l l ] . 2. With the Lagrange multipliers (A(t,)}available, the forces of constraint can now be computed using I
Gi(t0)= - x h k ( t o ) V I ~ k ( t o ) (i = 1, k=l
. . ., N )
I121
3 . With the forces ( F ( t o ) ) and ( G ( t , ) ]and , their derivatives with respect to time up to order smaxavailable, the new position vectors (r(t, a t ) ) can be computed.
+
Error Analysis of the Analytical Method When Orban and Ryckaert4* first applied the analytical m e t h ~ d ,they ~?~ -_ found that although the constraints were satisfied exactly at some initial time,
90
Molecular Dynamics with General Holonomic Constraints
as time went on the constrained degrees of freedom diverged progressively from their constraint values. This discrepancy is due to the error inherent in the numerical integration algorithms. If the computed trajectory were identical to the exact one, the constraints would be exactly satisfied at every time step. However, because the computed trajectory deviates gradually from the exact the constrained degrees of freedom deviate progressively from their constraint values. After some time, the parts of the system involved in the constraints will look substantially different from their exactly constrained configurations. It follows that the system being studied will look as a whole different from the constrained or partially constrained system desired, and the use of a larger time step is not justified in this case. It is possible to reduce the size of the time step such that the system configuration better approximates the constrained system configuration. However, this defies the practical reason for introducing constraints in the first place, namely, the ability to use a larger time step for increased computational efficiency. If the integration algorithm has an error in the coordinates on the order of the time step O(St""), then in the worst case of holonomic constraints linear in the coordinates we have
The constraints are therefore satisfied only to O(Stm), and the size of the error in Eq. [13] grows with every time step. Equation [13] should be compared to Eq. [l].Clearly, for the analytical method to be of practical use, it must be coupled with a correction algorithm to keep the constraints satisfied within a desired tolerance. We describe an approach proposed by Edberg, Evans, and Morriss (EEM)49 and show that it is the simplest special case of the analytical method, with an added constraint correction scheme.
Method of Edberg, Evans, and Morriss in Context The EEM algorithm49 is shown here to be a special case of the analytical method described earlier by Ryckaert et aL5 and applied first by Orban and Ryckaert,4* albeit without a constraint correction scheme. As already discussed, the general formalism of the analytical method contains the parameters smax(the highest order of time derivative of the forces, needed by the integration algorithm) and the uk's (the forms of holonomic constraints). For a given system of interacting particles, a combination of choices for smaxand uk specifies uniquely the final sets of equations, Eqs. [ l o ] and [ll],to be solved for the Lagrangian multipliers and their derivatives up to smax.In this context, the EEM method results from the particular combination of (1)smaX = 0, meaning no derivatives of the forces of constraint need to be computed (the EEM method
The Analytical Method of Constraint Dynamics
91
uses a second-order Runge-Kutta integration algorithm,49 and (2) the holonomic constraints ukare bond-length constraints. To clearly illustrate the special case status of the EEM method, we consider two systems originally treated with EEM: a rigid diatomic molecule and an n-butane molecule. In each case, the fundamental analytical method equations described in steps l a , l b , and lc, are shown to reduce identically to the corresponding equations of the EEM approach, for [5max = 0, ak = bond stretch}. The rigid diatomic molecule provicies a simple context for comparison, involving only a single bond-stretching constraint, whereas the n-butane model offers a more complex system with five coupled bond-stretching constraints. It will also be evident from the following treatment that the EEM method is a special case of the analytical method for any general system to which the EEM approach is applicable. For the rigid diatomic molecule, the general holonomic constraint, Eq. [l], takes the special form
a ( [ r ( t ) )=) rf2
-
d2 = 0
1141
where r12 = r2 - rl, and d is the constant bond length between the two atoms. The notation rab = ra - rb is adopted throughout this chapter, except for the present analysis involving the EEM method, to be consistent with the notation in Reference 49. The subscript k has also been dropped from Eq. [ l ] because the system involves a single constraint. Equation [14] above corresponds to Eq. [ l ] of Reference 49. For smax= 0 (i.e., only the second time derivative of the coordinates is required) and a single holonomic constraint, Eq. [3] from step l a of the analytical method reduces to
With the additional EEM restriction that a = bond-stretching constraint, given by Eq. [14], Eq. [15] becomes t’, = F l ( t ( J ) -
W&,p
f, = F&)
+ Mto)r12
1161
where masses m , and m2 have been set to unity, consistent with Reference 49. Equations 1161 are simply the equations of motion for the rigid diatomic molecule. For the present trivial case of 5max = 0, they could have been obtained alternatively from the Lagrangian equations of motion, Eq. [2], directly. This is done in Reference 49, and our Eq. 1161 is identical to Eq. [3] of Edberg et al.49 = 0 and a single holonomic constraint, Eq. [6] from step l b of the anFor smaX alytical method reduces to
92
Molecular Dynamics with General Holonomic Constraints
With the additional EEM restriction on the constraint given by Eq. [14], Eq. [ 171 becomes
Again, for smax= 0, Eq. [18] could have been obtained directly by differentiating Eq. [14] twice with respect to time. This is done in Reference 49, and Eq. [18] is identical to Eq. 121 of Edberg et al. Equation 1111 (obtained from Eq. [7] of step 1c of the analytical method for s = 0) for a single holonomic constraint reduces to
With the additional EEM restriction given by Eq. [14], Eq. [19] becomes
where, consistent with the locally adopted convention, F,, Eq. [20], it follows that
F,
-
F,. From
Again, for smax= 0, Eq. [21] could have been obtained directly by inserting Eq. 1161 into Eq. [18] to get Eq. [20]. This is done in Reference 49, and Eq. 1211 is identical to EEM’s Eq. [4].49 For the n-butane molecule, the molecular model adopted in Reference 49 is the same as that introduced and used in References 4,6, and 7. It is a “united atom” model with every CH, and CH, group represented by a single particle of appropriate mass. Also, the C,-C,, C,-C,, and C,-C, bond lengths are held 0 The two C-C-C fixed to eliminate the C-C stretching modes ( ~ 7 0 cm-l). bending angles are held constant to freeze the bending modes (-300 cm-l), by constraining the C,-C, and C,-C, distances (i.e., by triangulation). This model of butane therefore involves five holonomic bond-stretching constraints, and the general holonomic constraint, Eq. [l],takes the special form uk((r}) = r2I t - d2 = 0 fI
(k = 1,. . . ,5)
1221
where rij rj - ri, ri and ri are the coordinates involved in the particular constraint uk,and dji is the constant distance between the (ij)pair of united atoms. For smax= 0 and five holonomic constraints, Eq. [3] from step l a of the analytical method reduces to
The Analytical Method of Constraint Dynamics
93
With the additional EEM restrictions that the holonomic constraints be given by Eq. 1221, mi = 1 with i = 1, . . . , 4 (consistent with Ref. 49), and hk = ha, with k = (Y + p - 2, Eq. [23] becomes the following set of simultaneous equations:
jl(
1241
Equations 1241 are simply the equations of motion for the four-particle nbutane molecule, and for smax= 0, Eq. [24] could have been obtained directly from the Lagrangian equations of motion Eq. 121. This is the procedure by which it is obtained in Reference 49, and Eq. [24] is identical to Eq. [5] of that work. To be consistent with Reference 49, Eq. [24] is recast in the more compact form: 'i, = F,
+ CM,,(XR), (a= 1, . . . , 4;
It = I,
. . . , 5)
n
[251
where Man is the matrix of ones and zeros in Eq. [24], and Rn = ra = rp r, with n = (Y + p - 2. Taking the difference of two terms of Eq. [2!] gives y a p = Fap
+
C L(,p)n(XR)n
[aP = (12), (131, (23), (241, (34); n = 1, . . . , 51 1261
n
where the square matrix L(ap)n= Mpn - M,,. p - 2 in Eq. [26] gives Rm = F,
+ C L,,(hR),
Setting m
(m,n= 1,
= (YPwith m =
. . . , 5)
(Y
+
1271
n
For smax= 0 and five holonomic constraints, Eq. [6] from step l b of the analytical method reduces to 4
4
With the additional EEM restriction on the constraints given by Eq. [22], and using the notational convention introduced above, Eq. [28] becomes
94
Molecular Dynamics with General Holonomic Constraints Rm.Rm+ (RJ2
=
0 ( m = 1,. . . ,5)
For smax= 0, Eq. [29] could have been obtained directly by differentiating Eq. 1221 twice with respect to time. This is the procedure used in Reference 49, and Eq. [29] is identical to EEM’s Eq. Equation [ l l ] for five holonomic constraints reduces, after a rearrangement of terms, to
With the additional EEM restriction of Eq. [22], Eq. [30] becomes -(F,.R,+RZ,)=c(R,L,,.R,)A, n
(m,n=l,..., 5)
~311
which is the linear system of equations to be solved for the An, and for smax= 0, Eq. [31] could have been obtained directly by inserting Eq. [27] into Eq. [29] as was done in Reference 49; Eq. [31] is identical to Eq. [ l l ] of Edberg et al.49 So far we have shown that for a rigid diatomic molecule and a constrained n-butane molecule, the fundamental analytical method equations reduce with (smax= 0, mk = bond stretch} to the corresponding EEM equations. For butane, the Lagrangian multipliers and the forces of constraint are computed by solving Eq. [31]. The forces of constraint and the potential energy forces are then used in the smaX = 0 integration algorithm, which in the EEM method4y is a second-order Runge-Kutta scheme, to give the constrained positions. This procedure is repeated at every time step. It is clear that the EEM special case status holds also for any system with arbitrary bond-length constraints, to which the EEM method is applicable. As discussed before, when Orban and Ryckaert4* first applied the analytical method, they discovered that the constraints diverged progressively from their constant values as a result of the numerical integration of the equations of motion. To deal with this problem, Ryckaert et aL5 developed the method of undetermined parameters, described in the next section, together with its two well-known solution techniques: the matrix method and the SHAKE algorithm, described for the basic Verlet scheme and for general holonomic constraints in the section following that. When EEM applied their they too observed this type of constraint divergence. However, in contrast to the solution proposed by Ryckaert et al.,j EEM introduced a correction scheme to compensate for this numerical error. The scheme consists of introducing for each molecule “penalty functions” for the constrained distances and for the rate at which they change, so-called bond and velocity penalty functions, respectively.
T h e Method of Undetermined Parameters
95
The positions and velocities of each molecule are corrected to ensure that the penalty functions are kept below specific tolerances. To reduce program complexity and remove the restriction to self-starting integration algorithms (e.g., Runge-Kutta), Baranyai and Evansso introduced a continuous proportional feedback scheme to correct for the numerical drift in the constraints. Although this chapter is concerned primarily with holonomic constraints, we comment on the role of nonholonomic constraints in the present context. Because the analytical dynamics theory of a system of particles subject to holonomic constraints is well e ~ t a b l i s h e d the , ~ ~issues in (holonomic)constraint dynamics are mainly algorithmic in nature, as seen in this chapter. The situation is more complex for molecular dynamics with nonholonomic constraints, where theoretical difficulties e ~ i s t . ~ ~ - ~ ~ In the EEM method, Gauss’s principle of least constraint34 is invoked to derive the equations of motion of the system of particles with holonomic constraints. However, it is well known33,34that when holonomic constraints are involved, the equations of motion can be derived from either D’Alembert’s principle, Hamilton’s principle, or by means of a third approach.s8 Application of Gauss’s principle in this case offers no advantage over these other approaches. Gauss’s principle is also exploited in the EEM method to enforce a nonholonomic temperature constraint57 in constant-temperature M D simulations. Again, the same equations of motion can be obtained by alternative means.s8 Now we turn to the alternative solution proposed by Ryckaert et al.s for ensuring satisfaction of the constraints at every time step.
THE METHOD OF UNDETERMINED PARAMETERS This section describes in detail the method of undetermined parameters for any form of holonomic constraints uk and with an integration algorithm requiring derivatives of the forces up to arbitrary order smax.The treatment is again based mainly on the work of Ryckaert et al.,s but more detailed mathematical derivations are given. As before, where fundamental equations are reached, corresponding equations in Reference 5 (often with some notational differences) are noted for comparison. Recall that the analytical method without a constraint correction scheme compensating for the numerical integration error is not computationally practical. To address this problem, Ryckaert et al.s proposed a modification of the analytical method that ensures exact satisfaction of the constraints at every time step, without introducing an additional numerical error in the trajectory. Using Eq. [3], a (truncated)Taylor series solution of Eq. [2] can be written as follows:
96
Molecular Dynamics with General Holonomic Constraints
(z=l,. . . , N )
~321
Rearranging the terms in Eq. [32] such that terms with same order {A(P)(to))appear together, we have
where 6r, (to+ 6t, [A(O)(t0))) contains smaxterms linear in (A(O)(t,)}, 8rr (to+ at, [A(l)(t0))) contains smax- 1 terms linear in [A(l)(t,)}, and so on; 6ri (to + at, ( A ( s m - . J ( t,)))contains a single term h e a r in ( A ( s ~ ~ ~ ~to)). x ) ( Equation [33] can be rewritten as follows:
where the partially constrained position vector rl! (to + 6t, {A(O)(t,), . . . , A ( S ~ n ~ x - l ) ( t o ) ) ) is defined as the sum of all terms except the last in Eq. [33]. In the case of, ,s = 0, as encountered, for example, when a basic Verlet or a velocity Verlet integration algorithm is used, as described in following sections, rz!(to + tit, (A(o)(to), . . . , X ( s n ~ ~ . - l ) ( t o ) ) ) reduces to the purely unconstrained position vector. In the analytical method, the (r (to + tit, ( A ( O ) ( t , ) , . . . , A(Smax- 1 )(to), A ( S - ~ ~ ) ( t o ) )were )) evaluated by first computing { A ( P ) ( t o ) ; p = 0, 1, . . . ,s), and then numerically integrating the equations of motion. Without a constraint correction scheme, this approach led to diverging constraints. Instead, the proposed modification5 to the analytical method consists of computing the constrained
The Method of Undetermined Parameters
97
coordinates from the two contributions in Eq. [34]. In the first contribution, the (r' (to + 6t, ( A ( o ) ( t o )., . . , A(sm*x-l)(to)))) are evaluated by computing only { W t o ) ;P = 0, 1, * . . (Smax - 1))and integrating the equations of motion, just as in the analytical method. In the second contribution, the (6rj are chosen to enforce the exact satisfaction of the constraints at every time step. In other words, the (A(Sm=)(to)} are not computed as in the analytical method, but are instead replaced by a new set of parameters {y} 7
that are required to have values such that Eq. [l]is satisfied exactly:
Equation [36] should be compared with Eq. [13] of the analytical method. In Eq. [35], the 6rl(to + at, (y))term is obtained by replacing ( A ( s ~ = ) ( t oby ) ) [y} in the expression for 6rl (to+ 6t, { A ( s n ~ * * ) ( t o ) ) ) , obtained by comparing Eq. [32] with Eq. 1331:
Equation [37] is the general expression for the displacements required to satisfy the constraints. From Eq. [37], it is seen that Sri evaluated at to + 6t is linear in the (y) and a function of the [r(to)).Adoption of a particular integration algorithm and particular forms of holonomic constraints determines the value reof smaxand the functional dependence of 6ri (to + 6t, (y}) on the (r(tO)}, spectively. By means of Eq. [37], Eq. [35] becomes q ( t o +6t,{h(O)(t0), ...,
-')(t0),y)) = r;(tO + ~t,(h(~)(t,), ...,~
( ~ m a 'x) ( t o ) } )
Computation of the Partially Constrained Coordinates The partially constrained coordinates (r' (to + 6t, (A(O)(t,), . . . , are computed as described for the analytical method. The Lagrangian multipliers and their time derivatives up to order smax- 1 are computed using Eqs. [ 101 and [ll]as in step 1 of the preceding section. The forces A(s-ax-l)(to)))}
98
Molecular Dynamics with General Holonomic Constraints
of constraint and their time derivatives up to order smaxare then evaluated (with the time derivatives of order smax of the forces of constraint missing the (h(S’gmdx)( to))contributions, of course) for use in the integration scheme. Having computed the forces of constraint and their time derivatives up t o order smax,we can integrate numerically the equations of motion, Eq. [2], for . . . ,A ( s m a ~ - l ) ( t o ) ) ) ) the partially constrained coordinates (r’ (to + tit, {A(O)(t,), as in step 2 of the preceding section. Using the forces (F(t,)) and their time derivatives up to order smaxtogether with the forces (G(to)) and their time derivatives up to order smax(with the time derivatives of order smaxmissing the (h(”-=)(t,)) contributions), the partially constrained position vectors (r’ (to + tit))are computed.
Computation of the Undetermined Parameters and the Constrained Coordinates When the partially constrained coordinates (r’ (to + 6t, (h(O)(t,),. . . , t,))))have been computed, the undetermined parameters {y)can be obtained by inserting Eq. [38] into Eq. [36], giving A(s-ax-l)(
@ = I ,. . . , I )
[39]
The most general form of holonomic constraint is nonlinear in the particle positions. Even the simple bond-stretch constraint is nonlinear. Consequently, Eq. [39] is in general a system of 1 coupled nonlinear equations, to be solved for the 1 unknowns (y). This nonlinear system of equations must be contrasted with the linear system of equations Eqs. [lo] and [ l l ] (which is also in general part of the method of undetermined parameters) used in the analytical method to solve for the Lagrangian multipliers and their derivatives. A solution of Eq. [39] can be achieved in two steps: 1. Taylor expansion of the holonomic constraints uk The Taylor expansion for a function of N position vectors, about (r), is:
Equation [40] is used to Taylor-expand every holonomic constraint crk in the molecular model, in Eq. [39], about the partially constrained coordinates (r’ (to must be used to establish + 6t, (A(O)(t,),. . . , A ( s m - . x - l ) ( t o ) ) ) ) . Standard
T h e Method of Undetermined Parameters
99
the domain of validity or convergence of the Taylor expansion representation for every form of holonomic constraint ukinvolved. 2. Linearization and iteration In the Taylor expansion of ukin Eq. [39], the terms linear in (y) are proportional to [ 6 t ] S m = + 2 , the terms quadratic in (y] are proportional to [St]2(sm=+2), the terms cubic in (y) are proportional to [tjt]3(smax+2), and so on. In MD simulations, the time step used is typically 6 t = 1 fs, and therefore the constraint corrections (8r(to+ 6t, ( y ] ) }added to the partially constrained coordinates (r’ (to+ tit, ( A ( O ) ( t o )., . . , A ~ 5 m = - * ) ( t o ) ) ) ) , to satisfy the constraint equations, are relatively small. Consequently, the terms nonlinear in (y] in Eq. [39] will have increasingly negligible contributions with increasing order of nonlinearity. For a first estimate (y), all nonlinear terms can be neglected and the linearized Eq. [39] solved for (y]. To compensate for the linearization, Eq. [39] is then solved iteratively by inserting the first estimate into the retained nonlinear term(s),solving the resulting linear equations for an improved estimate (y}, and so on. Usually, for computational efficiency, only the lowest nonlinear term is retained in the iterative solution. The procedure is repeated until the {y) converge within a desired tolerance. Typically, the nonlinear contributions to Eq. [39] are relatively small, and few iterations are needed for convergence. In summary, this iterative procedure of solution involves two approximations: (1) neglecting all nonlinear terms in the Taylor expansion of uk for a first-estimate solution, and (2) retaining only the lowest nonlinear term in further iterations. As for the Taylor representation itself in step l above, the validity of these approximations underlying the iterative method must be carefully examined for every form of holonomic constraint uk.In particular, two related points must be considered: (a) For given sizes of constraint corrections, the higher the order of nonlinearity inherent in the holonomic constraints uk,the less justifiable are the aforementioned approximations. (b) For a given form of holonomic constraint uk with a certain order of inherent nonlinearity, the larger the constraint corrections, the less justifiable are the two approximations. Together, these points stipulate that the larger the nonlinearity inherent in uk, the smaller the allowed sizes of constraint corrections required to justify these approximations, hence to justify the use of the iterative method itself. The {y], obtained by solving Eq. [39], are substituted into Eq. [37] to provide the displacements necessary to satisfy the constraints. Subsequently, the constrained position vectors {r(to + 8 t ) ) are obtained from Eq. [38] by adding these constraint corrections to the partially constrained position vectors. The actual derivatives of order smaxof the forces of constraint must be computed (a priori) in the analytical method, whereas in the method of undetermined parameters, the approximate derivatives of order smaxof the forces of constraint can be computed (a posteriori) if desired, by replacing the (A(”-=)(t,)] by the {Y).
100 Molecular Dynamics with General Holonomic Constraints
Error Analysis of the Method of Undetermined Parameters The formal equivalence between the adopted integration algorithm and the Taylor expansion Eq. [32] is used to derive the order of the error in S t that is carried by the to))in the analytical method. This automatically yields the order of the error in St, which was introduced into the method of undeter( t o )the ) parameters [y). mined parameters by the replacement of the ( h ( " - ~ ~ )with The error in the method of undetermined parameters is shown to be of the same order as the error in the analytical method, but with the former approach the constraints are satisfied exactly at every time step. In the analytical method, the (X@)(to)} ( p = 0,1, . . . ,smax),obtained from Eqs. [lo] and [ll],were used as input to the integration algorithm whose highest time derivative of the coordinates is of order smaX+ 2, which is equivalent to a derivative of the forces of order smax. This integration algorithm is formally equivalent to the Taylor expansion Eq. [32]. The last term in the Taylor exand the highest power 6 f - a ~ ' ~ . Assume pansion Eq. [ 3 3 ] contains the {X("-~x)(to)) that the adopted integration algorithm has an error in the coordinates of O(St"+l).For example, the basic Verlet algorithm described in the next section requires no time derivatives of the forces (i.e., smax= 0) and has an error in the coordinates of O(St4)(i.e., m = 3 ) . From the equivalence of the Taylor expansion and the integration algorithm, it follows that the highest term in the Tayand therefore S f m ~ x ' ~ X (X(s"'*")(to)) is of O(Stnz). This lor expansion is of O(Stm), In other words, implies that (X(sl-J( to))is of O(Stm-(Sm=+2)). (h(St~laJ(
where the [p) are some estimated values of the (X(Sm*~)(to)}. In the method of undetermined parameters, the { h ( S ~ a x ) ( t O ) )are replaced by the [y], computed as discussed before. When the {p) are set equal to the (y), Eq. [41] becomes
Inserting Eq. [42] back into Eq. [37]of the method of undetermined parameters gives:
(i = 1,
. .. ,N)
[43]
Method of Undetermined Parameters 101 Using Eq. [43], the coordinates given by the method of undetermined parameters Eq. [35] (or Eq. [38]) can be written as follows:
+ o(st"'1)
(i = 1, . . . , N )
1441
Equivalently, we can write: ri [method of undetermined parameters] = r j [analytical method] + O(6tM+1)( 2 = 1,. ,N)
..
1451
Therefore, the difference between the trajectory computed with the method of undetermined parameters (i.e., using the (y]) and the trajectory computed with the analytical method (i.e., using the (A(sm~=x)(to)]) is of 0 ( 6 t m + l )This . is the same order of error assumed in the integration algorithm and present in the analytical method before this modification. Hence the modification in the analytical method leading to the method of undetermined parameters does not introduce an error of lower order in 6 t than the error already present. The errors in the trajectory computed with the analytical method and the trajectory computed whereas in with the method of undetermined parameters are both of O(6tm+'), the latter method the constraints are perfectly satisfied at every time step (compare Eq. [13] with Eq. [36]).
USING THE METHOD OF UNDETERMINED PARAMETERS WITH THE BASIC VERLET INTEGRATION ALGORITHM Ryckaert et al.5 incorporated initially the basic Verlet integration algorithm,sl known also as the Stormer algorithm,60,61 into the method of undetermined parameters. In the basic Verlet scheme, the highest time derivative of the coordinates is of second order, and Eq. [37] with smax= 0 reduces to:
With smaX= 0, no (A(o)(t,), . . . ,A(sn7s=-1)(to)] and no forces of constraints and their derivatives need to be computed to evaluate the coordinates (r' (to + 6t, {A'*)(t0), . . ,A ( s - ~ ~ - l ) ( t , ) ] ) ) in Eq. [38]. Instead, for smax= 0 the partially con-
102 Molecular Dynamics with General Holonomic Constraints strained coordinates (r' (to+ tit, (X(O)(t,), . . . ,X(sm=-l)(to)))) reduce to the unconstrained coordinates (r' (to+ tit)),and Eq. [38] becomes
As described in the preceding section, the unconstrained coordinates are obtained numerically, using here the basic Verlet recipe: $(to + 6 t ) = -r,(tO
-6t)+21;(to)+-Fi(to) [W2 mi
For the Verlet scheme with smax= 0, the hk(to)are replaced by the y k (k = 1, , . . , l ) , and the method of undetermined parameters incorporating the basic Verlet scheme could equally well be termed, and is often referred to in the literature as, the method of undetermined (Lagrangian) multipliers. However, in general where smaX > 0, the Lagrangian multipliers and their derivatives (h'O'(t0),. . . , X(sm-.x-')(to)) need to be computed, in addition to the undetermined parameters (y), and the latter designation is misleading. The method should then be referred to strictly as the method of undetermined parameters. For the basic Verlet scheme, smaX= 0, and m = 3 in Eq. [42] leads to y k = Xho)(to) + 0 ( 8 t 2 ) (k = 1,
. . . ,I)
Inserting Eq. [49] into Eq. [46] shows that the coordinates given by the method of undetermined parameters are accurate to O(tit3).The error in the coordinates (local error) is of the O(8t4)present in the basic Verlet scheme. Therefore, consistent with the preceding error analysis, no additional error is introduced in the method of undetermined parameters, and the constraints are exactly satisfied at every time step. If the approximate forces of constraints are desired, they can be computed a posteriori as
Inserting Eq. [49] into Eq. [50] shows that the forces of constraint are accurate to the same order as the integration algorithm adopted. Hence, with the Verlet algorithm the computation of the approximate forces of constraint at to reduces to the computation of the (y}. Equivalently, by inserting Eq. [48] into Eq. [47], one obtains
Method of Undetermined Parameters
103
Inserting smax= 0 into Eq. [32] and replacing the first three terms on the right of the equal sign by their basic Verlet recipe gives
Comparing Eq. [51] with Eq. 1521 leads to the identical expression, Eq. [SO], for the approximate forces of constraints. Note the factor of Yzin front of the ~ ?“extract” ~~ the last term in Eq. [52]. A common error in the l i t e r a t ~ r eis~to to)from the basic Verlet expression for the unconstrained constraint forces Gi( coordinates, Eq. [48], and to write the constrained coordinates as follows:
Not surprisingly, comparing Eq. [53] with Eq. [Sl]leads to the forces of constraint in Eq. [SO], but with a factor of % discrepancy. This discrepancy is due to unwarranted attempts to apply Eq. [48], which should be used only in the computation of the unconstrained coordinates (r’(to + st)],to the constrained coordinates (r(to + 6t,(y])].Unfortunately, a factor of % is often artificially int r ~ d u c e dinto ~ ~the equations to mask this inconsistency. For convenience and conformity with the most widely adopted convention, the rest of this chapter redefines the undetermined parameters (y] such that their new values are equal to half their previously defined values. With this new definition, Eq. [46] takes the form
and Eq. [47] becomes
We next consider two techniques for determining the (y] and obtaining the constrained position vectors of Eq. [55].
The Matrix Method The first approach for computing the undetermined parameters (y] is an application of the general method of solution, discussed in the preceding section, to the basic Verlet scheme. The total displacement in Eq. [54] containing
104 Molecular Dynamics with General Holonomic Constraints
all the constraints involving ri is considered, and the nonlinear system of coupled equations, Eq. [39],is solved for (y). For the Verlet algorithm, Eq. [39] becomes
( k = 1, . . . , 1)
[56]
Again, a solution of Eq. [56] can be achieved in two steps. 1. Taylor expansion of the holonomic constraints uk When nk is taken to be the subset of particles involved in uk, and Eq. [40] is used to Taylorexpand ak({r(tO+ St,{y))]) about {r’(tO+ s t ) ) ,Eq. [56] becomes
(k=1, . . . , I )
[571
where cubic and higher order terms are not shown explicitly, and the last term and contains a scalar double-dot product : of the dyads ( ~Vrak,](t,)[Viu,..](t,)) [ViVla,]( (r’(to + S t ) ) ) .Again, for every particular form of holonomic constraint uk involved, the domain of validity of the Taylor representation in Eq. [57] needs to be carefully established. The bond-stretch, angle-bend, and torsional constraints treated with Eq. 1.571, in the following section, have Taylor expansions valid for constraint displacements of any size. 2 . Linearization and iteration The nonlinear system of equations, Eq. 1571, is linearized and solved for a first estimate solution of [y], as discussed in connection with Eq. [39].The solution is then inserted in the retained quadratic terms, and the linear system is solved for an improved estimate of the (y). This iterative procedure is repeated until the (y] converge within a desired tolerance. For the bond-stretch constraint, there is just one nonlinear (quadratic) term in its Taylor expansion (see later, Eq. [ 9 5 ] ) and , the linearization and iteration procedure is a fairly good approximation, justified even for relatively large corrections. For the bond-angle and torsional constraints, with infinite series Taylor representations, tighter limits are imposed on the allowable constraint
Method of Undetermined Parameters 105 correction sizes, although in typical MD simulations the displacements fall well within these limits, and the linearization and iteration procedure is still effective. With only the lowest order nonlinear term retained, as required by the iterative method of solution, Eq. [571 in matrix notation becomes
where 5 and 3 are vectors of dimension 1 and have the following components: uk = uk ([r'(tO+ a t ) } )
and
-
jk,= y k ,
(k,k'
=
1,.
. . , 1)
[591
The bracketed term in Eq. [58]is an f x 1 generally nonsymmetric square matrix with elements
Qis a vector of dimension f containing the contributions to Eq. [571 quadratic in (y}, with components
and 0 is the zero vector of dimension 1. Equation [58] can be solved iteratively €or j :
T=-[%]
-1 [U+Q]
The most recent estimates of the (y] are inserted into Q to yield improved estimates until convergence within a desired tolerance is achieved. When the solutions {y} have been obtained, they are substituted into Eq. [55] to get the position vectors at t, + St. From Eq. [62], one sees that this procedure requires evaluation and inversion of the 1 X 1matrix [ a d d y ] for every MD time step. Accordingly, this method of solution is referred to as the matrix method. Equations [60] and [61] show that implementation of the matrix method requires [Viuk]((r'(tO + s t ) ) )and , [ViViuk]((r'(tO + s t ) ) )for k computation of [V,ak](to), - 1, . . . , f; i, j = 1, . . . , nk. In the simplest case of f uncoupled constraints, [d5/ay] is diagonal with elements
106 Molecular Dynamics with General Holonomic Constraints
and the vector Qk reduces to
In general, some of the 1 constraints will share particles and will consequently have some coupling among them. In that case [aU/ay] will have coupled nondiagonal elements, in addition to the diagonal elements given by Eq. [63], and Q, will contain coupled contributions, in addition to the self-contributions given by Eq. [64]. For large molecules, with large sets of constraints, the inversion of the 1 X I matrix [ W a y ] at every MD time step becomes computationally expensive. If the constraints are localized, relatively little coupling exists between them; consequently the matrix [addy] will be sparse, and efficient inversion is possible. In most situations of physical interest, however, the coupling among constraints is substantial, and the resulting high computational cost of the matrix method has prompted the development of an alternative method of solution known as the SHAKE algorithm.s
SHAKE The matrix method solution of Eq. [56] consists of finding the total displacement, Eq. [54], of a particle, needed to satisfy simultaneously all the constraints in which the particle is involved. In contrast, the SHAKE procedure consists of finding incrementally the displacement a particle needs to satisfy successively all those constraints. From Eq. [54], the displacement needed to satisfy separately a constraint cr, is
The simultaneous matrix solution Eq. [62] of the matrix method is replaced by iterations over the sequence of constraints. The SHAKE algorithm consists of an iterative loop inside which the constraints are considered individually and successively. That is, the constraints are decoupled. SHAKE was initially described for the case of bond-stretch constraintsS and later generalized to handle general forms of holonomic constraint.8 The algorithm is discussed here for the case of general holonomic constraints ak.Beginning with the starting point of the matrix method, Eq. [56], a solution can be achieved in three steps.
Method of Undetermined Parameters 107 1. Decoupling of the uk and iteration The generally nonlinear system of coupled holonomic constraints Eq. [56] is decoupled, and the constraints are satisfied sequentially rather than simultaneously as in the matrix method. To compensate for the decoupling of Eq. [56], the process of satisfying the constraints is iterated over the sequence of constraints until convergence has been achieved within a desired tolerance. The iteration in SHAKE over the nonlinear constraints, to compensate for the decoupling of Eq. [56], should be contrasted with the iteration in the matrix method over the coupled Eq. [56], to compensate for its linearization. During an iteration, the algorithm successively selects every constraint and corrects the positions of the subset of particles involved in that constraint, to satisfy it. Consider a certain iteration and a particular constraint uk.Let (rold(to+ tit)}be the subset of nk particle positions involved in uk with values obtained from either the preceding iteration or the current iteration, including all changes made up to this point in the iteration. Both cases are discussed below, although the latter is the more common. From Eq. [65], the new positions of the particles (rnew(tO + tit)]obtained in the current iteration
should satisfy the constraint uk:
Equation [67] is generally nonlinear in y r w , even in the common case of a bond-stretch constraint. 2. Taylor expansion of each holonomic constraint uk As with the Taylor expansion of Eq. [56] into Eq. [57],taking nk to be the subset of particles + tit)))about involved in uk and using Eq. 1401 to Taylor-expand uk ((rnew(to {rold(t0+ tit)),Eq. [67] becomes
108 Molecular Dynamics with General Holonomic Constraints where the nonlinear terms are not shown explicitly. Again, for every particular form of holonomic constraint uk,the domain of validity of the Taylor representation in Eq. [68] needs to be carefully established. 3. Linearization For computational efficiency, all terms higher than first order in Eq. [68] are neglected, because the iterative process over constraints, introduced to compensate for the decoupling of Eq. [ 5 6 ] , ensures that the solution obtained in this manner will satisfy the nonlinear equation Eq. 1681. Solving for ylew, one obtains
Again, the validity of neglecting all nonlinear terms in Eq. [68] must be carefully examined for every form of holonomic constraint uk,and, as discussed in connection with Eq. [39], the larger the nonlinearity inherent in the constraint uk,the smaller the allowed size of the constraint corrections to justify neglecting the nonlinear terms. With some change in notation, Eq. [69] is consistent with Eq. [9] of Reference 8. If during a SHAKE iteration it happens that
a singularity occurs in Eq. [69], because each dot product in the denominator vanishes and the denominator is then zero. In that case, a solution using SHAKE is impossible. A physical explanation of this situation can be given. From Eqs. [66]-[69], it is seen that at every SHAKE iteration the new molecular configuration (rnew(tO+ s t ) )is obtained from the old molecular configusuch , ration (rO'd(tO + st)}with displacements in the directions of ([V ub ](t o )} that the new configuration satisfies the constraint, in the linear approximation. Each vector ([V,ak]; i = 1, . . . ,n k ) points in the direction in which a displacement of particle i produces the greatest change in the degree of freedom involved in the constraint uk,62as described in detail shortly. Therefore, Eq. [70] states that the SHAKE displacements are orthogonal to the corresponding displacements that produce maximum change in the molecular configuration (rold(to + st)).Clearly, such SHAKE displacements have no effect on the molecular configuration (rold(tO+ 8 t ) )and in particular cannot lead to the new configuration (rnew(tO+ S t ) ) . In this case, a solution using the SHAKE algorithm fails. As discussed before, the matrix method is computationally expensive for large numbers of constraints, leaving the SHAKE scheme as the only practical technique in such common situations. The complications due to the possible oc-
Method of Undetermined Parameters 109 currence of a singularity in the application of SHAKE, as just described, should be taken into account when the practical aspects of the analytical method and the method of undetermined parameters are compared. The general physical picture of the orthogonality situation described here is illustrated for bondstretch, angle-bend, and torsional constraints in the following section. As mentioned above, the position coordinates can be updated either by “block replacement” between successive iterations using a Jacobi-type60 scheme or by “continuous replacement” during iterations using a Gauss-Seideltype60 scheme. For block-updated coordinates-that is, when [rold(tO+ St)} in Eq. [66] has values obtained from the preceding iteration-the coordinates, after use of Eqs. [66] and 1691 for the kth constraint and during the nth iteration, are
where y%isthe value of y for the k’th constraint and the n’th iteration. For continuously updated coordinates-that is, when (r0ld(tO+ st)]in Eq. [66] has values obtained from the current iteration-the coordinates, after use of Eqs. [66] and [69] for the kth constraint and during the nth iteration, are
with the definitions
n‘=l
n’=l
The latter procedure of updating the coordinates is the more efficient and more commonly used one, although the former approach has obvious advantages in vector implementations of SHAKE. Hammonds and R y ~ k a e r discussed t~~ the use of equivalent alternative constraints, redundant constraints, and reordering of constraints in the SHAKE iteration, to speed up the convergence rate of SHAKE. A dramatic illustration of the improvement achieved by using equivalent alternative constraints is given in a following section, where the angle-bend constraint replaces the traditional triangulation procedure, to constrain a rigid water model. With regard to ordering strategies, the most natural sequence seems to entail performing internal coordinate constraints in the order of bond-stretch, angle-bend, and torsional in the SHAKE scheme, because the typically faster converging constraints are performed first, as recommended by an empirical rule.42
110 Molecular Dynamics with General Holonomic Constraints
Physical Picture of SHAKE for Internal Coordinate Constraints Typically, holonomic constraints in MD simulations are formulated in terms of internal coordinates, such as bond-stretch, angle-bend, and torsion. The general constraint equation, Eq. 111, can then be written as follows: Uk =
S,
- Sk0 = 0 (k = 1,. . .,I )
[741
where S, is the internal coordinate associated with constraint uk,and Sko is its constant constraint value. Upon application of Eq. 1741, the SHAKE displacement in Eq. [66] becomes
From the theory of molecular vibrations, any internal coordinate can be Taylor-expanded about equilibrium in terms of its Cartesian coordinate^^^
where [ S , ] , is the value of S , at vibrational equilibrium, and terms beyond linear are not shown explicitly. The vectors [ViSk],are the familiar Wilson ski vectors.62 In the present constraint dynamics context, they are referred to the molecular configuration at the preceding time step t,, rather than to the equilibrium molecular configuration of the vibrational dynamics context. To emphasize this fact, we adopt the notation skt(tO) [ViSk](to). With this definition, the SHAKE displacement Eq. [75]can be written in the more instructive form
The expressions for the ski are available from molecular vibration theory, for the common internal coordinates of bond-stretch, angle-bend, and torsion.62 As described earlier, ski points in the direction in which a displacement of particle i produces the greatest change in Sk.62 Equation [77]provides a physical picture of the SHAKE displacements for internal coordinate constraints. In the language of Wilson vectors, the singularity condition for internal coordinate constraints takes the form
Method of Undetermined Parameters
11 1
Hence, when internal coordinate constraints are used, a singularity occurs in SHAKE if, at some stage of an iteration, the Wilson vectors of the preceding time step molecular configuration (r(to))become orthogonal to the corresponding Wilson vectors of the current time step old molecular configuration (rO1d(tO + st)).Equations [77] and [78] are applied to the internal coordinate constraints of bond-stretch, angle-bend, and torsion in the following section, to provide a physical picture of SHAKE and the orthogonality situation.
Method of Tobias and Brooks in Context So far we have described the method of undetermined parameters with the basic Verlet integration algorithm, for general holonomic coupled constraints, and two techniques of solution: the matrix method and the SHAKE algorithm. Tobias and Brooks54proposed a method of constraint dynamics, which they described as an extension of the method of undetermined parameters with the basic Verlet algorithm to general holonomic constraints, and in particular to internal coordinate constraints. As with the earlier treatment of the method of Edberg et al.49 in the context of the analytical method, the method of Tobias and Brooks (TB)s4IS ' examined in the context of the method of undetermined parameters with the basic Verlet algorithm and general holonomic constraints. It is shown that the TB method is not an extension of the method of undetermined parameters with the basic Verlet algorithm to general holonomic constraints, but rather an approximation of the true extension described earlier in this section. To this end, we first derive the fundamental equations of the TB method in the terminologies of the matrix method and the SHAKE algorithm, but in a manner that parallels the derivation of Reference 54. When the relevant equations of the TB method have been obtained in this manner, they are rederived in an alternative, more straightforward way which, more importantly, identifies explicitly the additional approximations introduced by the TB method into the method of undetermined parameters with the basic Verlet algorithm and for general holonomic constraints. Again, beginning with the starting point of the matrix method, Eq. [ 5 6 ] , taking nk to be the subset of particles involved in uk,and using Eq. [40] to Tay(r(to + St,{y))))about the preceding time step particle positions lor-expand ok( (r(to))(rather than about the current time step unconstrained particle positions (r'(to + tit))), Eq. [56] becomes Uk(('(t0
+st,(Yl)l)= Uk(('(t0)) nk
u k ( ( r ( t O ) ) ) + C [r.@ i 0 + st,
i=l
+ w 0
+st,hl)-r(to)I)=
( 4 )- ri(to)l.IViUk] ( t o )
1 12 Molecular Dvnamics with General Holonomic Constraints In Eq. (791, a k ( ( r ( t o ) = ] ) 0 because the constraints are satisfied (within the desired tolerance) at every time step and in particular at the preceding time step to. Replacing [Viuk](tO) by the Wilson vectors ski(to),and neglecting all nonlinear terms in Eq. [ 7 9 ] ,one gets "k
Uk({'(tO))+MtO + s t , l U ) ) - r ( t o ) l ) = C r ~1 0( +t6t, Id)- rj(to )I. ~ k l ( t 0=) ( 1
=1
(k=1,
. . . , 1)
I801
Equation (801 is equivalent to Eq. [15] of the TB method.-54In the Wilson vector notation, the total displacement Eq. [ 5 5 ] can be written as follows:
Inserting Eq. 181 I into Eq. [80] gives
Taylor-expanding ak((r'(to+ 8 t ) ) )about the preceding time step particle positions (r(to)),we have
Using Eq. 1841 in Eq. [82] affords
Method of Undetermined Parameters 113
Equation [85] is equivalent to Eq. [16] of the TB method,S4 with a difference in sign in the definitions of the Lagrange multipliers. A definition analogous to that of the Wilson G matrix62 of vibrational dynamics
allows us to write Eq. [85] more compactly: I
1871 Equation [87] is equivalent to Eq. [19] of the TB method,S4 again with a difference in sign in the definitions of the Lagrange multipliers. In matrix notation, Eq. [87] takes the form
where G is the 1 X 1 matrix with elements G k k , ,A is the 1 X 1 matrix with elements A, = -yk, and S is the 1 X 1 matrix with elements ak((r'(to+ s t ) ) )From . Eq. [SS] we have A=--
'
[W2
G-'S
Equations [88] and [89] are identical to Eqs. [21] and [22] of the TB method,54 respectively. Equation [89], which follows from Eqs. [85]-[88], is the fundamental equation of the TB method. We shall return to this equation shortly. Similarly, the constraint displacement in Eq. [81] can be written in the matrix notation A
=
[6tI2M-'Bh
1901
where A is the 3N X 1 matrix of constraint displacements, M is the 3N x 3N diagonal matrix of particle masses, and B is the 3N X 1 matrix of sk,i(t,). Inserting Eq. [89] into Eq. [90] gives
114 Molecular Dynamics with General Holonomic Constraints
Equations [90] and [91] are identical to Eqs. [23] and 1241 of the TB method,54 respectively. With all needed equations available, the TB method proceeds as follows: 1. The unconstrained positions (r’(to+ S t ) ) are computed using Eq. [48]. 2. Equations [89] and [90] are solved for the Lagrangian multipliers and the constraint displacements, respectively (i.e., Eq. [91] is solved directly). 3. The constrained coordinates are obtained by means of Eq. [81]. The process is then repeated at the next time step. As mentioned before, Eq. [89] for the Lagrangian multipliers is crucial to the TB method, and because it follows directly from Eq. [ 8 5 ] ,we concentrate on the latter as the more fundamental, and we show that it is derivable alternatively from Eq. [57] of the matrix method, by means of two approximations. Beginning again with the starting point of the matrix method Eq. [56], Taylorexpanding u,((r(tO+ St,(y))))about [r’(to + St)) as in Eq. [57], neglecting all the nonlinear terms, and replacing [Via,] with ski, we have
Applying the additional approximation
to Eq. [92] leads identically to Eq. [ 8 5 ] .The approximation Eq. [93] states that
the Wilson vectors evaluated at the current time step’s unconstrained particle positions are approximately equal to the Wilson vectors evaluated at the preceding time step’s particle positions. Since the Wilson vectors have unique magnitudes and orientations relative to the molecular configuration at which they are evaluated,62 the foregoing approximation implies that the current time step unconstrained molecular configuration is identical (to the level of the approximation) to the preceding time step molecular configuration. In its first derivation, Eq. [85]was obtained rather circuitouslys4 by a combination of Taylor expansions. 1’. A Taylor expansion of a k ( ( r ( t o+ S t , ( y ) ) )about ) the preceding time step particle positions [r(to)],with a linear approximation leading to Eq. [82]. 2’. A Taylor expansion of o,((r’(to + S t ) ) )about ( r ( t o ) )with , a linear approximation leading to Eq. [84]. However, as shown above by the more direct alternative derivation of Eq. [ 8 5 ] , steps 1 ’ and 2‘ are entirely equivalent to (a) a Taylor expansion of a k ( ( r ( t o+ S t , { y ) ) )in) Eq. [56] about {r’(to+ st)),as in Eq. [57], and linearization, and (b) use of the approximation in Eq. [93]. Therefore, the TB method54 involves two
Application to lnternal Coordinate Constraints 115 additional approximations to the method of undetermined parameters with the Verlet integration algorithm, general holonomic constraints, and solution by the matrix method: 1. The Taylor expansion, Eq. [57], is linearized, and no further iterations are carried out using the lowest nonlinear term, as in the matrix method. 2. The current time step unconstrained molecular configuration is assumed to be (approximately)identical to the preceding time step molecular configuration, as described in connection with Eq. [93]. Clearly, approximation 1 leads to an Eq. [85] that is linear in the Lagrangian multipliers. Not surprisingly, its solution by the TB method is found to be inaccurate.54 The reason for the inaccuracy is obvious in light of the steps of the matrix method: the solution in Eq. [89] is just a linearization first estimate of the true solution, and no further iterations, using at least the lowest nonlinear term in the expansion, are carried out to refine this first estimate, unlike the procedure followed in the matrix method. To deal with this problem, Tobias and Brooks decouple the constraints and iterate over them until convergence is reached to within a certain tolerance.54 Therefore, in the final analysis the TB method is equivalent to SHAKE with the additional approximation of Eq. [93]. Because the equations being decoupled and iterated over are Eq. [92] with the approximation of Eq. [93], or equivalently Eq. [85], the SHAKE parameters yyware given in the TB method by
In contrast to the correct expression for yzeWof the true SHAKE algorithm, given by Eq. [69], the denominator in the expression for [yZeWlTB of Eq. 1941 is constant during successive iterations, and its value is determined using only the preceding time step to particle positions. Hence the only “feedback” in this it+ st)]). One side effect of erative process is through the numerator uk((rOld(tO the form of the denominator in Eq. [94] is the absence of orthogonality problems, discussed earlier in connection with SHAKE, from MD simulations using the TB algorithm.
APPLICATION TO INTERNAL COORDINATE CONSTRAINTS This section begins by applying the method of undetermined parameters, with the basic Verlet algorithm, to the treatment of bond-stretch constraints. Since they are the most common type of constraint in MD simulations and were
1 1 6 Molecular Dynamics with General Holonomic Constraints the first to be treated with constraint dynamics, the matrix method and SHAKE have been extensively discussed in the literature for the case of bond-stretch constraints. Nevertheless, the matrix method and SHAKE are described here for bond-stretch constraints, for completeness and to permit use with other forms of holonomic constraints. For example, for a rigid water model, discussed later, one bond-angle constraint and two bond-stretch constraints are present. On the other hand, the material in this section on angle-bend and torsional constraints is new in the sense that these constraints have not been discussed in the literature in the contexts of the matrix method or SHAKE. Recall that the TB method54 is not a true extension of SHAKE to internal coordinate constraints, nor does it cover the extension of the matrix method to those constraints.
Bond-Stretch Constraints Consider a system of N particles and 1general holonomic constraints, and assume that ls bond-stretch constraints (ls 5 I) are present. For bond-stretch constraints, the general holonomic constraint, Eq. [l], takes the special form a,({'}) = [r,(t) - ri(t)I2- d i = 0
(k = 1 , . . . ,ls)
P51
where i and j are the two particles involved in the particular constraint uk,and dij is the constant distance between the (ij)pair of particles.
The Matrix Method Solution Using the bond-stretch constraint uk((r))of Eq. 1951, Eq. [57] becomes
where the pair of particles ( i j ) in the bond-stretch constraint Eq. [951 has replaced the subset of particles n, in Eq. [57]. For every uk([r])with (k = 1 , . . . , I,), a corresponding equation analogous to Eq. [96] holds. With a slight difference in notation, Eq. [96] is consistent with Eq. [3.7] of Reference 5. The highest term in Eq. [57] for a bond-stretch constraint is quadratic, and Eq. [96] is exact. In general, the bond-stretch constraint in Eq. [96] is coupled to other forms of holonomic constraints sharing one or both of its ( z j ) pair of particles. In particular, uk,and a,,,in Eq. [96] can be other bond-stretch, bond-angle, or torsional constraints. From Eq. [96], the matrix elements in Eq. [58] of the matrix method are
Afmlication to Internal Coordinate Constraints 1 1 7
and Eq. [69] becomes
( k = 1, . . . , 1,) 1991 where the pair of particles (ij)in the bond-stretch constraint Eq. 1951 has replaced the subset of particles nk in Eq. 1691. With a slight change in notation, Eq. 1991 is consistent with Eq. [5.6] of Reference 5 (with the quadratic term neglected), and with Eq. [9a] of Reference 8. For the bond-stretch constraint there is just one nonlinear quadratic term in Eq. 1681, and the approximation underlying Eq. 1991 is reasonable. Now we use the treatment in the preceding section to give a physical picture of the SHAKE process of resetting the coordinates to satisfy bond-stretch constraints. If the bond-stretch constraint Eq. [ 951 had been formulated, instead, in terms of the bond-stretch internal coordinate62 (i.e., magnitude of the distance between the two atoms), the following bond-stretch SHAKE displacements would have been found from Eq. [77]:
where ski(to)and ski( to) are the Wilson vectors (unit vectors in this case) for the bond-stretch internal coordinate.62 From Eq. [loo], it is seen that the SHAKE
1 1 8 Molecular Dynamics with General Holonomic Constraints displacements of the bond-stretch atoms are along the directions of the corresponding Wilson vectors. In particular, firjand 6r, are along the direction of the line connecting the two atoms at the preceding time step toand pointing inward. However, for computational efficiency, the bond-stretch constraint is usually formulated in terms of the square of the interatomic distance, as in Eq. [95]. In this case, the bond-stretch SHAKE displacements are found from Eq. [98]: &,(to + 6 t , y r w ) = - - [2s t ] 2 y knew [ r j ( t o ) - r i ( t O ) ]
mi
6r;(to+ t i t , y r W )= - - 2[ ~ t 1 ~ y ~ ~ [ r ~ ( t 0 ) - r ~ ( t ~ ) l mi
[I011
From Eq. [loll it is seen that Srl and 6ri are again in the directions of the line connecting the two atoms at the preceding time step to and pointing inward. Hence the current displacements are in the direction of the Wilson unit vectors, = 2-yrwjri(to) - rl(to)l.A later section details the correspondence bewith yrw tween the formulations of the bond-stretch constraint in terms of the bondstretch internal coordinate and in terms of the square of the interatomic distance. When the bond-stretch constraint Eq. 1951 is formulated instead in terms of the bond-stretch internal coordinate and the SHAKE bond-stretch displacements are given by Eq. [ 1001, a singularity occurs in Eq. [99] if the Wilson vectors of the preceding time step configuration become orthogonal to the corresponding Wilson vectors of the current time step old configuration. In the present case, where the bond-stretch constraint has the form of Eq. [ 9 5 ] and the SHAKE displacements are given by Eq. [loll, a singularity occurs in Eq. [99] if the vectors [rj(to)- rl(to)]and [r;ld(to + 6t) - rpld(t0 + at)] are orthogonal.
Angle-Bend Constraints Again, consider the system of N particles and 1 general holonomic constraints, and assume that la bond-angle constraints (I, 5 1) are present. For bond-angle constraints, the general holonomic constraint, Eq. [l], takes the special form
where a, b, and c are the three particles involved in the particular constraint uk, arccos(ta6.tc6)is the angle at b formed by the abc triplet of particles, ?ab - ra6/lrabl,Tab = ra - rb, and aabcis the constant constraint angle-bend value.
+abc
The Matrix Method Sofution We insert the bond-angle constraint uk((r])of Eq. [lo21 into Eq. [57] to obtain
Application to Internal Coordinate Constraints 119
where the subset of particles nk is the (abc) triplet of particles in the bondangle constraint, Eq. [102]. For every uk((r])with (k = 1, . . . , la), a corresponding equation analogous to Eq. [lo31 holds. The left-hand side of Eq. [lo31 is an infinite series, and all nonlinear terms higher than quadratic are neglected in the context of the matrix method. In general, the angle-bend constraint in Eq. [lo31 is coupled to other forms of holonomic constraints sharing one or more of its (abc) triplet of particles. In particular, uk,and uk,,in Eq. [lo31 can be bond-stretch, other bond-angle, or torsional constraints. The expressions for Vi+a6c are available from the Wilson vectors for the angle-bend internal coordinate.62 The expressions for ViV,+a6c are also needed to evaluate the dyadic double-dot product in Eq. [103]. From Eq. [103], the matrix elements in Eq. [58] of the matrix method are
( k = l , . . . , 1,)
~041
The SHAKE Solution The angle-bend constraint uk({r))of Eq. [lo21 is inserted into Eq. [66] to obtain rpw(tO+ st)= q0ld(to + st)- [sq2 y ~ w [ V i + ~ b ~ l ( t O(i)= a, b, c ) m,
and Eq. [69] becomes
[1051
120 Molecular Dynamics with General Holonomic Constraints where the subset of particles nk is the abc triplet in the angle-bend constraint Eq. 11021. For the bond-angle constraint, the left-hand side of Eq. [68] is an infinite series, and the approximation underlying Eq. [ 1061 neglects all the nonlinear terms. The expressions for Vt+abc needed in Eqs. [lo51 and [lo61 are available from the Wilson vectors for the angle-bend internal coordinate.62 As for the case of bond-stretch constraints, a physical picture of SHAKE for satisfying bond-angle constraints is given here. From Eqs. [lo51 and [77],one finds
where ski(to)(i = a, 6, c ) are the Wilson vectors for the angle-bend internal coordinate.62 From Eq. [lo71 it is seen that the SHAKE displacements of the angle-bend atoms are in the directions of the corresponding Wilson vectors. In particular, 6ra and arc are perpendicular to the directions of the bonds a b and cb, at to, respectively. The displacement arb is in the direction of - ( s k a + skc). Figure 1 shows the orientations of the displacement vectors Sra, art, and arc relative to the molecular configuration at the preceding time step to, for the case of an angle-bend constraint. A singularity occurs in Eq. 11061 if the bond-angle Wilson vectors of the preceding time step configuration become orthogonal to the corresponding bond-angle Wilson vectors of the current time step old configuration.
Torsional Constraints Consider the system of N particles and 1 general holonomic constraints, and assume that 1, torsional constraints (I, 5 1) are present. For torsional constraints, the general holonomic constraint, Eq. [l], takes the special form = ' a h ~ d ( { ~-) )Pabcd =
(k l, ' * * I,) 9
[lo81
where a, 6, c, and d are the four particles involved in the particular constraint uk,and
is the dihedral angle formed by the abcd quadruplet of particles; Pahcd is its constant constraint value. The Matrix Method Solution When the torsional constraint uk({r])of Eq. [lo81 is inserted into Eq. [57], we find
Application to lnternal Coordinate Constraints 121
6rC Figure 1 SHAKE displacements for angle-bend constraint. The displacements are shown relative to the molecular configuration at the preceding time step to and have the (negative) directions of the Wilson vectors for an angle-bend internal coordinate. The displacements of the end atoms are perpendicular to the bond directions at to, and the displacement of the apex atom points in the direction of the sum of their Wilson vectors.
where the subset of particles nk is the abcd quadruplet of particles in the torsional constraint, Eq. [log]. For every uk( ( r) )with (k = 1, . . . , l J , a corresponding equation analogous to Eq. [I101holds. The left-hand side of Eq. [110] is an infinite series, and all nonlinear terms higher than quadratic are neglected in the context of the matrix method. In general, the torsional constraint in Eq.
122 Molecular Dynamics with General Holonomic Constraints [llO] is coupled to other forms of holonomic constraints sharing one or more of its abcd quadruplet of particles. In particular, uk,and uk,,in Eq. [110] can be bond-stretch, angle-bend, or other torsional constraints. The expressions for V1~ubcd are available from the Wilson vectors for the torsional internal coordiare also needed to evaluate the dyadic nate.62 The expressions for ViV.~abcd double-dot product in Eq. [llOj. From Eq. [110], the matrix elements in Eq. [58] of the matrix method are
(k=1, . . ., It)
D111
The SHAKE Solution When the torsional constraint cTk((r})of Eq. [lo81 is inserted into Eq. [66], we find qnew(tO+ s t ) =r;ld(to + s t ) - L s 4 Z ~ e ~ [ V ~ ~ ~ b ~ (di l= (u,6,c,d) to) mi
[I121
and Eq. [69] becomes
where the subset of particles nk is the abcd quadruplet in the torsional constraint equation (Eq. [108]). For the torsional constraint, the left-hand side of Eq. [68] is an infinite series, and the approximation underlying Eq. I1061 necd in Eqs. [112] glects all the nonlinear terms. The expressions for V j ~ u b needed and [113] are available from the Wilson vectors for the torsional internal coordinate.62 Again, a physical picture of SHAKE for satisfying torsional constraints is given here. From Eqs. [112] and [77], we write
where ski(t,) (i = a, b, c, d ) are the Wilson vectors for the torsional internal coordinate.62 From Eq. [114] it is seen that the SHAKE displacements of the
--
Angle Constraint Versus Triangulation 123
atoms involved in torsion are in the directions of the corresponding Wilson vectors.
ANGLE CONSTRAINT VERSUS TRIANGULATION In the preceding section, we discussed the use of angle constraints to impose bond-angle constraints with either the matrix method or SHAKE. To date, the standard approach of constraining bond angles has been to introduce a (fictitious) bond-stretch constraint on the side facing the angle to be constrained, in addition to bond-stretch constraints on both sides of the angle. This procedure, known as t r i a n g ~ l a t i o n ,is~of ~ course applicable with either the matrix method or SHAKE. Here we compare the triangulation and angle-constraint approaches in the context of the SHAKE algorithm. There are two fundamental differences between these two procedures:
1. Domain of applicability As mentioned above, triangulation is feasible only if the bond stretches on the sides of the bond angle being constrained are also constrained. In contrast, the angle-constraint procedure can be applied independently of other constraints. Thus, triangulation entails total rigidity of the triangle formed by the two sides of the angle to be constrained and the side facing it, whereas an angle constraint imposes no restrictions on the sides of the constrained bond angle (e.g., the sides can remain flexible). Hence the angle constraint is a more general and widely applicable procedure than triangulation. 2. Convergence rate It is well known1'9l5 that when triangulation is used to enforce angle-bend constraints, the iterations in the SHAKE algorithm converge very slowly (i.e., a large number of iterations is required for convergence). Convergence is slow in such cases because each correction for the constrained bond angle, through adjustment of the triangulation side, distorts the other two bond-stretch constraints, and a relatively large number of iterations is required for all constraints to converge. On the other hand, the angleconstraint SHAKE displacements have a smaller effect on the other two bondstretch constraints. Therefore, angle-constraint SHAKE is expected to converge more rapidly than triangulation SHAKE. This conclusion is borne out dramatically for the SPC/E rigid water modeF4 in Table 1, where we compare the average numbers of SHAKE iterations to convergence in MD runs using triangulation and angle-constraint procedures. Point 1 clearly holds for both the matrix method and SHAKE. Because the matrix method is also an iterative approach (albeit a coupled one), the essence of the remarks of point 2 also holds for the matrix method. As stated in point 1, constraining an angle bend with an angle constraint is still applicable when total rigidity is unacceptable, and triangulation is then not a feasible
124 Molecular Dynamics with General Holonomic Constraints Table 1. Comparison of Angle-Constraint and Triangulation Procedures for Bond-Angle Constraints Using the SHAKE Algorithm on 13,824 SPC/E Rigid Water Molecules“ Average SHAKE Constraint Average Number CPU Time (s) Approach of SHAKE Iterations Triangulation 23.2 2.43 Angle constraint 4.5 1.05 “The SHAKE tolerance for the bond-stretch constraints is the same in both procedures [ ( s f ) = and the bond-angle tolerance in the angle-constraint approach is obtained from Eq. [119] [(af) = The accuracies of the two approaches are comparable tor the adopted values of tolerances. SPCE designates a modified simple point charge model.
option. Even when total rigidity is acceptable or desirable, however, the angleconstraint procedure is more efficient than triangulation. Hammonds and R y ~ k a e r suggested t~~ using a dot product constraint on the side bonds because it more rapidly converges and is more efficient than triangulation. Although a dot product constraint is an improvement in efficiency over triangulation (as is also the angle-constraint procedure), it suffers from the same drawback as triangulation, namely, a narrow domain of applicability. A dot product constraint entails the use of rigid side bonds. Only the angleconstraint procedure provides an efficient and generally applicable way of constraining bond angles. It was noted in point 2 above that angle-constraint SHAKE is more rapidly convergent than triangulation SHAKE. However, the SHAKE correction for an angle constraint is more complex, hence more computationally expensive, than the corresponding triangulation correction. Therefore, for the angle-constraint procedure to be more efficient than triangulation in situations permitting triangulation, the angle-constraint SHAKE routine must be more efficient than its triangulation counterpart. This is the case for the SPEK rigid water in Table 1.To obtain an accurate assessment of the efficiency of triangulation SHAKE versus that of angle-constraint SHAKE, their speeds must be compared for results of similar accuracy. This requires equivalent convergence tolerances on the triangulation side and the constrained angle. Therefore a relation is needed for translating a SHAKE tolerance for the triangulation side into the equivalent tolerance for the constrained angle. From the law of cosines, referring to Figure 1 and denoting the sides opposite the corners a, 6, and c, by A, B , and C, respectively, we have
8’
=
A2
+ C2 - ~ A C C O S ~
11151
Differentiating both sides of Eq. [ 1151 with respect to 4, and recalling that both A and C are held constant, we have
Angle Constraint Versus Triangulation 125
(dB2)= 2AC sin+(&)
[I161
Consistent with the adopted form of the bond-stretch constraint, Eq. [95], convergence for the triangulation bond-stretch constraint is reached when (&I2) 9 (st),where ( s t ) is the SHAKE bond-stretch tolerance. Similarly, consistent with the adopted form of the bond-angle constraint, Eq. [102], convergence for the angle constraint is reached when (d+) 5 (at), where (at) is the SHAKE angle-bend tolerance. Using Eq. [ 1161, the bond-stretch convergence criterion leads to
Comparison of Eq. [117] with the angle-constraint convergence condition leads to the following relation between the triangulation bond-stretch tolerance and the corresponding angle tolerance:
Equation [118] translates the tolerance imposed on the triangulation bond stretch into the equivalent tolerance on the corresponding angle and hence can be used for an accurate comparison of the triangulation and angle-constraint approaches. The tolerances for bond-stretch and angle-bend constraints are usually expressed as multiples of the constraint bond-stretch and bond-angle values, rewhere ( s f ) is the stretch factor spectively: (st) = ( s f ) X B2, and (at) = (af) X or relative bond-stretch tolerance and (af) is the angle-bend factor or relative angle-bend tolerance. Substitution of the tolerance expressions into Eq. [118] yields an analogous relation between the tolerance factors:
+,
Assuming a commonly used42 bond-stretch tolerance factor ( s f ) = 1OW6 for a molecular model with specific geometrical parameters, Eq. [119] furnishes the angle tolerance factor needed for equal accuracy runs of triangulation and angle-constraint SHAKE. If the form of the bond-stretch constraint is chosen to be in terms of the magnitude of the bond stretch rather than its square (i.e., as in Eq. [95]), the convergence for the triangulation bond stretch is reached when ( d B ) 5 (sf)' x B, where (sf)' is the bond-stretch magnitude tolerance factor or relative bondstretch magnitude tolerance. A relation between (sf) and (sf)' is easily obtained. The tolerance on the bond stretch
126 Molecular Dynamics with General Holonomic Constraints
leads to an equivalent tolerance on the bond-stretch magnitude (sf 1 (dB)I xB 2
Comparison of Eq. [ l 2 l ] with the tolerance condition on the magnitude of the bond stretch yields (sf)' = (sf)/2. When this relation is inserted in Eq. [119], the following relation between the tolerance factors of angle-bend and bond-stretch magnitude results:
To date, the traditional method for treating rigid water models using constraint dynamics (i.e., excluding methods of rigid body dynamics, such as quaternion approaches) has been through triangulation, with constraint of a fictitious H-H bond stretch. We show that the angle-constraint approach is substantially more efficient than triangulation for the SPC/E rigid water When the geometrical parameters of the SPCE (see Figure 1)are used, the parameters in Eq. [119] are as follows: A = C = [H-0] = 1.0 A, B = [H-HI = 1.63 A, and = [HDH] = 109.4'. With these values, Eq. [119] yields ( a f )= ( s f ) , hence ( a f ) = for SPC/E water. Table 1 compares the average number of SHAKE iterations and the average SHAKE CPU time for 13,824 SPCE water molecules over 200 MD time steps with a time step 6t = 1 fs. Calculations were carried out on an IBM RS6000 workstation. Consistent with the discussion in point 2 above, the middle column of Table 1 shows that the average number of angle-constraint SHAKE iterations needed for convergence is dramatically smaller (more than a factor of 5 fewer iterations) than the average number of triangulation SHAKE iterations. The last column in Table 1 shows the angle-constraint SHAKE to be roughly 2.5 times faster than the triangulation SHAKE for SPCE water. Use of Eq. [119] made the tests of the two approaches of comparable accuracies.
+
USING THE METHOD OF UNDETERMINED PARAMETERS WITH THE VELOCITY VERLET INTEGRATION ALGORITHM Whereas the basic Verlet algorithm has many advantage^,^^ it suffers from a number of drawback^^^.^^: 1. A term of O(6t2)is added to terms of O(6tO)in Eq. 1481 in the computation
VelocityVerlet Integration Algorithm 127 of the unconstrained coordinates (r’(t, + s t ) ) ,leading eventually to numerical precision p r o b l e m ~ . ~ ~ J ~ ~ ~ ~ 2. The velocities of the atoms are not computed by the algorithm; rather, they are obtained from $ ( t o )=
q ( t 0 + st)- q ( t 0 - st)
26t
The velocities at time to can be obtained only after the coordinates at time (to+ St)are available, resulting in difficulties in implementing the basic Verlet algorithm in constant pressure, constant temperature, and nonequilibrium MD simulation^.^^ Whereas the error in Eq. [48] is of O(8t4)(the local error), the error in Eq. [123] is O(8t2),and more accurate velocities become expensive to obtain. 3. Finally, the algorithm is not self-starting, and two successive initial configurations of the system must be provided to start the integration, as seen from Eq. [48]. To avoid these shortcomings of the basic Verlet algorithm, Andersens3 incorporated the velocity Verlet algorithm52 into the method of undetermined parameters. Unlike the basic Verlet scheme, the velocity Verlet algorithm involves two stages: 1. Determine the positions by:
2. Determine the velocities by:
In the first stage, the positions at time (to + st) are calculated from the positions and velocities at time t,, as given by Eq. [124]. With the positions at time (to+ 8t) available, the forces at time (to+ 8t) can be computed, for use in the second stage, to evaluate the velocities at time (to+ st) by means of Eq. [125]. The velocity Verlet algorithm, Eqs. [124] and [125], is algebraically equivalent to the basic Verlet algorithm, Eqs. [48] and 11231. Both schemes have a global error of O(8t2).However, the local error in Eqs. [124] and [125] is of O(8t3), in contrast to the local error of O(8t4)in Eq. [48] and of O(8t2)in Eq. [123]. The velocity Verlet algorithm evidently eliminates the three drawbacks of the basic Verlet scheme. In addition, both algorithms require 9 X N words of memory storage for N particles. As in the basic Verlet scheme, the highest derivative with respect to time of the coordinates in the velocity Verlet algorithm is of second order. Therefore,
128 Molecular Dynamics with General Holonomic Constraints
the coordinate equations of the method of undetermined parameters with the velocity Verlet algorithm are
which follows from Eq. [38]with smax= 0, and where, using Eq. [124], we have
The parameters {y)are again chosen such that the coordinates a t time t, + 8t satisfy the constraint equations, and either the matrix method or SHAKE can be used to obtain the (y]. The velocity equations are
where, using Eq. [125], we have
The parameters (q]are chosen such that the velocities at time to + 6 t satisfy the constraint equations. Taking the derivative with respect to time of the constraint equation for uk,and using Eq. [4], we get
Either matrix inversion or SHAKE can be used to solve the set of 1 linear equations Eq. [130] for the (q}.As discussed before, solution for the (y}and (q]by matrix techniques becomes computationally expensive for systems with large numbers of coupled constraints, so the following presentation is confined to the solution by the SHAKE procedure. Andersens3 discusses why the velocity Verlet algorithm cannot be incorporated into the method of undetermined parameters in as straightforward a manner as can the basic Verlet algorithm.
RATTLE for General Holonomic Constraints The solution for the (y} and (q]using SHAKE was termed RATTLE by A n d e r ~ e n The . ~ ~material here is new in the sense that the original formulation of RATTLE was for bond-stretch constraints only.s3 The treatment is general-
VelocityVerlet integration Algorithm 129 ized below to cover general holonomic constraints. Following this, our general formulation is specialized to the internal coordinate constraints important for MD simulations. In particular, RATTLE for bond-stretch constraints is reviewed, and treatment of bond-angle and torsional constraints is introduced. The first stage of RATTLE is analogous to SHAKE. The new positions of the particles (rnew(tO + s t ) )obtained in the current iteration are
where the starting value of rpld(t0 + st) is given by Eq. [127]. The new positions should satisfy the constraint equation for uk,Eq. [67],and following the same arguments we get
As with SHAKE, iterations over the constraints continue until all are satisfied, within a certain tolerance. When all constraints have been satisfied and the coordinates at (to+ st) are available, the forces (F(to+ s t ) )can be computed for use in the second stage of RATTLE. In this second stage, the new velocities of the particles {rnew(tO + st))obtained in the current iteration are ~ n e w ( t O + ~ t ) = ~ O ' d ( t ~ + s[stl t ) - - r l ~ W I V z U k ] ( t O + ~ t () z = 1 ,
2m,
. . ., nk) [I331
where the starting value of i-pld(to+ 8t) is given by Eq. [129]. The new velocities should satisfy the derivative with respect to time of the constraint equation. Inserting Eq. (1331 into Eq. [I301 gives
Again, iteration over all constraints continues until the constraints on the velocities have been satisfied within a selected tolerance. The entire procedure is then repeated at the next time step.
130 Molecular Dynamics with General Holonomic Constraints
Application to Bond-Stretch, Angle-Bend, and Torsional Constraints As with SHAKE, RATTLE is illustrated in this chapter for the internal coordinate constraints of bond stretch, angle bend, and torsion. For bond-stretch constraints and the first stage of RATTLE, inserting the constraint ak((r])of Eq. [9S] into Eq. [131] yields
and Eq. [132] reduces to
( k = 1, . . . , 1,) [137] Equations 11361 and [137] are consistent with the corresponding equations in Appendix C of Reference 53. For the second stage of RATTLE, Eq. [133] becomes 1
inew(tO + s t ) = i0ld(t0+ 6 t )- - [ ~ t ] q ~ ~ [ r ~+ s( tt)~- rJ(to+ st)] I
I
ml
1 iFw(tO + s t )= ipld(to+8t)--[[Gt]qrw[q(t0 m,
+ s t ) - rl(tO + s t ) ]
[I381
and Eq. [13S] reduces to new
qk
=
[st]-'
[r (to+ s t ) - q(to + s t ) ][i;ld(to
+ s t )-ip'd(to + s t ) ]
[(11m,1 + (1/ m111"/(to + S t ) - r, (to + WI2
(k=1, . . . , 1,)
[I391
Because the coordinates at (to+ 8t) from the first stage satisfy (towithin a given tolerance) the constraint Eq. [9S], Eq. 11391 can be rewritten as follows: new =
qk
[st]-'[r (to+ 6 t )- rJ(to+ st)l[i?(to + st)- 0 are also worth investigating. The method of undetermined parameters then requires computation of (A@)(to); p = 0 , 1, . . . , (smax- 1))and the undetermined parameters (y), and includes computational elements of the analytical method. A major motivation for using constraint dynamics is to allow for larger time steps in M D simulations. A larger time step is justifiable for bond-stretch constraints and in some cases bond-angle constraints. The ability to implement general holonomic constraints allows greater flexibility in selecting the constraints for a molecular model. Of particular interest is the use of torsional constraints in free energy calculations. The use of equivalent alternative holonomic constraints can also improve the efficiency of SHAKE itself42 (on top of the reduction in number of M D time steps resulting from applying the constraints in the first place). The advantagess3 that RATTLE offers over SHAKE should motivate RATTLE implementations for general holonomic constraints, in particular internal coordinate constraints. To make MD simulations of large systems feasible, it will be necessary to augment constraint dynamics (which reduces the number of MD time steps required) with combined strategies for improving the efficiency of computing the constraint forces42 and the forces from the system potential energy72 (to reduce the CPU time of an M D time step).
ACKNOWLEDGMENTS This work was done in part to support the Environmental Molecular Sciences Laboratory (EMSL)project at Pacific Northwest National Laboratory, a multiprogram national laboratory operated by Battelle Memorial Institute for the U.S. Department of Energy under contract DE-AC0676RLO 1830. Funding for this aspect of the work was provided by the EMSL Construction Project, Office of Health and Environmental Research, Office of Energy Research. Computer resources were provided by the Scientific Computing Staff, Office of Energy Research, at the National Energy Research Supercomputer Center (NERSC), Livermore, California. Ramzi Kutteh was supported by a fellowship administered by the Associated Western Universities-Northwest Division under grant DE-FG06-89ER-75522 with the U.S. Department of Energy.
134 Molecular Dynamics with General Holonomic Constraints
REFERENCES 1. G. D. Harp and B. J. Berne,]. Cbem. Phys., 49,1249 ( 1 968). Linear and Angular Momentum
Autocorrelation Functions in Diatomic Liquids. 2. G. D. Harp and B. J . Berne, Phys. Rev. A , 2, 975 (1970).Time Correlation Functions, Memory Functions, and Molecular Dynamics. 3. A. Rahman and F. H. Stillinger,J. Chem. Phys., 55,3336 (1971). Molecular Dynamics Study of Liquid Water. 4. J. P. Ryckaert and A. Bellemans, Chem. Phys. Lett., 30, 123 (1975). Molecular Dynamics of Liquid n-Butane Near Its Boiling Point. 5. J. P. Ryckaert, G. Ciccotti, and H. J. C. Berendsen, J. Comput. Phys., 23, 327 (1977). Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of n-Alkanes. 6. J. P. Ryckaert and A. Bellemans, Chem. SOC.Faraday Discuss., 66,95 (1978). Molecular Dynamics of Liquid Alkanes. 7. G. MarCchal and J. P. Ryckaert, Chem. Phys. Lett., 101, 548 (1983).Atomic vs. Molecular Description of Transport Properties in Polyatomic Fluids: n-Butane as an Example. 8. J. P. Ryckaert, Mol. Phys., 55, 549 (1985). Special Geometrical Constraints in the Molecular Dynamics of Chain Molecules. 9. T. A. Weber, f. Chem. Phys., 69,2347 (1978). Simulation of --Butane Using a Skeletal Alkane Model. 10. T. A. Weber, f. Chem. Phys., 70,4277 (1979).Relaxation of a n-Octane Fluid. 11. W. F. van Gunsteren, Mol. Phys., 40, 1015 (1980). Constrained Dynamics of Flexible Molecules. 12. D. Chandler and B. J. Berne, f. Chem. Phys., 71,5386 ( 1 979). Comment on the Role of Constraints on the Conformational Structure of n-Butane in Liquid Solvents. 13. D. W. Rebertus, B. J. Berne, and D. Chandler, J. Chem. Phys., 70,3395 (1979).A Molecular Dynamics and Monte Carlo Study of Solvent Effects on the Conformational Equilibrium of n-Butane in CCI,. 14. I. A. McCammon, B. R. Gelin, and M. Karplus, Nature, 267,585 (1977).Dynamics of Folded Proteins. 15. W. F. van Gunsteren and H. J. C. Berendsen, Mol. Phys., 34, 1311 (1977). Algorithms for Macromolecular Dynamics and Constraint Dynamics. 16. T. P. Lybrand, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1990, Vol. 1, pp. 295-320. Computer Simulation of Biomolecular Systems Using Molecular Dynamics and Free Energy Perturbation Methods. 17. M. Bixon, Annu. Rev. Phys. Cbem., 27, 65 (1976).Polymer Dynamics in Solution. 18. M. Fixman, J. Chem. Phys., 69, 1527 (1978). Simulation of Polymer Dynamics. I. General Theory. 19. M. Fixman,!. Chem. Phys., 69,1538 (1978). Simulation of Polymer Dynamics. 11. Relaxation Rates and Dynamic Viscosity. 20. M. Fixman, Proc. Natl. Acad. Sci. U.S.A., 71,3050 (1974).Classical Statistical Mechanics of Constraints: A Theorem and Application to Polymers. 21. H. J. C. Berendsen and W. F. van Gunsteren, in Molecular Liquids, Dynamics and Interaction, A. J. Barnes, W. J. Orville-Thomas, and J. Yarwood, Eds., NATO AS1 Series C135, Reidel, New York, 1984. 22. M. R. Pear and J. H. Weiner,]. Chem. Phys., 71,212 (1979). Brownian Dynamics Study of a Polymer Chain of Linked Rigid Bodies. 23. W. F. van Gunsteren and M. Karplus, Macromolecules, 15, 1528 (1982). Effect of Constraints on the Dynamics of Macromolecules.
References 135 24. N. Go and H. A. Scheraga, Macromolecules, 9, 535 (1976).On the Use of Classical Statistical Mechanics in the Treatment of Polymer Chain Conformation. 25. E. Helfand, J. Chem. Phys., 71, 5000 (1979).Flexible vs. Rigid Constraints in Statistical Mechanics. 26. T. P. Straatsma, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1996, Vol. 9, pp. 81-127. Free Energy by Molecular Simulation. 27. R. Elber, 1. Chem. Phys., 93,4312 (1990). Calculation of the Potential of Mean Force Using Molecular Dynamics with Linear Constraints: An Application to a Conformational Transition in a Solvated Dipeptide. 28. R. Czerminski and R. Elber, J. Chem. Phys., 92, 5580 (1990). Reaction Path Study of Conformational Transitions in Flexible Systems: Applications to Peptides. 29. C. Choi and R. Elber, J. Chem. Phys., 94, 751 (1991).Reaction Path Study of Helix Formation in Tetrapeptides: Effect of Side Chains. 30. C. A. Gough, D. A. Pearlman, and P. Kollman,!. Chem. Phys., 99,9103 (1993).Calculations of the Relative Free Energies of Aqueous Solvation of Several Fluorocarbons: A Test of the Bond Potential of Mean Force Corrections. 31. D. A. Pearlman and P. A. Kollman, J. Chem. Phys., 94,4532 ( I 991). The Overlooked BondStretching Contribution in Free Energy Perturbation Calculations. 32. T. P. Straatsma, M. Zacharias, and J. A. McCammon, Chem. Phys. Lett., 196, 297 (1992). Holonomic Constraint Contributions to Free Energy Differences from Thermodynamic Integration Molecular Dynamics Simulations. 33. H. Goldstein, Classical Mechanics, 2nd ed., Addison-Wesley, Reading, MA, 1981. 34. C. Lanczos, The Variational Principles of Mechanics, 4th ed., Dover, New York, 1986. 35. J. Barojas, D. Levesque, and B. Quentrec, Phys. Rev. A, 7, 1092 (1973). Simulation of Diatomic Homonuclear Liquids. 36. A. T. Allen and D. J. Tildesley, Computer Simulation ofliquids, Oxford University Press, New York, 1992. 37. D. J. Evans, Mol. Phys., 34, 317 (1977).On the Representation of Orientation Space. 38. D. J. Evans and S. Murad, Mol. Phys., 34, 327 (1977).Singularity-Free Algorithm for Molecular Dynamics Simulation of Rigid Polyatomics. 39. J. G. Powles, W. A. B. Evans, E. McCrath, K. E. Gubbins, and S. Murad, Mol. Phys., 38,893 (1979). A Computer Simulation for a Simple Model of Liquid Hydrogen Chloride. 40. M. P.Allen, Mol. Phys., 52,717 (1984).A Molecular Dynamics Simulation Study of Octopoles in the Field of a Planar Surface. 41. D. Fincham, CCPS Q., 2 , 6 (1981).An Algorithm for the Rotational Motion of Rigid Molecules. 42. K. D. Hammonds and J. P. Ryckaert, Comput. Phys. Commun., 62,336 (1991).On the Convergence of the SHAKE Algorithm. 43. G. Ciccotti, M. Ferrario, and J. P. Ryckaert, Mol. Phys., 47, 1253 (1982).Molecular Dynamics of Rigid Systems in Cartesian Coordinates. 44. H. Katz, R. Walter, and R. L. Somorjai, Comput. Chem., 3, 25 (1979). Rotational Dynamics of Large Molecules. 45. S. Miyamoto and P. A. Kollman, J. Comput. Chem., 13, 952 (1992).SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithm for Rigid Water Models. 46. J. E. Mertz, D. J. Tobias, C. L. Brooks 111, and U. C. Singh,]. Comput. Chem., 12,1270 (1991). Vector and Parallel Algorithms for the Molecular Dynamics Simulation of Macromolecules on Shared-Memory Computers. 47. J. P.Ryckaert, I. R. McDonald, and M. L. Klein, Mol. Phys., 67,957 (1989).Disorder in the Pseudohexagonal Rotator Phase of n-Alkanes: Molecular-Dynamics Calculation for Tricosane.
136 Molecular Dynamics with Grnerdl Holonomir Constrdints 48. J. Orban and J. P. Ryckaert, Report on CECAM Workshop, unpublished, 1974. Methods in Molecular Dynamics. 49. R. Edberg, D. J. Evans, and G. P. Morriss, J. Chem. Phys., 84, 6933 (1986). Constrained Molecular Dynamics: Simulations of Liquid Alkanes with a New Algorithm. 50. A. Baranyai and D. J. Evans, Mol. Phys., 70, 53 (1990). New Algorithm for Constrained Molecular-Dynamics Simulation of Liquid Benzene and Naphthalene. 51. L. Verlet, f h y s . Rev., 159, 98 (1967).Computer Experiments on Classical Fluids. I. Thermodynamical Properties of Lennard-Jones Molecules. 52. W. C. Swope, H. C. Andersen, P, H. Berens, and K. R. Wilson,]. Chem. Phys., 76,637 (1982). A Computer Simulation Method for the Calculation of Equilibrium Constants for the Formation of Physical Clusters of Molecules: Application to Small Water Clusters. 53. H. C. Andersen, ]. Comput. Phys., 52, 24 (1983). RATTLE: A “Velocity” Version of the SHAKE Algorithm for Molecular Dynamics Calculations. 54. D. J. Tobias and C. L. Brooks HI,]. Chem. Phys., 89,5115 (1988). Molecular Dynamics with Internal Coordinate Constraints. 55. D. V. Widder, Advanced Calculus, 2nd ed., Dover, New York, 1989. 56. W. G. Hoover, Computational Statistical Mechanics, Vol. 1 1 of Studies in Modern Thermodynamics, Elsevier, Amsterdam, 1991. 57. D. J. Evans, W. G. Hoover, B. H. Failor, B. Moran, and A. J. C. Ladd, Phys. Rev. A , 28,1016 ( 1 983). Nonequilibrium Molecular Dynamics via Gauss’s Principle of Least Constraint. 58. R. Kutteh, preprint (1997). Analytical Dynamics with General Nonholonomic Constraints. 59. G. Arfken, Mathematical Methods for Physicists, 3rd ed., Academic Press, Orlando, FL, 1985. 60. F. J. Vesely, Computational Physics-An lntroductron, Plenum Press, New York, 1994. 61. C. W. Gear, Numerical lnitial Value Problems in Ordinary Differential Equations, PrenticeHall, Englewood Cliffs, NJ, 1971. 62. E. B. Wilson Jr., J. C. Decius, and P. C. Cross, Molecular Vibrations. Dover, New York, 1955. 63. R. Kutteh and L. L. Van Zandt, Phys. Rev. A , 47, 4046 (1993). Anharmonic-PotentialEffective-Charge Approach for Computing Raman Cross Sections of a Gas. 64. H. J. C. Berendsen, J. R. Grigera, and T. P. Straatsma. J. Phys. Chem., 91, 6269 (1987).The Missing Term in Effective Pair Potentials. 65. G. Dahlquist and A. Bjork, Numerical Methods, Prentice-Hall, Englewood Cliffs, NJ, 1974. 66. M. K. Memon, R. W. Hockney, and S. K. Mitra,]. Comput. Phys., 43,345 (1981). Molecular Dynamics with Constraints. 67. R. W. Hockney, Methods Comput. Phys., 9, 136 (1970).The Potential Calculation and Some Applications. 68. R. W. Hockney and J. W. Eastwood, Computer Simulation Using Particles, McGraw-Hill, New York, 1981. 69. S. G. Lambrakos, J. P. Boris, E. S. Oran, I. Chandrasekhar, and M. Nagumo, J . Comput. Phys., 85, 473 (1989).A Modified SHAKE Algorithm for Maintaining Rigid Bonds in Molecular Dynamics Simulations of Large Molecules. 70. W. Smith and T. R. Forester, Comput. Phys. Commun., 79, 63 (1994).Parallel Macromolecular Simulations and the Replicated Data Strategy. 11. The RD-SHAKE Algorithm. 71. S. E. DeBolt and P. A. Kollman,]. Cvmput. Chem., 14,312 (1993).AMBERCUBE MD, Parallelization of AMBERS Molecular Dynamics Module for Distributed-Memory Hypercube Computers. 72. R. Kutteh and J. B. Nicholas, Comput. Phys. Commun., 86, 236 (1995). Implementing the Cell Multipole Method for Dipolar and Charged Dipolar Systems.
CHAPTER 3
Computer Simulation of Water Physisorption at Metal-Water Interfaces John C. Shelley* and Daniel R. Btrardt Department of Chemistry, University of British Columbia, Vancouver, British Columbia, V6T 1 Z1, Canada, (present address): *The Proctor & Gamble Company, Miami Valley Laboratories, P.0.Box 538707, Cincinnati, Ohio 45253-8707, and (present address): +Molecular Simulations Incorporated, San Diego, California 92 12 1
INTRODUCTION The interface between aqueous solutions and metal surfaces plays an important role in many fields of science and technology including, most notably, the disciplines focused on corrosion, electrochemistry, and catalysis. Although much has been learned from many experimental and theoretical studies of these systems, a more detailed and deeper understanding of both the microscopic structure and dynamics of the constituents at the interface is still emerging. Studies of interfacial structure require the extraction of surface data. This is difficult because most systems are composed largely of bulk and bulklike particles, and most researchers use experimental methods designed to probe the microscopic structure of bulk solids and liquids. This limitation has been overReviews in Computational Chemistry, Volume 12 Kenny B. Lipkowitz and Donald B. Boyd, Editors Wiley-VCH, John Wiley and Sons, Inc., New York, 0 1998
137
138 Computer Simulation of Water Physisorption at Metal- Water Interfaces come partly with the development of surface-sensitive techniques such as the scanning probe microscopes (SPM), surface-enhanced Raman spectroscopy, and nonlinear optical spectroscopy, to mention a few. However, the interpretation of data from such techniques, particularly for bulk water-metal interfaces, is complicated by significant perturbations of the solution or by the complex relationship between the experimental observables and the structure of the interface. Surfaces free of defects are often essential to simplify the interpretation of the experimental data, but such surfaces are difficult to achieve. Consequently, theoretical studies and computer simulations have played a unique and critical role in developing a physical picture and understanding of these interfaces at the microscopic level. A number of reviews and collections of papers have been published recently in this very active field.I4 Studies of the interfacial structure often quote early models from the first half of the twentieth century, and although these models have led to useful insights about the nature of the interface, it is important to note that they are based exclusively on phenomenological characteristics and macroscopic observ a t i o n ~ To . ~ establish a rigorous understanding of the interfacial structure, microscopic data from experimental and theoretical studies (e.g., statistical mechanical and quantum mechanical) of the metal-water interface are needed. The interface may be characterized as a narrow region bounded by an inhomogeneous distribution of electrons and nuclei in the metal, and by solvent and solute particles in the fluid. At typical aqueous interfaces, the surfaceinduced structure within the water decays rapidly to bulklike behavior within a few solvent diameters. On the metal side, the electron density and surface structure typically decay to bulklike arrangements at even shorter distances. Elucidating the metal-liquid interface is a particularly challenging problem as a result of the physical, as opposed to geometrical, inhomogeneity; the distinctly different natures of the two phases create unique computational problems. The nuclear skeleton of the metal is relatively rigid, with a highly diffuse electronic structure and coherent nuclear motion. The nature of the metal necessitates a quantum mechanical treatment when the metal's properties are the focus of interest. To reduce the computational demands of the metal calculation, finite-sized metal cluster^^-'^ are often used to represent the metal surface, although efforts are now being made to treat infinitely extended metal surf a c e ~ . * " The ~ ~ fluid, on the other hand, exhibits a distinctly different behavior, where particle motion is relatively incoherent and the electronic structure is often localized, essentially, within the molecules. These properties indicate that in many cases a model potential may be used to represent the intermolecular interactions for the liquid side of the interface instead of an expensive quantum mechanical treatment. Model potentials needed for this are designed so that many intermolecular interactions in a fluid can be calculated rapidly and a large number of solvent configurations can be included in a statistical mechanical study. The interaction of water molecules with the surface clearly plays an im-
Introduction 139 portant role in determining the solvent distribution. Many of the features of the interaction between these phases can be deduced from quantum mechanical cal-l~ enculations of isolated molecules interacting with a metal ~ l u s t e r . ~Potential ergy functions derived in this manner provide an indication of the overall strength of the interaction and how it varies with the position and orientation of the water molecules with respect to the metal surface. As such, the functions play a significant role in theoretical studies of the interface. However, it should be noted that the true interaction of an isolated water molecule with an infinitely extended surface will differ from finite-sized cluster calculations, which typically display an oscillatory dependence on the size of the cluster. Further, the presence of the other water molecules in the liquid phase influences the metal-water interaction. Because our ability to adequately treat metals is still problematic, yet it is one of the most significant aspects of modeling metalsolvent interfaces, this chapter examines the influence of various properties of the metal on the structure of water. We consider the motion of the metal atoms themselves (including phonons) only in passing. A water molecule adsorbed on a Hg surface remains a recognizable entity, even though there are significant electronic interactions between the molecule and the surface. This interaction can be referred to as physisorption, because the chemical natures of the species involved are not fundamentally changed. Many metals interact with water more strongly than Hg. If the surface interactions become strong enough, the structure of the adsorbed molecule may be changed, and it can be said a chemical reaction has occurred. When this happens, the adsorbed molecule is said to be chemisorbed. Such chemisorption is inherently quantum mechanical. The distinction between chemisorption and physisorption is blurred and is somewhat arbitrary in many cases. Even for a weakly physisorbed molecule, the electronic structure will be perturbed as a result of surface polarization effects. The ambiguity arises mainly when we attempt to distinguish strongly physisorbed and weakly chemisorbed species. In these cases, we base our classification on pragmatic considerations; systems in which the overall nature of the water molecule is not fundamentally modified are considered to be physisorbed, and model potentials for describing the water remain tractable. Describing chemisorption is more complex because the electronic structure of chemisorbed water is no longer characterized by the model potential and also because of the difficulties in treating the metal correctly. These characteristics impose severe limitations on the kinds of study currently possible. Accordingly, in this chapter we restrict discussions to physisorption of water to a limited set of metals (such as Hg, Au, Ag, Cu, Zn, and Ni) for which model potentials are appropriate. A single water molecule physisorbs on a metal surface with an energy of roughly 30-70 kJ/mol. This physisorption energy is larger than that of a single hydrogen bond (roughly 20 kJ/mol) and almost as large as the average interaction energy of a water molecule with its neighbors in bulk water under ambient conditions (83 kJ/mol).20 Therefore, an interfacial arrangement in-
140 Computer Simulation of Water Physisorption at Metal- Water Interfaces
volving a compromise between the forces at the interface is expected. Whereas individual hydrogen bonds may be broken to facilitate water-surface interactions, the strong interaction between water molecules plays a crucial role in determining the interfacial structure. In aqueous solutions, the preference of many solutes, especially ions, to adsorb on the surface or to remain solvated near the surface is still uncertain. Similar, often subtle, compromises among the competing interactions are involved. The complexity of metal-water interfaces and the limited experimental information available for determining precisely what the structure and dynamics of those interfaces are like provide the impetus for careful theoretical studies. Computer simulations of water-metal interfaces provide a means of accurately studying the system of interest for a given model Hamiltonian and have thus played an important role in developing a microscopic understanding of these common interfaces. However, most simulations amenable to these problems are computationally expensive, limiting the number of simulations that can be performed and restricting the range of systems that can be meaningfully treated. Analytical treatments of these interfaces are difficult to implement and often involve approximations in addition to those used in computer simulations. The ability of analytical treatments to systematically and rapidly treat these systems is unmatched, however. Clearly, analytical treatments and computer simulations play complementary roles in efforts to better understand these systems. In this chapter, we begin with a description of the nature of the models generally used in simulations of the physisorption of water on metal surfaces including a section focusing on a relatively simple and currently tractable quantum model for the metal, the jellium model. Following this is a section describing the most common classical simulation methods: molecular dynamics and Monte Carlo techniques. Some representative results obtained from simulations using these methods are then presented. The final two sections of this chapter summarize results from such studies, compare the strengths and weaknesses of the current methodologies available, and provide some perspectives on future developments.
MODELING Modern computer simulation studies, along with other theoretical methods applied to the metal-liquid interface, typically require models for the interaction between water molecules, for the metal, and for the metal surface interacting with the water. Designing realistic models of the interface that can be solved in a reasonable time frame is a primary challenge in the area of theory and computer simulation.
Modeling 141
Treatment of Water Classical model potentials for water are designed typically with the goal of obtaining the best computed results for the properties of interest. Such models are typically called effective models because their justification lies not so
much in that they have a rigorous basis, but on the grounds that they are meant to effectively mimic liquid water under a given set of conditions. Most of these
models contain relatively few parameters, so that the intuitive physical picture of the water is not hidden or lost by an overly complex mathematical expression. Simple as these models are, their development and testing entail time-consuming efforts in which experience can play a central role. The existence of wellcharacterized, efficient, and effective models of adequate quality suggests that a good route is to use an established model from the literature unless there is a compelling reason to do otherwise. Some commonly employed effective potentials for water are characterized in Table 1. The general geometry for those wal-~~ ter models is illustrated in Figure 1. Some properties of these m o d e l ~ , ~along with the corresponding experimental value^,^^-^^ are also presented in Table 1 . Figure 2 compares an experimentally derived oxygen-oxygen radial distribution function (see below) for bulk water28to that generated from a simulation of water using the SPC/E model. One sees that these simple models can generally reproduce the properties of bulk water, but discrepancies remain. For instance, the precise location and height of the first peak in the oxygen-oxygen radial distribution functions usually differ. The water models presented in Table 1 all assume rigid geometries. Flexible models for water do exist in the literature (see, e.g., Ref. 29), but their use requires substantially more computer time and/or more complex treatment. The internal vibrational modes of a water molecule correspond to energies significantly higher than the thermal energy at room temperature. Because of this, nonquantum treatments often inadequately imitate these intramolecular modes.30 Since distortion of a water molecule from its average molecular geometry is generally small, we recommend the use of models with a fixed nuclear geometry to study metal-water interfaces under most circumstances. The dipole moment of an isolated water molecule in the gas phase is 1.85 deb ye^,^^ increasing to roughly 2.6 in the liquid state as a result of the mutual electronic polarization of water molecules. The effective models for water listed in Table 1 either employ permanent dipole moments that are enhanced relative to the gas phase (e.g., SPC, SPC/E, TIP4P) or include polarizability explicitly (TIP4P-FQ).For metal-water interfaces, studies have been carried out using similar polarizable and nonpolarizable models. It has been found that explicitly including the polarizability does not significantly affect the interfacial structure at this level of modeling.35 One explanation for the overall similarity of results obtained with polarizable and nonpolarizable water models in studies of both bulk water and water physisorbed on metal surfaces is that the induced dipole moment of a water molecule in the polarizable models tends
142 ComDuter Simulation of Water Phvsisorbtion at Metal- Water Interfaces Table 1. Effective Potential Energy Functions for Water"
SPCb
Property
dOH) (A) H-0-H angle (deg) d O M ) @If UL, &[,J
40
(AF
(kJ/mo')'
(4
qH q11.r (elf Pg (debyeIh PI (debyeY Uw(kJlmo1)j EO
D (10-9m2
S-l)m
SPC/E"
1.o 109.47
1 .o 109.47
3.166 0.6502 -0.82 0.41
3.166 0.6502 -0.8476 0.4238
2.274 2.274 -41.8 7227 3.6
2.351 2.351 -41.5 7020 2.4
TIP4Pd
TIP4P-FQ'
Experiment
0.9572 104.52 0.15 3.154 0.6485 0.0 0.52 -1.04 2.177 2.177 -41.6 6127 3.3
0.9572 104.52 0.15 3.159 1.197
0.9572 104.52
1.85 2.62 -41.4 79 2 8 1.9
1.85' -41Sk 78.3' 2.3"
"298 K and 1.0 gtcm". %imple point charge (SPC) water model [29]. U , E ~ and , D are from Ref. 22. 'A modification of the SPC model (SPCE) to take into account the energy cost for polarizing the water molecules, Ref. 21. U , E ~ and , D are from Ref. 22. dRef. 20. U, E ~ and , D are from Ref. 22. 'Ref. 23. fSee Figure 1for the location of the M site. q h e s e water models all have a single Lennard-Jones 6-12 potential on the oxygen atom. "The gas and liquid phase dipole moments; while the experimental value of p, is not listed, it is roughly 2.6 debyes 1321-[34]. 'Ref. 31. 'The potential energy of the liquid. kRef. 20. 'Refs. 25 and 26. "Diffusion constant. "Ref. 27.
0
H 6+
H 6+
Figure 1 Geometries of water models listed in Table 1. These rigid models carry three charges: two located on the hydrogen sites (6+) and a balancing charge on the M site (-2S+) lying along the bisector of the H-0-Hangle. The single Lennard-Jones potential has its origin on the 0 site. For the three.-site SPC and SPCE models, the M and 0 sites coincide.
Modeling 143 3
2
1
n -
2
4
a
10
Figure 2 The oxygen-oxygen radial distribution functions for bulk water: solid line derived from neutron scattering studies (Ref. 28); dashed line obtained from a Monte Carlo simulation employing the SPC/E water model (Ref. 115).
to point in the same direction as the permanent moment, remaining relatively invariant from molecule to molecule: a situation similar to having an enhanced permanent dipole moment. Nonpolarizable models should give an acceptable representation of the water-water interactions for systems containing at least a monolayer of water. For coverages less than a complete monolayer, it may be desirable to employ polarizable models.
Treatment of Metal-Water Interactions The dynamics of any metal-liquid interface involves interactions both between and among particles in the metal and fluid. For the physisorption of water on metals, where the interaction between water molecules is comparable to the metal-water interaction, it is normally assumed that the metal-water interactions can be treated with model potentials and that a detailed quantum mechanical treatment of the interaction between the two phases is not necessary, provided an adequate model of the interaction is used. However, a simple quantum mechanical treatment for the metal, the jellium model, exists, and its role in the simulation of metal-water interactions also is considered below. For bulk water-metal interfaces, there have been a few studies in which - ~ ~ studies require the quantum nature of the metal is treated e ~ p l i c i t l y . l ~Such
144 Computer Simulation of Water Physisorption at Metal- Water Interfaces
large amounts of computer time for a single configuration, making statistical mechanical treatments of the metal-liquid interface prohibitively expensive. For example, in the work of Price and Halley, a short Car-Parrinello molecular dynamics simulation of the water-metal interface was performed.16 Ursenbach, Calhoun, and Voth14 examined the bulk water-semiconductor interface, also using the Car-Parrinello methods. In the latter study, the problem of sampling configuration space with short Car-Parrinello simulations was partially overcome by using several statistically independent starting geometries derived from classical molecular dynamics simulations. Lengthy simulation times are, at present, generally infeasible.
Review of Intevfacial Models Used in the Literature Quantum mechanical calculations of the interaction of a water molecule physisorbing on a metal cluster indicate a typical interaction strength of between -30 and -70 kJ/moL6-13 These values are consistent with experimental estimates of the physisorption energy for monolayers of water on metal surf a c e ~ . " - ~Classical ~ simulations of such systems show that the arrangement of the interfacial water varies with the nature of the metal-water interacas well as the density of the water adt i o n ~ . The ~ ~extent * ~ of~ structuring ~ ~ jacent to the metal surface varies monotonically with the overall binding energy of the water with the metal. If the potential is too weak (e.g., 2 kJ/mol), almost no structuring is observed.39 Contrarily, the results are generally insensitive to the overall binding energy, provided it is sufficiently high (roughly >15 k J / m 0 1 ) . ~It~is well established that monolayers of water readily form on metal surfaces. Nonattractive and likely very weak water-metal potentials will not produce stable monolayers of water.39 The orientational portion of the potential may be characterized by the energy required to rotate a water molecule adjacent to the metal surface from an orientation in which its dipole moment is pointing directly away from the metal surface to one in which the molecule lies flat on the surface. Generally, structural characteristics of the water are not strongly affected by this portion of the potential, provided its strength is not excessively large. However, very strong orientational potentials can force liquid water away from the metal s ~ r f a c e . l ~ . ~ ~ Many potentials have been proposed and used that are sufficiently strong with an appropriate orientational d e p e n d e n ~ e . Comparing ~ ~ ~ ' ~ the ~ ~util~ ~ ~ ~ ~ ity of these potentials is not easy because data of sufficient accuracy are not currently available from either experiment or quantum calculations. Generally speaking, the existing models treat the metal atoms e ~ p l i c i t l y ~ or , ~rep~,~~,~~ resent the interactions of a water molecule with the metal collectively as a single interaction, often by including a Fourier expansion within the surfaces to represent the surface c o r r u g a t i ~ n . For ~ ~ instance, ~ ~ ~ ~ the ~ ~overall ~ ' potential between a water molecule and a metal surface uWM, excluding the orientational dependence, has been represented by a Lennard-Jones 9-3 potential,39
Modeling 245
where z is the distance of the water molecule from the plane of the topmost metal atoms. In a manner based on that used by Berkowitz and co-worke r ~ , ~the~corrugation , ~ ~ , ~ for ~the (111)surface was mimicked by making .zgi and uyj functions of the x and y coordinates of the water molecule, as follows:
and u9.3 =
u0 + ucQ
131
where Q = C O S ~ X S+~cos 2
~i~ cos 2
~ ~ 3
141
2Jz
s2=zy s i = SI
+ s2
171
and a is the lattice constant for the solid; E ~ cC, , uo, and uc are the constants for the particular metal surface. Explicitly modeling the metal atoms permits a limited treatment of the phonons in the metal as well as the coupling of water-phonon motions.' Models of this nature have been employed t o describe less regular surfaces like that of an SPM tip.43,48 Heinzinger and co-workers have developed several potentials for describing the interactions of water with a number of metals in which the metal atoms are described e x p l i ~ i t l y , ~based ~ ~ "on ~ ~quantum ~ mechanical calculations. Their interaction potentials assume a pairwise sum of metal-water oxygen and metal-water hydrogen potentials. Adding Hg-Hg potential^,^^ Heinzinger and co-workers performed detailed studies of the liquid Hg-liquid water i n t e r f a ~ e , ' +making ~ . ~ ~ a significant contribution to the field. Siepmann and Sprik43,48 developed an approach based on an angle-dependent metal-oxygen atom potential in combination with a repulsive, threebody, metal-metal-water oxygen potential to ensure that the water molecules lie on top of the metal atoms, as found in both e ~ p e r i r n e nand t ~ ~ab initio quantum mechanical calculation^.^^^^^^^ This model has the interesting feature that
146 Computer Simulation of Water Physisorption at Metal-Water lnterfaces
the charge associated with individual atoms within the metal varies in response to the potential applied to the electrode as a whole, as well as to interactions with other charged species within the system. One of the main disadvantages of explicitly representing atoms in the metal surface is the expense of calculating all atom-water and, in some cases, atom-atom interactions. Berkowitz and co-workers4’ formulated a Pt-water potential to represent the metal surface as a single unit and employed a short Fourier expansion to describe the variation of the potential arising from surface corrugation. They determined the parameters for Pt-1004’ and Pt-11 144 by fitting their potential function to the explicit atom potentials developed by Heinzinger and co-workers.’ Simulations employing such potentials are considerably more efficient than an all-atom representation for examining the structure of the solution near the surface. Zhu and P h i l p ~ t employed t~~ a combination of relatively simple anisotropic and isotropic pairwise additive potential energy functions for a number of metal surfaces. They also provide a collective version of their potentials. Most of the collective potentials mentioned above are relatively complex because they are directly related to atom-atom potentials. Rather than constructing collective potentials in this manner, it is possible to fit them directly to the overall interactions between a water molecule and the metal, providing a simpler model that is easy to implement and interpret. For example, Spohrsl and Shelley et al.39 have employed collective potentials in which the interaction is represented as a Fourier expansion in the metal surface, but the potentials are kept simple and terms due to different physical effects are separated to facilitate systematic optimization and variation of the metal-water interactions.
Consideration of Image Potentials When Fitting Potentiafs Fitting potentials to results of quantum calculations is a straightforward process. However, special consideration of the role of image potentials is needed. Charges, including those within water molecules, polarize the metal. If the charges are distant from the metal surface, the electric field arising from metal polarization may be mimicked by placing a charge of equal size but opposite sign at a location within the metal. That site is obtained by reflecting the charge through a plane situated close to the metal surface (see Figure 3 ) . Coulomb‘s law can then be used to calculate the interactions between the real and image charges, with the stipulation that the magnitude of the interaction be reduced by a factor of 2 to account for the energy penalty for polarizing the metal. In this approach, it is important to include the interactions of each real charge with all image charges in the system, not just the image charges corresponding to the particular site or molecule in question. Calculating only the interaction of a site with its own image charge is not a good first approximation because of the many-body nature of these interactions. This simple image charge approach has significant failings near the metal surface. The precise location of the image plane is very important, yet where
Modeling 147
Figure 3 Illustration of the classical image charge potential. Point charges in solution distant from the metal surface induce charge redistribution in the metal that in turn produces an electrostatic potential acting on charges in the solution. That potential is mimicked using images by creating an imaginary plane (the image plane) located near the metal surface. For each charge in solution, q, a fictitious charge of equal magnitude but opposite sign is placed in the metal at an equivalent distance, d, on the opposite side of the image plane. For a collection of point charges in solution, all interactions between all charges in solution with all image charges, including cross interactions, should be included in the calculation of the image potential.
exactly to place it is uncertain. Also, for molecules close enough to penetrate the tail of the metal’s electron density distribution (those molecules in the first monolayer), the interaction is dampened significantly. O n a conceptual basis, the location of the image plane should not be deeper in the metal than the nuclei of the topmost layer of metal atoms nor shallower than the region where the metal’s electron distribution drops abruptly. Most models place it near the former, in part because the self-image interactions can become unphysical when a charged atom approaches the surface if the image plane is located close to the metal surface. However, by placing the image plane relatively deep, near the nuclei, one indirectly builds in some of the damping near the surface. In the apof the metal atoms proproach developed by Siepmann and S ~ r i k , 4charging ~ duces interactions similar to inclusion of explicit image potentials. The electrostatic behavior of solutions-particularly polarization, which is many-body in nature-is collective, partly owing to the strength and range of electrostatic interactions. The observation that the sum of charge-image charge interactions for water near a metal surface is almost zero39340,52arises from a large-scale cancellation of the terms involved and is one surprising consequence of this collectivity. Such cancellation has been observed both in wide slabs of water and in films of water as narrow as a r n ~ n o l a y e rThis . ~ ~ cancellation allows one to simply exclude these interactions in the Hamiltonian for
148 Computer Simulation o f Water Physisorption at Metal- Water Interfaces
metal surfaces that are fairly smooth, resulting, in turn, in a significant reduction in the simulation cost because the number of interacting sites is reduced by a factor of 2. Indeed, image interactions are commonly left out of simulations of pure water near metal surfaces. However, this cancellation may not be sufficiently complete for metal surfaces with large corrugations or for irregular metal surfaces, so excluding them in these cases might cause problems. It would be interesting to see an investigation of this cancellation, particularly with rough or irregular metal surfaces, using either ah initio calculations or perhaps the Siepmann-Sprik to learn how these cases should be addressed. When ah initio quantum theory is used to calculate the interaction of a single water molecule with a large cluster of metal atoms, polarization of the metal by the water molecule will occur, affecting the overall water-metal potential. Because only one water molecule is present, these effects will not cancel. To account for this polarization, the image potential should appear explicitly in the potential function being fit to these data; otherwise they are implicitly being included in the effective two-body potentials for the water molecule interacting with the metal surface. Including these interactions in effective twobody potentials prevents collective cancellation from occurring when more than one water molecule interacts with the metal. Explicitly including the image potential in the potential function permits one to either include this term in the interactions of a group of water molecules with the surface, or to drop the term entirely for cases where it is expected to cancel. The potentials of Zhu and Philpott,46 for example, have been parameterized in this manner. It should be noted that the net cancellation of the image terms may not apply to systems with solutes, particularly ions, lying close to the metal surface.
Detailed Analysis of the Jellium Model The jellium model is a very simple representation of the metal in which the valence electrons are treated as an interacting electron gas in the spatially averaged field of the atomic cores. That is, the atomic cores are represented as a uniform and positively charged background p ~ t e n t i a l . ~ The ~ J ~ only adjustable parameter in the ground state jellium model is the average charge density of the background, en+, where e is the elementary charge and n , is the number density. Traditionally, this parameter is represented by the so-called Wigner-Seitz radius (the radius of the average spherical volume occupied by an electron) and is given by rs = [ 3 / ( 4 1 ~ n + ) ] ” ~
181
As a result of the crude approximation used to represent the nuclear and core structure of the metal, applications of the jellium model are limited. Clearly, this model by itself will not yield accurate results for all properties of the metal or metal surface, nor will it provide an accurate model for the complete metal-water interaction potential. However, the jellium model can be used to represent a component of the metal-water interaction potential, especially the
Modeling 149 short-ranged, inhomogeneous electrostatic contributions that are expected to influence the average solvent orientation near the metal surface. The jellium model enables one to add this contribution to the potential in an approximate but self-consistent manner, providing a simple dependence on the charge density of the metal. The jellium model of the metal has been used extensively in recent studies of metal-liquid i n t e r f a ~ e s . ~ ~Early J - ~studies ~ of surface properties using the jellium model have shown that the jellium model can work surprisingly Generally, it provides a reasonable approximation of the work function and surface energies, particularly for metals with a low valenceelectron density (such as the alkali metals), but qualitative results can be obtained for other metals such as Hg and Pb. For metals with high valenceelectron densities, estimates for the surface tension given by the jellium model imply that the metal would cleave spontaneously. Those results can be significantly improved by using first-order perturbation theory to account for the lattice structure, but this has not yet been implemented for metal-water interfaces. Non-self-consistent treatments of the metal-water interface have also been employed in similar ~ o n t e x t s . ~ ~ - ~ ~ The electrostatic potential of a metal follows a pattern generally consistept with the nuclear structure. Near the surface, the normal component of the potential has an additional contribution arising from the kinetic energy of the electrons, causing the electron density to effectively “spill” several angstroms beyond the surface. This gives rise to an effective surface dipole moment (and higher moments) normal to the surface. The resulting electric field is strong enough to polarize the liquid near the interface, which in turn polarizes the metal in a self-consistent manner. Although the isolated jellium model provides a homogeneous potential normal to the surface, the instantaneous configurations of the fluid induce an inhomogeneous polarization of the metal, hence an inhomogeneous potential, EJel(r).To simplify the many-body nature of this interaction, a mean field approximation is applied, reducing the inherently manybody potential to a statistically averaged one-dimensional interaction perpendicular to the metal surface (i.e., the potential is averaged over the x and y dimension^).^^ The complexity associated with treating the complete manybouy interaction between the phases provides more detail but involves more work than is warranted in this model. In the mean field of N liquid molecules, the electron density, n(r), of the jellium metal is obtained from Kohn-Sham density functional theory54J5J7,65-67consisting of the Schrodinger-type equation
Here E,, and $,(z) denote the eigenvalues and normalized eigenfunctions, me is the mass of an electron, uxc is the exchange and correlation potential, and u,(z) = uiel(z) + uwa,(z)represents the average interaction of an electron in the uni-
150 Computer Simulation of Water Physisorption at Metal- Water Interfaces
Figure 4 Illustration of the geometries used in jellium calculations: (a) a semi-infinite metal, where the metal is infinite in the z direction (perpendicular to the surface) and (b)a metal slab, where the metal has a finite width in the z direction. In both cases the metal is infinite in the x and y directions (parallel to the surface).
form potentials of the jellium background uje,(z)and the liquid uwa,(z).In this approach, the jellium potential uje,(z)is simply obtained from Poisson’s equation. In the original work of Lang and K ~ h nthe , ~jellium ~ model is treated as a semi-infinite metal extending to z = --M (infinite width and infinite in the other two dimensions, see Figure 4a). In these calculations, the charge density and wavefunctions at the surface must be matched accurately with the bulk quantities. Another approach is to consider a metal slab of finite width along the z axiss4 (see Figure 4b). This latter approach has the advantage of permitting both the wavefunctions and charge density to be obtained explicitly from the model. The finite width model does not require matching with bulk quantities, and it yields results that are indistinguishable from the infinite width case if the slab is sufficiently wide. The solution to Eq. [9] is straightforward for the slab ge~rnetry.~~,~~ Another important issue in density functional theory is the form of the exchange and correlation potential. In most investigations of metal surfaces, the simple local density approximation is used, again, with surprisingly successful results. In the case of a homogeneous electron gas, the effective exand is accurately known change-correlation potential is given exactly by decays from quantum Monte Carlo calculations.6R Outside the metal, exponentially to zero and fails to reproduce the classical image potential for electrons far from the surface, and hence fails to faithfully represent the tail of the electron density distribution outside the metal. Nonlocal approximations to have been shown to reproduce the electron image limit outside the
WE
u2
uz
wk(z)
Modeling 151 The average potential resulting from the liquid is given by
a t a distance z from the surface. In this equation, i is the unit vector normal to the surface, (,p 1)is the solvent distribution function, a,denotes the Euler angles for orientation of the water molecule at rl, and +(r, $2)is the potential at zE due to a single water molecule. The integral is over all positions and orientations of a water molecule labeled 1. At a planar interface, this expression reduces to
where JI,. is the dipole moment of the water (see Table l),and OP is the angle between the dipole moment and the surface normal pointing into the liquid. Note that this final expression includes only the contribution from the solvent dipole moments. This is a consequence of applying the average solvent distribution for the metal-liquid interface where the electron density interacts with the average planar solvent distribution. Although the electron density can penetrate the water molecule, the contribution from within a molecule is highly dependent on the model potential used and is not representative of the average interaction with the liquid phase. Hence it is assumed that the average interaction is adequately represented using the polarization (dipole)density. This issue is discussed further below. In a computer simulation, the average solvent distribution p,(l) is not known a priori, so Eqs. [ l o ] and [ l l ] cannot be employed to evaluate the potential. In a simulation, the solvent potential may be obtained from an average over solvent configurations and can be expressed as follows:
) the cosine of 8, for water molecules between z1 and z1 + dz where C O S , , ( ~ ~ is from the surface, and p,(zl) is the density of water within this interval (independent of orientation). The angled brackets ( ,) and ( ), denote averages over all molecules in a given configuration and over all configurations, respectively. In principle, the jellium model for the metal in the field of the water can be solved on the fly during the simulation. This calculation provides the corresponding potential due to the jellium that the water would experience. The approach usually reported in the literature is to first determine the solvent poten-
152 Computer Simulation of Water Physisorption at Metal- Water Interfaces
tial from a simulation of moderate length and then use that to determine a new jellium-water interaction. This process is repeated several times until selfconsistency is attained. The electrostatic field due to the jellium, Uje,,is calculated using Poisson's equation,
where
and z, is the position of the jellium edge. This potential acts on water molecules that lie sufficiently close to the surface of the metal. For a point charge model, this interaction is given by
sites
where all the charged sites (each carrying a charge of qsite)within the water molecule are included in the sum. For a single water molecule modeled with the SPC/E potential, the jellium-water interaction gives rise to an orientation potential similar to that obtained from quantum c a l c u l a t i ~ n and s ~ ~those ~ ~ comfor the monly used in classical molecular simulations6~7~35~40~4245~47-50~70-79 range of angles that are most important for neutral metal surfaces.39 Note that if jellium interactions are to be included in the model for the metal surface, and if one is using a potential fit to ab initio data, the interaction of a single water molecule with the jellium should be removed from the potential function prior to fitting an effective two-body metal-water potential.
SIMULATION METHODS Two major classical simulation techniques, molecular dynamics and Monte Carlo, have been applied to simulation of water-metal interfaces. We first discuss features common to both methodologies and then describe aspects unique to each. The field of computer simulations is an actively evolving one, despite being more than 40 years old. Even for the particular case of water-metal interfaces, many variations exist on the central theme of how best to carry out these calculations. In this chapter, we limit our discussion to the most significant (in our opinion) techniques in use for metal-water interfaces,
Simulation Methods 153
with additional comments on related methods when warranted. Related materials can be found in other chapters of the Reviews in Computational - ~ ~ and comprehensive presentations of comChemistry s e r i e ~ . ~ OGeneral puter simulation methods for molecular systems may be found in References 84 and 85.
Common Aspects of the Methods Because of the limited resources on any computer system, simulations are usually restricted to a small, finite set of atoms or molecules. Consequently, a common approach is to use a cluster of atoms in an attempt to represent the entire macroscopic state. However, even for relatively large clusters a significant fraction of the molecules remain close to the surface. For instance, the structuring in liquid water at the liquid-vapor interface under ambient conditions extends at least 5 A, or roughly 2 water diameters into the liquid.86 So, for a only spherical cluster of a thousand water molecules, having a radius of -20 40% of the water molecules are more than 5 from the surface. The most common way to eliminate such surface effects is to surround the system with periodic images of itself (i.e., study a lattice composed of replicas of the same system). Periodic boundary conditions can be implemented in several ways for two- and three-dimensionally periodic systems containing interfaces. Some of these are illustrated in Figure 5. A consistent way to think of this situation is to consider the molecules as belonging to a central threedimensional simulation cell that is surrounded by an infinite number of images of itself in either two or three directions. The simulation cells are typically cubic or parallelepiped but may be represented by any of the Bravais lattice^.^^.^^ However, for simulations of interfaces, parallelepipeds are nearly always used. Whereas periodic boundary conditions are quite effective at eliminating surface effects, they do so with the cost of restricting the simulation to a periodic system. If the distances over which significant particle-particle correlations in the system occur are smaller than half the size of the simulation box, periodicity usually presents little or no problem, provided the interactions between the particles are handled adequately (considered below). A simple method for testing finite-sized effects is to perform trial simulations using simulation cells of different sizes. Increasing the size of the simulation cell will ultimately eliminate problems associated with the system’s periodicity. How the periodic boundary conditions is actually implemented depends on the manner in which the forces and potential energies of the interaction sites are calculated. Two major ways exist for calculating interactions between atoms in such periodic systems. The first involves truncating the interactions (for example, beyond a certain distance), and the second is to explicitly or effectively calculate the full set of interactions within the periodic system. Truncation effectively means modifying the Hamiltonian for the periodic system to exclude some, typically more distant, interactions. Some adverse effects of truncation within a region can be reduced by shifting andlor smoothly truncating the potentials. This
A
A,
154 Computer Simulation o f Water Phvsisorption at Metal- Water Interfaces
Figure 5 Typical periodic boundary conditions used for computer simulations of metal-water interfaces: (a) and (b)geometries that are periodic in the directions parallel to the metal surface; (c)geometry that is periodic in all three dimensions. The geometries illustrated correspond to (a) a single slab of water sandwiched between metal surfaces, (b) a slab of water that is on top of a metal slab and has a free surface, and ( c )a system with a n infinite number of parallel, alternating, metal and water slabs.
provides acceptable results but uses relatively little CPU time for short-range interactions. For example, to treat van der Waals interactions, Lennard-Jones 6-12 potentials having the form
are often used. Here E and aLJ are parameters defining the particular LennardL-! Jones potential, and rii is the distance separating sites i and j. The Lennard-Jones potential is usually truncated at a cutoff distance, rc, of 1 2 . 5 where ~ ~ the ~ ~potential is about 1.5% of eLJ.In many cases, it is desirable to incorporate corrections for the eliminated portions of the p ~ t e n t i a l . For ~ ~ .a~single ~ slab of
Simulation Methods 155 water with Lennard-Jones interactions with a cylindrical cutoff, this correction is a p ~ r o x i m a t e dby~ ~
where po is the density of Lennard-Jones sites as a function of z. Simulation cells containing at least a few hundred Lennard-Jones particles with periodic boundary conditions are sufficiently large that yC = 2 . 5 is~ ~ ~ generally smaller than half the shortest distance through the simulation cell. All directly calculated Lennard-Jones interactions occur between “nearest images” of any two interaction sites. There exist a number of ways to compute the nearest image vectors and associated distances. Here we describe only an implementation for three-dimensional periodic boundary conditions because it is simple and generally applicable. Periodic boundary conditions in two dimensions may be implemented similarly. The basic simulation box, a parallelepiped, may be defined using three lattice vectors: a, b, and c. By convention, a is restricted to lie along the x axis, b to lie in the x-y plane, and c to lie outside the x-y plane. The transformation matrix T between lattice coordinates (in terms of a, b, and c, denoted by r’), and Cartesian coordinates, r, is given by r
=
Tr‘
1181
having a, b, and c as column vectors. The inverse transformation is simply given by the inverse matrix of T. Hence
During a simulation, one maintains both sets of positions, r and r’, for the molecules. The potentials and forces between two particles are calculated using the nearest image vector, r;, pointing from particle j to particle i. The vector, r;’, is calculated using r!. = r! - r! $1
1
1
and
rtii = yLi, - NINT(r’W’1.-) where NINT is a function that determines the integer nearest to its argument and Eq. 1211 is applied for each component, w, of the vector. Equation [18] is then used to give r;( Coulombic interactions are inherently long-ranged, and their treatment in simulations employing periodic boundary conditions is still somewhat contro-
156 Computer Simulation of Water Physisorption at Metal- Water Interfaces ~ e r s i a l . For ~ . ~a~periodic system, these interactions can be described collectively by
where the sum over the lattice vectors, n, given by (n,a + nbb + nee), involves all combinations of (n,, nb, nc) and N is the number of sites in the system. The prime symbol on the outermost sum indicates that care needs to be exercised for the cell (n,, nb, nc) = (0, 0, 0) to avoid calculating some unwanted combinations of i and j. These include omitting the calculation of a site with itself (i = j ) as well as the sites within the same molecule that should not interact. Most water models do not include intramolecular Coulombic interactions. Implementations with smooth truncation schemes have been widely used , ~ ~ worries in part because of their ease of implementation, their e f f i c i e n ~ y and that long-range periodic interactions within a periodic representation may introduce artificial correlations into the s y ~ t e r n . Problems ~.~~ can indeed arise when the characteristic length of the correlations approaches or exceeds half of the smallest distance through the simulation box. This problem is one that has a physical origin and thus is part of the behavior of the periodic system being simulated. These difficulties should be handled by employing larger simulation cells, if possible, rather than by modifying the Hamiltonian t o limit these correlations. Using a smooth truncation for the electrostatic potentials, effectively modifying the Hamiltonian of the system to reduce the correlations induced by the periodicity, is not the correct solution to this problem. Clear examples exist in the literature for standard, typical systems where even smooth truncation of electrostatic potentials with a reasonably large truncation radius in the presence of an aqueous interface leads to significant and artificial changes in computed proper tie^'.^^.^^ and structure9’ of the system of interest. If truncation of the potential is not to be attempted, one must then calculate the potential of the infinite periodic system, and a commonly used technique for doing so is the Ewald method.24,84,85,92The Ewald method formulates the slowly converging sum in Eq. [22] as two rapidly converging sums: one in real space similar to Eq. [22]and one in reciprocal space (see below). Implementations of the Ewald method for a system that is finite in one dimension but infinite and periodic in the other two have been published (see, e.g., Refs. 93 and 94).These methods are not in widespread use, however. More typically, Ewald treatments for systems periodic in all three dimensions are employed; that is, the system is also periodic in the z direction, as depicted in Figure 5c. Consequently, the system actually studied consists of an infinite number of parallel slabs of water extending in the remaining dimensions. This geometry introduces the possibility that interactions between molecules of different slabs will affect the results. The extent of this interaction can be monitored as a function of space separating the slabs. For slabs containing only water, it has been
Simulation Methods 1.57 found that the interaction between the slabs is easily controlled with relatively small slab separations (10 A).91 Using the three-dimensional Ewald method, Eq. [22]is written as the sum of four u e = 'real
+ 'recip
+
U o r r + 'polar
~ 3 1
The real space sum, Urea,, is given by
where PEW is a constant controlling the convergence of the real and reciprocal space sums, and erfc is the complementary error function96 erfc(r) =
2
-
8'dt
This function rapidly approaches 0 as r increases. t is an integration variable representing distance. The reciprocal space term, Ureclp,may be written as a sum over m # (0, 0, 0), where m = maa* + mhb* + mcc''. That is,
where
1271 and V is the volume of the simulation cell. Ucorr,the correction term, is given by
where erf is the error [erf(r) = 1 - erfc(r)J.This term corrects the Ewald sum so that sites do not interact with themselves (the first sum) and so that the interactions between pairs of sites (i, j ) , i # j within the same molecule, which should not interact (referred to as the set Ex), are completely excluded. The Ewald sum represents an evaluation of the Coulombic energy for a system
158 Computer Simulation of Water Physisorption at Metal- Water Interfaces
consisting of a central cell surrounded by all the image cells contained in a large sphere, in the limit that the radius of the sphere increases to infinity. The practical consequence in taking this limit is that the Ewald system is always surrounded by a dielectric medium. The last term, Upolar,is related to the external medium and may be written as
This term accounts for how the dielectric constant of the surrounding medium c5 influences the potential energy of the system. Often “tinfoil” boundary conditions, in which the surrounding medium is a conductor (cS= to) and Upolar = 0, are used. A sufficient number of terms can be included in Urea, and Urecipto ensure an accurate description of the electrostatics of the periodic system. Calculating these terms requires most of the CPU time. Changing PEW reduces the computational effort for one of these terms to converge but increases the effort required for the other term. Because erfc is short-ranged, the sum in Urealmay be truncated beyond a selected distance r,. The Fourier space sum can be truncated for (m(< mmax.Studies have been conducted concerning the optimal choices for the parameters r,, PEW, and mmaxto ensure accuracy while minimizing the computational expense of the Ewald treatment.97 It is usually possible to choose Y , to be less than one-half the minimum distance through the simulation cell, so that only n = (0, 0,O) will be considered in the sum Urea,(minimum image boundary conditions). The degree of accuracy needed to perform these calculations seems to be a matter of opinion. One set of conditions that seems adequate for systems containing a few hundred water molecules is to set PEW = 2.8/rc and mmax2 2.5/rc with r, = 12 A, provided Y , is less than half of the shortest distance through the simulation box. Typically, systems containing a few hundred water molecules are sufficiently large for studying water physisorption on metal surfaces, and the version of the Ewald method just described works well for these small systems. For larger systems, other methods that also take into account the long-range nature of the electrostatic interactions, such as the particle mesh E ~ a l and d ~ the ~ Greengard and R o ~ k l i n ~ algorithms, ~ - ’ ~ ~ become computationally more efficient than the method outlined above. Simulations of water in contact with metal surfaces have been performed in a number of ensembles, including the microcanonical ( N V E ) ,canonical ( N V T )and , grand canonical ( K V Tensembles. ) The implementation of these ensembles differs for molecular dynamics and for Monte Carlo techniques. The NVT ensemble is convenient because the temperature of the system is maintained along with the number of particles and the volume. However, with the NVE and NVT ensembles care must be exercised to ensure that the density of the water in the system is consistent with the desired equilibrium state. For
Simulation Methods 159
metal-water interfaces, this means that water molecules near the surface should be in equilibrium with bulk water. For simulations between two walls, the overall density of the water is fixed in the NVE and NVT ensembles, but the water density varies rapidly near metal walls (see below). Hence, it is difficult to determine a priori exactly how much water should be used in the simulation to mimic equilibrium with the bulk solvent. In practice, one can circumambulate this problem using experience and trial and error. One simple way around this problem is to carry out a simulation of a film of water on a metal surface35 as shown in Figure 5b. In principle, a constant pressure ( N P T )simulation in which the simulation cell changes size and shape might also be used, but we are unaware of any applications of this technique to metal-water interfaces. A pVT simulation, in which the number of water molecules varies under a specified chemical potential, enables a constant volume to attain the appropriate den~ i t yUsing . ~ ~ the chemical potential, p, that yields the correct density for bulk water under the same conditions (PT)has the following advantage: the water in the simulation is effectively in equilibrium with a reservoir of bulk liquid water. These kinds of simulation can run for a few CPU hours on current workstations. More typically, however, they require CPU weeks, depending on the size of the system, the nature of the interactions, and the quality of the data required from the simulation. In addition, configurations are stored at frequent intervals during the simulation for analysis later. These trajectories typically occupy as much as a few tens of megabytes of storage space each.
Molecular Dynamics In its simplest sense classical molecular dynamics involves integrating the classical equations of motion for a set of molecules. In this regard, the most salient issues concern initial simulation conditions, numeric integration of the equations of motion, ensembles selection, equilibration, and checks on the simulation.
Initial Conditions To initiate a molecular dynamics simulation, the starting positions and velocities of all particles must be specified. For liquids that are not too viscous, such as water under ambient conditions, placing the molecules on a lattice is an acceptable way to construct an initial configuration. To generate initial velocities consistent with the Maxwell-Boltzmann distribution, one can employ the Box-Muller m e t h ~ d . ~ ~However, .l~l the velocities equilibrate very so it is acceptable to generate them u n i f ~ r m l y that , ~ ~is, ~~ from ~ uwi = (25 l)vmax, where 5 is a random number in the interval [0, 11. In either case, the overall momentum of the system, P = Eel rimi,will be nonzero. The net momentum can be removed by shifting the molecular velocities by -P/M, where M is the total mass of the molecules. Note that for systems in which the metal
160 Computer Simulation of Water Physisorption at Metal-Water Interfaces
is fixed in space, the total momentum need not be conserved. On the time scale of a few hydrogen bond lifetimes (50 ps or less), pure water will typically evolve from such a starting configuration to an equilibrium configuration. Because the velocities are selected in a random manner, the system’s total kinetic energy, K , N
i =1
will not be consistent with the desired temperature. This can be corrected by scaling the velocities by a factor of (Nfk,T/2K)1’2, where N,is the number of degrees of freedom in the system (see below), k, is the Boltzmann constant, and T is the temperature. In Eq. [30] r is the first derivative of the position of the particle with respect to time, i.e., the velocity. Numerical Integration Finite difference methods are usually used to integrate the equations of motion in molecular dynamics s i r n u l a t i ~ n s Using . ~ ~ ~ the ~ ~ state of the system at time t, one integrates the equations of motion over a short time interval S t , to obtain a new state for the system at t + 6t. In a simulation, this process must be repeated many times to map the trajectory of the system over a time interval long enough to permit the sampling of the configurational distribution of the system. Many finite difference methods have been employed in molecular . ~ Verlet ~ ~ ~algorithm, ~ ~ ~ ~ ~one ~ ~of~the ~ most comdynamics s i m u l a t i ~ n s The monly employed, is outlined below. Expanding the particle coordinates r,(t) in a Taylor series, at times t - 6t and t + St, gives
and ri(t + st) = ri ( t )+ G t i j ( t + )
(w2y,(t)
+
.. .
2
respectively, for each atom, i. (Double dot means second derivative, i.e., acceleration.) Adding Eqs. [31] and [32] and solving for ri(t + S t ) gives the Verlet integrator r,(t + 6 t ) = 2r,(t) - r,(t - st)
+ 6t2ri(t) + O(6t4)
1331
where fi = flm, = - ( l/mi)Vr,Utota,, and Utota,is the total potential energy of the system. The Verlet algorithm approximates the equations of motion using Eq. [33] by ignoring terms involving powers of 6 t higher than 2. This algorithm has
Simulation Methods 161 the advantage of being simple to adapt to other techniques and ensembles, as illustrated below. In addition, even though more elaborate algorithms conserve the total energy more accurately over short intervals, the overall drift in the total energy over longer times scales is relatively small for the Verlet algorithm.8s Integration of the equations of motion for flexible molecules can be implemented in a simple manner. One disadvantage of using flexible models in molecular dynamics is that fast internal modes of vibration require one to use significantly smaller time steps, tit, thus increasing the computational costs. The integration of the equations of motion must be handled differently for rigid models such as those commonly used for water (see Table 1).The motion of a molecule can be treated as a rigid body in terms of the center-of-mass positions and orientational coordinate^.^^^^^ Another common approach for treating rigid molecules is to use the method of constraints described below. In this method, some components of the molecular geometry are individually held fixed, and a collection of such constraints can be used to force a molecule to be rigid. Although the method outlined below is for completely rigid molecules, the method is generalizable to treat molecules in which some degrees of freedom are constrained and others are For each constraint, a, imposed on the system, there is a relation, x,, which is a measure of the deviation of the molecular geometry from that imposed by the constraint. When the molecular geometry is perfectly consistent with the constraint
x, = 0
1341
For instance, for a bond length constraint between the atoms i and j,
where d , is the desired bond length. In addition to the forces arising from the potentials, each atom experiences a force, g ,
mir, = fi + g,
[361
that is due to the imposition of constraints. The constraint force is given in terms of Lagrange's undetermined multipliers, h,,83-8-5,104*106,107
where the sum runs over all the constraints and Vr, is the gradient with respect to the position of atom ri' In principle, by using Eqs. 1341, [36], and [37], the equations of motion can be derived to satisfy the imposed constraints. In practice however, reliance on the approximate solutions of the equations of motion inherent in numerical
162 Computer Simulation of Water Physisorption a t Metal- Water Interfaces integration causes the atoms to follow trajectories that are not perfectly aligned with their formal trajectories. The error associated with integrating the equations of motion accumulates, and ultimately the constraints are no longer satisfied for long enough periods of time to be u s e f ~ l A. way ~ ~ around ~ ~ ~ this using problem is to enforce the constraints at each time step,84,85~104~106~107 miri = fi + g:
~381
where g: is the force applied to ensure that the constraints in Eq. [34] are actually met for the current time step. This is implemented by predicting the position of the atoms without the constraint forces and then determining the forces needed to bring the new atom positions into agreement with the constraints. The exact implementation of this approach depends on the method being used to integrate the equations of motion. For the Verlet method, substituting Eq. [38] into Eq. [33] yields ri ( t + fit)= 2ri(t)- ri ( t - fit)+
(fitI2[fl(t)+g$)l mi
1391
Note that the predicted position in the absence of constraints rfo(t t S t ) is simply r P o ( t + 6 t ) = 2 r i ( t ) - r i ( t-tit)+-
l2 fi ( t 1 mi
~401
where the 0 superscript indicates that this is the initial prediction for the position. Equation [39] may thus be written in the more general form:
where k = 1. Substituting Eq. [37] for g f ( t )and Eq. [41] into Eq. [35] yields the set of equations needed to satisfy the constraints. For fixed bond lengths, the constraint equations involve terms that are first and second order in the A,%. Dropping the second-order terms yields a system of linear equations that can easily be solved for the Am’s. These in turn are used to give gf(t)needed in Eq. [41] to provide the first estimate of the constrained positions of the atoms, rt(t + St). Because an approximation is used, the constraints cannot be rigorously met. However, estimates for the new position may be used by substituting rt(t + 8t) for r f l ( t + St) into Eq. [41] and repeating the process described above to obtain a second estimate for r f ( t + St), where k = 2. This process is iterated until the constraints are satisfied to within a specified tolerance (typically, one part in l o 9 or better). The constraint method permits the use of time steps as large as 2.5 fs for pure water and for water interacting with rigid metals.
Simulation Methods 163
For a three-site water model such as SPC or SPCIE, the rigidity of the molecule can be maintained by using three distance constraints, one along each 0H bond and one between the H atoms. Other commonly used water models such as TIP4P and TIP4P-FP contain four coplanar sites, but it is difficult to maintain planarity of the molecule using the recipe given above. The reason is that the constraint forces between the fourth site and the other three all lie in the plane of the molecule and thus are ineffective at enforcing a coplanar geomIn these models the additional site, taken here to be the M site (Figure I), is massless. A solution to this particular situationlo6is to make the constraint for the additional site (M) a function of the three primary sites (I = 1, 293) 3
'M = ccl'l 1=1
where the cis are constants for a particular water model. Integrating the equations of motion for the primary sites can be carried out as described above, while Eq. [42] is used to define the position of the fourth site (a secondary site). The catch is that the fourth site may interact with other sites in the system (the M site carries a charge), and thus forces acting on this site, fM, must be transferred onto the primary sites. This is done by adding a term, clfM, to Eq. [37] for each primary atom, 1, that is:
The constraint method has extensions to more complex models than those described here. Interested readers should consult Reference 106.
Molecular Dynamics in the NVT Ensemble The natural ensemble for molecular dynamics simulations is the NVE ensemble. When implementing molecular dynamics as described above, the total energy of the system is well conserved. It is often desirable and somewhat more convenient, however, to carry out simulations in the NVT (canonical)ensemble, because one is often interested in performing simulations at or near a particular temperature. A number of methods exist for carrying out molecular dynamics simulations in this ensemble. 84,85 We restrict ourselves to describing the widely used approach of employing NosC-Hoover t h e r m ~ s t a t sThe . ~ ~idea ~~~~ behind this method is that the system of interest can exchange heat with a heat bath to obtain the desired average temperature during a simulation. In practice, this can be implemented by adding one degree of freedom to represent the bath of the system of interest. Heat is transferred dynamically to this degree of freedom when the system is too hot, and heat is transferred out when the system is too cool. The equations of motion for a system with a NosC-Hoover thermostat are
164 Computer Simulation of Water Physisorption at Metal- Water Interfaces
i s
-dins dt
-5
where s is the dynamical variable for the thermostat, and g is the number of degrees of freedom in the system; g = 3N - C - g , where N is the number 4 of atoms in the system, C is the number of constrained degrees of freedom, and g is 3 if the net momentum of the system is conserved and 0 if it is not (e.g., for a corrugated metal surface that is held at a fixed position). The equations of motion for the atoms in the system are integrated as before but with the inclusion of the term tii in the acceleration and with two other modifications. First, integration of the equations of motion for 5 is given by
and for In s is given by
Second, the current velocities that appear in the accelerations of the particles are problematic because they are not calcualted in the normal Verlet method. FerrarioIo9 outlined a way to obtain estimates for the current velocities, whereby an initial estimate of the current velocities of the atoms is obtained from the preceding time step t - 8t using ii(t)= i;(t- 26t) + 2 q t - St)6t
[471
These velocities can be used to estimate the damping term tii (and In s), which is used in the normal Verlet integration scheme to estimate the positions rz(t+ at). A better estimate for the current velocities can be obtained using i l ( t= )
ri (t + 6 t )- ri(t - 6 t ) 26t
Simulation Methods 16.5 These velocities in turn are used to obtain a better estimate of r,(t + tit) and thus another refinement of the current velocities. This process can be iterated until the estimates converge, typically in only a few cycles. The term 6ii plays a role similar to a friction term in Brownian dynamics except that it can also serve to speed up the molecules if 6 takes on negative values. The parameter M Sis a generalized mass. Its precise value is not crucial, but for efficient coupling between the system and the heat bath, it should be selected so that the thermostat relaxation time corresponds roughly to a typical oscillation period for the molecules in the system.'1° The integration of In s in Eq. [46] is redundant in the equations of motion. However, it appears in the expression,
i=l
2mi
2
A virtue of this simple method is that it deterministically samples phase space in a manner consistent with the canonical e n ~ e r n b l e ,lo ~ ~while ~ ' providing a quantity, HNosc,similar to a total energy, which is conserved. The latter property is of value because conservation is a useful check on the validity of the simulation. Both thermostats and constraints modify the positions of the atoms in the system, so the effects of applying each must be accounted for. It is straightforward to do this by first applying the thermostats and then enforcing the constraints that in turn will affect the temperature. By iterating the application of both the thermostats and the constraints, a consistent solution can typically be obtained in a few cycles. Equilibration The initial choices of configurations and velocities are usually not typical configurations for the system when it is at equilibrium. As a result, it is necessary to simulate the system for a time period prior to the actual production simulation, allowing the system to relax from its initial artificial state into one that is more characteristic of the system at equilibrium. During the equlibration period, it is often desirable to scale (or even reset) the velocities several times to attain the target temperature and to encourage equilibration. It is sometimes advantageous to overadjust the velocities by a factor of roughly to allow for the repartitioning of energy between the potential and kinetic terms. A question typically posed by computational chemists is: When is equilibration complete? There are no hard and fast rules applicable to all cases; rather, the answer depends on what properties are of interest. Typically, one proceeds with the simulation and monitors characteristic properties of the system until they are constant or attain anticipated values. At that point, one begins collecting the data of interest (the actual dividing line between the equili-
166 Computer Sirnutation of Water Physisorption at Metal- Water Interfaces
bration and simulation stages can be established after the calculation is complete). In most systems that have not been preequilibrated, the initial potential energy is quite different from the average equilibrium potential energy, and this value can be monitored until it comes to fluctuate about a consistent level. While the potential energy is evolving, so does the temperature. Stabilization of these energetic factors is a necessary but not sufficient condition for considering that the system has been equilibrated. In aqueous systems, energetic properties can settle down remarkably quickly. Since, however, there are compensating cancellations between interactions within these systems, configurations that are not characteristic of equilibrium states can have energies that are similar to typical equilibrium states. In water, for example, the hydrogen bond lifetime is in the neighborhood of 5 ps. For pure water, the equilibration stage should thus be at least several times longer than this period. To further evaluate the system’s equilibration, it is appropriate to monitor some other properties of interest, such as radial distribution functions or density profiles of the molecules with respect to the interface.
Performing the Simulations To obtain meaningful averages, a simulation should be conducted for a sufficiently long time period to permit averaging over the inherent fluctuations of the properties of interest. Given the hydrogen bond lifetime for pure water, a minimum of 50-100 ps is needed, but some properties may require much longer. For instance, the diffusion rate of water along a metal surface has not yet been computed accurately because this process is very slow. The conservation of energy or in the case of an NVT simulation, HNose, serves as a basic check on the correctness and quality of the simulation. Where possible, the results of a study, particularly for new programs, should be checked carefully against other existing results. Our take-home message is that great care is necessary to ensure high quality results, and one must assess those results with the understanding that they are prone to misinterpretation.
Monte Carlo Methods The Metropolis Monte Carlo method attempts to sample a representative set of equilibrium states in a manner that facilitates the calculation of meaningful averages for properties of the system. This method is discussed in the following subsections: Basic Aspects of the Metropolis Monte Carlo Method, Monte Carlo Moves, and General Pointers for Conducting Monte Carlo Simulations.
Basic Aspects of the Metropolis Monte Carla Method The probability of an arrangement (labeled a ) of N indistinguishable water molecules, in the canonical ensemble, is given by
Simulation Methods 167 pa
,-PU* = ( QA3 )-Nd a d R -
Q
where A is the thermal de Broglie wavelength (A = Vph2/27rmw, h is Planck's ~ a~ nonlinear species constant, and mwthe mass of a water molecule), R is 8 7 for like water, p = (kBT)-l,and Uais the total potential energy of the ath state of the system. The term dR is the product of all solid angle elements for all the water molecules in the system, and the corresponding term dR is the product of all volume elements for all water molecules in the system; as usual, Q is the canonical partition function. The momenta do not explicitly appear in this equation. Instead, they have been averaged, under the assumption that the system is at equilibrium. Averages of a property A can be calculated using
where the integrals run over all positions and orientations available to the molecules. Direct numerical integration of the 6N-dimensional integrals is not practical even for a small collection of water molecules. Monte Carlo simulations attempt to overcome this problem by randomly sampling characteristic configurations of the system, enabling one to obtain meaningful average properties. The averages calculated with this equation should agree with those calculated in an MD simulation for the same ensemble, provided adequate sampling is possible and has been achieved. The probability p a varies rapidly with the configuration of water molecules in the liquid state. In fact, for liquid water (and most liquids) under ambient conditions, the overwhelming majority of the possible microstates have minuscule probabilities. Consequently, a completely random sampling of configurations is impractical because only exceedingly rare configurations provide significant contributions, and it is usually necessary to average over many such contributions for many properties of interest (e.g., the average potential energy) to make a reasonable estimate for any such property. To overcome this problem, Metropolis et a1.ll1 devised a method that biases the configurations sampled so that those making significant contributions are visited most often (importance sampling). The Metropolis Monte Carlo method is still the most commonly employed Monte Carlo simulation method for molecular systems. A Metropolis Monte Carlo simulation starts with a collection of molecules in a known configuration. The simulation consists of a large number of steps, each of which is an attempt to introduce an acceptable change in the collection of molecules. This change is either accepted or rejected based on a simple set of rules that ensure consistency of the results with the desired ensemble. If a change is accepted, the new state is used to generate the next step of the simulation; otherwise, the unmodified configuration is used. Translational and rotational moves in the canonical ensemble are described below, followed by
168 Comi7uter Simulation of Water Phvsisorption at Metal- Water Interfaces
an explanation of insertion and deletion moves within the grand canonical ensemble.
Monte Carlo Moves Figure 6 depicts an attempt to use a Monte Carlo move to translate a molecule in a system that is currently in a state labeled “a” to a new position, thereby creating a new state for the system as a whole, referred to as “b.” The specific translation being attempted is based on random numbers, and there exist many ways to do this. A very simple and commonly used method is to choose the new position for the molecule from within a small cube with sides of length 2 Srmaxcentered on the current position of the molecule, that is,
where a different random number, 5, is used for each dimension w. The probability of choosing the state b, called aab,from within this cube is
where dr; is the volume element of the position of i, the molecule being translated. The condition most commonly used to reject or accept the new state b is based on the concept of “detailed balance,” requiring that for a sufficiently long
0 0 0
0
0
0
0 .
\:
0
0
0 .
....
0
5**.. i
0
.:v
0
0
O0
0
0
Figure 6 Monte Carlo displacement moves. An attempt is made to change state a (the current state of the system) into state 6 with a probability aab.The reverse process involves changing state 6 into a with a probability, aha. Equations [ 5 6 ] and [57] are used to decide whether to accept or reject the change from state a to state 6.
Simulation Methods 169
simulation of a system at equilibrium, the probability of making a successful Monte Carlo move from any state a to any other state b is equal to the probability of successfully making the reverse move from state b to state a. Detailed balance can be expressed mathematically as *ahpa
= Tbapb
~ 4 1
where nBbis the probability that a system in state a will change into one in state b, and vbais the corresponding probability for the system changing from state b to state a. vabmay be written
where fab is the probability that an attempted move from state a to state b is accepted. Equations 1541 and [SS]can be used to give
Metropolis et a1.ll1 ensured that this equation, and thus detailed balance, are satisfied by requiring
where the function MIN takes on the value of the smaller of its two arguments. This equation indicates that the probability of accepting a move is determined by the inherent probabilities of the states a and b, and by the probabilities that attempts will be made to change the system from a to b and vice versa. Once a f, has been calculated, the move is accepted if fa,, is greater than or equal to a uniform random number on the interval [0,1]. For the translational move, substituting Eqs. [SO] and 1.531 into Eq. 1561 gives
Further, substituting Eq. LS8] into Eq. 1.571 and comparing the resulting fab to a random number on [0,1] determines whether the move is accepted or rejected. Equation [SS] involves the ratio of pb over pa, so the partition function, Q, appearing in both p , and pb, cancels. Similarly, for this translational move, a,,, and ahaare equal and thus cancel.
170 Computer Simulation of Water Physisorption at Metal- Water Interfaces Clearly, the nature of the moves employed in a Monte Carlo simulation is relatively unrestricted and is based on considerations relating to the efficiency of the sampling. This freedom has led to a wide variety of valid (and perhaps in some cases invalid) schemes for performing these simulations. Because water molecules are not spherically symmetric, a Monte Carlo simulation should also sample their orientations. A common approach for describing the orientation of the molecules is to use the Euler angles (4, 8, Although orientation moves are similar in spirit to translational moves, care must be taken to avoid violations of detailed balance or numeric difficulties. For example, using Euler angles, the dR term in Eq. [SO] becomes
dR
=
sin 8 d+ d8 d$
1591
and the ratio of the probabilities p a and p , is proportional to sin Obisin 0,. This ratio can be unreasonably large when Oa is small or zero. However, as we now show, this problem can be avoided by uniformly generating changes in cos 8 rather than 8 itself. Presented here is a method we have employed, but there exist a number of equally appropriate method^.*^^*^ The reorientation of the molecule is described within the molecular frames of reference for both the a and b states for the forward and reverse moves, respectively, as illustrated in Figure 7. The rotation matrix Ra, corresponding to the Euler angles (+a, Oa, $J, is cos+,cos+,
- C O S ~ ,sin+,sin+,
-sin+, cos+, -cosO, sin+, cos+, sine, sin+,
cos(~,sin+, +~0~0,cos+,sinJr,
sin0, sin+,
-sin+, sin+, - sin 8, cos
sine, cos+,
+,
+ C O S ~ , cos+, COSIJJ,
COS8,
1
[601
such that rm = Rar
t611
where r is a vector measured from the molecule’s center of mass in the spacefixed frame, and rm is the same vector described in the molecular frame of reference. One can generate a rotation matrix, R,, for state b by performing a series of rotations. It is advantageous to perform the rotations in the molecular frame, because the move will then conform to a recognizable change in the molecular orientation. Within the molecular frame, a rotation by an angle &$, selected uniformly up to a specified maximum value is applied first. The corresponding rotation matrix is cos6+ R8+ = -sin6+ 0
sin&$ COS~+
0
0
Simulation Methods 171
Figure 7 Monte Carlo rotational moves. States a and 6 are represented by the moleis the rotational matrix for the move involving rotating cule fixed coordinate axis. the molecule from state a to state b, expressed in the coordinate system of state a. RPi is the rotational matrix for the reverse move involving rotating the molecule from state 6 to state a, expressed in the coordinate system of state 6.
R2
Rotation by an angle 68 is then applied to the molecule. Instead of sampling 60 uniformly, it is better to sample G(cos e) uniformly, up to its specified as follows: maximum value G(cos emax),84
The corresponding rotation matrix is now
1641
vl-
where G(sin 0) = (6cose)2. Finally, the molecule is rotated by an angle 6+, chosen uniformly up to a maximum value (usually picked to be equal to The corresponding rotation matrix is then
172 Computer Simulation of Water Physisorption at Metal- Water Interfaces
The new rotation matrix, R,, can be calculated using
R,
=
Rfd Ro
1661
and taking the product of three matrices
Euler angles for state b, (+b, eb, +b), can be calculated from R,. Expressions for a,, and ahafor the rotational move can now be written. The contributions for and are simply d+/6+,ax and d+/6+max,respectively. Because a random uniform change was made in cos 8, the probability for a particular change in 8 is given by sin @/(1 - 6 cos @max).lol Hence,
+
+
Similarly,
1691 where the (a) and (b)superscripts indicate that the rotations were performed in the molecular fixed frames for the a and b states, respectively. The ratio of the probability of state b (in the reference frame of state a ) to the probability of state a (in reference frame of state b ) is given by
Substituting Eqs. [68],[69], and [70] into Eq. [56] yields
Note that by choosing to change G(cos 8) with a uniform probability, we cancel the potentially troublesome sin 8 terms. An alternative method for specifying and tracking molecular orientations
Simulation Methods 173 is to employ unit quaternions (unit vectors in four-dimensional space).84~ss~113 Quaternions do not have the problems due to the sin 8 terms that arise when Euler angles are used, but the description of these vectors is somewhat less in-
tuitive than that provided by Euler angles. It is straightforward to construct compound moves in which both the molecular center of mass and the orientation of the molecule are simultaneously modified. The rate of exploring configurational space depends on the rate at which moves are accepted and the magnitudes of the changes in the configuration for the accepted moves. To make the most efficient use of computer time, then, it is important to consider the maximum change that should be used. For instance, using a small armax for the displacements of the water molecules will lead to small changes in the configuration for moves that are accepted. However, since the energy changes corresponding to these moves will be small, they will have high acceptance probabilities. Conversely, large values for 8ymax will correspond to low acceptance rates for larger changes in the configuration. The choice of 8ymax thus controls how rapidly the system explores available configuration space and thus how efficiently meaningful averages can be calculated. Usually the value of 8yrnax is defined in terms of the corresponding acceptance rates for these moves. A common practice is the choose the upper limit for the size of a change, so that the acceptance probability is near 50%.84,85Evidence has recently emerged suggesting that acceptance rates of -25% can lead to more efficient exploration of configurational space for dense Lennard-Jones f l ~ i d s * ~and , l ~for ~ liquid water.llS Values for the upper limits on the size of a change for a given target acceptance rate usually can be established by trial and error using short simulations. Another approach is to automatically adjust the maximum size of the change during the ~ i m u l a t i o n Strictly .~~ speaking, such adjustments lead to violations of detailed balance. For instance, if a move close to 8ymax is accepted but then a decision is made to reduce the size of Gymax, the simulation cannot attempt to get back to the old state. If the adjustments are small (we typically change the limit by so/,)and are made infrequently (once every 50 moves per particle appears safe), the simulation results are not noticeably affected by this process. 1 1 6 A grand canonical Monte Carlo simulation is somewhat similar to a canonical ensemble simulation. The same translational and rotational moves for the molecules are used, with identical acceptance criteria. Also, moves attempting to introduce an additional water molecule into the system (insertions) or to remove a water from the system (deletions) are included. For an insertion, the position of the new water molecule is selected randomly within the simulation cell, whereas its orientation is chosen in a manner analogous to that used for reorientating a water molecule, that is, using = 8 = = 0 and G+max = 8+max= 2.rr and ~ ( C O S8)max= 2. The acceptance criterion for water insertions done this way is given by
+
+
174 Computer Simulation of Water Physisorption at Metal-Water Interfaces where p. is the chemical potential of the water specified at the beginning of the simulation. For a deletion attempt, the molecule to be removed is selected randomly. The acceptance criterion for this move is given by
The acceptance rates for insertions and deletions can be exceedingly low, making it impractical to employ this technique in many cases. However, for water under ambient conditions this technique can be successfully utilized at the cost of increased computational time. Where applicable, grand canonical Monte Carlo simulations enable constant volume simulations to attain the appropriate densities. This is particularly useful when one is simulating a system in which the appropriate density is not known a priori, as is often the case for inhomogeneous systems.
General Pointers for Conducting Monte Carlo Simulations When moves of different types, such as displacements, rotations, insertions, and deletions, are used in the same simulation, the order in which moves are made becomes important. In particular, imposing a fixed order for each type of move can lead to situations in which the reverse move is not possible, thus violating detailed balance. This problem can be avoided by including randomness in the choice of what type of move to attempt at any stage of the simulation. When carrying out Monte Carlo simulations, it is sometimes worthwhile to use the concept of a “pass,” where a pass is specified as roughly one Monte Carlo displacement (translation andlor rotation) attempt per water molecule in the system. In our experience, one Monte Carlo pass results in approximately the same magnitude of change as found in one molecular dynamics time step of 2.5 fs for water under ambient conditions. This comparison can help one estimate the number of Monte Carlo steps needed to equilibrate the system and sample different configurations, adequately. The approach we use for our Monte Carlo simulations is to specify the desired number of moves for each type of change per pass ntypein the input file. At each stage of the simulation, where the type of move attempted is chosen with a probability, ntypJNpass, Npassis the total number of moves per pass. Unlike molecular dynamics methods, Monte Carlo simulations lack conserved quantities such as energy or linear momentum that can serve as a convenient flag to problems in the simulation. Consequently, careful testing of the program and comparison of results from other programs or with well-established literature values is thus essential to ensure that the simulations are meaningful. The random number generator used in Monte Carlo simulations should also be chosen carefully, because random number generators having undersirable properties can lead to incorrect results.ss For most simulations of liquid water, the random number generator does not take up a significant fraction of the total CPU
Analvsis and Results for Metal- Water lnterfaces 175 time, so one can afford to use a slow but effective random number generator.84 Further information on random number generators can be found elsewhere.'0' Initial coordinates for the molecules present in the system being studied are required. As with molecular dynamics, the molecules can be placed initially on a lattice. An effective way to generate an initial configuration is to use a grand canonical Monte Carlo simulation to randomize the placement of molecules in the system. To accomplish this effectively, it is usually necessary to use a high chemical potential during the initial stages of the simulation, forcing enough molecules in the original simulation cell to cause the interaction energies to approach those normally experienced in the liquid state. Concerns regarding equilibration of the system in Monte Carlo simulations are similar to those in molecular dynamics simulations. One often has some a priori knowledge of the time scales for events in the system being studied. In molecular dynamics studies, these time scales can be used as a guide for estimating the numerical integration time step size, the equilibration times, and the simulation times required. But, for Monte Carlo simulations of the nature described in this chapter, time is irrelevant, making it more difficult to estimate the length of the simulation required for both equilibration and sampling. Fortunately, the rough correspondence between the system evolution, based on attempted translational or rotational moves per molecule, and how much the system evolves during a typical molecular dynamics time step (2.5 fs for water) provides the means for making such estimates. For grand canonical simulations, the equilibration period should be extended to ensure that no net drift in the number of water molecules is present. One common mistake made by novices using Monte Carlo simulations is to update averages only when a move is accepted. This is incorrect, for example, in a two-state system, in which the states have different probabilities. Updating averages only when the system changes state will always result in an arithmetic average of the property for the two states regardless of how differently the states are populated at equilibrium. One of the great advantages of the Metropolis Monte Carlo approach as opposed to some other Monte Carlo methods is that the configurations are sampled using the appropriate probability for the given ensemble. As a result, averages for properties can be calculated by simply averaging the value of this property at the end of each stage of the simulation without any weighting factors.
ANALYSIS A N D RESULTS FOR METAL-WATER INTERFACES In this section, we discuss the types of analysis used for metal-water interfaces and provide examples of results from the literature. Generally, analyses of some types are conducted during the simulation; however, typically most of the more complex kinds of analyses are carried out using stored trajectories following the simulation. The analysis and results of simulations to be discussed
176 Computer Simulation of Water Physisorption at Metal- Water Interfaces are categorized under visualization, electron density from a jellium model, structure, dynamics, and miscellaneous properties.
Visualization Interactive visualization of the molecular configurations during the iterative course of code development, system setup, simulation, and analysis stages of research is an indispensable and often undervalued tool. For researchers debugging code, visualization of the system can provide a quick check and illuminate the cause of a problem. Visualization is an important first check on appropriateness of both the initialization and execution of a simulation; it should be one of the first resorts employed in troubleshooting. For problem-free simulations, visualization can also provide the researcher with an impression of the nature of the system’s behavior and the time scales of events. Visualization obviously overlaps with the numerical analysis of the structures and dynamics of the molecules in the system, but visualization warrants special attention because it provides a means to convey information more succinctly, enabling us to bring into play our abilities to conceptualize that information. In addition, some system behaviors and structures, particularly collective ones that may be difficult to anticipate and analyze, become obvious in graphical images. For example, the improper ordering of the molecules arising from boundary condition effects as described in Reference 91 was detected and quickly understood by interactively viewing a configuration on a graphics workstation. Such problems occurred in earlier simulation^,'^^ but their detection was likely precluded by the inferior level of graphics readily available at the time. Figure 8 presents side views from Monte Carlo simulations of two monolayers and a single slab of water contained between flat rigid Hg surfaces.39 For the slab, one clearly sees a first layer and a somewhat noticeable second layer. The monolayer is distinct, narrow, and somewhat comparable to the first layer in the slab simulation. Top views of the first water layer on smooth and corrugated surfaces are given in Figure 9 . For the monolayer on a smooth surface, many of the water molecules arrange themselves in a square pattern, making it possible for those water molecules to be involved in four in-plane hydrogen bonds. For slabs on a smooth surface (Figure 9 , top), the situation is different. Here there are occasionally squares of water molecules, but more typically the water molecules arrange themselves in pentagons. One way to understand the change in behavior between monolayer and slab is visible in Figure 8. Hydrogen bonding occurs between the first and second layers in the slab, resulting in each water molecule being involved in roughly four hydrogen bonds without needing to be arranged in a square pattern. The pentagons might arise from a “frustration” of the water structure. However, another possibility exists to explain this observation. Given that the H-0-H bond angle is close to that needed for a regular pentagon (1087, water in planar configurations might prefer to adopt such shapes. Either way, molecular visualization provides clues and generates new directions of pursuit which may not be evident from numerical analysis.
Analysis and Results for Metal- Water lnterfaces 177
Figure 8 Configurations of water near two rigid metal (Hg) surfaces produced by a simulation (Ref. 39). Covalent radii are used to represent 0 (gray) and H (black) atoms in the water molecules. Lines between the water molecules represent hydrogen bonds. Top: depletion of a slab of water wide enough for the water structure to be bulklike in the center. Bottom: monolayers of water on the two surfaces.
For corrugated surfaces (Figure 9, bottom) the water tends to adopt configurations in which molecules lie close to the potential minima in the surface.7,39,434s,47-7s>79 This behavior forces a shift from pentagonal arrangements to squares, trapezoids, and occasionally h e ~ a g o n s . ~ ~ ~ ~ ~ ~ ~
178 Cornmiter Simulation o f Water PhvsisorPtion at Metal- Water Interfaces
Figure 9 Representations of the first water layer on a rigid metal surface, as in Figure 8. Covalent radii are used to represent 0 (gray) and H (black) atoms in the water molecules. Lines between the water molecules represent hydrogen bonds. Top: water adjacent to a smooth (noncorrugated)metal surface for monolayers and a slab of water. Bottom: water configurations adjacent to corrugated (111)metal surfaces. Both views look down on the water, with the metal surface (not shown) below.
Analysis and Results for Metal- Water Interfaces 179 I
"
'
"
"
"
I
"
I
d
Figure 10 The electron density presented as n n+ for jellium as a function of distance,
z, from the jellium edge: solid curve, charge ensity for the positive, uniform charge background; dashed curve, charge density of the electrons.
Electron Density for Jellium The electron density of the model Hg-water interface obtained from jelhum calculation^^^ is shown in Figure 10. Within the metal, the density oscil) ~ quickly ' ~ , decays to the lates with a wavelength d k F ,where k, = ( 3 ~ ~ n +and bulk density within the metallic slab. These "Friedel oscillations" originate from the step discontinuity in the background charge density. Outside the metal, the electron density distribution decays rapidly to an asymptotic zero density. The strength of the metal-water interaction is determined by the magnitude of the electron density at the charge sites of water molecules. Thus it is primarily the tail of the distribution (and the density of water molecules in the surface monolayer) that is of interest in this respect. In the vicinity of the first maximum in p(z) for water molecules, the electron density decays to about 2% of the bulk value. Hydrogen atoms can approach to within 0.9 A of the jellium edge, where the electron density is 9.4% of the bulk value. The electron density distribution at the model Hg-water interface is similar to the Hg-vacuum interface. Variations in the density distribution due to the solvent are generally of the order 0.1% of the bulk value. Nevertheless, these small variations in the charge density distribution can give rise to a change in the poJ ~ Figure ll).Furthertential drop across the interface of about 0.1 v 0 1 t s ~ ~(see more, in the presence of solvent, the electron density spills farther into the solvent.
180 Computer Simulation of Water Physisorption at Metal- Water Interfaces
4.0 3.0 2.0 8
1.o
0.0
-1
.o ~
-5.0
0.0
z
(4
5.0
10.0
Figure 11 The total potential drop @ through the metal-water interface and its components as a function of distance z (Ref. 39). The long dashed curve is the potential drop through the metal-vacuum interface. The solid curve is the total potential drop through the metal-water interface. The total potential drop is the sum of the contributions from the solvated metal (dot-dash curve) and water potentials (dashed curve).
The electron density distribution, along with other properties of the metal, shows an oscillatory dependence on the width of the metal slab. These oscillations correspond to the inclusion of one additional occupied eigenstate. For narrow slabs (about 20 8, wide), these oscillations can be significant. For example, the potential drop across the Hg-vacuum interface oscillates by about 0.1 volt. For the results considered here, a slab width of 64 A was used giving rise to oscillations of only 0.01 volt. As the width of the slab increases further, the semiinfinite limit of the potential drop is a p p r ~ a c h e d . ~ ~
Structure Several structural properties are typically monitored to characterize water in these systems. Some of these are the density distribution of water molecules with respect to the surface, surface area per water molecule, the root-mean-square displacements from the optimal surface positions for corrugated surfaces, angular distributions of the dipolar and 0-H bond vectors with respect to the surface normal, and moments of the angular distributions. Most of these characterize the water structure with reference to the metal surface. In contrast, the radial distribution function, number of nearest neighbors per molecule, and number of hydrogen bonds per molecule are used to characterize water-water interactions.
Analysis and Results for Metal- Water Interfaces 181 8.0 I
I
m-
oa
I
I
1
I
1 1
I
' I
I
I
6.0
I
v
0.0 2.0
4.0
6.0
z
(ATmo
10.0
12.0
Figure 12 The density of the water atoms, 0 (solid) and H (dashed, divided by 2), as a function of distance z from a rigid metal surface for a smooth surface (heavy lines) and a 111 surface (thin lines) from simulations of water slabs (Ref. 39). Insert: corresponding distributions for a water monolayer on the same metal surfaces. Note that the smooth and 111 curves are very similar in the slab and indistinguishable in the insert.
The density of atoms or molecules with respect to the interface, p(z), can be calculated by averaging the number of water molecules present in narrow planar slices of z (binning) and calculating the average density within each slice. Figure 1 2 portrays p(z) for both the oxygen and hydrogen atoms of water with respect t o a flat surface and a 111, rigid Hg surface for both slab and monolayer cases. There are two and perhaps even three distinct layers of water extending 7-1 0 8, out from the metal surface. These results are representative for solid or rigid metal models of Hg,39742,118Pt,7,3s,40,44,118,119and Ag.78 Studies of a liquid Hg surface show that the peaks in the density of the water molecules are broader and shorter in c o m p a r i s ~ n . However, ~ ~ . ~ ~ a convolution of the distribution of Hg atoms in the first layer of the liquid-metal surface compared with the distribution of water from a corresponding rigid Hg surface suggests that the broadening is due to the roughness of the metal surface rather than to a different ordering of the water with respect to the metal.3s It is tempting to try to estimate the characteristic density for water within the first peak of the water density profile. Such estimations are difficult, however, because this layer is significantly narrower than the diameter of a water molecule (roughly 2.8 A) and the density varies rapidly over this range, making the results somewhat ill-defined. Instead, it is more appropriate to estimate the
182 Combuter Simulation of Water Phvsisorbtion at Metal- Water Interfaces Table 2. Characterization of the Physisorption of Water onto a Hg Surfacea Area per Molecule (AZ)b Adsorption Layer 1 Energy (Jim')' Layer 2: Simulation Monolayer Slab Slab Monolayer Slab I
Flat 100 111
8.4 (0.1) 8.8 (0.1) 8.2 (0.1)
8.35 (0.1) 8.6 (0.1) 7.9 (0.1)
10.4 (0.3) 11.1 (0.3) 11.0 (0.5)
-1.28 (0.01) -1.17 (0.01) -1.29 (0.01)
-0.619 -0.539 -0.541
"Values are for a rigid Hg surface (Ref. 39). 'Talues in parentheses are the standard deviations in same units.
surface area per water molecule in this layer by counting the number of molecules within the peak and dividing that sum by the planar surface area (calculated assuming that the surface is perfectly smooth). Table 2 lists selected surface areas for water adjacent to rigid Hg surfaces. The average root-mean-square displacements of first layer water molecules from potential minima for the water-metal potential functions along the metal surface for the 100 and 111rigid Hg-water interface are roughly 0.5 This small value indicates that water molecules spend much of their time in the vicinity of the potential minima, as is evident from the images of the interface in Figure 9. Angular distributions for the water molecules can be quite useful for understanding the average water structure near an interface. In particular, the distributions of the 8, the angle between the surface normal pointing into the water and the dipole moment or the 0-H bond vectors, are often considered. These distributions are calculated by binning, as was done for p(z) above. It is usually better to bin the cosine of 8, which will be flat when the molecules are oriented randomly. A distribution plotted as a function of 8 itself would follow sin 8 for a random distribution. In addition to binning these functions in the angle, it is quite useful to also plot (bin) them as a function of the distance from the surface, z. Such distributions are depicted in Figure 13. In the distributions for the 0-H bond vectors, the distinct peaks present in the water layer shift away from the surface normal at a nearly constant rate of 90" per layer of water and weaken with increasing z. A similar but much less consistent pattern is evident in the distributions for the dipole moment. To condense the information contained in the angular distributions of the dipoles to a more manageable form, it is common to calculate order parameters such as and
Analysis and Results for Metal- Water Interfaces 183
8.0
h
o
6.0
6.0
4.0
4.0
a
W
N
L."
2.0 0.0 4.0-1.0-0.5 0.0 0.5 1.0 -0.5 0.0 0.5 1.0 P W
Figure 13 Probability densities P(cos(0J) and P(cos(BoH)) of the dipole and OH bonds of the water molecule with respect to the surface normal directed into the water and plotted as a function of distance z from the metal surface. The center-of-mass density of water p, is given to indicate the water layer positions. These results are for a smooth metal surface (Ref. 39).
where the averaging is conducted over all molecules within a certain range of z values, and OF is the angle between the surface normal and the molecular dipole. Figure 14 plots SA(z)and S;(z) for a Hg-water interface: sh(~) provides a measure of the net polarization of the water as a function of distance from the surface; S;(Z) provides a measure of the extent to which the dipole moments of the water molecules are directed along or perpendicular to the metal surface. If the dipoles of the water molecules tend to align parallel or antiparallel to the surface normal, S;(Z) is positive. If they tend to lie parallel to the surface itself, then Si(z) is negative; and if they have an isotropic distribution, S ~ ( Zis) 0. The information presented in Figures 13 and 14 is representative of a number of studies on Hg, Pt, and Ag.35,44,4s,49,78The water molecules that lie closest to the metal are found to lie flat on the metal surface. In the center of the first layer this population splits in two, with some water molecules lying flat on the surface having a slight tilt of both OH bonds and the dipole pointing toward the surface, whereas other water molecules adopt an orientation such that the plane of the molecule is perpendicular to the surface and one OH bond points away from the surface (see Figure 15).The latter population becomes dominant in the outer half of the first layer. The structuring in the orientational distributions of the water molecules is significantly weaker in the second layer. Among the wide range of orientations exhibited in this region, there exist two distinguishable
284 Computer Simulation of Water Physisorption at Metal- Water Interfaces
0.00
(/> -0.25
-0.50
'
2.00
I
4.00
I
I
I
10.00
I
12.00
Figure 14 Sk (solid) and S', (dashed)as a function of distance from a smooth metal-water interface (Ref. 39). Beyond roughly 7 A, the results become indistinguishable from the noise.
Figure 15 Two important orientations for the water molecule in the central portion of the water layer adjacent to the metal surface. (a) Some water molecules lie nearly flat on the metal surface, with their dipole moments and OH bond vectors pointing slightly into the surface. (b)Other molecules have one OH bond vector pointing away from the metal surface.
Analysis and Results for Metal- Water Interfaces 185 groups of water molecules. Both have the molecular plane parallel to the surface normal. One group having an OH bond pointing toward the surface is strongest on the metal side of this peak, whereas the other group with an OH bond pointing away from the surface is noticeable on the other side. There is some evidence that this orientational pattern is repeated vaguely in the region corresponding to a third possible peak in p(z), which is not shown in Figure 13. The radial distribution function, g(r), is the ratio of the density distribution of a type of site j at a given distance from a given type of site i, to the average density of site j in the system. A more technical definition, which allows the binning of this function into discrete intervals of r, iss4
where bin is an integer label for a discrete interval having a length of AY corresponding to the range of r values; r1 < Y 5 Y,, where r1 = AT(& - l),Y,, = y 1 + AY, and rbin= ( y 1 + rb)/2.In Eq. [ 7 6 ] ,C(6in)is the number of j sites per i site found in the range of distances corresponding to bin averaged over the configurations sampled, and Cid(bilz)is the average number of j sites per i site found in the range of distances corresponding to bin that could be found if the j sites behaved like ideal gases. Cid(bin)can be calculated using Vbinn Cid(6in)= V
where
Vbinis the volume of the simulation box, and n is the number of j sites if i # j, or the number of i sites minus 1 if i = j . Note that a literal interpretation of this approach for i = j would lead one to accumulate contributions to C(6in)for each pair of sites twice. A more efficient approach for this case is to accumulate the contribution for each pair of sites only once and then multiply the result by 2. To facilitate comparison with bulk water, it is usually better to use the average density of the sites from bulk water rather than their densities for inhomogeneous systems like those being described in this chapter. The radial distribution function can be calculated for water molecules for different distances from the metal surface and compared with that for bulk water. The range of distances from the metal surface usually is selected to correspond to the first and second water layers. Such results have been published for Pt-~ater,'~rigid Hg-water:2 and Ag-water7* interfaces. For the first layer of water adjacent to the metal surface, the first peak in the oxygen-oxygen radial distributions is enhanced, and the structure beyond this peak differs completely
186 Computer Simulation of Water l'hysisorption at Metal- Water Interfaces
1
from that of bulk water. In particular, the eak for the tetrahedrally hydrogenbonded water molecules (at roughly 4.5 in bulk water) is absent. For these rigid surfaces, there exists a regular structure in this distribution extending out to large distances owing to the positioning of the water molecules at particular sites on these corrugated metal surfaces. The g(r) for water in the second layer, in contrast, resembles that of bulk water. The structure of water near a liquid Hg-water interface49 is similar, but the oscillations in the radial distribution function for water in the first layer are dampened compared to the rigid surface. From the analysis above and arguments offered earlier concerning the energetics of the hydrogen bonding compared to that for metal-water interactions, it would be useful to measure the amount of hydrogen bonding in the system as a function of distance from the metal surface. When considering the number of hydrogen bonds, it is especially important to consider the number of nearest-neighbor water molecules. The number of nearest-neighbor water molecules can be defined as the number of water molecules lying within a set distance of the molecule of interest. This distance is usually chosen to be close to the position of the first minimum in the oxygen-oxygen radial distribution function (roughly 3.5 A).491120In the literature, a hydrogen bond is something defined to exist if nearest-neighbor water molecules interact with a potential energy stronger than 16.75 k J / m 0 1 , ~ ~ 9a 'pragmatic ~~ definition at best. The number of nearest neighbors and the number of hydrogen bonds per water molecule have been studied for water beside Hg s ~ r f a c e s .In ~ contrast ~.~~ to p(z), these functions are surprisingly simple; they tend to remain close to the bulk values for distances greater than the first minimum in p(z) (approximately 5.5 A). The number of hydrogen bonds per water molecule gradually decreases by roughly 25% near the surface. Water molecules in the region between the first two layers of water have a somewhat larger number of nearest neighbors as a result of their proximity to the high density first and second layers.49 Within the first layer, the number of nearest neighbors either rises somewhat or drops Hg-water intersomewhat based on published studies of rigid42 and faces, respectively. More than 60% of the potential energy of a water molecule adsorbed on a liquid Hg surface can be attributed to water-water interact i o n ~ supporting ,~~ the idea that water-water hydrogen bonding is an important factor in the interfacial region.39,42~44~45~49~78J21
Dynamics From a conceptual standpoint, it is useful to have an understanding of the time scales for motions of particles near metal-water interfaces, to be able to better understand their nature, as well as how molecules and atoms near these interfaces differ from those of the bulk. The two most commonly calculated dynamic properties for metal surfaces are the mean-square displacement and the velocity autocorrelation functions, because these can be used to calculate diffusion constants and spectra.
Analysis and Results for Metal- Water Interfaces 187 The mean-squared displacement, (R2(tic)),provides a measure of water mobility by quantifying the amount of change that happens between positions can be of the water molecules tic configurations apart in a trajectory. (R2(Sc)) calculated using
where the average is performed over all of the molecules i and all configurations c in the trajectory. This property can be derived from either Monte Carlo or molecular dynamics simulations. Figure 16 provides a representative example of this function as obtained from a Monte Carlo ~ i m u l a t i o nThe . ~ ~ slope at is a measure of rate of change large values of tic, the linear portion of (R2(Sc)), of particle positions. Using Monte Carlo calculations, this slope provides an estimate of the relative mobilities of molecules under similar conditions and similar simulations. For molecular dynamics sampling, tit = Attic, where At is the time interval between consecutive configurations in the trajectory. If the meansquare displacement is a function of time, (R2(tit)),then the diffusion coefficient, D, may be calculated from the slope of (R2(6t))using
3.0
2.0 A
U
N
V
1.o
0.0
0
10000
MC Pass
20000
30000
Figure 16 The mean-squared displacement (R2(Sc))for a slab of water between two rigid (100)metal surfaces.39 The heavy curves are calculated using all three dimensions, whereas the narrower curves are calculated using only the x and y dimensions (parallel to the surface). The solid and dashed curves are for water in the first and second layers adjacent to the metal surface, respectively. The dot-dash curve is for water in the rest of the simulation cell.
188 Computer Simulation o f Water Physisorption at Metal- Water Interfaces
where nwis the number of dimensions used in calculating the mean-square displacement. 84,85 The velocity autocorrelation function relates the velocities of the molecules at different times of a molecular dynamics simulation. The velocity autocorrelation function is given by
where the average is conducted over all times within the stored trajectory, t, and all molecules i. Figure 17 gives normalized velocity autocorrelation functions [Le., divided by (vco,,(0))]for the center of mass of the water molecules, and for Hg atoms near a liquid Hg-water interface.49 The Fourier transform of (vcor,(8t)), (Gcorr(w))can be used to assess how the interface influences the mo-
0.9
I
0.6 0.3
0.0
VACF
z
1st layer 2nd layer bulk
I
~
---
VACF
--
7
0.6
I
1st layer 2nd layer bulk ~
water
mercury
-0.3 0.9
z
VACF
1st layer 2nd layer --- --bulk
"y
water
1.
0.3
1st layer 2nd layer bulk
VACF xy
-
--- ----
mercury
\\
0.0 -0.3
0.0
0.2
0.6
0.4
tips
0.8
0.0
0.5
L .0
I .5
2.0
tips
Figure 17 Velocity autocorrelation functions (VACF), (vo,,(st)),for water molecules and Hg atoms, perpendicular (2)and parallel ( x y ) to the liquid Hg-liquid water interface broken down by distance from the interface. (Reprinted with permission from Ref. 49. Copyright 1996 American Chemical Society.)
Analysis and Results for Metal- Water lnterfaces 189 tions of atoms or molecules in the interfacial region. The (CcCorr(o)) functions corresponding to those in Figure 17 are presented in Figure 18. For water, the maximum near 25 cm-l has been assigned to motions involving bending of the angle formed by three 0 atoms that are close to each other, whereas the maximum near 200 cm-l is associated with 0-0 stretch motions.49 These results indicate that water in the second layer behaves somewhat like bulk water, but water in the first layer is noticeably different. Figure 18 also indicates that the motions of the Hg atoms are affected by the presence of the interface. The diffusion constant can also be calculated from (u,,,,( St))84,85by means of 1 . D = - lim 3t+-
t 0
(ucorr(Zit))d(6t)
The presence of even a smooth metal surface reduces the diffusivity of water adjacent to a rigid Hg surface by a factor of about 3 , compared to bulk water.39 Corrugation of a rigid metal surface leads to a reduction of the diffusivity by
f(;)/10-~ cm
f ( ~ ) / 1 0 -cm ~ 1st layer 2nd layer --.---bulk .- .
I st layer 2nd layer .-..-.bulk
1st layer 2nd layer -..bulk
0
100
200
6lcm-l
300
400
0
25
50
15
100
ij/cm-l
Figure 18 The normalized spectral density, fcC) ((CCorr(w)}in the text) for water molecules and Hg atoms, perpendicular (2)and parallel ( x y )to the liquid Hg-liquid water as in Figure 17. (Reprinted with permission from Ref. 49. Copyright 1996 American Chemical Society.)
190 Computer Simulation of Water Ph ysisorption at Metal- Water Interfaces roughly a factor of 25 or more, depending on the particular metal surface under c o n ~ i d e r a t i o nFor . ~ ~water ~ ~ ~ in the second layer, the diffusivity is smaller than bulk but diffusion parallel to the surface is close to bulklike (i.e., roughly two-thirds the bulk three-dimensional ~ a l u e ) .For ~ ~a .liquid ~ ~ Hg surface, the reductions in diffusivity are much smaller.49
Miscellaneous Properties The bulk adsorption energy per unit area of a metal surface can be defined as
where U,,, is the total energy of the system, Nw is the number of water molecules in the system, Uw, is the average potential energy of bulk water for the particular water model being used, and S is the area of the metal exposed to the water neglecting the corrugation (i.e., calculated assuming a completely planar interface). Adsorption energies for water on planar, 100, and 111 rigid Hg surfaces are given for both monolayers and slabs of water in Table 2. The adsorption enthalpies are all relatively large and negative, indicating that water will favorably adsorb on such surfaces. Reflecting the compromises made between water-water and water-metal interactions, the results for slabs of water are significantly smaller in magnitude than those for monolayers. One of the key properties calculated in the simulations of metal-water interfaces is the shift in the potential drop across the surface of the metal when it is covered by water. Part of this shift is due to the shift in the Fermi energy level of the metal upon exposure to water. The rest is due to direct electrostatic contributions from the water. The first term can be estimated in simulations in which the electronic structure of the metal is represented. The second term is calculable from the water configurations sampled during the simulation of the interface. The second term is also useful for interpreting and simplifying interfacial features in general because it separates the average contributions of the electrostatic portion of the potential from that of the average total potential near an interface. The manner in which this term is calculated is the subject of ongoing debate in the l i t e r a t ~ r e . ~ >One ~ ~ approach ~ - l ~ ~ is to assume that the potential drop can be calculated from the net charge density as a function of z
and that for molecules like water, the charge density can then be calculated directly from the partial charges within the water molecule. This potential drop
Analysis and Results for Metal- Water Interfaces 191 can be expressed as a sum of contributions to the effective charge density from the molecular dipole and quadrupole moments.12s The other predominant view is to assume that the potential drop can be calculated from the dipolar density alone (see, e.g., Ref. 126). To do this, one collapses the charge density within the water molecule down into a point dipole located at the center of the molecule and then calculates the average dipolar density as a function of distance from the interface. In practice, the dipolar density in the z direction can be estimated by ((cos,, (Op)),pw(zl)), where ( ,) and ( ), denote averages over all molecules in a given configuration and over all configurations, respectively. It is then possible to calculate the water’s contribution to the electrostatic potential using
where is the dipole moment of a water molecule for the model used, and z, is the distance to a location in the water slab that is sufficiently far from the surface (e.g., the center of the water slab). At the dipolar level, these approaches are similar.12s However, contributions from quadrupolar and higher moments present in the first method are absent in the second. The calculated contributions from the higher multipolar moments can be significant (0.9 V for the water-hydrocarbon interface12s). The difference in results from the two approaches is puzzling in light of the published proofs that the potential drop for a uniform planar interface should depend only on the polarization density126 and not on higher order terms like the quadrupolar density discussed above. This issue needs to be examined carefully because the electrostatic potential drop is a key property of interest for metal-water interfaces. Ideally the potential drop could be calculated simply by inserting a test charge into the system and averaging the potential obtained as a function of distance relative to the interface. This method is formally equivalent to finding the potential using Eq. [84],but this is not the method normally used for most simulations because it is time-consuming and has technical difficulties. Some of these difficulties stem from problems associated with the truncation of the electrostatic potential (if the long-range components are not included) and because the test charge can sample points within the water molecules, in particular near the point charges assigned to the sites (e.g., hydrogen and oxygen). The contributions from these sites give enormous energies, which are difficult to average out. The latter problem is related to an important issue that has been raised in the literature.126 The real issue is: What is the physical meaning of the potential drop? In the current context of metal-water interfaces, there exist at least two answers, both of which can be considered in the context of an ideal electrode that probes the electrostatic potential of the system without perturbing it. The two interpretations of the ideal electrode concept are distinguished by
192 Computer Simulation of Water Physisorption at Metal- Water Interfaces
the regions within the solution that the hypothetical electrode is permitted to explore. In one, the electrode can explore anywhere in the system, producing an estimate of the electrostatic potential averaged over all positions within the solution. This is essentially how the average electrostatic potential drop calculated from the average charge density using Eq. [84] is evaluated. Letting the electrode explore everywhere is valid in its own right but has the significant problem of averaging over regions inside the water molecules. Empirical water models such as those in Table 1 are parameterized to yield a model that works well for intermolecular interactions; they are not parameterized to yield accurate estimates of electrostatic potentials within the water molecule. Another problem with letting the electrode probe all over is that it is not clear what experimental observable is related to this quantity. This problem is also present to some extent in the integration of the polarization density in Eq. [85], but because of symmetry arguments the average of the electrostatic potential over a spherical region centered on a point dipole (water molecules have excluded volumes that are roughly spherical) is zero. More typically, the potential drop as measured experimentally and probed by an ideal electrode is related to the electrostatic potential that a molecule (or ion) experiences. If such an ideal electrode is to probe the potential relevant for a molecule, it certainly should not be probing positions inside the water molecules. In this respect, the molecular models currently used, as well as both the charge density and the dipolar density approaches described above, include assumptions and untested approximations. We have employed a second method to examine the electrostatic potential using a test charge that is present only in the primary simulation cell but interacts with the entire periodic array of water molecules at the i n t e r f a ~ e . ~ ~ J ‘ ~ This approach eliminates the difficulties encountered with the truncation of the electrostatic potential. The test charge also interacts with water molecules via a Lennard-Jones potential having parameters uL = 2 A and E ~ = , 1 K. The test charge was inserted in many positions on a gridfor each of the configurations stored in a large trajectory file. At each point for each configuration, the electrostatic and Lennard-Jones potentials were calculated and recorded separately. The average electrostatic potential was then calculated excluding points at which the net Lennard-Jones potential was more positive than a specified value. This way we probe electrostatic terms for “sterically accessible regions” only, avoiding the problem of having the hypothetical electrode make measurements in physically unrealistic places. There is no clear criterion for choosing the cut off potential, and so the information obtained from the calculation is only qualitative. However, this scheme makes it possible to determine the electrostatic potential for the regions between the water molecules. This method seems promising, but in its current form it is too CPU intensive to be practical for routine use in large simulations. Figure 19 shows the electrostatic potential due to the water near a Hg-water interface as determined using the charge density,
Analysis and Results for Metal- Water Interfaces 193
1.2 -
-
1.0 -----/AH----
"
0.8 0.5
-
0.2 -
0.0 0.0
-
0
I I I
0
I I
I
I
I
2.0
-5.0
-10.0 I
5.0
0.0 I
10.0 I
8.0
10.0
Figure 19 The electrostatic potential from water as a function of distance, z, from a point well within the metal. The solid line is the result obtained from the polarization density, Eq. [85]. The results for the test charge calculation are the dashed curve, which was calculated by averaging the electrostatic potential experienced by a test charge excluding points for which the fictitious Lennard-Jones potential between the test charge and the water molecules exceeded 12.36 E ~ where ~ , this is the parameter for the charge c interacting with a water oxygen 0. This potential cutoff coincides with the value of the potential at one-half of uLJfor the interaction of two water oxygens Owor 1.584 for the SPC/E water model (Ref. 115).Insert: results obtained by integrating the charge density given by Eq. [84].
dipolar densities, and the test charge methods. The test charge and polar density methods give similar results, whereas the charge density method produces results distinctly different from these. At present, the polarization density method seems to be the most appropriate one for calculating the potential drop due to water in these systems.
General Discussion of the Properties of Metal-Water Interfaces Nearly all the simulation studies conducted so far have focused on flat or relatively gently corrugated metal surfaces. Accordingly, our discussion about general properties is confined to such surfaces. It is first worth noting that there is general agreement among the conclusions about water physisorption from most studies for a wide range of metal ~ ~ r f a ~ e ~ . ~ , l 8~ A, ~ ~ ~
194 Computer Simulation of Water Physisorption at Metal- Water Interfaces
prominent property of water-metal interfaces is the presence of two to three layers of water molecules adjacent to the metal surface that arise from the strong isotropic interaction between those water molecules and the metal surface.39*40For a given surface, if the isotropic portion of the potential is sufficiently strong overall (i.e., within the range ascribed to the physisorption of water molecules on metal surfaces), additional increases in the potential have a relatively minor effect on the number of water molecules present in the first layer. The number of water molecules in the second layer, however, is somewhat more sensitive to these variations in potential.39 Other changes of the potential, at least for the range of variations expected for relatively smooth metal surfaces on which water physisorbs, tend to have modest effects on the water density Using X-ray scattering measurements, Toney and c o - ~ o r k e r s ob~~~-~~~ tained experimental information about the density of water molecules in a weak electrolyte solution near charged Ag( 111) surfaces. These results were somewhat controversial, because they indicate the presence of three to four dense layers of water as well as distinct solution structures for negatively and positively charged surfaces. Nonetheless, the layering proposed by those authors is at least superficially similar to that derived by simulations. (We return to the discussion of these results later when simulations under electric fields are considered.) It would be useful to have additional experimental results for uncharged surfaces. Even for gently corrugated metal surfaces (111 and 100 faces) studied by computer simulations, the corrugations are strong enough to force most of the water molecules in the first layer to reside in potential energy minima on top of the surface metal atom^,^,^^^^^^^^^^^^^^^^ while those in the second layer are much less influenced by these minima.45 The strong preference for water to lie above the surface metal atoms is consistent with experiments on monolayers of water on metal surfaces.36 Some evidence exists suggesting that very strong corrugation can lead to different behavior.35 It is not yet clear how important geometric effects (such as the variation in height of the potential energy minimum above the surface) are, as opposed to the energetics of the corrugation. Even though the lowest energy orientation for a single water molecule adjacent to a metal surface has its dipole pointing straight out from the surface, water molecules in slabs and monolayers near the surface tend to lie flat on the metal surface. This is indicated by the order parameters and orientational distributions and can be clearly seen in molecular graphics images. The reorientation of the water molecules near the metal surface reflects the importance of water-water interactions, particularly hydrogen bonding, in determining much of the structure of the solution adjacent to these surfaces. Experimental evidence also suggests that water-water interactions can be construed as being the dominant factor for determining the structure of water near a metal surface.121 Variations in the surface corrugation, the overall strength of the potential, and
Analysis and Results for Metal- Water Interfaces 195 the strength of the orientational component of the potential tend not to change this picture of metal-water interface^.^-$>^^ Water molecules adjacent to rigid metal surfaces diffuse very ~ l o w l y , sometimes ~ ~ , ~ ~ remaining in the first layer for longer than 50 For many computed properties, the behavior of water molecules in the second layer is similar to that of bulk water molecules. The drop in electrostatic potential has been demonstrated to be relatively insensitive to most aspects of the potential, with the exception of the strength of the orientational portion of the metal-water potential, which can significantly alter this property.39 At present little agreement exists as to how one should calculate the electrostatic potential drop in these systems. Additional theoretical studies elucidating how to do this are clearly needed. Most studies have focused on the structure of water near uncharged metal surfaces, but many systems of interest involve charged metal surfaces. ~ Sthat~ Some simulations mimicking charged metal S U ~ ~ ~ C show as the electric field increases from low levels up to high levels of 1 V/W, the hydrogen bonding network of the water molecules gradually changes from the normal behavior described above to one in which the water molecules tend to align their dipoles with the electric field. At 3 V/w it is possible to align nearly all the water molecule^.^^^^^^^^ At these high fields, the overall density of the water in the system increases, but the number of the water molecules adjacent to the surface actually decreases.39 To comprehend this behavior, consider what happens: overall, the water molecules in these systems organize in layers that are perpendicular to the direction of the electric field (and thus parallel to the metal surface); but the molecules in these layers are oriented with their dipoles parallel to the electric field. As a result, hydrogen bonding between the water molecules in the same layer is impossible. Because hydrogen bonds between water molecules tend to draw the molecules closer together than the water’s van der Waals potential would, the lack of hydrogen bonding within the layers of water molecules gives rise to a net lateral expansion. Thus fewer water molecules are in contact with the metal surface. Simultaneously, hydrogen bonding does occur between molecules in different layers. This behavior, combined with the significantly lower energy of water molecules existing in such strong electric fields, induces a small spacing between the layers of water molecules and a net increase in the density of the water in the system. The resulting structures bear some resemblance to polarized ice lattice^.^^>^^ Simulations using electric fields39,78,79give rise to solution structures similar to those for the negatively charged surface found in the experiments of Toney and co-workers. 127-129 For positively charged surfaces, however, Toney’s experimental results show density variations more pronounced than those found in the sir nu la ti on^.^^^'^ For both charged surfaces, the enhancement of water density adjacent to the metal surfaces found in simulations is generally smaller than those derived from experiment. These density differences
~
~
196 Computer Simulation of Water Physisorption a t Metal- Water Interfaces
could be reconciled by citing the limitations of the experiments, which are difficult to perform and analyze. Keep in mind, however, that these charged metal surfaces carry high surface charge densities and thus produce very strong electric fields. It is important to remember that under such conditions the nature of the interaction between the water molecules and the metal surface may take on a different character that may not be described by the models typically used in sir nu la ti on^.^^^^^ Still, distinct layering is present in both experiment and theory. The layering of water molecules, the alternating pattern of hydrogen bonding between these layers, and the arrangements of water molecules adjacent to metal surfaces have prompted comparisons of water structure adjacent to metal surfaces with some crystalline phases of ice.39,44,78For highly charged surfaces, this is a beneficial way to address and understand the water ordering. Near uncharged surfaces, in contrast, the description is best thought of as an analogy between the orientational structure of ice and the average orientational structure of water in the first few water layers adjacent to the The analogy does not extend to the positional ordering. Water in the first layer adjacent to the metal surface can adopt structures reminiscent of ice. However, when the corrugation in the metal surface is removed, the water in the first layer adopts structures in which pentagons are quite common, while maintaining the same overall orientational structure as for the corrugated surface. Because pentagons are incompatible with any crystalline versions of ice, the resemblance to icelike structures near corrugated surfaces can be attributed to the corrugation of the metal surface but not to the water itself.
SUMMARY A N D PERSPECTIVE State-of-the-art simulations appear to provide consistent and reasonable results for simple model systems. For pure water situated near flat or relatively flat (111 and 100) metal surfaces, the predicted water structure is consistent irrespective of which water potential is used, and the simulation results are generally similar for different metal surfaces on which water physisorbs. Moreover, the simulation results are on the whole consistent with the limited information available from experiment. Unfortunately, modeling physisorption is currently limited by a wide spectrum of factors. A significant limitation in obtaining an accurate representation of the interface concerns our knowledge of the metal-water interactions. Quantum mechanical calculations are the best source of this type of information, especially for obtaining interaction potentials. When adequately parameterized, these potential functions can be used effectively within traditional Monte Carlo and molecular dynamics simulations. Because of the extensive amounts of computer
Summary and Perspective 197
time required for accurate quantum calculations, such calculations are usually performed on isolated water molecules interacting with a small cluster of metal atoms (tens of a t o m ~ ) . ~Although -l~ the interaction of a water molecule with clusters is of inherent interest, it is not clear that small to moderate-sized clusters are large enough to mimic the interactions of water with an extended metal surface. In this regard, it has been found that the interaction potentials can show significant oscillations as the size of the metal cluster increases.” This behavior in small metal clusters can be attributed to edge effects, the finite size of the cluster, and the geometric dependence of the valence (band) electron distribution. Hence, the potentials currently used in traditional simulations, particularly those including orientational and surface corrugation contributions, are suspect. Solutions to the problem of treating extended metal surfaces are being sought.’ 7,131 A related issue is the dependence of the quantum mechanically calculated metal-water interaction on the inclusion of other surface-bound and solvating water molecules. Although current model potentials are suspect, it is clear they can provide a qualitative approximation of the true interactions in most cases. In many situations the potentials in use do not provide an adequate approximation of the true interactions. A simple example of this occurs when water chemisorbs to the surface. At charged metal surfaces both the electronic structure of water and the charge distribution of the metal may differ significantly from those situations where the model potentials were constructed, . ~neutral ~ ~ ~ ~surfaces, image parameterized, and intended to be e m p l ~ y e dAt potentials appear to have little effect on the structure of water. This nearly complete cancellation of interactions of water molecules with their images may not occur at charged metal surfaces, where long-range ordering of the solvent occurs. Given the uncertainties in locating the image plane and in treating image interactions for water molecules close to the metal surface, these problems are not minor. Additionally, it is not yet clear whether the net cancellation of image terms will hold for solutes (particularly ions) as they approach, and possibly adsorb, onto the metal surface. More extensive quantum mechanical calculations for charged metal surfaces could in principle help us to better understand these issues and provide a firmer theoretical foundation as we develop appropriate model potentials for these situations. On the whole, current studies of metal-water interfaces are restricted to fairly flat surfaces, free of irregularities. Study of an SPM tip near a metal surface in the presence of water is a notable exception.43 The treatment of irregular surfaces presents a significant challenge in this area of computational chemistry, partly because of the lack of information concerning the nature of the interaction of water molecules with defects in metal surfaces. Similarly, an understanding of the dynamics of metal atoms in these systems is required. Here again, appropriate quantum mechanical calculations will likely be important for further development.
198 ComDuter Simulation o f Water Phvsisorbtion at Metal- Water Interfaces
FUTURE DEVELOPMENTS Most of the current limitations on classical simulations of water interacting with metal surfaces appear to be surmountable once quantum mechanical calculations have addressed the issue of how water interacts with metal surfaces under a wider range of conditions. Beyond water-metal physisorption, however, problems in the classical realm include the behavior of ions in aqueous solutions near metal surfaces. A number of simulations have been done on such systems, usually in the presence of strong electric fields. For weak electric fields or in the absence of electric fields, the simulation times required for the ions to reach equilibrium is significantly longer than what is currently practical.40 Nonetheless, potential of mean force calculations for dilute ionic solutions can be carried out, and results from such simulations can be used to learn about the distributions of ions with respect to the metal surface. A significant number of simulations of this kind has already been p ~ b l i s h e d , 4 ~but~ ~ ~ ~ the inherent strengths of this approach have not yet been exploited to the fullest. Another approach to learning about such ionic solutions, employed by Philpott and Glosli,122,134-136is to deliberately use metal-water potentials that are somewhat weaker than appropriate in order to limit the ordering of the water molecules. In this way, barriers to ion diffusion are reduced near the metal surfaces, facilitating studies of aqueous ionic solutions near electrodes. This approach might facilitate the study of some aspects of processes such as the response of the system to discharging an ion near the surface. Systematic variation of the water-metal potential might make it possible to extrapolate to systems with strong water-metal interactions. Current developments in simulation techniques include multiple time step methods137 and more efficient polarizable models.23 Hardware developments that increase the power of affordable computers will continue to expand the size of the systems treated and the extent of exploring the configurations and dynamics, making the simulations more realistic. Beyond the purely classical treatments of these interfaces, where structural and dynamical events are probed, lie the most interesting systems to study: systems in which reactions occur. Chemical reactions by their very nature usually require quantum mechanical treatments. Such studies may be purely quantum mechanical or mixed quantudclassical mechanical138 in nature. Because reactions occur in localized environments, classical treatments of much of the system can dramatically reduce simulation times. Some exploratory studies in this direction have been conducted, and the field is full of possibilities.14~16~77~138-143
ACKNOWLEDGMENTS It is our pleasure to acknowledge the kindness of Professor I. Benjamin, University of California, Santa Cruz; Dr. M. Philpott, IBM Alrnaden; Professor E. Spohr, University of Ulrn; and Pro-
References 199 fessor K. Heinzinger, Max Planck Institut fur Chemie, Mainz, for providing reprints and preprints of their work. We thank Professor K. Heinzinger for providing Figures 17 and 18 and Professor A. K. Soper of the Rutherford Appleton Laboratory for providing the experimentally derived g(r)’s for water. Discussions related to the material presented here with Professor G. N. Patey, University of British Columbia, Professor G. M. Torrie, Royal Military College, Canada, and Professor M. Berkowitz, University of North Carolina, Chapel Hill, have been enlightening. Dr. M. Shelley and the editors have provided many suggestions for improving this chapter.
REFERENCES 1. S. Trasatti, Ed., Electrochemica Atta, Vol. 41, No. 14 (1996). 2. W. Schmickler, Znterfacial Electrochemistry, Oxford University Press, Oxford, 1996. 3. I. Benjamin, Chem. Rev., 96, 1449 (1996). Chemical Reactions and Solvation at Liquid Interfaces: A Microscopic Perspective. 4. 1. Benjamin, Modern Aspects of Electrochemistry, Plenum Press, New York, 1997. 5. D. C. Grahame, Chem. Rev., 41,441 (1947). The Electric Double Layer and the Theory of Electrocapillarit y. 6. G. Tbth, E. Spohr, and K. Heinzinger, Chem. Phys., 200,347 (1995). SCF Calculations of the Interactions of Alkali and Halide-Ions with the Mercury Surface. 7. E. Spohr and K. Heinzinger, Ber. Bunsenges. Phys. Chem., 92, 1358 (1988). A Molecular Dynamics Study on the Water/Metal Interfacial Potential. 8. E. Spohr, J. Phys. Chem., 93, 6171 (1989). Computer Simulations of the WaterPlatinum Interface. 9. H. Sellers and P. V. Sudhakar, J. Chem. Phys., 97, 6644 (1992). The Interaction Between Water and the Liquid -Mercury Surface. 10. R. R. Nazmutdinov, M. Probst, and K. Heinzinger,]. Electroanal. Chem., 369, 227 (1994). Quantum Chemical Study of the Adsorption of an H,O Molecule on an Uncharged Mercury Surface. 11. S. Jin and J. D. Head, Surf. Sci., 318, 204 (1995). Theoretical Investigation of the Molecular Water Adsorption on the Al( 111) Surface. 12. M. D. Calvin, J. D. Head, and S. Jin, Surf. Sci., 345, 161 (1996).Theoretically Modelling the Water Bilayer on the Al(111) Surface Using Cluster Calculations. 13. I. I. Zakharov, V. I. Avdeev, and G. M. Zhidomirow, Surf. Sci., 277, 407 (1992). Nonempirical Cluster Model Calculations of the Adsorption of H,O on Ni( 111). 14. C. P. Ursenbach, A. Calhoun, and G. A. Voth, J . Chem. Phys., 106, 2811 (1997). A FirstPrinciples Simulation of the SemiconductorAVater Interface. 15. C. P. Ursenbach and G. A. Voth,J. Chem. Phys., 103,7569 (1995). Effect of Solvent on Semiconductor Surface Electronic States: A First Principles Study. 16. D. L. Price and J. W. Halley, J. Chem. Phys., 102, 6603 (1995). Molecular Dynamics, Density Functional Theory of the Metal-Electrolyte Interface. 17. J. D. Head and S. J. Silva, J. Chem. Phys., 104, 3244 (1996). A Localized Orbitals Based Embedded Cluster Procedure for Modeling Chemisorption on Large Finite Clusters and Infinitely Extended Surfaces. 18. J. L. Whitten, Chem. Phys., 177, 387 (1993). Theoretical Studies of Surface Reactions: Embedded Cluster Theory. 19. B. Kirtman, J. Chem. Phys., 79, 835 (1983). Density Matrix Treatment of Localized Electronic Interactions. Separated Electron Pairs. 20. W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey, and M. L. Klein,J. Chem. Phys., 79, 926 (1983). Comparison of Simple Potential Functions for Simulating Liquid Water.
200 Computer Simulation o f Water Physisorption at Metal-Water Interfaces 21. H. J. C. Berendsen, J. R. Grigera, and T. P. Straatsma, J. Phys. Chem., 91,6269 (1987).The Missing Term in Effective Pair Potentials. 22. K. Watanabe and M. L. Klein, Chem. Phys., 131, 157 (1989). Effective Pair Potentials and the Properties of Water. 23. S. W. Rick, S. J. Stuart, and B. J. Berne, J. Chem. Phys., 101, 6141 (1994). Dynamical Fluctuating Charge Force Fields: Application to Liquid Water. 24. E. R. Smith, CCPS Q (U. K.), 4, 13 (1982). Point Multipoles in the Ewald Summation. 25. F. H. Stillinger, in The Liquid State of Matter: Fluids, Simple and Complex, E. W. Montroll and J. L. Lebowitz, Eds., North-Holland, Amsterdam, 1982, p. 341. 26. D. Bertolini, M. Cassettari, and G. Salvetti, J. Chem. Phys., 76, 328.5 (1982). The Dielectric Relaxation Time of Supercooled Water. 27. K. Krynicki, C. D. Green, and D. W. Sawyer, Discuss. Furaday Soc., 66, 199 (1978). Pressure and Temperature Dependence of Self-Diffusion in Water. 28. A. K. Soper, F. Bruni, and M. A. Ricci, J . Chem. Phys., 106, 247 (1997). Site-Site Pair Correlation Functions of Water from 25 to 400 OC: Revised Analysis of New and Old Diffraction Data. 29. H. J. C. Berendsen, J. P. M. Postma, W. F. van Gunsteren, and J. Hermans, Intermolecular Forces, Reidel, Dordrecht, 1981. 30. J. Lobaugh and C . A. Voth, J. Chem. Phys., 106, 2400 (1997). A Quantum Model for Water: Equilibrium and Dynamical Properties. 31. S. A. Clough, Y. Beers, G. P. Klein, and L. S. Rothman, J. Chem. Phys., 59, 2254 (1973). Dipole Moment of Water from Stark Measurements of H,O, HDO, D,O. 32. S. L. Carnie and G. N. Patey, Mol. Phys., 47,1129 (1982).Fluids of Polarizable Hard Spheres with Dipoles and Tetrahedral Quadrupoles. Integral Equation Results with Application to Liquid Water. 33. P. Barnes, J. L. Finney, J. D. Nicholas, and J. E. Quinn, Nature, 282, 459 (1979). Cooperative Effects in Simulated Water. 34. C. A. Coulson and D. Eisenberg, Proc. R. Soc. London A, 291, 445 (1966).Interactions of H,O Molecules in Ice. I. The Dipole Moment of an H,O Molecule in Ice. 35. E. Sphor, in Solid-Liquid Electrochemical Interfaces, G. Jetkiewicz, M. P. Soriaga, K. Uosaka, and A. Wieckowski, Eds., ACS Symp. Series 656, American Chemical Society, Washington, DC, 1997, pp. 3 1 4 4 . Computer Simulation of the Structure and Dynamics of Water Near Metal Surfaces. 36. P. A. Thiel and T. E. Madey, Surf. Sci. Rep., 7, 385 (1987). The Interaction of Water with Solid Surfaces: Fundamental Aspects. 37. G. B. Fisher and J. L. Gland, Surf. Sci., 94, 446 (1980). The Interaction of Water with the Platinum (111)Surface. 38. C. Kemball, Proc. R. SOC. London, Ser. A , 190, 117 (1947). The Adsorption of Vapours on Mercury: Polar Substances. 39. J. C. Shelley, G. N. Patey, D. R. Bkrard, and G. M. Torrie,J. Chem. Phys., 107,2122 (1997). Modeling and Structure of Mercury-Water Interfaces. 40. E. Spohr, Acta Chem. Scand., 49, 189 (1995). Computer Modeling of Interfaces Between Aqueous and Metallic Phases. 41. B. Eck and E. Spohr, Electrochem. Acta., 42,2779 (1997). Computer Simulations of Hydrated Ions Near a Mercury Electrode. 42. J. Bocker, R. R. Nazmutdinov, E. Spohr, and K. Heinzinger, Surf. Sci., 335,372 ( 1 995). Molecular Dynamics Simulation Studies of the Mercury-Water Interface. 43. J. I. Siepmann and M. Sprik,J. Chem. Phys., 102,511 (199.5).Influence of Surface Topology and Electrostatic Potential on WaterElectrode Systems.
References 201 44. K. Raghavan, K. Foster, K. Motakabbir, and M. Berkowitz,]. Chem. Phys., 94,2110 (1991). Structure and Dynamics of Water at the Pt( 11 1 ) Interface: Molecular Dynamics Study. 45. K. Raghavan, K. Foster, and M. Berkowitz, Chem. Phys. Lett., 177, 426 (1991). Comparison of the Structure and Dynamics of Water at the Pt(111)and Pt( 100) Interfaces: Molecular Dynamics Study. 46. S.-B. Zhu and M. R. Philpntt,]. Chem. Phys., 100, 6961 (1994). Interaction of Water with Metal Surfaces. 47. K. Foster, K. Raghavan, and M. Berkowitz, Chem. Phys. Lett., 162, 32 (1989). A Molecular Dynamics Study of the Effect of Temperature on the Structure and Dynamics of Water Between Pt Walls. 48. J. I. Siepmann and M. Sprik, Surf. Sci. Lett., 279, L185 (1992).Ordering of Fractional Monolayers of H,O on N i ( l l 0 ) . 49. J. Bocker, Z. Gurskii, and I(. Heinzinger,]. Phys. Chem., 100, 14969 (1996). Structure and Dynamics at the Liquid Mercury-Water Interface. 50. J. Bocker, E. Spohr, and K. Heinzinger, 2. Nattrrforsch., 50a, 61 1 (1995). Density Profiles at a WateriLiquid Mercury Interface. 51. E. Spohr, ]. Mol. Liq., 64, 91 (1995). Ion Adsorption on Metal Surfaces. The Role of Water-Metal Interactions. 52. G. Barabino, C. Gavotti, and M. Marches;, Chem. Phys. Lett., 104,478 (1984). Molecular Dynamics Simulation of Water Near Walls Using an Improved Wall-Water Interaction Potential. 53. N. D. Lang and W. Kohn, Phys. Rev. B, 1,4555 (1970).Theory of Metal Surfaces: Charge Density and Surface Energy. 54. P. Gies and R. R. Gerhardts, Pbys. Rev. B, 33, 982 (1986). Self-Consistent Calculation of Electron-Density Profiles at Strongly Charged Jellium Surfaces. 5 5 . D. R. Bkrard, M. Kinoshita, X. Ye, and G. N. Patey,]. Chem. Phys., 101,6271 (1994). Structure and Properties of the Metal-Liquid Interface. 56. D. R. BCrard, M. Kinoshita, X. Ye, and G. N. Patey,]. Chem. Phys., 102,1024 (1995). Structure and Properties of the Metal-Electrolyte Interface. 57. D. R. Bkrard, M. Kinoshita, and G. N. Patey, 1.Chem. Phys., 107,471 9 ( 7 997). Structure of the Metal-Aqueous Electrolyte Solution Interface. 58. A. Zangwill, Physics at Surfaces, Cambridge University Press, New York, 1988. 59. J. A. Alonso and N. H. March, Electrons in Metals and Alloys, Academic Press, New York, 1989. 60. W. Schrnickler and D. Henderson,]. Chem. Phys., 80,3381 (1984). The Interphase Between Jellium and a Hard Sphere Electrolyte. A Model for the Electric Double Layer. 61. D. L. Price and J. W. Halley,]. Electroanal. Chem., 150,347 (1983). A New Model of the Differential Capacitance of the Double Layer. 62. J. W. Halley, B. Johnson, D. L. Price, and M. Schwalm, Phys. Rev. B, 31,7695 (1985). Quantum Theory of the Double Layer: A Model of the Electrode-Electrolyte Interface. 63. J. W. Halley and D. L. Price, Phys. Rev. B, 35, 9095 (1987). Quantum Theory of the Double Layer: Model Including Solvent Structure. 64. D. 1.. Price and J. W. Halley, Phys. Rev. B, 38, 9357 (1988). Electronic Structure of the Metal-Electrolyte Surfaces: Three-Dimensional Calculation. 65. P. Hohenberg and W. Kohn, Phys. Rev., 136B, 864 (1964). lnhomogeneous Electron Gas. 66. W. Kohn and L. J. Sham, Phys. Rev., 140A, 1133 ( 1 965). Self-Consistent Equations Including Exchange and Correlation Effects. 67. L. J. Bartolotti and K. Flurchick, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1995, Vol. 7, pp. 187-216. An Introduction to Density Functional Theory.
202 Computer Simulation of Water Physisorption at Metal- Water Interfaces 68. D. M. Ceperley and B. J. Alder, Phys. Rev. Lett., 45, 566 (1980).Ground State of Electron Gas by a Stochastic Method. 69. S. Ossicini, C. M. Bertoni, and P.Gies, Europhys. Lett., 12,661 (1986).Image Plane for Surface Potential. 70. K. Heinzinger, Fluid Phase Equilib., 104, 277 (1995). Computer Simulations of Aqueous Electrolyte Solution/Metal Interfaces. 71. E. Spohr and K. Heinzinger, Chem. Phys. Lett., 123,218 (1986).Molecular Dynamics Simulation of a Water/Metal Interface. 72. E. Spohr, Chem. Phys. Lett., 207,214 (1993).A Computer Simulation Study of Iodine Ion Solvation in the Vicinity of a Liquid WatedMetal Surface. 73. 0.Pecina, W. Schmickler, and E. Spohr,]. Electroanal. Chem., 394,29 (1995).On the Mechanism of Electrochemical Ion Transfer Reactions. 74. G. Toth and K. Heinzinger, Chem. Phys. Lett., 245,48 (1995).Molecular Dynamics Study of an Iodide and a Lithium Ion at the Water-Liquid Mercury Interface. 75. A. Kohlmeyer, W. Witschel, and E. Spohr, Chem. Phys., 213, 211 (1997). Molecular Dynamics Simulations of WatedMetal and WaterNacuum Interfaces with a Polarizable Water Model. 76. D. A. Rose and I. Benjamin, 1. Chem. Phys., 95, 6856 (1991).Solvation of Na' and CI- at the Water-Platinum (100) Interface. 77. D. A. Rose and I. Benjamin,J. Chem. Phys., 98,2283 (1993).Adsorption of Na' and CIat the Charged Water-Platinum Interface. 78. K. J. Schweighofer, X. Xia, and M. L. Berkowitz, Langmuir, 12, 3747 (1996). Molecular Dynamics Study of Water Next to Electrified Ag(11 1 ) Surfaces. 79. X. Xia and M. L. Berkowitz, Phys. Rev. Lett., 74, 3193 (1995). Electric-Field Induced Restructuring of Water at a Platinum-Water Interface: A Molecular Dynamics Computer Simulation. 80. T. P. Lybrand, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1990, Vol. 1, pp. 295-320. Computer Simulation of Biomolecular Systems Using Molecular Dynamics and Free Energy Perturbation Methods. 81. T. P. Straatsma, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1996, Vol. 9, pp. 81-127. Free Energy by Molecular Simulation. 82. G. Ravishanker, P.Auffinger, D. R. Langley, B. Jayaram, M. A. Young, and D. L. Beveridge, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., Wiley-VCH, New York, 1997, Vol. 11, pp. 317-372. Treatment of Counterions in Computer Simulations of DNA. 83. R. Kutteh and T. P. Straatsma, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., Wiley-VCH, New York, 1998, Vol. 12, pp. 75-136. Molecular Dynamics with General Holonomic Constraints and Application to Internal Coordinate Constraints. 84. M. P. Allen and D. J. Tildesley, Computer Simulations of Liquids, Oxford University Press, Oxford, 1987. 85. D. Frenkel and B. Smit, Understanding Molecular Simulation, Academic Press, London, 1996. 86. K. A. Motakabbir and M. Berkowitz, Chem. Phys. Lett., 176, 61 (1991). Liquid-Vapor Interface of TIP4P Water: Comparison Between a Polarizable and a Nonpolarizable Model. 87. N. W. Ashcroft and D. N. Mermin, Solid State Physics, Holt, Rinehart & Winston, New York, 1976. 88. M. Schoen, D. J. Diestler, and J. H. Cushman, J. Chem. Phys., 87, 5464 (1987).Fluids in Micropores. I. Structure of a Simple Classical Fluid in a Slit Pore. 89. D. R. Birard, P. Attard, and G. N. Patey, J. Chem. Phys., 98, 7236 (1993).Cavitation of a Lennard-Jones Fluid Between Hard Walls, and the Possible Relevance to the Attraction Measured Between Hydrophobic Surfaces.
References 203 90. S. E. Feller, R. W. Pastor, A. Rojnuckarin, S. Bogusz, and B. R. Brooks, J. Chem. Phys., 100, 17011 (1996). Effect of Electrostatic Force Truncation on Interfacial and Transport Properties of Water. 91. J. C. Shelley and G. N. Patey, Mol. Phys., 88, 385 (1996). Boundary Condition Effects in Simulations of Water Confined Between Planar Walls. 92. D. M. Heyes, CCPS Q., (U.K.), 8,29 (1983).MD Incorporating Ewald Summations on Partial Charge Polyatomic Systems. 93. J. Hautman and M. Klein, Mol. Phys., 75, 379 (1992). An Ewald Summation Method for Planar Surfaces and Interfaces. 94. D. M. Heyes, M. Barber, and J. H. R. Clarke, J. Chem. Soc., Faraday Trans. 11, 73, 1485 (1977). Molecular Dynamics Simulation of Surface Properties of Crystalline Potassium Chloride. 95. U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee, and L. G. Pedersen, ]. Chem. Phys., 103, 8577 (1995).A Smooth Particle Mesh Ewald Method. 96. M. Abramowitz and I. A. Stegun, Handbook ofMathematical Functions, Dover, New York, 1970. 97. G. Hummer, Chem. Phys. Lett., 235, 297 (1995). The Numerical Accuracy of Truncated Ewald Sums for Periodic Systems with Long-Range Coulombic Interactions. 98. L. F. Greengard, The Rapid Evaluation of Potential Fields in Particle Systems, MIT Press, Cambridge, MA, 1987. 99. L. F. Greengard and V. Rocklin,]. Comput. Phys., 73,325 (1987). A Fast Algorithm for Particle Simulations. 100. L. Greengard and V. Rocklin, Chem. Scripta, 29A, 139 (1989). On the Evaluation of Electrostatic Interactions in Molecular Modeling. 101. W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, Numerical Recipes: The Art of Scientific Computing, Cambridge University Press, New York, 1986. 102. S. Jang and G. A. Voth,]. Chem. Phys., 107, 9514 (1997). Simple Reversible Molecular Dynamics Algorithms for Nose-Hoover Chain Dynamics. 103. G. J. Martyna, M. E. Tuckerman, and M. L. Klein, Mol. Phys., 87,1117 (1996).Explicit Reversible Integrators for Extended Systems. 104. M. Ferrario and J. P. Ryckaert, Mol. Phys., 54, 587 (1985). Constant Pressure-Constant Temperature Molecular Dynamics Simulations for Rigid and Partially Rigid Molecular Systems. 105. J. P. Ryckaert, M o f . Phys., 47, 1253 (1982). The Method of Constraints in Molecular Dynamics. General Aspects and Application to Chain Molecules. 106. G. Ciccotti, M. Ferrario, and J. P. Ryckaert, Mol. Phys., 47, 1253 (1982). Molecular Dynamics of Rigid Systems in Cartesian Coordinates. 107. G. Ciccotti and J. P. Ryckaert, Comput. Phys. Rep., 4,345 (1986).Molecular Dynamics Simulation of Rigid Molecules. 108. W. G. Hoover, Phys. Rev. A, 31,1695 (1985).Canonical Dynamics: Equilibrium Phase-Space Distributions. 109. M. Ferrario, in Computer Simulations in Chemical Physics, M. P. Allen and D. Tildesley, Eds., NATO AS1 Series C, Kluwer Academic Publishers, Dordrecht, 1992, Vol. 397, pp. 153-172. Thermodynamic Constraints. 110. S. Nose, in Computer Simulation in Materials Science, M. Meyer and V. Pontikis, Eds., NATO AS1 Series, Kluwer Academic Publishers, Dordrecht, 1991, Vol. 205, pp. 2 1 4 1 . Molecular Dynamics Simulations at Constant Temperature and Pressure. 11 1. N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, J. Chem. Phys., 21, 1087 (1953).Equation of State Calculations by Fast Computing Machines. 112. H. Goldstein, Classical Mechanics, Addison-Wesley, Reading, MA, 1980.
204 Computer Simulation of Water Physisorption at Metal-Water lnterfaces 113. D. J. Evans and S. Murad, Mol. Phys., 34, 327 (1977). Singularity-Free Algorithm for Molecular Dynamics Simulation of Rigid Polyatomics. 114. R. I).Mountain and D. Thirumalai, Physics A, 210, 453 (1994).Quantitative Measure of Efficiency of Monte Carlo Simulations. 1 1 5. J. C. Shelley, D. R. Btrard, and C. N. Patey, unpublished results, (1995). Calculations Related to Simulating Aqueous Interfaces. 116. D. Bouzida, S. Kumar, and R. H. Swendsen, Phys. Rev. A, 45, 8894 (1992). Efficient Monte Carlo Methods for the Computer Simulation of Biological Molecules. 117. J. P.Valleau and A. A. Gardner,]. Chem. Phys., 86,4162 (1987).Water-like Particles at Surfaces. I. The Uncharged, Unpolarized Surface. 118. E. Spohr, G. T6th, and K. Heinzinger, Electrochem. Ada, 41, 2131 (1996). Structure and Dynamics of Water and Hydrated Ions Near Platinum and Mercury Surfaces as Studied by MD Simulations. 119. W. Schmickler and D. Henderson, J . Chem. Phys., 85,1650 (1986).The Interphase Between Jellium and a Hard Sphere Electrolyte: Capacity-Charge Characteristics and Dipole Potentials. 120. C. Y. Lee, J. A. McCammon, and P. J. Rossky, J. Chem. Phys., 80,4448 (1984).Thc Structure of Liquid Water at an Extended Hydrophobic Surface. 121. J. D. Porter and A. S. Zinn, J. Phys. Chem., 97, 1190 (1993).Ordering of Liquid Water at Metal Surfaces in Tunnel Junction Devices. 122. J. N. Glosli and M. R. Philpott, Electrochem. Acta, 41, 2145 (1996).Molecular Dynamics Study of Interfacial Electric Fields. 123. M. R. Philpott and J. N. Glosli,J. Electrounal. Chem., 409,65 (1996).Electric Potential Near a Charged Metal Surface in Contact with Aqueous Electrolyte. 124. M. A. Wilson, A. Pohorille, and L. R. Pratt, J . Chem. Phys., 88, 3281 (1988). Surface Potential of the Water Liquid-Vapor Interface. 125. M. A. Wilson, A. Pohorille, and L. R. Pratt, J. Chem. Phys., 90,5211 (1989). Comment on “Study on the Liquid-Vapor Interface of Water. I. Simulation Results of Thermodynamic Properties and Orientational Structure.” 126. G. M. Torrie and G. N. Patey,J. Phys. Chem., 97,12909 (1993).Molecular Solvent Model for an Electrical Double Layer: Asymmetric Solvent Effects. 127. M. F. Toney, J. N. Howard, J. Richer, G. L. Borges, J. G. Gordon, 0. R. Melroy, D. G. Wiesler, D. Yee, and L. B. Sorensen, Nature, 368,444 (1994).Voltage-Dependent Ordering of Water Molecules at an Electrode-Electrolyte Interface. 128. M. F. Toney, J. N. Howard, J. Richer, G. L. Borges, J. G. Gordon, 0. R. Melroy, D. G . Wiesler, D. Yee, and L. B. Sorensen, Surf. Sci. Lett., 335, 326 (1995). Distribution of Water Molecules at Ag( 111 )/Electrolyte Interfaces Studied with Surface X-Ray Scattering. 129. J. G. Gordon, 0. R. Melroy, and M. F. Toney, Electrochem. Actu, 40, 3 (1994).Structure of Metal-Electrolyte Interfaces: Copper on Gold( 1 1 I ) , Water on Silver(111). 130. M. Watanabe, A. M. Brodsky, and W. P. Reinhardt, J. Phys. Chem., 95, 4593 (1991). Dielectric Properties and Phase Transitions of Water Between Conducting Plates. 131. T. Ito, H. Kobayashi, and D. R. Salahub, Cutul. Today, 23,357 (1995). Density Functional Study on the Reaction of CO Molecules with MgO Surfaces. 132. R. R. Nazmutdinov and E. Spohr, 1.Phys. Chem., 98,5956 (1994). Partial Charge Transfer of the Iodide Ion Near a WatedMetal Interface. 133. 0. Pecina, W. Schmickler, and E. Spohr, J . EIectrounaL Chem., 405,239 (1996). The Temperature Dependence of the Transfer of an Iodide Ion. 134. J. N. Glosli and M. R. Philpott, J . Chem. Phys., 98, 9995 (1993).Adsorption of Hydrated Halide Ions on Charged Electrodes. Molecular Dynamics Simulation.
References 205 135. M. R. Philpott and J. N. Glosli, in Theoretical and Computational Approaches to Interface Phenomena, H. L. Sellers and J. T. Golab, Eds., Plenum Press, New York, 1994, pp. 75-100. Molecular Dynamics Computer-Simulations of Charged Metal-Electrolyte Interfaces. 136. M. R. Philpott and J. N. Glosli,J. Electrochem. SOL., 142, L25 ( 1 995). Screening of Charged Electrodes in Aqueous Electrolytes. 137. M. E. Tuckerman, B. J. Berne, and G. J. Martyna,J. Chem. Phys., 97,1990 (1992).Reversible Multiple Time Scale Molecular Dynamics. 138. J. Gao, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1995, Vol. 7, pp. 119-185. Methods and Applications of Combined Quantum Mechanical and Molecular Mechanical Potentials. 139. P. N. Day, J. H. Jensen, M. S. Gordon, S. P. Webb, W. J. Stevens, M. Krauss, D. Garmer, H. Basch, and D. Cohen,]. Chem. Phys., 105,1968 (1996).An Effective Fragment Method for Modeling Solvent Effects in Quantum Mechanical Calculations. 140. M. J. Field, P. A. Bash, and M. Karplus, J. Comput. Cbem., 11, 700 (1990). A Combined Quantum Mechanical and Molecular Mechanical Potential for Molecular Dynamics Simulations. 141. N. Vaidehi, T. A. Wesolowski, and A. Warshe1,j. Chem. Phys., 97,4264 (1992).QuantumMechanical Calculations of Solvation Free Energies. A Combined ab initio Pseudopotential Free-Energy Perturbation Approach. 142. T. A. Wesolowski and A. Warshel, J. Phys. Chem., 97, 8050 (1993). Frozen Density Functional Approach for ab initio Calculations of Solvated Molecules. 143. 1).A. Rose and I. Benjamin, Chem. Phys. Lett., 234,209 (1994). Solvent Free Energies for Electron Transfer at a Solution/Metal Interface. Effect of Ion Charge and External Field.
CHAPTER 4
Quantum-Based Analytic Interatomic Forces and Materials Simulation Donald W. Brenner, Olga A. Shenderova, and Denis A. Areshkin Department of Materials Science and Engineering, North Carolina State University, Raleigh, North Carolina 2 7695-7907
INTRODUCTION AND BACKGROUND From traditional areas such as the study of interfaces, fracture, and point defects to somewhat less traditional areas such as nanometer-scale engineering and device fabrication, atomistic simulations are providing exciting new data and insights into the properties and engineering of materials that often cannot be obtained in any other way. Central to the success of an atomistic simulation is the use of an appropriate interatomic force model. Although approaches based on first-principles, total energy quantum mechanical calculations are being actively developed and used, they remain too compute-intensive for simulations involving large systems and/or long times.1>2For these cases, the computational efficiency offered by analytic potential energy functions is necessary. Analytic potential energy functions are mathematical expressions that give the potential energy of a system of atoms as a function of relative atomic positions. Interatomic forces, which govern the motion of atoms in a typical dynamic simulation, are obtained by taking derivatives of the potential energy function with respect to nuclear coordinates. Reviews in Computational Chemistry, Volume 12 Kenny B. Lipkowitz and Donald B. Boyd, Editors Wiley-VCH, John Wiley and Sons, Inc., New York, 0 1998
207
208 Quantum-Based Analytic Interatomic Forces and Materials Simulation The typical approach to developing analytic potential energy functions is to assume a mathematical expression containing a set of parameters that are subsequently fit to a database of physical properties. An effective potential function requires a mathematical expression that both accurately reproduces this database and is transferable to structures and dynamics beyond those to which it is fit. The latter property is especially critical if an atomistic simulation is to have useful predictive capabilities. Whereas an extensive and well-chosen database from which parameters are determined is important, transferability ultimately depends on the chosen mathematical expression. The definitive expression, however, has yet to be developed. Indeed, many different forms are used, ranging from those derived from quantum mechanical bonding ideas to others based on ad hoc assumptions. This chapter discusses analytic potential energy functions that have been developed for materials simulation. The emphasis is not on an exhaustive literature survey of interatomic potentials and their application; rather, we provide an overview of how some of the more successful analytic functions summarized in Table 1 are related to quantum mechanical bonding. Concepts are emphasized over mathematical rigor, and equations are used primarily to illustrate derivations or to show relationships between different approaches. Atomic units are used for simplicity where appropriate. The discussion is restricted t o metallic and covalent bonding. To put this field into perspective, the next section is a brief review of the historical development of atomistic simulation. This is followed by a general discussion of potential energy functions and materials simulation, including some uses and desirable properties of potential energy functions. The roots of most of the more successful expressions used for materials simulation (with the Table 1. Interatomic Bonding Expressions Representative Energy Expression Formalism
dv+E,.[p'"(~)]+~~~~
Harris functional Tight binding Finnis-Sinclair
Origin/ Justification
ck -el cl,lbt,, c, 4, c,
0(rjj)+
C1,I
Ae?'!
Harris functional
~k
-112
B [ c .e-Br'l]
Tight binding
/
Ae-ar'l e-Pr, Empirical bond order Ey+ El(Pave)+ Effective medlurn theory Embedded-atom c l , , U ( r , ,+) F(P,) method
Xl,/
Density functional theory
I
Tight binding Jellium model Effective medium theory
Introduction and Background 209
notable exception of the Stillinger-Weber silicon potential3) can be traced to density functional t h e ~ r y . ~Therefore J a brief review of the foundations of DFT is presented. The sections that follow discuss two connections between the density functional equations and analytic potentials, as well as specific approaches that have been derived from these analyses. The first connection is the Harris functional, from which tight binding, Finnis-Sinclair, and empirical bondorder potential functions can be derived. The second connection is effective medium theory, which leads to the embedded-atom method of Baskes, Daw, and Foiles. The chapter concludes with a brief discussion of databases from which function parameters can (and should) be obtained.
Historical Perspective Atomistic simulation was initially developed and used in two essentially parallel fields. The first was chemical dynamics, where classical atomic trajectories have been used for many years to understand topics such as the dynamics of reactive collisions, molecular energy flow, and scattering probabilities. The earliest example is the work presented in the seminal 1936 paper by Hirschfelder, Eyring, and Topley,6 who used a classical trajectory to model the reaction H + H, H, + H. Although the potential energy surface used is crude by current standards (it has a nonphysical well that yields a stable H, molecule), this work established some of the critical ideas still used today to understand and model chemical dynamics. The second field in which atomistic simulation has provided an important tool is statistical mechanics. In this case classical trajectories can be used to generate quasi-experimental many-body dynamics from which statistical mechanical theories can be derived or tested. The first use of continuous potentials in this field was the pioneering work of Rahman, who used classical dynamics simulations to probe the structure and many-body dynamics of liquid argon.’ In both these fields, the forces needed to calculate atomic trajectories can be obtained from the gradient of an analytic potential energy function. The importance of the potential function itself, however, has traditionally been quite different. For many statistical mechanical applications, the important quantity is correlated many-body motion; this can result from even very simple interatomic forces. Therefore the potential function often plays a secondary role in the interpretation of the results of the simulation. On the other hand, chemists are often concerned with understanding specific chemical reactions and with comparing simulation results to very detailed experimental data. In this case an accurate potential energy function is crucial because it governs the details of each atomic trajectory. Therefore tremendous effort has gone into developing and refining potential energy surfaces for a wide range of specific reactions.* More recently a new field has emerged where atomistic simulations are being used in biological applications. In drug design, for example, simulations are being used to model structure-binding relation^.^ Potential functions used --+
21 0 Quantum-Based Analytic lnteratomic Forces and Materials Simulation for this application typically model short-range covalent bonding with valence force fields composed of bond-stretching and -bending terms, and a combination of Lennard-Jones (or equivalent) pair-additive potentials and Coulombic terms to model nonbonded interactions.I0 The first example of the application of atomistic simulation to a materials-related area is probably the work of Vineyard and co-workers.ll They used classical trajectories to model damage induced in a solid by bombardment with ions having hyperthermal kinetic energies. These calculations, which were done at about the same time as Rahman’s initial studies on liquids, provided important data related to damage depth as well as new insights into many-body collisions in solids. The potentials used were continuous pair-additive interactions similar to those employed in Rahman’s simulations. Pair-additive interactions continued to be used in most materials-related simulations for over 20 years after Vineyard’s work despite well-known deficiencies in their ability to model surface and bulk properties of most materials. Quantitative simulation of materials properties was therefore very limited. A breakthrough in materials-related atomistic simulation occurred in the 1980s, however, with the development of several many-body analytic potential energy functions that allow accurate quantitative predictions of structures and dynamics of materials. 12-14 These methods demonstrated that even relatively simple analytic interatomic potential functions can capture many of the details of chemical bonding, provided the functional form is carefully derived from sound physical principles. With the advent of more powerful computers and increasingly clever algorithms, interatomic forces can now be routinely obtained from first-principles, total energy calculations for medium-sized systems (currently up to about a thousand atoms). The Car-Parrinello method in particular has found tremendous use in materials sim~1ation.l~ This approach has many advantages over analytic potential energy functions: it is not necessary to guess appropriate mathematical expressions; there are no parameters to fit (although appropriate choice of basis set and pseudopotential must still be determined); and electronic properties can be calculated simultaneously. However, at present first-principles calculations remain too compute-intensive to be practical for simulations requiring large systems and relatively long times. Therefore carefully developed analytic potential energy functions still play an important role in materials simulation.
Analytic Potentials and Materials Simulation Analytic interatomic potentials can be used for a variety of purposes in studying materials. For example, minimum energy structures for surface reconstructions, grain boundaries, and related defects can be estimated. From calculations of this type, one can often suggest a probable structure among several
Introduction and Background 21 1
Figure 1 A graphitic phase at a grain boundary in diamond predicted by a simulation using a many-body analytic potential function. The simulation conditions are described in Ref. 16.
possible choices or predict new structures that had not been previously explored (see Figure 1 ) . 1 6 It is also often possible to “prescreen” candidate structures before more compute-intensive first-principles calculations are attempted. To effectively carry out studies of these types by means of an analytic potential energy function, the function must be constructed and parameterized so that it possesses certain crucial characteristics. These include the following: 1. Flexibifity The function should be flexible enough to accommodate inclusion of a relatively wide range of structures in a fitting database. 2. Accuracy The potential function must be able to accurately reproduce quantities such as energies, bond lengths, elastic constants, and related properties that enter a fitting database. 3 . Transferability The function should be able to reproduce related properties that are not included in the fitting database. 4. Computational Efficiency The function should be of a form such that it is tractable for a desired calculation given available computing resources. Forms are often tried during the development of analytic potential energy functions that emphasize points 1 and 2 above, assuming that transferability will follow. However, as potential energy functions have continued to be developed, it has become clear that this does not necessarily happen. In fact, in many cases the opposite is true. Anyone who has tried to fit many data points with a high order polynomial has probably had the experience of having the function go through each data point while deviating from any meaningful trends beyond, and even between, the data points. This can also happen (although perhaps more subtly) for potential energy functions developed to fit a range of physical properties but have little in the way of predictive capabilities. The forms best suited for predicting materials properties are not generally those
21 2 Quantum-Based Analytic Interatomic Forces and Materials Simulation with the most parameters, but those whose functional forms best reflect quantum mechanical bonding.
FRAMEWORK FOR BONDING: DENSITY FUNCTIONAL THEORY Quantum mechanics tells us that all the properties of a system of interacting particles can be obtained from the system’s wavefunction. To determine the time-independent wavefunction J,,Schrodinger’s equation
H J , = E+
[I1
must be solved, where H is the time-independent quantum mechanical Hamiltonian operator for the system, and E is the energy. For systems composed of electrons and nuclei, one can often invoke the Born-Oppenheimer approximation and assume that the wavefunction is separable into individual wavefunctions for the electrons and nuclei. The wavefunction for the electrons can then in principle be determined from Eq. [l]by assuming that the electrons move in the positive field arising from the stationary nuclei. In practice, however, Eq. [l] cannot be solved analytically for many-electron systems, so some form of approximation is required. One of the earliest approximations for calculating an electronic wavefunction is due to Hartree.17 Rather than treat the Coulomb interaction between each pair of electrons explicitly, this method assumes that each electron interacts with the positive Coulomb field due to the nuclei and the negative Coulomb field from the “smeared” average total charge density of the electrons. The latter electrostatic field, called the Hartree potential VH(r), is given by the classical formula
where p(r’) is the electron density, and r is position. This yields the “one-electron approximation,” where each electron moves independently of the others, and leads to a series of single-electron equations of the form
where T is the kinetic energy operator, V,(r) is the potential energy due to the positive nuclei, and +k and ck are the one-electron orbitals and energies, respectively. Hartree’s equations can be solved self-consistently by guessing an initial
Framework for Bonding: Density Functional Theory 213 charge density, solving for the one-electron orbitals, and then readjusting the Hartree potential iteratively until the charge density no longer changes. The orbitals are usually expanded in a linear combination of functions-typically atom-centered atomic orbitals-with coefficients for each of the orbitals determined from the solution of a secular equation. To account for the antisymmetric nature of w a v e f ~ n c t i o n s , ~the ~ ~total J ~ electronic energy is obtained by applying the Hamiltonian to a Slater determinant constructed from the singleelectron wavefunctions obtained from Eq. [3]. This leads to the total electronic energy expression
where T[p(r)]is the kinetic energy of a noninteracting electron gas with a density pfr). Obtaining the self-consistent solution to the Hartree-Fock equation for the electrons is equivalent to minimizing the energy EHF[p(r)], There are well-known limitations to Hartree’s approach, most of which are related to approximating a many-body wavefunction by a single Slater determinant and the neglect of electron correlation. This method does, however, provide a good starting point from which more sophisticated approaches for calculating wavefunctions can be applied. For example, configuration-interaction methods can be used to calculate correlation energies. l 7 For materials, however, most structures of interest are composed of more than a few atoms, hence many electrons. Directly calculating a many-body wavefunction for these systems, even within the independent-electron approximation, is an extremely difficult task. In 1964 Hohenberg and Kohn proved a fundamental theorem in quantum mechanics that led to a breakthrough in calculating properties of materials.18 They showed that all properties of an interacting system of electrons with a nondegenerate energy in an external potential can be uniquely determined from the electron density. Furthermore, they showed that the ground state electronic energy is a unique functional of the electron density (a functional is a function of a function) for a given external potential, and this energy is minimized by the correct electron density. The latter is analogous to the variational principle used to construct the best approximation to an exact wavefunction for a given basis set. Because the form of the exchange-correlation functional is not known, the Hohenberg and Kohn theorem does not lead to a direct solution of the many-body electron problem. However, the power of density functional theory is that one deals with the electron density p(r), which depends on only three coordinates, rather than a full many-body wavefunction. Therefore density functional theory casts the calculation of electronic energies into a slightly more tractable form than is obtainable from methods that attempt to calculate a full wavefunction. For most cases of interest, the external potential is simply the Coulombic
214 Quantum-Based Analytic Interatomic Forces and Materials Simulation
field due to the nuclei, and the electron energy EDFin density functional theory can be written exactly as
where EHF is the Hartree expression given above, and Ex= is called the exchange-correlation functional. The difference between Eqs. [4] and 151 is therefore an (unknown) exchange-correlation functional Ex=. In 1965 Kohn and Sham used the variational principle of Hohenberg and Kohn to derive a system of one-electron equations which, like the Hartree approach, can be self-consistently solved. l 9 For this case, however, the electron densities obtained from the orbitals (called Kohn-Sham orbitals) are an exact solution to the many-body problem (for a complete basis) given the density functional. Hence the task of determining the electronic energy is changed from calculating the full many-body wavefunction to determining the best approximation to the density functional. The analogy between the Hartree approach of Eq. [3] and density functional equations is straightforward. The Kohn-Sham one-electron equations can be written as follows:
[T + VH(r) + V,(r)
+ Vxc(r)]cjfS = ci+KS
where Vxc(r), called the exchange-correlation potential, is the functional derivative of the exchange-correlation energy
When solved self-consistently, the electron densities obtained from Eq. [ 61 can be used in Eq. [5] to calculate the total electronic energy. This is equivalent to the relationship between Eqs. [3] and [4] for the Hartree approach. Unlike the Hartree approximation, however, this expression takes into account exchange and correlation interactions between electrons directly, and requires no other approximations other than the form of the density functional. Rather than using Eq. [5] to obtain the electronic energy, the expression
I
EKS[pSC(r)]= x k E k - Psc[
+ vxc(r)]dr + Ex,[psc(r)]
is usually used, where psc is the self-consistent electron density and ck are the eigenvalues of the Kohn-Sham orbitals. The integral on the right-hand side of Eq. [8] corrects for the fact that the eigenvalue sum includes the exchangecorrelation energy and double counts the electron-electron Coulomb interactions. These are usually referred to as the double-counting terms.
Bridge t o Analytic Forms: The Harris Functional 215
It should be understood that the Kohn-Sham orbitals are strictly good only for generating the electron densities that enter Eqs. IS] or [8], and the eigenvalues for the one-electron orbitals are not necessarily the same values that would be obtained from an exact solution for the many-body wavefunction. Nonetheless, the orbital energies are often used to estimate reasonable values for quantities such as electron affinities and electronegativities. Whereas the Kohn-Sham orbital approach is in principle exact, it requires knowing the exchange-correlation energy and potential. A widely used approximation is called the local density approximation (LDA) in which the exchange-correlation functional is given by
where &[PI is the exchange-correlation functional of an electron gas with density p. This functional has been calculated by Monte Carlo and diagrammatic methodsYz0and good parameterizations exist based on these calculations.21 Mathematically, the local density approximation should work well only for slowly varying electron densities, although in practice it gives accurate atomic configurations and energies for a relatively wide range of systems. Density functional theory has shaped the development of analytic potential energy functions in several very important ways. First, it provides a formalism from which effective non-self-consistent, total-energy schemes can be derived. Second, it obviates the need to resort to ad hoc approximations and parameter fitting by giving a systematic way of deriving functions entering more approximate schemes. Finally, it introduces the concept of electron density as the central quantity for calculating electronic energies. Approximating electron densities analytically is much easier than trying to approximate a full wavefunction.
BRIDGE TO ANALYTIC FORMS: THE HARRIS FUNCTIONAL One of the strengths of density functional theory is that the error in electronic energy is second order in the difference between a given electron density and the true ground state density. This means, for example, that a 10% error in the electron density will lead to an error in energy of the order of 1%. Based on this property, it may be possible to obtain reasonably good estimates for electronic energies without having to iterate to a fully self-consistent solution. This would, of course, save considerable computer time for calculating total energies. The obvious procedure would be to construct the Kohn-Sham orbitals from a given input charge density, and then sum the occupied Kohn-Sham or-
21 6 Quantum-Based Analytic Interatomic Forces and Materials Simulation bital energies and calculate the double-counting terms in Eq. [S] from the output density given by these orbitals. However, although summing the orbital energies is straightforward, calculating the double-counting terms can still be very compute-intensive, depending on the basis set and system size. It is therefore desirable to find a way in which calculating the double-counting terms can be simplified while still maintaining the second-order error in energy. Working independently, Harris22 as well as Foulkes and H a y d ~ c k ~ ~ showed that the electronic energy calculated from a single iteration of the energy funcrional
is also second-order in the error in charge density. This differs from the usual density functional approach in that while the Kohn-Sham orbital energies are still calculated, the double-counting terms involve only the input charge density, not the density given by these orbitals. Therefore this functional, generally refered to as the Harris (or sometimes Harris-Foulkes) functional, yields two significant computational benefits over a full density functional calculation-the input electron density can be chosen to simplify the calculation of the double-counting terms, and it does not require self-consistency. The correct input electron density used in the Harris functional will yield the correct ground state energy. However, this expression is not variational; hence the Harris functional can give an energy either higher or lower than the true ground state energy. In practice it has been found that with a judicious choice of input density, this expression can produce results that match fully selfconsistent calculations reasonably well, or at least better than the variational upper bound produced by a full single step of the Kohn-Sham procedure. Furthermore, this expression provides theoretical justification for more approximate approaches to calculating total energies, as discussed below. A number of tests on specific systems and detailed analyses of the Harris functional have been performed. Harris, for example, calculated binding energies, minimum energy distances, and force constants for the diatomic molecules Be2, C,, N,, F,, Cu, using the Harris functional and a sum of overlapped atomic densities.22 Foulkes and Haydock performed similar calculations for the diatomic molecules H,, He2, and Ge2.23In each case, the Harris functional produces values comparable to those obtained from fully self-consistent density functional calculations. The Harris functional has shown similar success in calculating bulk cohesive energies, lattice constants, and compressibilities for a range of crystalline solids, including aluminum, silicon, carbon, germanium, and sodium c h l ~ r i d e .The ~ ~ results ? ~ ~ of these calculations are summarized in Tables 2 and 3 . Comparison of the aluminum (111)surface energy given by a Harris functional and self-consistent density functional calculations have been carried out
&zUt
Bridge to Analytic Forms: The Harris Functional 21 7 Table 2. Properties of Diatomic Molecules Given by the Harris Functional and Fully Self-Consistent Density FunctionaYLocal Density Approximation (SC-DF/LDA) Calculations Binding Energy (eV)
Bond Distance (8)
Molecule“
Harris
SC-DF/LDA
Harris
Be2
0.5 8.7 10.7 3.7 2.9 4.9-5.2
0.5 7.2 11.3 3.3 2.7 4.8-4.9
4.50 2.20 2.03 2.71 4.10 0.73-0.79
c2
N2 F2 cu2
H,
Vibrational Frequency (meV)
SC-DF/LDA Harris 4.63 2.36 2.08 2.62 4.10 0.76
45 246 346 120 35 496
SC-DF/LDA 45 232 296 133 41 475-523
“All from Ref. 22, except H, which is from Ref. 23.
by Read and Needs24 as well as Finnis.26 Both papers conclude that the Harris functional is not particularly accurate when a superposition of atomic charge densities is used, although the results are better than those obtained with a single Kohn-Sham step. The energy and relaxation produced by the Harris functional can apparently be improved with a better choice of input charge density. Finnis, for example, found that accuracy improves considerably when contracted atomic orbitals are used to construct the input charge density, although the reason for this improvement is somewhat contr~versial.~’ The Harris functional, together with other related simplifying approximations, has also been used to model a wide range of cluster and surface structures for silicon and carbon.28 With careful choice of nonspherical atomic orbitals used to construct the input density, these studies have demonstrated that the Harris functional can yield energies and structures that match self-consistent results relatively well. Furthermore, because these calculations are relatively efficient, this approach has been used to model full atomistic dynamics.
Table 3. Properties of Solids Given by the Harris Functional and Fully Self-Consistent First-Principles (SC-FP) Calculations“ Cohesive Energy (eV/atom) Crystal Be
A1 V Fe Si NaCl “From Ref. 25.
Lattice Constant (A)
Bulk Modulus (Mbar)
Harris
SC-FP
Harris
SC-FP
Harris
SC-FP
4.47 4.30 9.19 7.84 4.74 3.56
4.28 4.07 7.60 6.27 4.79 3.25
2.21 3.95 3.08 2.78 5.44 5.76
2.24 4.02 2.97 2.74 5.48 5.63
1.40 0.96 1.70 3.15 0.96 0.25
1.32 0.88 1.99 3.04 0.90 0.28
21 8 Quantum-Based Analytic lnteratomic Forces and Materials Simulation The Harris functional provides not only relatively accurate non-self-consistent estimates of energies and structures, it also provides a basis from which the success of other non-self-consistent approximate methods can be understood. One of the more widely used of these, the tight binding method, is discussed in the next section.
Tight Binding Method The tight binding method refers to a class of semiempirical molecular orbital approximations that date back before the introduction of density functional theory.29 This method has proven to be capable of producing energies, bond lengths, and vibrational properties that are relatively accurate and transferable for a wide range of structures, including bulk solids, surfaces, and clusters. Furthermore, with the recent development of linear scaling algorithms for solving secular equations,30 tight binding methods have been used to simulate systems of up to several tens of thousands of atoms. The basic concept of tight binding is similar to extended Hiickel theory. Tight binding expressions give the total energy Eta, for a system of atoms as a sum of eigenvalues E of a set of occupied non-self-consistent, one-electron molecular orbitals plus some additional analytic function A of relative atomic distances k
The general idea behind this expression is to introduce approximate quantum mechanics through the eigenvalue calculation, and then apply corrections for these approximations needed to obtain reasonable total energies through the analytic function. The simplest and most widespread tight binding expression obtains the eigenvalues from a quantum mechanical wavefunction expanded in an orthonormal minimal basis of short-range atom-centered orbitals. One-electron molecular orbital coefficients and energies are calculated using the standard secular equation as is traditionally done for molecular orbital calculations. Rather than calculating many-body Hamiltonian matrix elements, however, these are usually taken as two-center terms that are parameterized to reproduce electronic properties such as band gaps, or they are sometimes adjusted along with the analytic function A to enhance the transferability of total energies. For calculations involving defects, disordered structures, and related configurations, a distance dependence of the two-center terms must also be specified. A pair-additive sum over atomic distances is often assumed for the analytic function
Bridge to Analytic Forms: The Harris Functional 219 where rji is the scalar distance between atoms i and j. The function e ( r , ) models Coulomb repulsions between positive nuclei that are screened by core electrons plus corrections to the approximate quantum mechanics. These corrections include repulsions between core electrons neglected in a minimal basis set, non-self-consistency, the one-electron approximation, assumption of orthogonal atomic orbitals, and the neglect of multicenter integrals. While a pair sum may be justified for the core electrons, there is little a priori reason to assume that it takes into account each of the other approximations. Nonetheless it appears to work well for a range of materials. Several approaches have been used for determining functional forms for the pair sum Eq. [12]. Once the Hamiltonian matrix elements had been specified, Chadi, for example, used a near-neighbor harmonic interaction for covalent materials where the force constants and minimum energy distances were This expression was then fit to bulk moduli and lattice constants, re~pectively.~~ used to predict energies and bond lengths for surfaces and related structures. More recently, Ho and coworkers have fit the pair sum to the universal binding energy relation.32 This reproduces not only lattice constant and bulk modulus, but also ensures reasonable nonlinear interatomic interactions that account for properties like thermal expansion. A number of recent studies have attempted to improve on the “standard” tight binding approach. Rather than use a simple pair sum for A, for example, Xu et al.33 used the many-body expression i
j
for carbon, where P is a polynomial function and e ( r ) is an exponential splined to zero at a distance between the second and third nearest neighbors in a diamond l a t t i ~ e . The ~ ~ .polynomial ~~ and pair terms were fit to several different solid state and molecular structures. The resulting potential produces binding energies that are transferable to a wide range of systems, including small clusters, fullerenes, diamond surfaces, and disordered carbon. A number of researchers have suggested that a truly transferable tight binding expression will not allow the assumption that the basis functions are orthogonal. This necessitates the introduction of additional parameters describing the overlap integrals. Several methods have been introduced for determining these. Menon and Subbaswamy, for example, have relied on extended Hiickel theory, which gives proportionality expressions between the overlap and Hamiltonian matrix element^.^' In a different approach, Frauenheim and co-workers have calculated Hamiltonian and overlap matrix elements directly from density functional calculations within the local density a p p r ~ x i m a t i o nThis . ~ ~ approach is powerful because complications associated with an empirical fit are eliminated, yet the relative computational simplicity of a tight binding expression is retained. Taken separately, the approximations used in the tight binding approach
220 Quantum-Based Analytic lnteratomic Forces and Materials Simulation
Taken separately, the approximations used in the tight binding approach at first appear too severe to be useful, yet they apparently work well for many systems when taken together. Justification for this success has been given by Foulkes and Haydock based on the Harris f ~ n c t i o n a l First, . ~ ~ the use of nonself-consistent, one-electron molecular orbitals assumed in tight binding expressions is justified if they correspond to the Kohn-Sham orbitals constructed from the input charge density in the Harris functional. Second, if the input electron density is approximated by a sum of overlapping, atom-centered spherical orbitals, then it can be shown that the double-counting terms in the Harris functional are given by
where Ca is a constant intra-atomic energy, Uij(rii)is a short-range pair-additive energy that depends on the scalar distance rii between atoms i and j, and Unp is a non-pair-additive contribution that comes from the exchange-correlation functional. Haydock and Foulkes have shown that if the regions of overlap of electron densities from three or more atoms are small, Unp is well approximated by a pair sum that can be added to U,. Hence the assumption of pair additivity for the function A in Eq. 1121 is justified. Finally, spherical atomic orbitals lead to the simple form
for the one-electron potential needed to calculate the orbital energies in the Harris functional. The function Vi(r) is an additive atomic term that includes core electrons as well as Hartree and exchange-correlation potentials, and U(r) arises from nonlinearities in the exchange-correlation functional. Although not two-centered, the contribution of the latter term is relatively small. Thus the use of strictly two-center matrix elements in the tight binding Hamiltonian is also justified.
Second Moment Approximation and Finnis-Sinclair Potential Whereas the combinations of approximations involved in tight binding methods make them significantly less compute-intensive than density functional calculations, having to calculate molecular orbitals that may span many atoms can still be very compute-intensive for large systems. Therefore more efficient methods that further approximate quantum mechanical bonding and do not require solving a secular equation are desirable for large-scale materials simulation.
Bridge to Analytic Forms: The Harris Functional 221 of chemical bonds is due to the broadening of electronic orbital energies as atoms are brought together. An example of this broadening is illustrated in Figure 2, where molecular orbital energies arising from valence p orbitals on a collection of carbon atoms are plotted against the number of orbitals with each energy as the atoms form a diamond lattice. The orbital energies were calculated from the minimum basis set, tight binding Hamiltonian of Xu et 21.33 for a supercell of 64 atoms with periodic boundaries. The lattice is greatly expanded so that there are no interactions between valence s and p bands. As the atoms come together, the degeneracy of the initial three valence p orbitals on each atom is lost, and molecular orbitals form with energies above and below that of the atomic orbitals. Following Hund’s rule, the molecular orbitals are
0
R
B I T A L
E
N
E R
a/q=1.62
d2’=0.22
I
1
G
Y
RELATIVE NUMBER OF STATES Figure 2 Molecular orbital energies arising from a linear combination of atomic p orbitals plotted against the relative number of electronic states with each energy. Data are for 64 carbon atoms arranged in a diamond lattice with periodic boundaries in each direction. Energies (in eV) are calculated from the tight binding Hamiltonian of Xu et al. (Ref. 33). a. is the experimental lattice constant for diamond. a is the lattice constant used in the calculations, and p2 is the second moment of the density of states.
222 Quantum-Based Analytic lnteratomic Forces and Materials Simulation filled by pairs of electrons beginning with the lowest energy. The total electronic energy is the sum of occupied orbital energies. Because the valence p orbitals are not completely filled, more lower energy orbitals are occupied than higher energy orbitals. This arrangement results in a lower overall electronic energy relative to the free atoms and the formation of chemical bonds. The electronic energy associated with these bonds is the sum of the occupied orbital energies minus the sum of the occupied atomic orbital energies of the individual atoms. Electronic molecular orbital energies can be conveniently described using the density of states. This is a function that gives the number of electronic states in the interval between energy e and e + de
where 6 is a delta function, ck is the energy of molecular orbital k, and the sum is over both occupied and unoccupied orbitals. Using this function, the total energy Etot arising from the electrons can be written as
where g, is the degeneracy of level k, and the sum is now over all k occupied orbitals. For an infinite solid, the energy difference between the delta functions in Eq. [16] becomes infinitesimally small, and the density of states becomes a continuous function. The sum in Eq. [17] can then be replaced by the integral
whose upper limit is the Fermi energy. As with any distribution, the shape of the density of states can be described by its moments about some energy value. We can conveniently choose from which the molecmoments about the energy of the atomic orbital catornic ular orbitals are formed so that the nth moment of the density of states is given by ,,,n =
C(e-
&atomic n
1 D(4
~ 9 1
k
With this definition (and neglecting charge transfer), the first moment of the is zero. The second moment b2describes the width of the dendistribution ~1.l sity of states. The third and fourth moments describe the degree of skewness about the center of the distribution and the tendency to form a gap in the middle of the distribution, respectively.
Bridge to Analytic Forms: The Harris Functional 223 If all moments of the density of states were known, the electronic energy could be calculated exactly through Eq. [18] without having to explicitly calculate all the molecular orbitals. Therefore the problem of calculating orbital energies is transformed to estimating the moments of the density of states. Because the binding energy relative to the free atoms comes primarily from the spread in orbital energies, it is reasonable to assume that the energy should be most closely related to the second moment of the density of states. Figure 3 shows the sum of the energies of the occupied orbitals for the distributions from Figure 2 versus the square root of their second moments. There is a strong linear dependence between these two quantities. Therefore it is possible, at least with this relatively simple example, to calculate the electronic bond energy without having to calculate either the orbitals or the higher order moments of the density of states. Instead, one needs to know only the second moment of the density of states and the equation of the line relating the square root of the second moment to the sum of the occupied orbital energies. This is a considerable simplification of the electronic orbital energy problem. The discussion thus far is for periodic lattices in which all atoms have the same environment. For many materials simulations, we are interested in the properties of things like surfaces and defects around which atomic bonding environments can differ from atom to atom. It is therefore desirable to apply the same ideas locally to each atom. To do this, a local density of states di(e)can be defined for each atom i in which the contribution of each molecular orbital is weighted by the “amount” of the orbital on the atom. If an orbital has a node on a partic-
0 -
N D .
E N E R
G
: : -
Y --
Figure 3 Sum of the occupied orbital energies (in eV) versus the square root of the second moment of the distributions illustrated in Figure 2.
224 Quantum-Based Analytic Interatomic Forces and Materials Simulation
uiar atom, for example, it does not contribute to the atom’s local density of states. For a molecular orbital expanded in an orthonormal linear basis of atomic orbitals on each atom, this weight is simply the sum of the squares of the linear expansion coefficients for the atomic orbitals centered on the atom of interest. The electronic bond energy EP’ for each individual atom i can be defined through the local density of states by
where the upper limit of the integral is the Fermi energy. With this definition, the global density of states is recovered by summing all the local densities of states
and the total electronic energy is the sum of the electronic energies associated with the individual atoms. Following the discussion above, an electronic bond energy for each atom can be determined if the second moment of the local density of states and the equation of the line giving the relationship between the square root of this value and the energy are known. This would not be of much use, however, if one had to calculate all the molecular orbitals to obtain the expansion coefficients needed to determine the local densities of states. Fortunately there is a very powerful theorem, called the moments theorem, that relates the moments of the local density of states to the bonding topology but does not require the explicit calculation of molecular orbital^.^',^^ This theorem states that the nth moment of the local density of states on an atom i is determined by the sum of all paths between m neighboring atoms that start and end at atom i. Some of these paths are illustrated in Figure 4. To obtain an exact local density of states for a given atom, one would have to know all the moments, and therefore all the possible paths starting and ending at that atom would have to be determined. This quickly becomes a nontrivial calculation as higher moments are determined. However, it was demonstrated above that a good estimate of the bond energy can be obtained if one knows only the second moment. In this case, loops requiring only two “hops” are involved, which is simply the nlimber of nearest neighbors z. Therefore we can conclude from this analysis that the local electronic bond energy for each atom arising from the molecular orbitals is approximately proportional to the square root of the number of neighbors.
This result is called the second-moment approximation.
Bridge to Analytic Forms: The Harris Functional 225
0
0
1 0
0
TWO HOPS-
2ndMoment
0
a
Four Hops -
fihMo 0
ti
Figure4 Illustration of the “hops” on a two-dimensional square lattice that contribute to the second and fourth moments of the local density of states in tight binding theory.
To develop an analytic potential function from the second-moment approximation, the definition of neighboring atoms needs to be addressed. For regular solids, the nearest neighbors can be determined unambiguously. For disordered systems and many defects, the range of possible bond lengths makes it less clear which atoms are nearest neighbors. In the tight binding approximation, the magnitude of the interactions between orbitals on different atoms are determined by the two-center, off-diagonal Hamiltonian matrix elements. Because atomic electron densities decay exponentially from the nucleus, these can be approximated by exponential functions. Similarly, neighbors can be defined in an analytic expression by exponential functions of the distance between atoms. This yields a bond energy in the second-moment approximation of
E‘l
oc
where p is a parameter. The total c nsity of states is the sum of the local density of states, and so the total bond energy for a system of atoms is the sum of the bond energies of the individual atoms. Including a proportionality constant B in Eq. [23] and adding pairadditive interactions between atoms to balance the bond energy yields the Finnis-Sinclair analytic potential energy function13
226 Quantum-Based Analytic Interatomic Forces and Materials Simulation
This is a particularly simple expression that still captures much of the essence of quantum mechanical bonding. Hence it can be used in large-scale simulations to introduce basic elements of quantum mechanics into the interatomic forces. Finnis and Sinclair initially developed their expression for the bodycentered elements vanadium, niobium, tantalum, chromium, molybdenum, tungsten and iron, which all have unfilled d ~hel1s.I~ Because the d orbitals are relatively high in energy compared to the s and p shells, there is comparatively little intershell mixing. Hence the second-moment approximation should be relatively valid. The parameters in the expression were systematically fit to cohesive energies, lattice constants, and elastic moduli. Predictions of surface and defect energies were found to be in reasonable agreement with experimental values, in contrast to results generally obtained with simple pair potentials. The Finnis-Sinclair analytic functional form was introduced at about the same time as two other similar forms, the embedded-atom m e t h ~ d land ~ . the ~~ “glue” model.40 However, the derivation of the Finnis-Sinclair form from the second-moment approximation is very different from the interpretation of the other empirical forms, which are based on effective medium theory as discussed later. This difference in interpretation is often ignored, and all three methods tend to be put into a single class of potential energy function. In practice, the main difference between the methods lies in the systems to which they have been traditionally applied. In developing the embedded-atom method, for example, Baskes, Daw, and Foiles emphasized close-packed lattices rather than body-centered-cubic lattice^.^^.^^ Given that angular interactions are usually ignored in these approaches (with a few exceptions), this potential form appears to work better for describing properties of close-packed metals. There has been considerable effort since the introduction of the Finnis-Sinclair potential to develop expressions that include angular interactions and higher moments of the local density of s t a t e ~ . Carlson ~~,~~ and coworkers, for example, have introduced a matrix form for the moments of the local density of states from which explicit environment-dependent angular interactions can be obtained.41 The role of the fourth moment, in particular, has been stressed for half-filled bands because it describes the tendency to introduce an energy gap. These investigations have led to improved models that describe local bonding in both covalent and body-centered-cubic materials.
Empirical Bond-Order Model Similar to the Finnis-Sinclair potential, the empirical bond-order approach mimics quantum mechanical atomic bonding based on the local chem-
Bridge t o Analytic Forms: The Harris Functional 227
ical environment of each atom. It uses an analytic interatomic bond-order function 6 that modulates a two-center interaction V ( Yto) model the local electronic bond energy
where the sum is over nearest neighbors j of atom i. The function V'(rll),which represents bonding from valence electrons, is assumed to be transferable between different atomic configurations, with all many-body, quantum mechanical effects such as changes in the local density of states with varying local bonding topologies included in the bond-order function. As with the Finnis-Sinclair potential, this local orbital energy is balanced with a repulsive pair potential of exponential functions. Also assuming an exponential for V'(rii) yields the expression
Ecoh
= 1
for the cohesive energy of a system of atoms. The underlying formalism leading to Eqs. [25] and [26] was originally derived by Abe11,43 who was concerned with the question of how materials with vastly different structural characteristics (covalent solids vs. metals, for example) could all be described by a universal binding energy relation, as shown by Rose, Smith, and Ferrante.44 Analytic functional forms based on this work subsequently developed by Tersoff are suitable for detailed simulation of group IV elements.45 This formalism has also been used to model large-scale reactivity in molecular solids such as occurs during a d e t ~ n a t i o nas , ~well ~ as a wide range of different reactions of hydrocarbon molecules and diamond surface~.~'-~~ Rather than present the original derivation, much of which is similar to the discussion above, Eq. [25] can be directly obtained from the Finnis-Sinclair bond energy Eq. [23] by the following: ,,
,112
228 Quantum-Based Analytic Interatomic Forces and Materials Simulation
With this derivation the analytic bond order b, in Eq. [25] is \-112
and the two-center pair potential is
This direct relation between the second-moment approximation and the bond-order expression is not obvious from Abell’s original d e r i ~ a t i o n Rather .~~ than neglecting higher moments of a local density of states, Abell arrived at his expression using a Bethe lattice. This is a hypothetical infinite lattice resembling a “tree” in which the end of each “branch” splits into additional branches (Figure 5). While this lattice has a well-defined coordination determined by the number of splits at the end of each “branch,” there are no loops involving three or more different atoms through which hops can be made that start and end at the same atom. Therefore on this lattice the second moment defines the density of states. For the case of an ideal lattice with only nearest-neighbor interactions, all bond lengths are identical and the bond-order function reduces to the inverse square root of coordination z. The cohesive energy can then be written as r
1
Hence for uniform expansion of an ideal lattice, this expression reduces to an
Bridge to Analytic Forms: The Harris Functional 229
Figure 5
Section of an infinite Bethe lattice with coordination of 3 .
coordination increases, the bond order decreases. This results in decreased bond energies and increased bond distances (Figure 6). This general trend between bond properties and coordination is followed by most materials, a fact that can be verified by comparing bond distances in homonuclear diatomics with nearest-neighbor distances in their respective solids. Furthermore, with exponentials used for the two-center terms, bond lengths are proportional to the log of the bond order. This is the Pauling bond order-bond length relationship. Although the expression above is formally identical to the Finnis-Sinclair potential expression, its form helps to clarify some profound implications for understanding chemical bonding. The more bonds that are formed, the more terms are in the bonding sum involving the attractive pair potential; this tends
230 Quantum-Based Analytic Interatomic Forces and Materials Simulation
1
\
41-
r
Repulsive -PairTerm
Et
E
R
R
I
I
'I
G t
/
#
I
I
1
.
I
BOND DISTANCE
I
I
Figure 6 Left: the two-center terms in the bond-order potential for different local coordinations, and right: their influence on the effective pair potential. z is the local coordination.
to increase stability. At the same time, however, as the coordination increases the bond order decreases, resulting in a decrease in bond strength. The most stable coordination (and hence structure) for a particular material is therefore a balance between these two effects. In a close-packed metal, for example, increasing the number of bonds wins over weakening of each individual bond, and the structure attempts to maximize coordination. In molecular solids, on the other hand, the weakening of bonds dominates over increasing the number of neighbors, and relatively small atomic coordinations result. For covalent solids the two effects roughly balance, and intermediate coordinations occur. Hence with a single, well-derived expression, one can attempt to understand in simple terms the wide diversity in the structure of materials. In developing his analytic potential energy function for group IV materials, Tersoff introduced three important features into Abell's original bondorder formalism.45 First, rather than try to fit minimum energy structures for different materials, he used this formalism to describe different structures of single group IV elements. Using data from density functional calculations, Tersoff showed that a single set of exponential functions for two-center attractive and repulsive terms can provide a quantitative fit to bond lengths and energies for silicon for coordinations ranging from 1 for the diatomic to 12 for the closepacked, face-centered-cubic structure. The transferability of pair interactions can be understood through the Harris functional as described above. The second innovation by Tersoff was to introduce a functional form different from that given above for the bond order; his form incorporates angular interactions while still maintaining coordination as the dominant featuredetermining structure. With this functional form, Tersoff was able not only to model stabilization of the diamond lattice against shear, he also was able to ob-
Effective M e d i u m Theory 231 tain a reasonable fit to elastic constants and phonon frequencies for silicon, germanium, and diamond with just a few parameters. Tersoff’s third innovation in developing the bond-order formalism is a subtle but important one. The diamond lattice (which is the lowest energy structure for the group IV elements Si, C, and Ge) has all tetrahedral angles. More traditional analytic potential energy functions that include angular interactions use functions for which the tetrahedral angle is the minimum energy value.51 Tersoff, on the other hand, did not explicitly do this, but instead used his fit to the various structures to establish a bond-angle minimum that is independent of structure. This feature, together with a functional form motivated by quantum mechanical bonding, provides Tersoff‘s analytic potential with a high degree of transferability. Brenner, who analyzed the bond-order formalism in terms of the behavior of potential barriers for chemical reactions, concluded that the formalism satisfies the correct trends relating barrier position and height relative to reac. ~ ~ ~ ~this ~ ~ provides ~~ a powerful conceptual bridge tion e x o t h e r m i c i t i e ~ Hence between few-body chemical dynamics and many-body materials structure. This approach has been applied to the development of an analytic bond-order function that describes bond energies and lengths of both molecular hydrocarbons and solid state carbon.54 In the most recent form of this potential function, the bond-order function is divided into contributions from u and IT bonding, with the relative contribution from each depending on the local chemical environment.ss This potential is able to accurately describe a wide range of properties of carbon-based materials, including energies, bond lengths, and vibrational properties of hydrocarbons, bulk elastic properties of diamond and single graphene sheets, and various solid surface and defect structures. Being able to model both molecular and solid state bonding on an equal footing has opened up a number of important materials-related processes to larger-scale computer trisimulation. These include chemical vapor deposition of diamond bochemistry and wear,’+*reactive chemical ~puttering:~and nanometer-scale chemical patterning of surfaces.50
EFFECTIVE MEDIUM THEORY Effective medium theory was originally introduced in the early 1980s to describe chemisorption of gas atoms on metal surfaces.56 It has since been developed as a relatively efficient method for describing bonding in solids, particularly metals, and therefore has found considerable use in materials modelings7 It also forms the quantum mechanical basis for the more empirical and widely used embedded-atom method discussed below.l 2 Specific implementations of effective medium theory for materials simulation have been developed by Norskov, Jacobsen, and c o - w o r k e r and ~ ~ ~by~DePristo ~~ and c o - w o r k e r ~ . ~ ~
232 Quantum-Based Analytic Interatomic Forces and Materials Simulation The following discusses the former implementation; the reader is referred to other sources for a discussion of the latter.58 The basic concept of effective medium theory is to replace the relatively complicated environment of an atom in a solid with that of a simplified host. The electronic energy of the true solid is then systematically constructed from accurate energy calculations on the simplified (or effective) medium. In the usual implementation of effective medium theory, the simplified host is a homogeneous electron gas with a compensating positive background. Called a jellium, this construction models the electron density and its role in bonding reasonably well for most metals. This is because screening in metals tends to be so efficient that the local electron density plays a larger role in determining atomic binding energies than does any specific atomic environment. Because of the change in electrostatic potential, an atom embedded in a jellium alters the initially homogeneous electron density. This difference in electron density with and without the embedded atom can be calculated within the local density approximation of density functional theory20 and expressed as a spherical function Ap(r) about the atom
where pjellium and ,,atom/jellium are the densities of the jellium before and after the atom is introduced, respectively. This function depends on both the identity of the atom embedded in the jelfium and the initial homogeneous electron density of the jellium. In effective medium theory, the overall electron density of a solid, p(r), is constructed as a superposition of the perturbed electron densitiess7
where the sum is over atoms i. With this ansatz, however, the Ap(r)’s are not specified until the embedding electron density associated with each atom is known. For perfect solids this can be taken as the average electron density in the region surrounding each atom. For solids containing defects, however, this is impractical, so the embedding electron density at each atomic site must be estimated by some other method. Norskov, Jacobsen, and co-workers have suggested using spheres centered at each site with radii chosen so that the electronic charge within each sphere cancels the charge of each atomic nucleus. For most metals, Jacobsen has shown that to a good approximation the average embedding density is related exponentially to the sphere radius,57 and Puska has tabulated these relationships for 30 element^."^ Among the assumptions of effective medium theory, for an imperfect crystal the average embedding electron density is constructed, and the total electron density in turn depends on these local densities through the Ap(r)’s. Hence a self-consistent problem is at hand. Because the Ap(r)’sneed be calculated only
Embedded-AtomMethod 233 once for different homogeneous electron densities, however, this method can be applied to much larger systems than full density functional calculations without any input required from experiment. It has been shown using the variational principle of density functional theory that the binding energy E , of a collection of atoms with the assumption Eq. [32] above is given by the expression
i
i
i
where p;", is the average electron density for atomic site i, E,(P;~') is the energy of an atom embedded in jellium with density p;"", Eov is the electrostatic repulsion between overlapping neutral spheres (which is summed over atoms i and j ) , and El, is a one-electron energy not accounted for by the spherical approximations.60 For most close-packed metals, the overlap and one-electron terms are relatively small, and the first term in Eq. [ 3 3 ] dominates the energy. Puska has calculated these energies within the local density approximation for 30 elements and tabulated the results as a function of electron density within the j e l l i ~ mReasonable .~~ estimates for shear constants, however, require that the overlap term not be neglected,61 and for non-close-packed systems the oneelectron term becomes important. A further discussion of how these terms can be included has been reviewed by J a c o b ~ e n . ~ ~ Effective medium theory has been used to estimate a wide variety of solid state energies and structures. Examples include relaxations and reconstructions of clean and adsorbate-covered metal surfaces, bulk and surface phonons, interstitial defects, and grain b o u n d a r i e ~Furthermore, .~~ because energies are relatively well defined in this approach, it can be used to understand the underlying driving forces leading to defect structures and energie~.~'
EMBEDDED-ATOM METHOD The embedded-atom method was introduced by Daw and Baskes as a way of modeling hydrogen embrittlement of metals.I2 These authors recognized that although simple pair terms are sufficient to capture, for example, much of the physics of helium in metals, the chemical effects introduced by atomic hydrogen require a more sophisticated treatment. The embedded-atom method gives the binding energy E , of a collection of atoms as
where p, is the total electron density at site z, F is called the embedding function, and U(r,) is a pair-additive interaction that depends on the scalar distance
234 Quantum-Based Analytic Interatomic Forces and Materials Simulation tion, and U(rz,)is a pair-additive interaction that depends on the scalar distance yji between atoms i and j . This expression is essentially an empirical implementation of effective medium theory for metals. The first term in Eq. 1341 corresponds to the energy of the atoms embedded in jellium, the second term represents overlap of neutral spheres, and the one-electron term of Eq. [ 3 3 ] is ignored. In the embedded-atom method, however, the average electron density surrounding an atom within a neutral sphere used in effective medium theory is replaced with a sum of electron densities from neighboring atoms at the lattice site i
where @i(rjj)is the contribution to the electron density at site i from atom j . Furthermore, the embedding function and the pair terms are empirically fit to materials properties, and the pair-additive electron contributions qj are either taken from atomic electron densities or also fit as empirical functions. The number of research groups using the embedded-atom method is large, as is the range of systems and properties being examined. This method is popular for several reasons. First, despite its apparent simplicity, it works very well for predicting energies and properties of a relatively wide range of metal struct u r e ~This . ~ ~is due primarily to the quantum mechanical origins of the method (as described in the preceding section) and the database used to fit parameters in the analytic potential function. Second, self-consistency is not incorporated into the method; with pair-additive electron densities, the time required to evaluate the function scales with the number of atoms as a pair potential. Hence one can obtain relatively accurate and transferable energies and structures for large systems. A recent example of this is the simulation of crack tip blunting in copper using a multimillion-atom system.62 Third, because each term is well defined and relatively simple, the embedded-atom method can be used to qualitatively understand trends in materials properties as well as driving forces for specific defect structures and energetics. Finally, the computer code developed by Baskes, Daw,12 and F ~ i l e was s ~ ~made widely available to interested researchers. This has led to extensive testing of the method, which has produced a broad understanding of its strengths and limitations. This in turn has given an overall confidence in the results within the scientific community. While not a strictly scientific issue, it nonetheless has made a major contribution to the widespread use of the embedded-atom method. In the original implementation of the embedded-atom method, the electron densities contributed by each atom were assumed to be spherically symmetric.12 This led to generally reasonable results for close-packed metals. There have also been attempts to extend this method to other systems such as silicon without introducing one-electron terms by incorporating an angular depenWhereas this dence into the electron densities p(r) contributed by the
Fitting Databases 235 non-close-packed materials has generally not produced the same degree of accuracy and transferability found for the initial implementation. If the embedding function of the embedded-atom method is taken as the square root of the electron density, then its form is the same as the FinnisSinclair potential, Eq. [24]. Although the routes to each of the various expressions are very different, the final analytic function apparently retains the same underlying quantum mechanical features of bonding. Effective medium theory and the embedded-atom method also provide a different interpretation of the Xu et al.,33 tight binding expression of Eq. [13]. The analytic term A in Eq. [ 111,which is taken as a polynomial function of a pair sum in the Xu et al. form, can be interpreted as corresponding to the energy of embedding an atom into jellium. The one-electron, tight binding orbitals are then the one-electron terms typically ignored in Eq. [33].
FITTING DATABASES This chapter has focused almost exclusively on analytic functions, so it is appropriate to end with a short discussion of a few issues related to fitting databases. Before accurate first-principles methods were available, most analytic potential functions were fit almost exclusively to experimental data, with ab initio calculations used to “fill in” important regions of the function for which no reliable experimental data were available. For a three-body chemical reaction, for example, a potential function might be fit to the bond energies, bond lengths, and vibrational constants of the three diatomic molecules, and with the reaction barrier fit to an experimental activation energy. With the combination of better approximations to total energies, increasingly clever numerical algorithms and more powerful computers, however, more use is being made of data generated from first-principles calculations. In fact, for many cases, the use of data generated strictly from first principles has important advantages over the mixing of calculated data with experimental values because there is a consistent and often well-understood source of error in the database. Furthermore, with current theoretical capabilities it is often easier to generate an appropriate select set of data from first principles than to attempt experimental measurements. It was mentioned above that it is often (falsely)assumed that fitting many parameters to a large database guarantees a function that is transferable to a wide range of structures. Whereas an extensive database can sometimes be advantageous, experience suggests that a well-chosen set of properties, even if relatively small, is often desirable if the analytic function is carefully derived. For most solids, an obvious database would include lattice constant, cohesive energy, and bulk modulus. For a pair potential like a Morse function, these observables would be sufficient to determine the three parameters entering the
236 Ouantum-Based Analvtic lnteratomic Forces and Materials Simulation
function. With a many-body potential, however, there are usually additional parameters to be fit. One of the reasons for the initial success of the embedded-atom method was the choice of solid state properties used to fit the function. In addition to the three listed above, the original parameterization for close-packed metals included elastic constants and the vacancy formation energy. The latter property reflects undercoordinated structures compared to the bulk, while the elastic constants help yield reasonable relaxation of the bulk material around defects. Both of these contribute to the considerable predictive power of the embeddedatom method in reproducing surface structures and energies because surfaces involve both undercoordinated atoms and relaxation. In later embedded-atom method parameterizations, Foiles, Daw, and Baskes added the universal bonding energy relation to the fitting database.39 This helps to assure accurate thermal expansion behavior. It likely also contributes to the ability of embeddedatom method potentials for close-packed metals to reproduce relatively accurate melting points. Analytic potential energy functions for atomistic simulation are sometimes mistakenly viewed as ad hoc mathematical expressions whose development is more art than science. Although there is perhaps some level of art in developing these expressions, the most successful potential functions are almost always based on fundamental quantum mechanics, as we have attempted to demonstrate. With the growing interest in simulating relatively large systems and long-time dynamics, development of relatively accurate and transferable analytic potential expressions will undoubtedly continue to play a central role in materials theory.
ACKNOWLEDGMENTS B. J. Garrison, J. A. Harrison, J. W. Mintmire, S. A. Sinnott, and C. T. White are thanked for helpful discussions leading to some of the ideas presented here. Financial support from the Office of Naval Research, the National Science Foundation, and NASA is gratefully acknowledged.
REFERENCES 1 . J. Cioslowski, in Reviewsin Computational Chemistry,K . B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1993, Vol. 4, pp. 1-33. Ah Initio Calculations on Large Molecules: Methodology and Applications. 2 . D. Feller and E. R. Davidson, in Reviews in Computational Chemistry, K . B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1990, Vol. 1, pp. 1-43. Basis Sets for Ab Initio Molecular Orbital Calculations and Intermolecular Interactions.
References 237 3. F. H. Stillinger and T. A. Weber, Phys. Rev. B, 31, 5262 (1985).Computer Simulation of Local Order in Condensed Phases of Silicon. 4. L. J. Bartolotti and K. Flurchick, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1995, Vol. 7, pp. 187-21 6. An Introduction to Density Functional Theory. 5. A. St-Amant, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1995, Vol. 7, pp. 217-259. Density Functional Methods in Biomolecular Modeling. 6. J. Hirschfelder, H. Eyring, and B. Topley, J. Chem. Phys., 4, 170 (1936). Reactions Involving Hydrogen Molecules and Atoms. 7. A. Rahman, Phys. Rev., 136A, 405 (1964). Correlations in the Motion of Liquid Argon. 8. See, for example, W. H. Miller, Ed., Dynamics of Molecular Collisions, Plenum Press, New York, 1976. 9. See, for example, L. M. Balbes, S. W. Mascarella, and D. B. Boyd, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1994, Vol. 5, pp. 337-379. Perspectives of Modern Methods in Computer-Aided Drug Design. 10. See, for example, J. P. Bowen and N. L. Allinger, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1991, Vol. 2, pp. 81-97. Molecular Mechanics: The Art and Science of Parameterization. 11. J. B. Gibson, A. N. Goland, M. Milgram, and G. H. Vineyard, Phys. Rev., 120, 1229 (1960). Dynamics of Radiation Damage. 12. M. S. Daw and M. I. Baskes, Phys. Rev. Lett., 50, 285 (1983). Semiempirical Quantum Mechanical Calculation of Hydrogen Embrittlement in Metals. 13. M. W. Finnis and J. E. Sinclair, Philos. Mag., A50, 45 (1984). A Simple Empirical N-Body Potential for Transition Metals. 14. J. Tersoff, Phys. Rev. Lett., 56, 632 (1986). New Empirical Model for the Structural Properties of Silicon. 15. R. Car and M. Parrinello, Phys. Rev. Lett., 5 5 , 2471 (1985).Unified Approach for Molecular Dynamics and Density-Functional Theory. 16. 0. Shenderova and D. W. Brenner, Muter. Res. SOL.Symp. Proc., 442, 693 (1997). Coexistence of Two Carbon Phases at Grain Boundaries in Polycrystalline Diamond. 17. See, for example, R. J. Bartlett and J. F. Stanton, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1994, Vol. 5 , pp. 65-169. Applications of Post-Hartree-Fock Methods: A Tutorial. 18. P. Hohenberg and W. Kohn, Phys. Rev., 136, B864 (1964). lnhomogeneous Electron Gas. 19. W. Kohn and L. J. Sham, Phys. Rev. A., 140, 1133 (1965). Self-Consistent Equations Including Exchange and Correlation Effects. 20. D. M. Ceperly and B. J. Alder, Phys. Rev. Lett., 45, 566 (1980).Ground State of an Electron Gas by a Stochastic Method. 21. J. P. Perdew and A. Zunger, Phys. Rev. B, 23,5048 (1981). Self-Interaction Correction to Density-Functional Approximations for Many-Electron Systems. 22. J. Harris, Fhys. Rev. B, 31, 1770 (1985). Simplified Method for Calculating the Energy of Weakly Interacting Fragments. 23. W. M. C. Foulkes and R. Haydock, Phys. Rev. B, 39, 12520 (1989). Tight-Binding Models and Density-Functional Theory. 24. A. J. Read and R. J. Needs, J. Phys. Condens. Matter, 1, 7565 (1989). Test of the Harris Energy Functional. 25. H. M. Polatoglu and M. Methfessel, Phys. Rev. B , 37, 10403 (1988). Cohesive Properties of Solids Calculated with the Simplified Total-Energy Functional of Harris. 26. M. W. Finnis,]. Phys. Condens. Matter, 2,331 (1990).The Harris Functional Applied to Surface and Vacancy Formation Energies in Aluminium.
238 Quantum-Based Analytic Interatomic Forces and Materials Simulation 27. 1. J. Robertson, M. C. Payne, and V. Heine, J. Phys. Condens. Mutter, 3, 8351 (1991). SelfConsistency in Total Energy Calculations: Implications for Empirical and Semi-Empirical Schemes. 28. See, for example, D. A. Drabold, P. A. Fedders, and P. Stumm, Phys. Rev. B, 49, 16415 (1994),and references therein. Theory of Diamond-like Amorphous Carbon. 29. J. C. Slater and G. F. Koster, Phys. Rev., 94, 1498 (1954).Simplified LCAO Method for the Periodic Potential Problem. 30. See, for example, X. P. Li, W, Nunes, and D. Vanderbilt, Phys. Rev. B, 47,10891 (1993).Density-Matrix Electronic-Structure Method with Linear System-Size Scaling. 31. D. J. Chadi, Phys. Rev. B, 19,2074 (1979). (110) Surface Atomic Structures of Covalent and Ionic Semiconductors. 32. 1. Kwon, R. Biswas, C. Z. Wang, K. M. Ho, and C. M. Soukoulis, Phys. Rev. B, 49, 7242 ( 1 994). Transferable Tight-Binding Model for Silicon. 33. C. H. Xu, C. Z. Wang, C. T. Chan, and K. M. Ho,J. Phys. Condens. Matter, 4,6047 (1992). A Transferable Tight-Binding Potential for Carbon. 34. See, for example, A. P. Sutton, P. D. Goodwin, and A. P. Horsfield, Muter. Res. Soc. Bull., 21, 42 (1996). Tight-Binding Theory and Computational Materials Synthesis. 35. M. Menon and K. R. Subbaswamy, Phys. Rev. B, 47, 12754 (1993).Nonorthogonal TightBinding Molecular-Dynamics Study of Silicon Clusters. 36. T. Frauenheim, F. Weich, T. Kohler, S. Uhlmann, D. Porezag, and G. Seifert, Pbys. Rev. B, 52, 11492 (1995). Density-Functional-Based Construction of Transferable Nonorthogonal Tight-Binding Potentials for Si and SiH. 37. F. Cyrot-Lackmann, J. Phys. Chem. Solids, 29,1235 (1968). Sur le calcul de la cohesion et de la tension superficielle des mktaux de transition par une mkthode de liaisons fortes. 38. A. P. Sutton, Electronic Structure of Materials, Clarendon Press, Oxford, 1993. 39. S. M. Foiles, Muter. Res. Soc. Bull., 21, 24 (1996), and references therein. Embedded-Atom and Related Methods for Modeling Metallic Systems. 40. F. Ercolessi, M. Parrinello, and E. Tossatti, Philos. Mug. A, 58,213 (1988).Simulation of Gold in the Glue Model. 41. A. E. Carlsson, Phys. Rev. B, 44, 6590 (1991).Angular Forces in Group-VI Transition Metals: Application to W(100). 42. See, for example, S. M. Foiles, Phys. Rev. B, 48,4287 (1993). Interatomic Interactions for Mo and W Based on the Low-Order Moments of the Density of States. 43. G. C. Abell, Phys. Rev. B, 31, 6184 (1985). Empirical Chemical Pseudopotential Theory of Molecular and Metallic Bonding. 44. J. H. Rose, J. R. Smith, and J. Ferrante, Phys. Rev. B, 28, 1935 (1983). Universal Features of Bonding in Metals. 45. J. Tersoff, Phys. Rev. B, 39, 5566 (1989). Modeling Solid-state Chemistry: Interatomic Potentials for Multicomponent Systems. 46. D. W. Brenner, D. H. Robertson, M. L. Elert, and C. T. White, Phys. Rev. Lett., 70, 2174 ( 1 993). Detonations at Nanometer Resolution Using Molecular Dynamics. 47. B. J. Garrison, E. J. Dawnkaski, D. Srivastava, and D. W. Brenner, Science, 255, 835 (1992). Molecular-Dynamics Simulations of Dimer Opening on a Diamond (001](2 X 1) Surface. 48. J. A. Harrison and D. W. Brenner,J. Am. Chem. SOL., 116, 10399 (1995). Simulated Tribochemistry: An Atomic-Scale View of the Wear of Diamond. 49. R. S. Taylor and B. J. Garrison, Int. J. Mass Spectrom., 143,225 (1995).A Microscopic View of Particle Bombardment of Organic Films. SO. S. B. Sinnott, R. J. Colton, C . T. White, and D. W. Brenner, Surf. Sci., 316, L1055 (1994).Surface Patterning with Atomically-Controlled Chemical Forces: Molecular Dynamics Simulations.
References 239 51. See, for example, U. Burkert and N. L. Allinper, Molecular Mechanics, ACS Monograph 177, American Chemical Society, Washington, DC, 1982. 52. D. W. Brenner, Muter. Res. SOL. Bull., B 2 1 , 36 (1996).Chemical Dynamics and Bond-Order Potentials. 53. D. W. Brenner, Muter. Res. SOL.Symp. Proc., 141,59 (1989).Tersoff-type Potentials for Carbon, Hydrogen and Oxygen. 54. D. W. Brenner, Phys. Rev. B, 42, 9458 (1990).Empirical Potential for Hydrocarbons for Use in Simulating the Chemical Vapor Deposition of Diamond Films. 55. D. W. Brenner, 0. Shenderova, J. A. Harrison, and S. B. Sinnott, unpublished results, 1997. 56. J. K. Norskov and N. D. Lang, Phys. Rev. B, 21, 2131 (1980).Effective-Medium Theory of Chemical Bonding: Application to Chemisorption. 57. K. W. Jacobsen, Comments Condens. Matter Phys., 14, 129 (1988),and references therein. Bonding in Metallic Systems: An Effective Medium Approach. 58. T. J. Raecker and A. E. DePristo, Int. Rev Phys. Chem., 93, (1991).Theory of Chemical Bonding Based on the Homogeneous Electron Gas System. 59. M. J. Puska, in Many-Body Interactions in Solids, R. M. Nieminen, M. J. Puska, and M. J. Manninen, Eds., Springer Proceedings in Physics, Springer-Verlag,New York, 1990, Vol. 48, pp. 134ff. 60. K. W. Jacobsen, J. K. Norskov, and M. J. Puska, Phys. Rev. B, 35, 7423 (1987).Interatomic Interactions in the Effective-Medium Theory. 61. A. P. Sutton and R. W. Balluffi, Interfaces in Crystalline Materials, Clarendon Press, Oxford, 1995, pp. 150-239. Models of Interatomic Forces at Interfaces. 62. S. J. Zou, D. M. Beazley, P. S. Lomdahl, and B. L. Holian, Phys. Rev. Lett., 78, 479 (1997). Large-Scale Molecular Dynamics Simulations of Three-Dimensional Ductile Failure. 63. See, for example, M. I. Baskes, J. S. Nelson, and A. F. Wright, Phys. Rev. B, 40,6085 (1989). Semiempirical Modified Embedded-Atom Potentials for Silicon and Germanium.
CHAPTER 5
Quantum Mechanical Methods for Predicting Nonlinear Optical Properties Henry A. Kurtz" and Douglas S. Dudist "Department of Chemistry, University of Memphis, Memphis, Tennessee 381 52, and fMaterials LaboratorylPolymer Branch, Wright Laboratory (WL/MLBP),Wright Patterson Air Force Base (WPAFB),O h i o 45433
INTRODUCTION In this tutorial on the basic ideas and modern methods of computational chemistry used for the prediction of nonlinear optical properties, the focus is on the most common computational techniques applicable to molecules. The chapter is not meant to be an exhaustive review of nonlinear optical theories, nor is it a compendium of results. Although much is omitted from this chapter, there exist several earlier reviews on the general subject of nonlinear optics that help form a broad foundation for this ~ o r k . l -The ~ material in this chapter will, hopefully, be of value to readers who are interested in learning enough about computational nonlinear optical methods to discern the differences between high and low quality results and limitations of modern methodologies, and to readers who would like to join the effort to improve the calculations. The initial part of this chapter discusses examples of current applications requiring a knowledge on nonlinear optical (NLO)properties followed by a secReviews in Computational Chemistry, Volume 12 Kenny B. Lipkowitz and Donald B. Boyd, Editors Wiley-VCH, John Wiley and Sons, Inc., New York, 0 1998
24 1
242 Predicting Nonlinear Optical Properties tion defining the molecular properties of interest. Subsequent sections introduce the methods used to calculate NLO properties, along with several practical considerations in NLO property calculations. A short presentation on the extensions to the basic electronic properties of gas phase systems follows, to highlight related areas of active research.
NONLINEAR OPTICAL PROPERTY BASICS A brief discussion of the definitions of the quantities of interest must precede our presentation of computational methods. More detailed developments can be found in many books such as the one by R. W, Boyd.8 The behavior of light is governed by Maxwell's equations and, after some reworking, the following important wave equation can be obtained: V x V x E+--
1 d2E
c2
at2
=
4n a2P c2 at2
This equation relates the propagation of the electric field E to the polarization of the medium, P. In this equation both E and P are vector quantities and the operation V X is the curl described in vector a n a l y ~ i s The . ~ constant c is the speed of light. The polarization of bulk materials is a function of the field E and can be expressed in a power series of the electric field as:
where the coefficients x(")in each order are referred to as the nth-order susceptibilities and are nth-rank tensor^.^ It is the inclusion of the terms beyond the first in the description of the polarization that gives rise to the name nonlinear optics. To see how nonlinear optical phenomena of different types arise, it is illustrative to work through a simple example. Assuming that the external electric field is from a monochromatic light source, represented as E = E,cos(wt), substituting this into Eq. [2] truncated after the third-order term, and using trigonometric identifies to expand the [cos(wt)]" terms, we obtain the following expression:
From this equation it can be seen that the polarization, P, has terms proportional to cos(wt),cos(2wt), and cos(3wt); the terms oscillate at a given frequency
Examples of Applications of' Nonlinear Optics 243
causing the transmitted E to oscillate at those same frequencies. The latter two terms of Eq. [ 3 ]give rise to second and third harmonic generation, respectively, and are proportional to x ( ~and ) x ( ~ )The . response at cos(wt) has two contributions. One is related to x ( l )and corresponds to the usual linear polarizability, and the other is a x ( ~term. ) The x ( ~term ) describes a change in the response at w that is dependent on the intensity of the electric field E , and results in an intensity-dependent refractive index. Note also that the fourth term in Eq. [ 3 ] has no frequency dependence. This is a term that is related to loss of the input light. These common examples of different effects, and a few others, are discussed again later. The bulk susceptibilities from Eq. [2] are to be related to molecular response properties denoted as a,@, and y. These molecular properties can be defined from the Taylor series of the response of the molecular dipole moment as follows:
where the second form shows the explicit contributions in terms of the components of the electric field Fiin direction i. Unless otherwise noted, from here on F is used to symbolize the electric field, to avoid confusion with the energy E. To avoid potential confusion over nomenclature about orders, the molecufirst hyperpolarizlar properties are referred to simply as polarizability (a), ability (p), and second hyperpolarizability (y). As mentioned earlier, the equations defining the NLO properties are usually complicated by the fact that the electric fields of interest are time-dependent (i.e., from lasers). This means that different combinations of types or frequencies of fields give rise to multiple molecular and bulk properties at each order, as illustrated in Eq. [3]. To differentiate these effects, the molecular quantities need to be written to reflect the nature of the experimental setup. This is done as a(-w;w), p(-wo;wl,w2), and y( -wo;w1,w2,w3), where the quantities to the right of the semicolon are the input radiation(s) and the quantity to the left is the resulting (outgoing)radiation. wo is simply the sum of the frequencies on the right side of the semicolon (conservation of energy). Definitions of the most commonly computed quantities are shown in Table I.
EXAMPLES OF APPLICATIONS OF NONLINEAR OPTICS The computational methods reviewed herein are applicable to organic, organometallic, and other molecular compounds. Such molecular and poly-
244 Predicting Nonlinear Optical l’roperties Table 1. Common NLO Properties Provertv
Name Second harmonic generation (SHG) Electrooptic Pockels effect OpticaI rectification Third harmonic generation DC electric-field-induced SHG Intensity-dependent refractive index Optical Kerr effect Coherent anti-Stokes Raman spectroscopy
Symbol pSHG pEOPE POR YTHG
or .,,DC-SHG or yDFWM
?EFISH ,+DRI
YOKE ?CARS
meric materials are under intense study for a variety of NLO applications.10-21 Traditionally these applications have used inorganic materials, principally lithium niobate, potassium dihydrogen phosphate, and potassium titanyl phosphate. A number of applications for the molecular and polymeric materials are described below. Most of these descriptions are oversimplified, and the reader is referred to more specialized s o ~ r c e sfor ~ ~further , ~ ~ details.
Second Harmonic Generation (SHG) Frequency doubling, or SHG, as illustrated in Figure 1, is of keen interest as a means of generating new laser sources. Frequency doubling of low power diode lasers from the red or infrared into the blue comprise a prime commercial application. Device outlets include higher capacity optical storage devices, as well as more sensitive laser printers and copiers. The objective is to increase not only the molecular response p, but of course the material response x ( ~ Be). cause the material used may be a polymer loaded with a molecular NLO chromophore, or a chromophore incorporated into the polymeric structure, a dilution effect is inevitable. Additionally, for x ( ~an ) alignment of the molecules is necessary, although in practice significantly less than 100% can be realized. as possible. Hence, a maximal p is desired to yield as high a
Electrooptic Modulation Modulators are needed to convert electrical signals into optical amplitudes or phase modulations. Polymeric waveguides may be more compatible with integrated optics owing to processing considerations, and will be much cheaper to manufacture. Routing switches, electrooptic shutters, and amplitude modulators are important components for optical communications networks. Directional couplers, mode sorters, and Mach-Zender interferometers (Figure 2) are representative of the specific device mechanisms. These devices can op-
Examples of Applications of Nonlinear Optics 245
Figure 1 Second harmonic generation. An input light source of frequency w produces an output beam of frequency 2w. This frequency doubling originates from the molecular p and occurs only in noncentrosymmetric materials.
erate a t much higher bandwidths for information processing than their electronic counterparts and are generally faster than devices made with inorganic materials because the dielectric constants of the organic materials are lower.
Optical Bistability (Optical Signal Processing) Optical bistability is a third-order effect that occurs when a material with an intensity-dependent refractive index can yield two possible states for a single input intensity. This corresponds to the optical equivalent of the transistor. Although much attention has been given to all-optical computing, it is questionable whether such a devise can economically compete with ever cost-decreasing, performance-increasing modern electronic computers, at least in the
Figure 2 Mach-Zender interferometer. The input light source (from the left) is split into two beams, A and B. An external voltage is applied across the bath for beam B, which changes the index of refraction of the medium. This alters the speed of the light in path 3 such that when the two beams are recombined, a phase mismatch is introduced. The voltage thus introduces a modulation signal into the light beam.
246 Predicting Nonlinear Optical Properties
foreseeable future. Optical signal processing does present other important opportunities. Advanced imaging systems are conveying ever increasing quantities of information. If the signals are converted to electrical signals, transmitted to a processing unit for analysis, and then converted to some type of visual output, the modulation, transmission, and processing steps become substantial bottlenecks. Preprocessing the optical signals allows up-front filtering of unwanted information, with obvious benefits. A major limitation is the magnitude of the third-order responses. Introduction of devices based on these materials would greatly simplify image processing.
Degenerate Four-Wave MixingPhase Conjugation (Imaging Enhancements) Degenerate four-wave mixing (also known as intensity-dependent refractive index) and phase conjugation are third-order effects in which two input beams can interfere to create a spatial variation in the refractive index of the medium. This variation acts as a diffraction grating for a third (probe) input beam. The diffracted (signal) beam is the phase complex conjugate of the probe beam. In addition to signal processing, applications include image recognition and reduction or correction of beam aberrations in reconstructed images. The latter is especially interesting for reconstructing images when a beam is transmitted through the atmosphere. Aircraft tracking, target recognition, and meteorological studies are recognizable applications. Materials having suitable third-order responses would be desirable, and computational approaches are needed both to understand the structure-property relationships and to design new materials.
Frequency Upconversion Lasing Many of the applications for higher frequency light are the same as those under SHG, but frequency upconversion techniques (Figure 3 ) offers some distinct advantages over SHG. These include elimination of phase-matching requirements, use of semiconductor lasers as pump sources, and viability of waveguide and fiber configurations. Both two-photon and sequential, multiphoton pumping mechanisms have been employed to achieve frequency-upconverted lasing. Recent work has demonstrated two-photon pumping using organic dyes in solid matrices. Applications include optical limiting, optical power stabilization, three-dimensional optical data storage, and upconverted fluorescence three-dimensional microscopy of materials. New dyes developed recently can efficiently upconvert infrared light into visible light. This material advance is enabling revolutionary improvements in confocal microscopy applications. The infrared pump laser can penetrate a sample laced with the NLO chromophore far better than other light sources. The chromophore emits visible light which, when coupled with computer-controlled tomographic techniques, enables imaging to micrometer depths.
Definitions of Molecular Properties 247
Figure 3 Two-photon upconverted emission. Two photons of frequency o are absorbed by a medium, thus populating an excited state. A higher energy photon LO’ is emitted. In some circumstances lasing can be achieved.
DEFINITIONS OF MOLECULAR PROPERTIES There are many terms in the total response of materials to electric and magnetic fields, but the polarizability (a) and hyperpolarizabilities (p and y ) defined above are the NLO properties that are the focus of this chapter. For more general treatments of other properties, see the reviews by B ~ c k i n g h a mand ~~ the NLO molecular properties are second-, Bish0p.l As illustrated in Eq. [4], third-, and fourth-rank tensor quantities, re~pectively.~ In the most general case, a has 9 components, p has 27 components, and y has 8 1 components; because of symmetry properties, however, many of these components are often interrelated or equal to zero. One complication in the literature concerning these properties is the multiple conventions for their definitions. In addition to the Taylor series convenan alternate convention (designated with a tilde) tion used in this work (Eq. [4]), is based on a perturbation series expansion given as follows:
The only difference compared to the Taylor series of Eq. [4] is that the numerical factors have been moved into the coefficients of the electric field. Properties based on this convention (with a tilde) are related to those of Eq. [4] (without a tilde) by p = 2!b and y = 3!7. A third common phenomenological convention is based on expanding a frequency-dependent field, Eocos(w),as in Eq. [3] and absorbing all constants into the coefficient of En at each order. This convention is unsatisfactory because the different frequency-dependent quantities in each order d o not converge to the same limit as o + 0. A review of these and other conventions has been given by Willetts et a1.2s
248 Predicting Nonlinear Obtical Proberties
It is possible to define the nonlinear optical properties of a molecule in terms of the response of molecular properties other than the dipole, such as the energy or polarizabilities, to an external electric field. The appropriate expressions within the convention used in this chapter are given by Eqs. 161 and 171.
Care must be taken in using the expressions above for obtaining nonlinear optical properties, because the values obtained may not be the same as those obtained from Eq. [4]. The results will be equivalent only if the Hellmann-Feynman theoremzh is satisfied. For the case of the exact wavefunction or any fully variational approximation, the Hellmann-Feynman theorem equates derivatives of the energy to expectation values of derivatives of the Hamiltonian for a given parameter. If we consider the parameter to be the external electric field, F, then this gives dE/dF = (dH/dF) = (p,). For nonvariational methods, such as perturbation theory or coupled cluster methods, additional terms must be considered. As mentioned, the NLO properties are tensors with potentially many nonzero components. The goal of quantum chemistry herein is to develop methods to calculate these components in a molecular frame, after which they can be transformed (with other appropriate factors) into a laboratory frame to give susceptibilities. Fortunately for theoreticians, many experiments are done on isotropic systems (gases, neat liquids, and solutions) where the invariant vector and scalar components of a,p, and y are measured. For example, the isotropic (or average) polarizability is given by
The other components of the polarizability (axpaxz,etc.) are not needed to obtain the isotropic quantity, saving considerable computing time. Similar simplifications are obtained for the second hyperpolarizabilities, where the isotropic quantity is2
Definitions of Molecular Properties 249
To obtain this isotropic quantity, only 27 of the 81 possible components need to be calculated. For the all-static (DC) field case, the component indices can be permuted freely, and only 6 components are needed to obtain y via the following expression: 1
(Y) = J IYxxxx + Y y y y y + Y zzzz
+ 2Y x x y y + 2Y xxzz + 2Y yyzzl
1101
This free permutation, known as Kleinman ~yrnmetry,~’is often assumed to hold at low frequencies far from resonance. At nonzero frequencies Kleinman symmetry no longer holds, and free permutation of the indices is not allowed. Fortunately, some simplification is present for the common types of measurement, because the component labels can be permuted as long as the corresponding frequency elements are equivalent. For example, for EFISH (see Table 1 ) the middle indices can be interchanged O ) . these simplifications leads to and y.. ( - 2 w ; o , w , O ) = ~ , ~ ~ ~ ( - 2 w ; w , w ,Using ffl . the fo owing expressions for a few common types of second hyperpolarizability: 1
WTHC=J{Yxxxx
+Yyyyy+Yzzzz+2(Yxxy~+Yxxzz+Yyyzz)~
+ (Yxyyx + Y xzzx + Y yxxy + Y yzzy
(Y)IDR1
=
+
Y zxxz + Yzyyz )I
1
7 {3(Yxxxx + Yyyyy + Y zzzz )
Y xzxz + Y yzyz + Y yxyx + Yzxzx + Y z y z y ) + (Yxxyy + Y xxzz + Y yyzz + Y zzxx + Y zzxx .+ Y zzyy )I +
2(Yxyxy
+
1131
These expressions also show the reduction in the number of components needed, with 6 for THG and 15 for both EFISH and IDRI (see Table 1 for acronyms). In some cases, other second-hyperpolarizability quantities may be required, such as for the Kerr effect, which measures the depolarization of light in the presence of a DC field. The corresponding second hyperpolarizability is given as
where yIIis the (y) in Eq. [lo] and
250 Predicting Nonlinear Optical Properties
This leads to the following expression in terms of individual components:
Y
K
1 5
= -(Y x x x x + Y yyyy
+
Y 2222 + Y xxyy + Yyyxx
[I61
Y x x u fY2vc.X +Yyyzz +Yzzyy)
For the first hyperpolarizability, p, it is harder to define a single simple quantity. In the SHG experiment, when all applied fields have parallel polarization and the molecule has a rotational symmetry axis (defined as the z axis), the permanent dipole moment is oriented along that axis and the NLO quantities of interest are:
When all fields are perpendicular in their polarization, the quantities of interest are:
As with the case earlier, in the case of static field a further simplification results, giving 3
PI1 = +z
+ Pyyz + Pxxz)
1191
In the case of nonstatic fields, some indices may again be permuted, and the SHG components obey the relation Psk( -2w;w,w) = Plk,( -2w;w,o). This leads to the following set of expressions for the p:
and
Many calculations are performed on molecules that do not have a rotational symmetry axis or for which it is not convenient to orient the coordinate system along the dipole moment. In these cases, the expressions for p become
Definitions of Molecular Properties 251 = -(P: 1
5
+ P’y + P
Y2
where the “components” are given as
The Kerr experiment measures the depolarization of light in the presence of a DC field, and the corresponding first hyperpolarizability is
PK = PI1 - P 1
1241
For nonisotropic systems, such as crystals and liquid crystals, the experimental laboratory-frame-based properties can still be related to the usual molecular frame properties. An example of this procedure for SHG28 is
where the cos 8, are the direction cosines relating the laboratory axis X to the molecular axis x , the are local field factors to account for the screening of the external field by the other molecules, and the s sum is over molecular sites. In summary, what is important for the computational chemist is the ability to compute the tensor components of ai,(-w;w), Piik(-wG;w1,w2),and y j i k l -wa;w1,w2,03). ( Various methods used to determine these values are described in the next section. Because most experimental measurements are not reported in atomic units, Table 2 gives relationships with two other popular unit systems-SI and esu.
fz
Table 2. Conversion Factors for Atomic Units (au) SI a 1 au 1.6488 X C2m2J-l 1 au 3.20636 X C3m3J-2 P 1 au 6.23538 X C4m4J-3 Y
esu 1.4819 X 8.63922 X 5.03670 X
esu esu esu
252 Predicting Nonlinear ODtical Proberties
METHODS FOR MOLECULAR ELECTRONIC (HYPER)POLARIZABILITY CALCULATIONS This section does not provide an exhaustive survey of all possible methods and approximations in use, but rather focuses on the most common methods currently implemented and then briefly describes a few others that are important. Three methods-the finite field, sum-over-states (SOS), and time-dependent Hartree-Fock methods-encompass the vast majority of NLO property calculations being performed today and are implemented in several readily available molecular orbital computer programs.
Finite Field Method The finite field method is the simplest method for obtaining nonlinear optical properties of molecules. This method was first used by Cohen and Roothaanzs to calculate atomic polarizabilities at the Hartree-Fock level. The basic idea is to truncate the expansion of the energy (Eq. [6]) and solve for the desired coefficients by numerical differentiation. For example, if the expression is truncated after the quadratic term, the result is E ( F ) = E ( 0 ) - p.,Fr - h 2 , F 2 F , . If a uniform electric field is assumed to be aligned along the x direction only, this reduces to E ( F x ) = E ( 0 ) - p.JX - &x,F:. These expressions can be solved for pXand axxas follows:
If the energy is then calculated with a field of known strength oriented along the + x direction and the --x direction, the equations above give p.x and axx.A similar procedure can be repeated for other field orientations to get all the components of p. and a. For the calculation of nonlinear optical properties, Eq. 161 is usually truncated beyond the F4 term and more complicated expressions result. Bartlett and P u r ~ i advocated s~~ this procedure in 1979 and Kurtz et al.31 implemented it in the MOPAC program. The resulting equations that give NLO properties for the fewest energy calculations using this approach are as follows:
(Hyper)Polarizability Calculations 253
Formulas for the polarizability and hyperpolarizability components can likewise be derived from the dipole moment expression in Eq. [4]. The resulting equations are:
a.. 21 = [g(p,(F,)- pi(-F.)} I - &z(pi(2Fj)- pi(-2Fj))]/F,
~29~1
P-..= 5[pi(2Fi)+ pi(-2Fi) - pi(Fi)- pi(-F,)}/Ff
~294
,El
An improved expression for Piii was proposed by Sim et al.32 that uses the directly computed value of p(0) instead of the approximate value from Eq. [29a]:
P,,, = IMO)
-
P,,,
-
=
+ I ~ , ( F+ , ) ~ , ( - F , ) I- ~(F,vF,)+ P,(-~F,)II/F; ~ 3 0 4
[w) +I~,cF,) + ~.,(-F,)J ~ I F , ( ~ F , )P,(-~F,)II/F; -
+
[30b1
Using Eqs. 1281 to calculate all the components of ci and P (except xyz) and most of y (xxxx,yyyy, zzzz, xxyy, xxzz, yyzz) requires 39 different energy
254 Predicting Nonlinear Optical Properties
calculations, each with different values, or orientations, of the external electric field. If the alpvalues are not calculated, only 25 energy calculations are needed. p,,, y,,,,), then Further, if only the “diagonal” tensor elements are needed (arz, only 13 calculations need to be performed to get all directions ( x , y, and z ) .Just 5 calculations are needed to get the values along a single direction. Another procedure for solving the truncated expressions is to calculate E ( F ) at several arbitrary values of F and fit the truncated equation to the data to extract the coefficients. A combination approach has been advocated by Sim, Chin, Dupuis, and Rice32 in which an E ( F ) - E ( 0 ) expression through order F“ is fit by results at 2m + 1 values of the electric field chosen as +kFmaxlm,k = 0 to m. Their application was only for “diagonal” values of the properties aJz, p,,,, and y,,,, and for n = 4 and m = 2 leads to the same equations as Eq. [ZS]. The procedure could be extended to compute all the components of a,p, and y by extending the number of energy calculations to include fields aligned along other directions (e.g., 45” to the axes). Another example of this approach is the calculations of Woon and Dunning.33 The finite field procedure is the most often used procedure because of two main advantages: (1)it is very easy to implement, and (2) it can be applied to a wide range of quantum mechanical methods. To calculate the energy in the presence of a uniform electric field of strength F, an fi*? term needs to be added to the one-electron Hamiltonian. This interaction term can be constructed from just the dipole moment integrals over the basis set. Any ab initio or semiempirical method can then be used to solve the problem, with or without electron correlation. It is the ability to obtain properties from highly correlated methods that makes finite field calculations usually the most accurate available. An alternative method of including the electric field is through the use of point charges. This procedure is generally useful only when one is interested in properties that arise from nonuniform electric field contributions such as field gradient^.^^.^^ Dewar and Stewart36 developed a procedure to arrange four charges to provide a fairly uniform field over the region of the molecule. However, this procedure is no simpler than including the l% term directly and has no advantage over it. There are, however, two major disadvantages to the finite field procedure. The first is that it is limited to static fields, and hence the method does not give values that can be directly related to most experiments. This is not necessarily a serious problem because as a rule only an estimate or information on relative properties is needed. The second disadvantage is that, like all numerical derivative schemes, the finite field procedure may exhibit severe numerical problems. In the example of this shown in Figure 4,the calculated y value for a biphenyl molecule obtained by a semiempirical method (MNDO) is plotted as a function of the arbitrary choice of the base field strength F used in Eqs. [28] and [29]. The “correct” value within the model is the limit as F + 0. Note the large dependence of the choice of F in the calculation. As the base field increases, the calculated properties based on E ( F ) and p ( F )deviate more and more from the
(Hyper)Polarizability Calculations 255
“correct” value and from each other. Initially, the source of this error is the missing terms truncated from Eq. [4]or 161. Eventually, the external field will be large enough to “break” the molecule of interest, that is, changing the ordering of the electronic states or ionizing the system (though only the first option actually occurs with restricted Hartree-Fock (RHF) wavefunctions and finite basis sets). As the base field decreases, the computed values tend to converge, but at some point numerical breakdown occurs, and the results become meaningless. The breakdown is related to the convergence of whatever quantity is being used in the finite field procedure. For the energy-based expressions, Eq. [28], y is calculated as a function of energy differences divided by F4. If a standard value of 0.001 au is chosen for F and the energies are converged to a value of typically used in energy and structure calculations, the error in y is on the order of 10-6/(0.001)4 or 106-which clearly is unacceptable. Under these conditions, f3 values would have errors on the order of lo3. To obtain y values with 1 au errors using a base field of 0.001 au, energies must be obtained that are converged to about au. This procedure would also yield f3 values with au errors. Certainly, care must be taken when standard computer packages are used because the errors in the two-electron integral calculations are usually of this order. In the estimation of the errors when finite field calculations are re-
Biphenyl Finite-Field Results
Dipole Based
0.000
0.002
0.004
0.006
0.008
0.010
0.012
0.014
Base Field (F)
Figure 4 Results of finite field calculation on the second hyperpolarizability (y) of biphenyl obtained with the MNDO semiempirical method.
256 Predicting Nonlinear Optical Properties ported, it is very important that the convergence criteria, and any other tolerances used, be reported-a practice often not followed, unfortunately. The other way to reduce the error is to use a larger base field value. But, as shown in Figure 4, this may give results that deviate too far from the “correct” value. For small, closed-shell molecules, base field values as large as 0.01 may be acceptable, requiring energies good only to l o p h ,but care must always be taken. The approach of fitting more than the minimum number of E ( F ) values also improves the stability.32 The analysis above illustrates why dipole-based expressions like Eqs. 1281 usually give better results. With base field strengths of 0.001 au and dipole moments converged to l O W , the errors in p are already down to 1 au, while the y error is lo3 au. Convergence of in the dipole moment is needed to lower the y error to 1 au. Some types of frequency-dependent quantity can be calculated from finite field procedures, provided they are based on frequency-dependent properties obtained by other methods, such as those discussed later in the chapter. For example, a computer program that calculates the frequency-dependent polarizability can be modified to do so in the presence of a static electric field to yield pEoPEand y°KE values via the expansion3’
the frequency-dependent polarizability calculated in the where ail(- o ; ~ is) ~ presence of a uniform electric field F. Similarly SHG first hyperpolarizabilities in the presence of a DC field can be used to produce the EFISH second hyperpolariza bilities:
Sum-Over-States Methods The most recognizable expressions for obtaining nonlinear optical properties are the expressions derived from time-dependent perturbation theory. This procedure is straightforward and has been described in The following equations show the expressions for a,p, and y: <eljlO>
+ < O( lEj l, -eE>,< e-l ihl O~> )
(E, -Eo - h ~ )
133aI
(Hyper)Polarizability Calculations 2.57
where i refers to the ith component of the dipole moment, and ;= i - (OJilO). Ep is the energy of state u, and the terms (ulilX) are the transition moments between states u and X, v and X are generic dummy variables used for convenience to represent the ground (E,) and excited states. For electronic properties, the sums over e, e’, and errare over excited electronic states. The sum over P is over permutations of the pairs {i,-w,,], {j,wl], {k,w2},and {1,w3]and has 6 terms for p and 24 terms for y. An advantage of these equations is that they can be used to obtain frequency-dependent properties. Even more important, they directly illustrate the essential frequency-dependent behavior of the nonlinear optical properties by virtue of having identifiable poles at certain frequencies. It is easily seen that the polarizability a has poles when hw is equal to the energy difference Ee - E 0. Because the transition moments are only nonzero for dipole-allowed transitions, the poles of a correspond to the normal absorption frequencies of the molecule. For the first hyperpolarizabilities, if wo is the frequency corresponding to an excitation energy, then pSHGhas poles at hw, and wo, whereas POR and pEoPEhave poles at wo. For the second hyperpolarizabilities, things are a little more complicated: yTHGhas poles at boo,$wo, and wo; yEFTSH has poles at bw, and wo; yTDR1 has poles atBwo and wo; and y°KE has poles at wo. These equations are difficult to use to accurately calculate the nonlinear optical properties of molecules because of the need for a complete set of excitation energies and transition moments. For p and y, the transition moments needed are not only ground to excited state (i.e., normal UV/vis-type data) but also excited state to excited state transtion moments. Alternatively, what is needed is a complete description of the ground and excited states of the molecule. In practice, the summations over excited states used in the sum-over-states (SOS) expressions generated from quantum mechanical calculations must be truncated in some manner. This leads to two problems: ( 1 ) how to logically truncate summations and (2)how to ensure that the results have converged such that additional excited states will not change the value. SOS methods are most often used with semiempirical Hamiltonians, where the small basis sets help limit the number of states, precisely because of the problem of logically truncating summations. A common truncation is to use single excitation configuration interaction (SCI) results for p and single and double excitation CI (SDCI) results for y. The
258 Predicting Nonlinear Optical Properties practice is justified by the observations that only single excitations are dipole allowed, and these are the only states that contribute in Eq. [33b], whereas Eq. [33c] requires double excitations. Examples of SOS-CI calculations for p and y are the works of Kanis et al.5943 and Meyers et al.44 It is not always necessary to diagonalize large CI matrices and use the explicit excited states in the SOS equations, however. For example, one can obtain frequency-dependent NLO properties directly from the CI matrices by use of moment methods.45 One very important simplification based on the SOS formulation is the simple two-state model for static first hyperpolarizabilities and polarizabilities given by:
where Mge is the transition moment between the ground state and excited state, Ege is the corresponding energy difference, and A p ( = pe - pg)is the change in the dipole moment upon excitation. Equation [34b] for p is the basis of the solvochromatic method in which Mge is obtained from the extinction coefficient, and Ap from shifts in absorption due t o changes in solvent polarity. Based on this understanding, one way to “engineer” molecules to have large first hyperpolarizabilities is by maximizing the difference between the ground and first excited state dipoles (i.e., large charge transfer); but care must be taken to avoid simultaneously making Mge very small. These simple models are primarily for the “diagonal” components only and give useful information if that is the dominant component, which it often is. In a similar procedure, a few-state model has developed for y as Yxxxx =
24MieAp2 Eie
24Mg“, M 2 M,”,. - - + 2 4 x e Eie el EgeEee,
[351
which can be written in a two-state form by keeping the first two terms only, or in a three-state form by restricting the sum to a single electronic state.46
Time-Dependent Hartree-Fock The time-dependent Hartree-Fock (TDHF) method has a lengthy history as a technique for calculating the electronic excitation energies and transition moments of molecular systems. There are two ways to formulate TDHF theory that look quite different but are in fact equivalent. The formulation used below
(Hyper)Polarizability Calcutations 259 was first developed by Frenke14' and is based on obtaining approximate solutions to the time-dependent Schrodinger equation ay H Y =ihat
by applying the variational principle
A nice development of the theory is given in a book by McWeeny,48 which shows that the assumption of a single determinant form for leads to the timedependent Hartree-Fock equations
In this section, F refers to the "Fock" matrix, and the electric field is denoted by E. Sekino and Bartlett49 extended this method to the computation of properties to any order, with explicit results for p and y. Karna and DupuisS0 followed with a similar procedure, which incorporated the 2n + 1 rule for more efficient calculations of @ and y. The essential feature of these techniques is expansion of the quantities in terms of the time-dependent perturbation p.Ea(eciwt + e+IWt+ l ) ,followed by collection of terms at each order to produce a series of Hartree-Fock-like equations. Completely general equations can be derived by using p-31.with 31. = XiEi(eiWj'+ e--rwif).Only two frequency terms need to be included to obtain expressions for a general first hyperpolarizability, Pabc( -oa;w1,02), and three terms are needed to obtain a general expression for the second hyperpolarizability, yabcd(- W ~ ; W ~ , W ~ , ~ ~ ) . Following Sekino and Bartlett,49 the F matrix can be expanded in terms of 31. as F
=
Fo
+ E"F" + (2!)-lEaEbFah + (3!)-*E"EbE'Fab" +
[391
where the definitions of Fa, Fab, Fabc, etc. depend on the choice of 31.. Likewise, the C and E matrices can be expanded in terms of the perturbation 2. Substitution into Eq. [38] and collecting together terms with the same frequency dependence results in the following set of coupled equations:
Fa(o)Co+ F0Ca(w) + oS°Ca(o) = S°C"(w)~o+ SoCo~"(w)
1
4
~
260 Predicting Nonlinear Optical Properties
Only the equations through third-order (derivative) are shown, but the procedure can easily be extended to higher orders. These equations are solved subject to the normalization constraint, CtSC = 1, which can also be expanded into order-by-order terms. and The details of solving these various equations have been are essentially the same as the normal self-consistent field (SCF) procedure. It is common to work in the atomic orbital (AO) basis as done in the usual Hartree-Fock procedure. In the A 0 basis, the quantities of interest are "transformation" matrices, U", UUb,etc., defined as C"(w) = Ua(w)Coand CUb(wl,w2) = U"h(o,,w,)CO. From these results, the various components of the density matrix, D = CnCt, where n is a matrix of occupation numbers and the dagger indicates the transpose conjugate, can be determined from
D a ( w )= Ca(w)nCot+ ConC"f(- w )
[4 1b1
D ~ ~ ~ ( c o , , u , ,=c o Cahr(w1,co2,co3)nCot ~) + C'(wI)nChct(-w2,-w3)
+ Cbc(w2,w3)nCat(-wl) + Cb(wZ)nCact(-w1,-w3)
+ nCbt( - w 2 ) + Cc(w3)nCabt( -w1,-w2) + Cab(w1,w2)nCct(-w3) + OnCahct(-w1,-w2,-w3)
[41 4
C~C(W,,W~)
The procedure is to calculate (or guess) an initial density (D", Dab, . . . ), calculate the "Fock" matrices (F", Fab, , . . ), calculate the U", Uab, . . . matrices,
(Hyper)Polarizability Calculations 261 and repeat until there is convergence in either the U matrices or the density matrices. From the various density terms, D'(w), D U h ( w 1 , w 2 )or , Dabc(w,,wz,w3), the properties of interest can be calculated by the traces
where 'p is the a-component dipole moment matrix. Solution of the equations above resembles strongly a normal SCF procedure, and one of its major strengths is the ease with which a normal SCF program can be converted to provide polarizability and hyperpolarizability properties; the major difference is that the matrices involved in the TDHF procedure are nonsymmetric: that is, (Dub),,# (Dub),,and, therefore, (FUb),, # so more storage is required. Additionally, the solutions at each order depend on those of the preceding orders, so these must be saved as well. One of the main problems with this TDHF procedure, though, is its poor convergence behavior, which becomes worse as a pole is approached. This simple approach is particularly hard to converge above the first pole, and conventional random phase approximation methods (to be discussed later) work better in this region. More work needs t o be done t o improve the initial guess and to develop better convergence accelerators. If y is the highest order hyperpolarizability of interest, it is not necessary to solve the interactive equations for Dahc(w1,w2,w3).Using the detailed procedure of Karna and Dupuis,so the 2n + 1 rule can be used to express y in terms of lower order terms from the cx and p calculations as follows:
Yahcd(- w0;w19w2,w3
1=
Tr[n[U'( -wo)Gh(wl)UCd(w2,w3)- Ucdt(-w2,-w,)Gb(~,)Uu(-wO)
+ U'(
-w0)GC(~z)Ubd(~l , ~Ubdt( 3) - ~ 1 , - ~ 3 ) G C ( ~ 2 ) U a ( - ~ 0 )
262 Predicting Nonlinear Optical Properties
where Ga and Gab are the "Fock" matrices transformed to the molecular orbital (MO) basis. These equations look like a lot of work, but only one evaluation is needed to get the desired component, and the terms are just matrix multiplications, with many having the same form. The time-consuming part is the requirement, generally, for several calculations of lower order properties. For in one step, one must have calculated example, to calculate y(-3w;w,o,w) a(-3w;3w), a(-w;w),and @(-2w;w,w). Similarly, to get y(-2w;w,w,O) in a single step, one must have obtained a(-2w;2w), a(-o;o), a(O;O),@( -2w;w,w), and @(-2w;w,O). In general, to compute y( -wo;w1,w2,w3), one needs a(-w&J, wJ;w
+
2
a ( - q ; w 1 ) , a(--02;wJ, 4 - - 0 3 ; w J , P(-(w1 + 4 ; w p 2 ) , P(-(wz ,w ), and @(-(wl w3);w1,w3). Likewise, if the highest order prop3
+
erty of interest is only @( - w p 1 , w 2 ) , then only polarizability calculations need be performed-that is, a(-wswo), a(--wl;w1), and a(-w2;02) must be computed. The earlier derivative Hartree-Fock theory of Dykstra and J a ~ i e is n ~an~ equivalent procedure to the THDF approach described above for the staticfield-only case. An alternate MO-based procedure developed by Sekino and
(Hyper)Polarizability Calculations 263 Bartletts' looks a lot like the usual random phase approximation (discussed later). The same authorss4 also expressed the THDF approach in terms of the sum-over-states procedure and discussed the effects of truncation with this context. Recently, Karnas5 extended the AO-based THDF procedure to open-shell systems using a UHF-like approach.
Other Methods The preceding sections outline the three most commonly implemented and currently practical methods for computing nonlinear optical properties. But, as discussed, each of those methods has drawbacks, and a great deal of effort is being expended in the quantum chemistry community to develop alternate methods that are more accurate and more efficient. This section briefly introduces a few of the approaches that currently look most promising. We introduce the reader to these methods but do not provide a detailed discussion.
Propagators Another way to view the nonlinear optical properties is via the time-dependent response functions:
When the perturbing potential V" is an external electric field and the property of interest A is the dipole moment, the coefficients, indicated in the expansion above by the double angle brackets, are referred to as polarization propagators. These functions are equivalent to the tensor components
264 Predicting Nonlinear Optical Properties
and are referred to as the linear, quadratic, and cubic response functions, respectively. There are several reviews on the calculation of polarization propagator~.”~-~~ Two major approximations must be made to obtain the response functions: a choice of a reference function and a choice of an operator manifold. For the linear response function, the choices of a Hartree-Fock reference state and simple “ particle-hole’’ excitation operators (in the second quantization sense) lead to an approximation known as the random phase approximation (RPA) and is equivalent to the TDHF method discussed earlier. The quadratic response functions have been calculated within the RPA approximation by Parkinson and Oddershede.”s Cubic response functions have been calculated within the random phase approximation ( RPA)60-62and within the multiconfiguration RPA (MCRPA).62,63 Mder-Plesset Perttrybation Theory Perturbation theory approaches are essentially improvements to the TDHF results. This type of work was pioneered by Rice and who defined the polarizability in terms of derivatives of a pseudoenergy, W = (1IrJ(H- ih(a/dt))JQ), with respect to one static and one frequency-dependent field
Aiga et al.66-68 developed a similar quasi-energy derivative method with the polarizability defined in terms of derivatives of frequency-dependent fields as
For hyperpolarizabilities, Rice and Handy actually used equations devel) , ~ ~ Aiga and Itoh used only a numerioped for an arbitrary p(- W ~ ; W ~ , O ~ whereas cal derivative (finite field) approach to calculate p(-w;w,O) and y( -w;0,0,0).~* Coupled Cluster-Equations of Motion A third procedure for calculating correlated frequency-dependent properties is the coupled-cluster, equations-of-motion (CC-EOM) method.69 This approach is formally equivalent to solving the sum-over-states expression using a similarity transformed Hamiltonian, H = e-=HeT, where the transformation is obtained from the coupled cluster choice for a reference state, = eTIO>. No explicit states need be obtained during this procedure, and the dynamic polarizability can be calculated directly. Sekino and Bartlett’O used a fi-
Practical Considerations 265 nite field procedure based on CC-EOM polarizabilities obtained with a reference state restricted to including double excitation operators in T (referred to as CCD-EOM) to p(-o;w,O) and y(-o;o,O,O) for NH, and C,H,.
PRACTICAL CONSIDERATIONS Basis Sets Selecting an adequate basis set is generally the most important consideration in obtaining accurate or even reasonable nonlinear optical properties from quantum mechanical calculations. The basis sets needed are much larger than those for normal energy and structure calculations,71 and this is the main reason for the great computational expense associated with NLO calculations. One problem in the literature is that there is little standardization in NLO basis set choice. Indeed, a common technique is to make the basis set as large as possible, as a result of which either converged properties are obtained or the limits of affordability are reached. An NLO basis set is typically a good energy basis set augmented by several extra, very diffuse functions. One of the smallest basis sets in common use i ~ ~polyis the 6-31G + PD basis developed by Hurst, Dupuis, and C l e m e r ~ tfor ene calculations. Here the normal 6-31G basis is augmented by a set of very diffuse p and d Gaussian functions with exponents of 0.05. The exponents in the 6-31G + PD basis set were selected by testing several basis sets in calculations on C,H, and picking a good compromise between results and basis sets size. Yeates and D ~ d i developed s ~ ~ a similar basis set but chose exponents for all first-row atoms to optimize the appropriate excitation energy. A similar type basis set was developed by S p a ~ k m a nwho , ~ ~ added to a standard 6-31G basis set a set of diffuse s and d Gaussian functions for firstrow atoms and a set of diffuse s and p functions for hydrogen. The exponents of the s functions were fixed at 0.25 times the most diffuse s exponent in the 631G basis, and the exponents of the additional p and d functions were optimized to maximize the polarizabilities for a series of AHn compounds. This basis, referred to as 6-31G(+sd+sp), was reported to give MP2 polarizabilities accurately to t2% of experiment.’, Like the 6-31G+PD basis, because of its relatively small size, this basis has also been used for hyperpolarizability calculations. A more evenly augmented set of a standard triple-zeta basis functions was used by Dykstra, Liu, and Malik to develop the ELP basis7 In this procedure the first-row-element basis sets were augmented by a diffuse s, two diffuse p, and three diffuse d functions providing a 6s4p3d basis set. The hydrogen basis was augmented by a single s and two diffuse p functions to give a 4s4p basis.
266 Predicting Nonlinear Optical Properties
A similar style of basis set for use in electrical response property calculations can be found in the augmented correlation consistent basis sets of Woon and Dunning.33 These authors have developed a scheme to augment the standard correlation consistent basis sets (cc-pVXZ) by adding one set of diffuse functions for each angular moment set used in the original method. This basis is referred to as aug-cc-pVXZ, where X is D for double-zeta or T for triple-zeta basis sets. Further augmented basis sets have been developed by adding additional diffuse functions formed in an “even-tempered’’ manner by decreasing the exponents by a fixed ratio. These basis sets are referred to as x-aug-ccpVXZ basis sets with x = d(oubly),t(riply),and q(uadrup1y). For example, for nitrogen the standard cc-pVDZ basis is [3s2pld]. Therefore, the aug-cc-pVDZ is [4s3p2d], and the d-aug-cc-pVDZ basis is [Ss4p3d]. The polarized basis sets (POL) developed by Sadlej and c o - w o r k e r ~ ~ ~ , ~ ~ are a very popular basis set choice. These sets were designed to improve the calculation of first- and second-order molecular properties (i.e., dipole moments and polarizabilities). They consist of a standard double-zeta Gaussian-type orbital basis with a set of extra functions derived from derivatives of the outer valence functions of the original set-functions of one higher angular momentum with the same exponents. An extension of the Sadlej basis has been promoted by Bartlett and c o - ~ o r k e r sreferred ,~~ to as POL+. The POL+ set is formed by adding a single set of d functions on hydrogen with an exponent of 0.1. The best way to illustrate the important role of basis set quality in these calculations is by example. For purposes of comparison and to focus on basis set effects alone, we performed TDHF calculations on the HCN molecule with many of the basis sets discussed above and a few others. All calculations were performed with the free ab initio program GAMES7* at the same molecular geometry, optimized with the cc-pVDZ basis (RcN = 1.1342 A, R,, = 1.0666 A). Table 3 shows the results for p. and a. For the dipole moment, all basis sets give reasonable values with the exception of the 6-31G + PD and cc-pVDZ basis sets, which are over 0.1 debye (D) too small, and the Spackman basis, which is too large by almost 0.1 D. However, all basis sets give answers within 5% of experiment. A good estimate for the limiting value is about 3.282 D, as predicted by the largest basis sets. For the isotropic polarizability < a > ,greater variation is seen in basis set influence. All the augmented Dunning basis sets give very reasonable values, with the augmented pVTZ results suggesting a limiting < a > value of around 16.4 au, as shown in Figure 5. The related Sadlej and POL+ sets give comparable values but use nearly one-third the number of functions. Figure S illustrates that the augmentation of the Sadlej basis to give the POL+ basis increases the value of a toward the estimated limit. The “normal” basis sets (631 1+ +G**, cc-pVDZ, and cc-pVTZ) have values that are much too small, and their main failure is the “nonaxial” axx(a,) values. It should be noted that the speciality 6-31G+PD basis is very poor at calculating < a > ,which emphasizes
6-311+ +G* * 6-31G+PD Spackman ELP Sadlej POL+ CC-pVDZ aug-cc-pVDZ d-aug-cc-pVDZ t-aug-cc-pVDZ cc-pvTz aug-cc-pVTZ d-aug-cc-pVTZ
Basis
Number of functions 53 35 38 88 61 67 35 59 83 107 85 135 185
Functions C, N/H
5s4pld/4slp 3s2pld/2slp 4s2pld/3s2p 6s4p3d/4s2p 5s3p2d/3s2p 5s3p2&3s2p 1d 3s2pld/2slp 4s3p2d/3s2p 5s4p3d/4s3p 6sSp4d/Ss4p 4s3p2dlW3s2pl d Ss4p3dlfI4s3p2d 6s5p4d3W5s4p3d
Table 3. Basis Set Dependence of p and cx for HCN
-3.2914 -3.1502 -3.3539 -3.2926 -3.2710 - 3.275 1 -3.1344 -3.2452 -3.2827 - 3.2820 -3.2551 - 3.28 13 -3.2816
Fz
21.922 18.601 21.451 21.849 21.913 21.904 19.605 21.381 21.923 21.935 2 1.006 21.908 21.916
azz
9.698 9.374 12.709 13.480 13.343 13.410 8.371 13.037 13.624 13.632 10.902 13.572 13.631
ax,
13.772 12.450 15.623 16.270 16.199 16.242 12.116 15.818 16.390 16.400 14.270 16.351 16.393
268 Predicting Nonlinear Optical Properties
16
I 0
I
1
2
I
3
Level of Augmentation Figure 5
Polarizability (a)of HCN as a function of basis set.
the point that one must take great care in the use of basis sets for predicting properties for which they were not designed. The Spackman basis gave a relatively good estimate, as advertised, especially considering the very small number of functions used. The HCN p and y values calculated using the same basis sets are shown in Table 4. For an ad hoc variational principle is usually employed-basis sets that maximize are better. A similar approach is also often used in evaluation of < a > results. A comparison of the values from the augmented cc-pVDZ and cc-pVTZ basis sets (Figure 6 ) illustrates that the most important factor is to have a basis set with sufficient diffuse functions. This is particularly obvious when one considers the change caused by adding a single set of functions to either the cc-pVDZ or cc-pVTZ basis sets. It is interesting to note, in this regard, that the smaller Sadlej basis sets give larger values than the projected limit of the augmented cc-pVTZ basis sets. Increasing to the POL+ basis reduces the value toward the projected limit. This apparent violation of the ad hoc variational principle reflects the failure of the Sadlej basis to give an adequate estimate of the lower order terms (p).The value obtained from the specialty 6-31G+PD basis is quite reasonable. Because of its good predictive ability and its small size, 6-31G+PD has become very popular for studying larger systems. In contrast, the Spackman basis, which is of similar size, did not fare as well. Finally, the “normal” basis sets like cc-pVDZ and cc-pVTZ give
Basis 6-311++G** 6-31G+PD Spackman ELP Sadlej POL+ CC-~VDZ aug-cc-pVDZ d-aug-cc-pVDZ t-aug-cc-pVDZ cc-pvTz aug-cc-pVTZ d-aug-cc-pVTZ pzxx
3.278 3.553 9.330 2.102 -0.484 0.798 8.036 11.711 2.362 2.708 8.572 3.519 2.409
pzz,
1.200 -3.352 8.166 6.157 1.061 2.134 26.146 15.978 4.493 4.584 18.697 5.946 5.302
Table 4. Basis Set Dependence of p and y for HCN 4.654 2.252 16.095 6.217 0.056 2.238 25.331 23.640 5.530 6.001 21.504 7.791 6.072
a> Yzzz
839.9 1193.9 1228.4 1507.0 1594.5 1505.2 121.6 881.3 1540.0 1587.4 293.9 1471.4 1554.4
797.4 1646.3 345.2 1503.4 2096.3 2067.0 49.6 1086.4 1700.3 1685.1 247.8 1462.2 1897.4
Yxxxx
yxxzz
362.5 572.8 356.9 538.5 606.4 621.0 24.1 233.6 589.5 598.7 90.5 514.3 592.2
Yxxyy
265.8 548.8 115.1 501.1 698.8 689.0 16.5 362.1 566.8 561.7 82.6 487.4 632.5
883.2 1575.1 715.3 1534.0 1922.1 1900.3 70.0 942.6 1686.4 1695.2 263.4 1485.5 1796.6
270 Predicting Nonlinear Optical Properties Sadlej
\
20001
0-
POL+ .--) B
I
1
, 3
Figure 6 Second hyperpolarizability (y) of HCN as a function of basis set. values far too small, being only 4 and 15% of the d-aug-cc-pVTZ value, respectively. A comparison of the computed p values highlights the real problem in NLO basis set selection-that there is no systematic variational trend to follow. When going from poor to better basis sets, p for HCN tends to become smaller, but this is not a general trend. The sequence of augmented cc-pVTZ results (Figure 7) seem to show a uniform convergence, but the cc-pVDZ results produce a deviation at the largest basis set, albeit small in magnitude. The best-guess limit is around 6 au. Another important point illustrated by these results is that, unlike y and a,adding diffuse functions to a basis set does not necessarily yield improvements in computed results. This is illustrated by the lack of improvement in going from the cc-pVDZ basis to the aug-cc-pVDZ basis, whereas using the aug-cc-pVTZ basis gives close to the converged result. As noted above, it is particularly interesting how different the Sadlej and POL+ basis sets, which give excellent < a > and results, compare to the augmented basis sets. The Sadlej basis gives a very small
value, with the POL+ result being somewhat larger. The source of these underestimations can be found by looking at the individual components. The,,p, (p,,,,,) components are very different from those derived from the Dunning augmented basis sets, and the results from using the Sadlej basis even have a different sign. Some other basis sets in Table 4 are known to be poor, and this again can be seen by looking at the individual components. The specialty 6-31 +PD basis set, for example, gives a reasonable
value, but the ,p,, component has the wrong sign.
Practical Considerations 271
30
I
Sadlejw
ot 0
1
2 Level of Augmentation
. 3
Figure 7 First hyperpolarizability (p) of HCN as a function of basis set.
To illustrate the size of typical NLO basis sets in use, we show the number of Gaussian functions for several of the common basis sets for two molecules: acetylene and p-nitroaniline (Table 5). Most MO programs in use are capable of implementing 5 d functions and 7 f functions, except for which uses 6 d functions and 10 f functions. The difficulty of predicting accurate NLO properties is clearly apparent when calculations on the fairly small molecule nitroaniline could require around 1000 basis functions to overcome basis set deficiencies. From the very limited results presented here, it is impossible to declare that one particular basis set is the best. The decision of which basis set to implement is often based on a compromise between quality and time or cost. It is important to note that the exact type of basis set needed is dependent on the nature of the system being studied, and basis sets as large as those mentioned here may not always be needed to provide accurate results. For very delocalized T systems in large polyenes, for example, it has been shown even the smaller, “normal” basis sets work ~ e 1 1 . ~ ~ 3 ~ ~
Other Considerations Semiempirical molecular orbital methods have long been a mainstay for the prediction of NLO properties, particularly with sum-over-states methods, and much of our understanding of how molecular properties affect NLO prop-
272 Predicting Nonlinear ODtical ProDerties Table 5. Basis Set Sizes for (6d710f)l(5d77f) Implementations Basis Functions Acetylene C, N, O/H Basis Set C2H2 3s2p 1dl2s 6-31G" 34132 6-31G+PD 3s3pldl2s 40138 Sadlej Ss3p2d13s2p 70166 POL+ 5s3p2d13s2pld 82176 aug-cc-pVDZ 4s3p2d13s2p 6 8/64 5s4p3d14s3p d-aug-cc-pVDZ 90186 t-aug-cc-pVDZ 6sSp4dlSs4p 1241116 Ss4p3d2f14s3p2d 1561138 aug-cc-pVTZ 21411 88 d-aug-cc-pVTZ 6sSp4d3f15s4p3d 7~6pSd4W6s5p4d t-aug-cc-pVTZ 2721238
p-Nitroaniline N02C,H,NH, 1621152 1921182 3141294 3501324 3041284 4281398 55315 13 6801598 9301812 11 8011026
erties is deduced from them.5>80However, in most instances the failure of semiempirical methods to provide reasonable NLO properties for small molecular systems has been noted.2,8' The primary cause of the problems in these systems is traced to the minimal basis sets used in semiempirical methods. As noted earlier, smaller basis sets may be adequate for many larger molecular systems, such as polyenes, and for these systems one expects that semiempirical methods might suffice.82 The role of electron correlation in the prediction of NLO properties has ~ ~ , ~ ~ ~ ~as~albeen shown to be very important in several s t ~ d i e s .Unfortunately, ready noted, most currently employed methods do not give correlated frequency-dependent properties, and most studies of electron correlation have been done using finite field approaches. The magnitude of the correlation correction varies widely. For example, for large polyenes like CZ4Hz8,it has been shown that the MP2 y may be twice the RHF value.85 At present, the only hope for routinely treating large systems is to hope that the essential trends and relationships are adequately explained at the RHF level even though the values may not be accurate. As an alternative to Hartree-Fock semiempirical and ab initio calculations, density functional theory has been used to obtain nonlinear optical properties in both the finite field86,87 and TDHF88,89 (or time-dependent Kohn-Sham) approaches.
BEYOND MOLECULAR ELECTRONIC CALCULATIONS Ultimately the computational chemist wants to make the model theories as realistic as possible, and the next step in the computation of NLO properties is not just to obtain more accurate and more efficient methods for the elec-
Beyond Molecular Electronic Calculations 273 tronic properties of isolated (0 K ) molecules, but to enlarge the theory to treat realistic NLO systems which are dynamic, interacting molecules. We now consider aspects of computing NLO properties more closely related to experimentally measured properties.
Molecular Vibrational Calculations The methods discussed above are, in general, concerned only with obtaining the electronic contribution to polarizabilities and hyperpolarizabilities. A complete treatment of the problem requires inclusion of the vibrational and rotational contributions as well. For many experiments at visible frequencies, these effects may be small. For low frequency or static field experiments, however, these effects have been shown to be as large or larger than the electronic effects t h e m s e l v e ~Bishop . ~ ~ and Kirtman developed a general approach for calculating the vibrational contributions for polyatomic system^.^^,^^ A recent overview of this subject can be found in the review by Kirtman and Champagne.93
Condensed Phase Problems The methods discussed so far apply to single molecules only. The data derived from those calculations should be comparable to those from experiments done at low pressure on pure gases. However, most interest in NLO properties are in condensed phase systems (liquids, polymer films, or crystals). A major area of theoretical interest has been on solvent effects, and several techniques have been applied to the calculation of NLO proper tie^.^^-^' The most common (and simplest) method is the reaction field model, where the solute molecule is in a cavity of solvent, which is treated as a uniform dielectric medium. Cavity approaches are problematic. How do you pick the cavity size? How do you pick the cavity shape? How do you model stronger, specific interactions (such as hydrogen bonding)?The work of Willetts and Rice94 illustrated the inability of reaction field models to adequately treat solvent effects even though they tried both spherical and ellipsoidal cavities. Mikkelsen et al.96 attempted to provide specific interactions with their solvent model by explicitly including solvent molecules inside the cavity. These and related issues need to be addressed further if computational chemists are to develop truly useful procedures capable of including solvent effects in NLO calculations. Recent work by Cammi, Tomasi, and c o - w o r k e r ~ has ~ ~ attempted - ~ ~ ~ to address these issues within the polarized continuum model (PCM)and have included studies of frequency-dependent hyperpolarizabilities. Another way to treat condensed phases is to explicitly study intermolecular interaction effects on NLO properties. Interesting attenuation effects in NLO properties arising from interchain interactions have been shown to exist for interacting ethylene molecules'"' and for butadiene and hexatriene molecules held in the alignment corresponding to polyacetylene stretched
274 Predicting Nonlinear Optical Properties
fibersg5 Another example is found in variable position studies of interacting H(C,H,)nH molecules (with n from 1 to 6).'02 More work is needed in this area of treating environmental effects before robust NLO property predictions can be made.
SUMMARY The implementation and limitations of three common methods for obtaining NLO properties (finite field, sum over states, and time-dependent Hartree-Fock) have been discussed, and very brief introduction has been made to the new methods under development. The goal of obtaining results that are directly related to the common experimental results is improving but still has a long way to go. Theoretical and computational NLO work will clearly continue to be important.
ACKNOWLEDGMENTS During the writing of this chapter, we benefited from many useful comments by Bernard Kirtman (University of California, Santa Barbara), Shasi Karna (U.S. Air Force Phillips Lab), Tom Cundari (University of Memphis), and the editors of this series.
~
~~
REFERENCES 1. D. M. Bishop, in Advances in Quantum Chemistry, J. R. Sabin and M . C. Zerner, Eds., Academic Press, San Diego, CA, 1994, Vol. 25, pp. 3-48. Aspects of Non-Linear-Optical Calculations. 2. D. P. Shelton and J. E. Rice, Chem. Rev., 94, 3 (1994).Measurements and Calculations of the Hyperpolarizabilities of Atoms and Small Molecules in the Gas Phase. 3. M. Ratner, Int. J. Quantum Chem., 43, 5 ( 1 992). Electronic Structure Studies of Nonlinear Optical Response in Molecules: An Introduction. (Plus rest of issue.) 4. J. L. BrCdas, C. Adant, P.Tackx, A. Persoons, and B. M . Pierce, Chem. Rev., 94,243 (1994). Third-Order Nonlinear Optical Response in Organic Materials: Theoretical and Experimental Aspects. 5 . D. R. Kanis, M . A. Ratner, and T. J. Marks, Chem. Rev., 94, 195 (1994).Design and Construction of Molecular Assemblies with Large Second-Order Optical Nonlinearities. Quantum Chemical Aspects. 6. G. D. Stucky, S. R. Marder, and J. E. Sohn, in Materials for Nonlinear Optics-Chemical Perspectives, S. R. Marder, J. E. Sohn, and G. D. Stucky, Eds., ACS Symposium Series 455, American Chemical Society, Washington DC, 1991, pp. 2-30. Linear and Nonlinear Polarizability: A Primer.
References 27.5 7. C. E. Dykstra, S.-Y. Liu, and D. J. Malik, in Advances in Chemical Physics, I. Prigogine and S. A. Rice, Eds., Wiley, New York, 1989, Vol. 75, pp. 37-112. Ab Initio Determination of Molecular Electrical Properties. 8. R. W. Boyd, Nonlinear Optics, Academic Press, San Diego, CA, 1992. 9. H. Margenau and G. M. Murphy, The Mathematics ofPbysics and Chemistry, Van Nostrand, Princeton, NJ, 1956. 10. B. A. Reinhardt, Trends Polym. Sci., 4, 287 (1996). Third-Order Nonlinear Optical Polymers. 11. B. A. Reinhardt, Trends Polym. Sci., 1 , 4 (1993).The Status of Third-Order Polymeric Nonlinear Optical Materials. 12. B. A. Reinhardt, in Encyclopedia of Advanced Materials, D. Bloor, R. J. Brook, M. C. Flemings, S. Mahajan, and W. Cahn, Eds., Pergamon Press, Oxford, 1994, pp. 1784-1793. Non) linear Optical Materials: x ( ~ Polymers. 13. R. Dagani, Chem. Eng. News, Sept. 23, 1996, pp. 68-70. Two Photons Shine in 3-D Data Storage. 14. A. Buckley, Adv. Muter., 4, 153 (1992). Polymers for Nonlinear Optics. 15. T. Kaino and S. Tomaru, Adv. Muter., 5,172 (1993).Organic Materials for Nonlinear Optics. 16. D. F. Eaton, G. R. Meredith, and J. S. Miller, Adv. Muter., 3, 564 (1991).Molecular Nonlinear Optical Materials-Potential Applications. 17. R. Dorn, D. Baums, P. Kersten, and R. Regener, Adv. Muter., 4, 464 (1992). Nonlinear Optical Materials for Integrated Optics: Telecommunications and Sensors. 18. P. N. Prasad and D. J. Williams, Introduction to Nonlinear Optical Effects in Molecules and Polymers, Wiley, New York, 1991. 19. J. Zyss, Ed., Molecular Nonlinear Optics, Academic Press, New York, 1994. 20. G. J. Ashwell and D. Bloor, Eds., Organic Materials for Non-Linear Optics, Vol. 111, Royal Society of Chemistry, Cambridge, 1993. 21. G . A. Lindsay and K. D. Singer, Eds., Polymers for Second-Order Nonlinear Optics, ACS Symposium Series 601, American Chemical Society, Washington, DC, 1995. 22. B. E. A. Saleh and M. C. Teich, Fundamentals ofPhotonics, Wiley-Interscience, New York, 1991. 23. A. Marrakchi, Photonic Switching and Interconnects, Dekker, New York, 1994. 24. A. D. Buckingham, Adv. Chem. Phys., 12,107-142 (1 967). Permanent and Induced Molecular Moments and Long-Range Intermolecular Forces. 25. A. Willetts, J. E. Rice, D. M. Burland, and D. P. Shelton, J. Chem. Phys., 97, 7590 (1992). Problems in the Comparison of Theoretical and Experimental Hyperpolarizabilities. 26. P. W. Atkins and R. S. Friedman, Molecular Quantum Mechanics, 3rd ed., Oxford University Press, Oxford, 1997. 27. D. A. Kleinman, Phys. Rev., 126, 1977 (1962). Nonlinear Dielectric Polarization in Optical Media. 28. D. J. Williams, in Materials for Nonlinear Optics-Chemical Perspectives, S. R. Marder, J. E. Sohn, and G. D. Stucky, Eds., ACS Symposium Series 455, American Chemical Society, Washington, DC, 1991, pp. 31-49. Second-Order Nonlinear Optical Processes in Molecules and Solids. 29. H. D. Cohen and C. C. J. Roothaan, 1. Chem. Phys., S43, 34 (1965). Electric Dipole Polares of Atoms by the Hartree-Fock Method. I. Theory for Closed-Shell Systems. 30. R. J. Bartlett and G . D. Purvis 111, Phys. Rev. A , 20, 1313 (1979).Molecular Hyperpolarizabilities. I. Theoretical Calculations Including Correlation. 31. H. A. Kurtz, J. J. P. Stewart, and K. M. Dieter, J. Comput. Chem., 11, 82 (1990). Calculations of the Nonlinear Optical Properties of Molecules.
276 Predicting Nonlinear Optical Properties 32. F. Sim, S. Chin, M. Dupuis, and J. E. Rice, J. Phys. Chem., 97, 1158 (1993).Electron Correlation Effects in Hyperpolarizabilities of p-Nitroaniline. 33. D. E. Woon and T. H. Dunning, Jr., J . Chem. Phys., 100,2975 (1994). Gaussian Basis Sets for Use in Correlated Molecular Calculations. IV. Calculation of Static Electrical Response Properties. 34. G. Maroulis and A. J. Thakkar, j . Chem. Phys., 93,4164 (1990). Polarizabilities and Hyperpolarizabilities of Carbon Dioxide. 35. D. M. Bishop and G. Maroulis,J. Chem. Phys., 82,2380 (1985). Accurate Prediction of Static Polarizabilities and Hyperpolarizabilities. A Study on FH (X'H'). 36. M. J. S. Dewar and J. J. P. Stewart, Chem. Phys. Lett., 111,416 (1984). A New Procedure for Calculating Molecular Polarizabilities: Applications Using MNDO. 37. M. Jaszunski, Chem. Phys. Lett., 140,130 (1987).A Mixed Numerical-Analytical Approach to the Calculation of Non-Linear Electric Properties. 38. C. Flytzanis, in Quantum Electronics: A Treatise, H. Rabin and C. L. Tang, Eds., Academic Press, New York, 1975, Vol. 1, pp. 9-207. Theory of Nonlinear Optical Susceptibilities. 39. J. 0. Morley, P. Pavlider, and D. Pugh, Int.]. Quantum Chem., 43,7 (1992).On the Calculation of the Hyperpolarizabilities of Organic Molecules by the Sum Over Virtual Excited States Method. 40. P. K. Franken and J. F. Ward, Rev. Mod. Phys., 35,23 (1963).Optical Harmonics and Nonlinear Phenomena. 41. J. F. Ward, Rev. Mod. Phys., 3 7 , l (1965).Calculation on Nonlinear Optical Susceptibilities Using Diagrammatic Perturbation Theory. 42. B. J. Orr and J. F. Ward, Mol. Phys., 20, 513 (1971). Perturbation Theory of the Nonlinear Optical Polarization of an Isolated System. 43. D. R. Kanis, T. J. Marks, and M. A. Ratner, Int. J. Quantum Chem., 43,61 (1992).Calculation of Quadratic Hyperpolarizabilities for Organic T Electron Chromophores: Molecular Geometry Sensitivity of Second-Order Nonlinear Optical Response. 44. F. Meyers, S. R. Marder, B. M. Pierce, and J. L. BrCdas,j. Am. Chem. SOC., 116,10703 (1994). Electric Field Modulated Nonlinear Optical Properties of Donor-Acceptor Polyenes: SumOver-States Investigations of the Relationship Between Molecular Polarizabilities (a,p, and y) and Bond Length Alternation. 45. T. Inoue and S. Iwata, Chem. Phys. Lett., 167,566 (1990).Method of Frequency-Dependent Hyperpolarizability Calculation from Large-Scale CI Matrices. 46. C. W. Dirk, L.-T. Cheng, and M. G. Kuzyk, Int. J. Quantum Chem., 43,27 (1992).A Simplified Three-Level Model Describing the Molecular Third-Order Nonlinear Optical Susceptibility. 47. J. Frenkel, Wave Mechanics, Advnnced General Theory, Oxford University Press, London, 1934. 48. R. McWeeny, Methods of Molecular Quantum Mechanics, 2nd ed., Academic Press, San Diego, CA, 1989. 49. H. Sekino and R. J. Bartlett, J. G e m . Phys., 85, 976 ( 1 986). Frequency Dependent Nonlinear Optical Properties of Molecules. 50. S. P. Karna and M. Dupuis,J. Comput. Chem., 12,487 (1991).Frequency Dependent Nonlinear Optical Properties of Molecules: Formulation and Implementation in the HONDO Program. 51. H. Sekino and R. J. Bartlett, Int. I. Quantum Chem., 43, 119 (1992). New Algorithm for High-Order Time-Dependent Hartree-Fock Theory for Nonlinear Optical Properties. 52. S. P. Karna, Chem. Phys. Lett., 214, 186 (1993).A "Direct" Time-Dependent Coupled Perturbed Hartree-Fock-Rootham Approach to Calculate Molecular (Hyper)polarizabilities. 53. C. E. Dykstra and P. G. Jasien, Chem. Phys. Lett., 109,388 (1984).Derivative Hartree-Fock Theory to All Orders.
References 277 54. H. Sekino and R. J. Bartlett, in Nonlinear Optical Materials: Theory and Modeling, S. P. Karna and A. T. Yeates, Eds., American Chemical Society, Washington, DC, 1996, pp. 78-1 01. Sum-Over-State Representation of Non-linear Response Properties in Time-Dependent Hartree-Fock Theory: The Role of State Truncation. 55. S. P. Karna, 1. Chem. Phys., 104, 6590 (1996). Spin-Unrestricted Time-Dependent Hartree-Fock Theory of Frequency-Dependent Linear and Nonlinear Optical Properties. 56. J. Linderberg and N. Y. Ohm, Propagators in Quantum Chemistry, Academic Press, New York, 1973. 57. J. Oddershede, Adv. Chem. Phys., 69,201 (1 987). Propagator Methods. 58. J. Olsen and P. Jsrgensen,J. Chem. Phys., 82,3235 (1985). Linear and Nonlinear Response Functions for an Exact State and for an MCSCF State. 59. W. A. Parkinson and J. Oddershede,]. Chem. Phys., 94, 7251 (1991).Quadratic Response Theory of Frequency-Dependent First Hyperpolarizability. Calculations in the Dipole Length and Mixed-Velocity Formalisms. 60. P. Norman, D. Jonsson, 0. Vahtras, and H. Agren, Chem. Phys., 203, 23 (1996). Non-linear Electric and Magnetic Properties Obtained from Cubic Response Functions in the Random Phase Approximation. 61. P. Norman, D. Jonsson, 0.Vahtras, and H. Agren, Chem. Phys. Lett., 2 4 2 , 7 (1995).Cubic Response Functions in the Random Phase Approximation. 62. P. Norman, Y. Luo, D. Jonsson, and H. Agren, J. Chem. Phys., 106, 1827 (1997). The Hyperpolarizability of trans-Butadiene: A Critical Test Case for Quantum Chemical Models. 63. D. Jonsson, P. Norman, and H. Agren,]. Chem. Phys., 105, 6401 (1996). Cubic Response Functions in the Multiconfiguration Self-Consistent Field Approximation. 64. J. E. Rice and N. C. Handy,]. Cbem. Phys., 94,4959 (1 991 ). The Calculation of FrequencyDependent Polarizab es as Pseudo-Energy Derivatives. 65. J. E. Rice and N. C. Handy, Int. J. Quantum Chem., 43, 91 (1992).The Calculation of Frequency-Dependent Hyperpolarizabilities Including Electron Correlation Effects. 66. K. Sasagane, F. Aiga, and R. Itoh,]. Chem. Phys., 99,3738 (1993).Higher-Order Response Theory Based on the Quasienergy Derivatives: The Derivation of the Frequency-Dependent Polariza bilities and Hyperpolarizabilities. 67. F. Aiga, K. Sasagane, and R. Itoh, J. Chem. Phys., 99,3779 (1993). Frequency-Dependent Hyperpolarizabilities in the Moller-Plesset Perturbation Theory. 68. F. Aiga and R. Itoh, Chem. Phys. Lett., 251, 372 (1996). Calculation of Frequency-Dependent Polarizabilities and Hyperpolarizabilities by the Second-Order Msller-Plesset Perturbation Theory. 69. J. F. Stanton and R. J. Bartlett,]. Chem. Phys., 99, 5178 (1993).A Coupled-Cluster Based Effective Hamiltonian Method for Dynamic Electric Polarizabilities. 70. H. Sekino and R. J. Bartlett, Chem. Phys. Lett., 234, 87 (1995).Frequency-Dependent Hyperpolarizabilities in the Coupled-Cluster Method: The Kerr Effect for Molecules. 71. D. Feller and E. R. Davidson, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1990. Vol. 1, pp. 1 4 3 . Basis Sets for Ah Initio Molecular Orbital Calculations and Intermolecular Interactions. 72. G. J. B. Hurst, M. Dupuis, and E. Clementi, ]. Chem. Phys., 89, 385 (1988).Ab Initio Analytic Polarizability, First and Second Hyperpolarizabilities of Large Conjugated Organic Molecules: Applications to Polyenes C,H, to CZ2H2,. 73. A. T. Yeates and D. S. Dudis, Abstracts, 208th National Meeting of the American Chemical Society, Washington, DC, August 1994. Moderate-Sized Diffuse Gaussian Basis Sets for the Ab Initio Evaluation of Excited-State Energies and Nonlinear Optical Coefficients. 74. M. A. Spackrnan,J. Phys. Chem., 93,7594 (1989).Accurate Prediction of Static Dipole Polarizabilities with Moderately Sized Basis Sets.
278 Predicting Nonlinear Optical Properties 75. A. J. Sadlej, Collect. Czech. Chem. Commun., 53, 1995 (1988). Medium-Size Polarized Basis Sets for High-Level Correlated Calculations of Molecular Electric Properties. 76. A. J. Sadlej, Theor. Chim. Acta, 79, 123 (1991).Medium-Size Polarized Basis Sets for HighLevel Correlated Calculations of Molecular Properties. 11. Second-Row Atoms: Si Through CI. 77. H. Sekino and R. J. Bartlett,]. Chem. Phys., 98, 3022 (1993). Molecular Hyperpolarizabilities. 78. M. W. Schmidt, K. K. Baldridge, J. A. Boatz, S. T.Elbert, M. S. Gordon, J. H. Jensen, S. Koseki, N. Matsunaga, K. A. Nguyen, S. Su, T. L. Windus, M. Dupuis, and J. A. Montgomery, J. Compui. Chem., 14, 1347 (1993). General Atomic and Molecular Electronic Structure System. 79. B. Kirtman, Int.]. Quantum Chem., 43, 147 (1992). Nonlinear Optical Properties of Conjugated Polyenes from Ah Initio Finite Oligomer Calculations. 80. J. 0. Morley and D. Pugh, in Organic Materials /or Non-Linear Optics, R. B. Hann and D. Bloor, Eds., Royal Society of Chemistry, London, 1988, pp. 28-39. Semiempirical Calculations of Molecular Hyperpolarizabilities. 81. R. J. Bartlett and H. Sekino, in Nonlinear Optical Materials: Theory and Modeling, S. P. Karna and A. T. Yeates, Eds., American Chemical Society, Washington, DC, 1996, pp. 23-57. Can Quantum Chemistry Provide Reliable Molecular Hyperpolarizabilities? 82. P. K. Korambath and H. A. Kurtz, in Nonlinear Optical Materials: Theory and Modeling, S. P. Karna and A. T. Yeates, Eds., American Chemical Society, Washington, DC, 1996, pp. 133-144. Frequency-Dependent Polarizabilities and Hyperpolarizabilities of Polyenes. 83. C. Adant, M. Dupuis, and J . L. BrCdas, Int.]. Quantum Chem., Quantum Chem. Symp., 29, 497 (1995). Ah Initio Study of the Nonlinear Optical Properties of Urea: Electron Correlation and Dispersion Effects. 84. E. Perrin, P. N. Prasad, P. Mougenout, and M. Dupuis,]. Chem. Phys., 91,4728 (1989).Ab Initio Calculations of Polarizability and Second Hyperpolarizability in Benzene Including Electron Correlation Treated by M~ller-PlessetTheory. B. Kirtman, in Nonlinear Optical Materials: Theory and Modeling, S. P. Karna and A. T. 85. Yeates, Eds., American Chemical Society, Washington, DC, 1996, pp. 58-78. Calculation of Nonlinear Optical Properties of Conjugated Polyenes. 86. F. Sim, D. R. Salahub, and S. Chin, Znt.]. Quantum Chem., 43,463 (1992). The Accurate Calculation of Dipole Moments and Dipole Polarizabilities Using Gaussian-Based Density Functional Methods. 87 R. M. Dickson and A. D. Becke, J. Phys. Chem., 100, 16105 (1996). Local Density-Functional Polarizabilities and Hyperpolarizabilities at the Basis-Set Limit. 88. B. J. Dunlap and S. P. Karna, in Nonlinear Optical Materiafs: Theory and Modeling, S. P. Karna and A. T. Yeates, Eds., American Chemical Society, Washington, DC, 1996, pp. 164-173. A Combined Hartree-Fock and Local-Density-Functional Method to Calculate Linear and Nonlinear Optical Properties of Molecules. 89. A. M. Lee and S. M. Colwel1,J. Chem. Phys., 101,9704 (1994).The Determination of Hyperpolarizabilities Using Density Functional Theory with Nonlocal Functionals. 90. D. M. Bishop, B. Kirtman, H. A. Kurtz, and J. E. Rice,]. Chem. Phys., 98,8024 (1993). Calculation of Vibrational Dynamic Hyperpolarizabilities for H,O, CO,, and NH,. 91. D. M. Bishop and B. Kirtman,J. Chem. Phys., 95,2646 (1991).A Perturbation Method for Calculating Vibrational Dynamic Dipole Polarizab es and Hyperpolarizabilities. 92. D. M. Bishop and B. Kirtman, J . Chem. Phys., 97,5255 (1992).Compact Formulas for Vibrational Dynamic Dipole Polarizahilities. 93. B. Kirtman and B. Champagne, Int. Rev. Phys. Chem., 16, 389 (1997). Nonlinear Optical Properties of Quasilinear Conjugated Oligomers, Polymers and Organic Molecules. 94. A. Willetts and J. E. Rice,]. Chem. Phys., 99,426 (1993).A Study of Solvent Effects on Hyperpolarizabilities: The Reaction Field Model Applied to Acetonitrile.
References 279 95. J. Yu and M. C. Zerner, /. Chem. Phys., 100, 7487 (1994). Solvent Effect on the First Hyperpolarizabilities of Conjugated Organic Molecules. 96. K. V. Mikkelsen, Y. Luo, H. Agren, and P.Jergensen,/. Chem. Phys., 102,9362 (1995).Sign Change of Hyperpolarizabilities of Solvated Water. 97. 1. D. L. Albert, S. di Bella, D. R. Kanis, T. J. Marks, and M. A. Ratner, in Polymers for Second-Order Nonlinear Optics, American Chemical Society, Washington, DC, 1995, pp. 57-65. Solvent Effects on the Molecular Quadratic Hyperpolarizabilities. 98. R. Cammi, M. Cossi, and J. Tomasi,J. Chem. Phys., 104, 4611 (1996).Analytical Derivatives for Molecular Solutes. 111. Hartree-Fock Static Polarizability and Hyperpolarizabilities in the Polarizable Continuum Model. 99. R. Cammi, M. Cossi, B. Mennucci, and J. Tomasi,/. Chem. Phys., 105,10556 (1996).Analytical Hartree-Fock Calculation of the Dynamic Polarizabilities a,p, and y of Molecules in Solution. 100. R. Cammi and J. Tomasi, Int. J. Quantum Chem., 29, 465 (1995). Nonequilibrium Solvation Theory for the Polarizable Continuum Model: A New Formulation at the SCF Level with Application to the Case of the Frequency-Dependent Linear Electric Response Function. 101. D. S. Dudis, A. T. Yeates, and H. A. Kurtz, Muter. Res. Soc. Symp. Proc., 247,93 (1992).Intermolecular Effects on Third-Order Nonlinear Optical Properties. 102. S. Chen and H. A. KurtzJ. Mol. Struct. (THEOCHEM),388, 79 (1996).NLO Properties of Interacting Polyene Chains.
CHAPTER 6
Sensitivity Analysis in Biomolecular Simulation Chung F. Wong," Tom Thacher,+ and Herschel Rabitz* "Department of Physiology and Biophysics, Mount Sinai School of Medicine, New York, New York 10029-6574, tvirtual Chemistry, Inc., 7770 Reagents Road #252, San Diego, California 921 22, $Department of Chemistry, Princeton University, Princeton, New Jersey 08544
INTRODUCTION Molecular simulation is playing an increasingly important role in studying the properties of complicated systems such as proteins, DNAs, lipids, and complexes of biomolecules.l-s A key advantage of molecular simulation is that it can help one understand in microscopic detail how the components of a complex biomolecular system, and their interactions, determine the functional properties of the system. Yet, sorting out the critical factors affecting the properties of a complex biomolecular system is typically not an obvious task because the intramolecular and intermolecular interactions giving rise to those properties are quite complex. Even with the relatively simple force fields currently being used in biomolecular simulations, one needs to select, from an enormous number of possible factors (easily over thousands), those that are truly significant in determining a key property of a biomolecular system. In principle, one can do chemical modification or genetic experiments to examine the role of a specific functional group, amino acid residue, or interaction in determining a Reviews in Computational Chemistry, Volume 12 Kenny B. Lipkowitz and Donald B. Boyd, Editors Wiley-VCH, John Wiley and Sons, Inc., New York, 0 1998
281
282 SensitivityAnalysis in Biomolecular Simulation specific property. This approach is still quite expensive to carry out systematically, either experimentally or computationally. Consider a moderate-sized protein with about 200 residues: since more than 200 mutations or chemical modifications are needed to systematically map the residues that are significant in determining a specific property of the protein, this would be a very time-consuming and expensive assessment to carry out. New strategies need to be developed to identify efficiently the determinants of biomolecular structures and functions, and to guide the design of novel bioactive molecules. Maiiy problems in engineering research are analogous, in principle, to the problem of dissecting the determinants of biomolecular properties. For example, chemical engineers may be interested in optimizing the performance of a chemical reactor. To achieve this goal, they need to identify the key parameters of the reactor that control its performance and then examine how these parameters can be optimized to improve the reactor's output. Models can be set up to simulate the reactions that take place in a chemical reactor. The parameters of the model can then be varied one by one, to examine how each affects the performance of the chemical reactor. The most significant parameters controlling the performance of the reactor can be identified and then optimized to improve this performance. It is difficult to carry out a sensitivity analysis in this manner when multiple parameters influence the performance of the reactor, however, because many simulations with different sets of model parameters need to be carried out. To overcome this problem, engineers have employed a simple trick. They calculate derivatives of observables/properties of a simulation model with respect to the parameters of the model instead of making many explicit variations of model parameters. This clever approach makes it much easier to identify the determinants of systems properties in a systematic manner. This chapter focuses on a recent development: the extension of this idea of sensitivity analysis to the domain of biomolecular modeling. Readers who are interested in studying sensitivity analysis in engineering research, chemical kinetics, or small-molecule dynamics can refer to books and reviews that have been published.6-11 This chapter presents four applications of sensitivity analysis in biomolecular modeling: the identification of the determinants of biomolecular properties, the design of bioactive molecules, the study of error propagation in simulating biomolecular properties arising from nonoptimal biomolecular force field parameters, and the refinement of potential models for biomolecular simulations.
METHODS As mentioned earlier, an efficient means of probing the sensitivity of the properties of a model biomolecular system is to calculate the derivatives of the properties of the system with respect to the model parameters. (Model parameters are not limited to force field parameters: consider, e.g., nonbonded cutoffs for electrostatic interactions.) Because these derivatives probe the responses of
Methods 283
the properties of a biomolecular system when different model parameters are perturbed, they are also called sensitivity coefficients. Model parameters that play no role in determining a given set of properties give sensitivity coefficients with negligible or zero magnitude. On the other hand, model parameters that give large sensitivity coefficients can reveal the key determinants of the properties of the biomolecular system. Sensitivity coefficients are classified as first order, second order, and so on, depending on the order of the derivatives. In most applications, only first- and second-order sensitivity coefficients are calculated and analyzed. Each first-order sensitivity coefficient provides a one-toone relationship between a property and a model parameter. A second-order sensitivity coefficient illuminates the cooperative or anticooperative effects of two model parameters. It is impractical to calculate and study higher order sensitivity coefficients because long simulations are usually required to obtain reliable values for them. Alternatively, one can carry out a singular value decomposition12 (which is equivalent to a principal component a n a l y ~ i s ’ ~of) a sensitivity matrix whose elements are sensitivity coefficients to study how a group of model parameters act together to affect a group of biomolecular properties. Examples of this are presented later. The computational efficiency of a sensitivity analysis stems from the analyst’s ability to obtain all the sensitivity coefficients by carrying out simulations on only one reference system. This is in contrast to making explicit changes to model parameters and then repeating the simulations hundreds or thousands of times. Parameters of different types usually exist in a simulation model. Some of the most interesting parameters to include in a sensitivity analysis are the ones that determine the intra- and intermolecular interactions, because the properties of a biomolecular system at any given thermodynamic state are determined solely by these interactions. An interesting multifaceted question to ask is: Which features of the intramolecular and intermolecular interactions are most significant in determining a particular property of a biomolecular system, and, to which components of the system can these features be mostly attributed? Such investigations can help to decipher the determinants of biomolecular structures and functions, and they can suggest how (bio)molecules may be modified to alter their structural or functional properties in a well-planned desired manner. (In this chapter, we use “biomolecules” in statements that pertain specifically to biomolecules-proteins, DNAs, etc.-and we use “( bio)molecules” in statements that pertain to either biomolecules or small molecules such as therapeutic drugs.) The interactions among the atoms of a biomolecular system are usually modeled by approximate potential energy functions such as the one shown in Eq. [1]1-3,14,15:
284 SensitivityAnalysis in Biomolecular Simulation
where the first sum is over all the bonds, the second sum is over all the bond angles, the third sum is over all the dihedral angles, and the last sum is over all the nonbonded pair interactions. In this equation, the harmonic approximation is used to describe the bond-stretching and angle-bending energies, and sinusoidal functions are used to describe energy changes due to rotations about bonds. The last term is the nonbonded interaction term, which is a sum over all pairs of atoms that are separated by a specified number of bonds (three, e.g., in the GROMOS16 force field). In this example, the nonbonded interactions are modeled by a sum of Lennard-Jones potentials and Coulombic-type electrostatic potentials. To identify the model parameters that are most significant in affecting biomolecular properties, one needs to calculate many sensitivity coefficients relating these properties to the parameters. For any property characterized by the variable 0, whose ensemble-averaged value is 0, the derivative of 0 with respect to a parameter A, can be shown to
a 0 a
OaH
121
In Eq. [2], H i s the classical Hamiltonian of the system of interest and p = l/k,T, in which k , and T are the Boltzmann constant and the absolute temperature, respectively. If 0 does not depend explicitly on the parameter A,, Eq. 121 simplifies to:
so that the parametric derivative of 0 is simply proportional to the covariance of 0 and the partial derivative of H with respect to At. (Equations 121 and [ 3 ] and other similar equations shown below allow partial derivatives to be calculated analytically instead of numerically.) To facilitate a comparison of different sensitivity coefficients, it is sometimes convenient to calculate the fractional change of a property due to a fractional change of a model parameter. It is then useful to calculate dimensionless normalized sensitivity coefficients of the form
It is useful to point out that two sensitivity coefficients associated with parameters of the same type may give different sensitivities. For example, it is common in current biomolecular force fields to use the same force field parameter for chemically similar groups (e.g., the atomic partial charges for all the amide nitrogens have the same value in commonly used force fields such as those in the GROMOS,16 CHARMM,20and AMBER” molecular modeling packages).
Methods 285 If, however, the environments of the atoms to which these parameters are associated differ, these identical parameters may generate different sensitivities for a given property of a protein. In fact, these identical force field parameters can be viewed as “probes” for studying different local environments of a protein, just as a spectroscopic probe is used to learn about the structural and dynamical properties of different parts of a protein. Therefore, sensitivity analysis provides an efficient tool for thoroughly probing the properties of different parts of a protein. A second-order sensitivity coefficient describing the cooperative/anticooperative effects of two model parameters-Xr and Xl-in affecting an ensemble-averaged property has the general form:
When the operator 0 does not depend explicitly on the parameters X i and
Xi’ Eq. [5] simplifies to:
286 Sensitivity Analysis in Biomolecular Simulation The formulas involving thermodynamic properties, however, are somewhat different because the formula for calculating a thermodynamic quantity differs from that for calculating an ensemble-averaged quantity. For example, the firstorder sensitivity coefficient relating the Helmholtz free energy A of a biomolecular system to a potential parameter hiis expressed in the form
g=(gj
[71
The second-order sensitivity coefficient relating the free energy A of a biomolecular system and two potential parameters hi and hi is expressed in the form
& (-) =
- p[(
(El[g))(($))((e))j -
IS'
A special type of sensitivity coefficient probes the structural responses of a biomolecule to perturbations introduced to different parts of the biomolecule. In molecular mechanic^,^^>^^ a Green's function matrix containing this information can be derived as follows. The x, y, or z component of a force FI acting on an atom of a molecule is given by F. =-- av
ax,
[91
where x i is the x, y, or z component of the Cartesian coordinate of an atom. Taking the derivative of the force with respect to a potential parameter A , one obtains the following equation for the parametric derivative of the force:
The condition
applies when a molecule is at a local or global minimum of its potential energy surface. Equation [ 101 can also be written in matrix form as follows:
-HS+M=O
Methods 287 where H is the Hessian matrix containing the elements a2vlax,axj, S is a sensitivity matrix containing the parametric derivatives &/ax, as matrix elements, and matrix M is composed of the elements d2VldxidX,. A Green’s function matrix satisfying the relation
HG=I
~ 3 1
where I is a unit matrix, can be constructed so that the sensitivity matrix can be obtained as follows:
S=GM
~ 4 1
Because matrix S contains parametric derivatives of structural variables and matrix M contains parametric derivatives of forces applied to a molecule, the elements of the Green’s function matrix G describe structural responses of the molecule when small forces are applied to the atoms of the molecule. (If Fi = -av/ax, is an internal force arising from an intramolecular interaction potential V, -F, = 8Vfaxi can be considered to be a force applied to an atom of a molecule.) One can follow a similar approach to derive a more general Green’s function that measures the change of an ensemble-averaged structural coordinate, rather than a structural coordinate of a molecule at a local or global energy minimum, when a force is applied to an atom of the molecule. Derivation through such a Green’s function approach is more complicated, but there is an easier way to derive this general Green’s function. (We shall call this a generalized Green’s function even though we shall no longer derive this function through a Green’s function approach because the physical insights that this more general function provides are similar to those of the molecular mechanics Green’s function.) In the easier derivation that follows,24 suppose a small perturbation potential Vp of the form:
is applied to a system in which the atoms are interacting through a potential l? In Eq. [15] xiis a coordinate associated with atom i, ki is a constant, o denotes an average over an ensemble of an unperturbed system (i.e., a system with no perturbation potential Vp applied). The perturbation potential V,, imposes a force of -kj on atom i. An ensemble-averaged coordinate of atom j, <xi>, of the system with an applied perturbation potential V, can be written as
288 SensitivityAnalysis in Biomolecular Simulation where J . . . dT denotes an integration over all the atomic coordinates. For a small value of k;,Eq. [16] can be approximated by
Equation 1171 also can be written in the form
where Axl = xl - <x,>,. as
If one expands <xl> on the left-hand side of Eq. [ l S ]
and compares the coefficients of the linear terms in kion both sides of Eq. [ 191, noting that
the elements of G, of a generalized Green’s function matrix G can be obtained from the familiar displacement correlation function
as
6
where = - k, is the x, y, or z component of an external force applied to atom j. Each element GzIof the Green’s function matrix measures the extent to which the averaged coordinate <xJ> is changed as a result of a small force applied to atom j. In its most general form, the Green’s function matrix is of order 3 N x 3 N , where N is the number of atoms in the system. However, rigid body translation is usually not of interest, and its contribution to the Green’s function matrix
fi
Methods 289 can be easily removed. The overall tumbling motion of a molecule, on the other hand, is typically coupled to the intramolecular vibrations of the molecule. In many cases, through, it may be a good approximation to neglect the coupling between the overall tumbling motion and the intramolecular vibrations of a biomolecule, and this contribution to the Green’s function matrix can also be removed, albeit approximately. (Better methods than those currently used are needed for removing the contribution of molecular rotation to Green’s function matrix elements. A recent work on calculating positional covariance matrices from molecular dynamics trajectories showed the results to be dependent on how the rotational components were approximately removed from the positional covariance matrices.25) The resulting Green’s function matrix thus mainly accounts for structural deformations when small forces are applied to different parts of the molecule. The essence of Eq. [22] is that it quantitatively predicts the extent to which the average position of an atom is perturbed when a small force is applied to another atom, in the linear response limit. The extent of the perturbation of the position of an atom by a force applied to another atom depends both on the degree of correlation of the two atoms and on the positional fluctuations of the two atoms about their means. The approach taken here to deriving the generalized Green’s function matrix has several advantages over the molecular mechanics approach described earlier. (1)The molecular mechanics approach only considers a molecule at a local minimum, whereas the generalized Green’s function approach accounts for an ensemble of structures accessible at physiological temperatures. (2) The calculation of a molecular mechanics Green’s function matrix requires the inversion of a Hessian matrix H of order 3 N X 3 N , where N is the number of atoms in a molecule; this is difficult to carry out for large biomolecules. The generalized Green’s function approach, on the other hand, does not require the inversion of a large matrix, and it can therefore be applied more easily to large biomolecules. ( 3 )Solvent effects are easier to include in the generalized Green’s function approach. In the molecular mechanics method, one in principle needs to include solvent coordinates in the Hessian matrix and invert this whole matrix to obtain the corresponding Green’s function matrix elements. Accordingly, if the molecular mechanics Green’s function matrix elements of the solute atoms embedded in an explicit solvent environment are sought, one needs to invert a Hessian matrix that includes the solvent coordinates to account for the effects of the solvent on the Green’s function matrix elements. It is probably a good approximation to neglect the matrix elements involving solvent coordinates in the Hessian matrix when only the solute matrix elements of the molecular mechanics Green’s function are needed. Nevertheless, the generalized Green’s function approach does not introduce this additional approximation if the trajectory used to calculate a positional covariance matrix is obtained from a simulation that already includes solvation effects (i.e., by the use of an explicit solvent model in a molecular dynamics simulation).
290 Sensitivity Analysis in Biomolecular Simulation
A Green’s function matrix can be manipulated in different ways to gain different insights into the structure and function of a biomolecule. For example, one can calculate
je(atom k ) i=l
for a protein molecule. (The first sum is over the three Cartesian components of atom k.) This quantity provides a general measure of the structural response of the protein when atom k is perturbed. Atoms associated with large values of G;a,,m k) may play important roles in determining the structure of the protein. One can also study collective structural responses of a protein molecule by diagonalizing a Green’s function matrix. Such an analysis is somewhat similar to a quasi-normal mode analysis.26-28 In a quasi-normal mode analysis, one focuses on obtaining the effective normal modes of a molecule by constructing an effective Hessian matrix from a molecular dynamics trajectory so that anharmonic effects can be accounted for approximately. The eigenvalues of the effective normal modes can be used to provide estimates of thermodynamic quantities, and the eigenvectors can give useful insights into the local and collective motions of the molecule. In a Green’s function analysis, on the other hand, one can directly identify groups of atoms that cooperate to produce large (or small) local or global structural responses of a molecule by diagonalizing a Green’s function matrix (see section on Applications). For large biomolecules, the diagonalization of a Green’s function matrix of order 3N X 3N is difficult. However, the principal component analysis technique13 provides a useful alternative. As discussed earlier,29 a principal component analysis can be achieved through a singular value decomposition12 of an m X n matrix, where m and n are integers and need not be equal. When a singular value decomposition12 is applied to a Green’s function submatrix of dimension m X n, one can study how small forces applied to the part of the protein defined by n propagate to affect structure in the part of the molecule defined by m. A comparison of this approach with some similar methods is discussed later. The singular value decomposition (SVD) method,12 and the similar principal component analysis method,I3 are powerful computational tools for parametric sensitivity analysis of the collective effects of a group of model parameters on a group of simulated properties. The SVD method is based on an elegant theorem of linear algebra.12 The theorem states that one can represent an m X n matrix M by a product of three matrices:
M = UDVT
~ 4 1
where U is an m x n matrix that satisfies the relation UTU = I in which I is the unit matrix; D is a diagonal matrix, and the diagonal elements are known as
Methods 292 singular values; and V is an orthogonal matrix that diagonalizes MTM. To use this theorem to study cooperative effects of potential parameters on simulated biomolecular properties, one can take M to be a first-order sensitivity matrix S containing the elements aO,tak,. For cases involving small perturbations and responses, one can use the following equation to estimate the responses of biomolecular properties to small perturbations of model parameters.
in which the vector ddcontains the elements d o j of parametric responses of computed properties, and the vector d i contains the elements dhi of parameter perturbations. By performing a singular value decomposition on S, one can write d 6 as follows:
Multiplying on the left by the transpose UT U T d d = DVTdi Then
do’ = Ddi;’ where
and
dg
= VTdi
Therefore, the elements of d d ’ are linear combinations of the elements of d6, and the elements of dg are linear combinations of the elements of d i . Because D is a diagonal matrix, Eq. [26c] represents a set of n decoupled equations. Each equation relates an element of d d ’ to an element of d?, that is,
do; = D,,dK;
~ 7 1
where d 6 ; and dii are the ith elements of the vectors dd’and di;‘, respectively, and Dii is the corresponding diagonal element of the matrix D. Therefore, each element of di;’ is mapped into an element of dd’ by the scaling factor Dji.Because each d d and each di;’ are linear combinations of the elements of d d and
292 Sensitivity Analysis in Biomolecular Simulation
d i , respectively, each of the n decoupled equations describes how a linear combination of parameters affects a linear combination of observabledproperties. The equations associated with the largest DIiidentify the linear combination of parameters having the most profound influence on a linear combination of biomolecular properties.
DEPENDENCE OF SENSITIVITY RESULTS ON THE CHOICE OF FORCE FIELDS Depending on the force field used in a molecular dynamics or a Monte Carlo simulation, sensitivity coefficients with apparently similar forms may have very different values. For example, the electrostatic energy of a biomolecule in solution may be modeled by using Coulomb's law with effective atomic partial charges that implicitly approximate the effects of electronic polarization of the atoms of the biomolecule by their surroundings. On the other hand, explicit polarization terms can also be used in a force field to calculate the electrostatic energy of a system. Consider the familiar example of water models. Effective charge modelssuch as the SPC30 and the TIP3P3' models-are commonly used to study the properties of water and aqueous solutions of biomolecules. The atomic partial charges of these water models give a water monomer a dipole moment larger than the corresponding experimental gas phase value. These larger dipole moments of the water models reflect the effects of electronic polarization of water molecules by their environment in the liquid phase. Without using larger magnitudes for the atomic partial charges to yield a larger effective dipole moment of a water molecule, a nonpolarizable water model such as the SPC and TIP3P models would have underestimated the electrostatic energy of liquid water. More sophisticated water models account for atomic or molecular polarizability explicitly, so that water molecules can be polarized differently according to their environment.3240 In these polarizable water models, the atomic partial charges are usually chosen to reproduce the gas phase dipole moment of a water monomer. These atomic partial charges are different from the corresponding charges used in the effective charge water models. If one calculates the sensitivity coefficients of different properties of an aqueous system with respect to the atomic partial charges of the water models, the sensitivity coefficients obtained from an effective charge model can be different from those of a polarizable model. This difference was found in calculations of the charge sensitivities of different properties of liquid water with two effective charge models (the SPC30 and the TIP3P models3') and a polarizable water mode117*40.41(Table l),although the differences were more pronounced for sensitivity coefficients of some types than for others. Therefore, in using sensitivity coefficients to help identify the determinants of bio(mo1ecular) proper-
-27.4 -25.6 -22.6 -4.7 -5.4 -0.2
? 0.2 +- 0.2 +- 0.2 f 0.6 t 0.3 +- 0.2
Internal Energy (kcaPrno1)
-32,000 i- 1000 -29,400 i- 700 - 17,500 i- 500 -58,800 +- 500 -55,500 C 400 -44,700 f 300
Pressure (atm)
-1.9 +- 0.6 -2.7 +- 0.5 -0.7 ? 0.6 -1.3 5 0.6 -1.6 2 0.5 0.3 f 0.6
Thermal Pressure Coefficient ( a t d d e g )
0.2 -0.0 0.2 0.2 0.2 0.6
+- 0.7 -+ 0.4 f 0.6 +- 0.7 ? 0.4 ? 0.5
Kirkwood G, factor
aThe SPC model is from Refs. 17 and 30, the TIP3P model is from Refs. 17 and 31, and the polarizable water model is from Refs. 38-40. 0, qo,, and qHyddenote, respectively, a property of liquid water, the atomic partial charge of the oxygen in a water model, and the atomic partial charge of the hydrogen in the water model.
q o x a o / a q O x , spc qoxao/aqox, T I P ~ P q o x a 0 / a q o x , Polarizable qHydao/aqHy& spc qHydao/aqHy& q H v d a O / d q H v d , Polarizable
0
Polarizable Water Modelsa
Table 1. Sensitivity of Several Properties of Liquid Water to the Perturbation of the Atomic Partial Charges of the SPC, TIP3P, and
294 sensitivity Analysis in Biomolecular Simulation ties, one needs to keep in mind the types of potential functions used in a simulation. For the example of liquid water, a simulated water property may respond to the perturbation of an atomic partial charge of an effective charge water model and the perturbation of an apparently similar atomic partial charge of a polarizable water model in different ways. In a polarizable water model, one can perturb an atomic partial charge with the electronic polarizability of the atom held fixed. O n the other hand, since an atomic partial charge of an effective charge model has implicitly included the effects of electronic polarizability in an approximate manner, perturbing an atomic partial charge of an effective charge model also has the effect of modifying the electronic polarizability of the water model. Therefore, just as in genetic studies, in which different point mutations may reflect different determining factors of the properties of a protein, the perturbation of apparently similar parameters of potential models of different types may reflect different determining factors of the properties of a biomolecular system. A user of sensitivity analysis needs to keep this in mind to make proper inferences from sensitivity data.
CONVERGENCE ISSUES As in simulating any other properties of bio(molecu1ar) systems, it is important to address the convergence characteristics of sensitivity coefficients of different types. The speed of convergence of sensitivity coefficients depends on the observables/properties and on the model parameters involved. The error bars for the simulation of several properties of liquid water in Table 1 illustrate the convergence characteristics of several types of sensitivity coefficient involving atomic partial charges. It is clear that the charge sensitivities of the internal energy and the pressure of liquid water have much smaller relative statistical errors than that of the Kirkwood G, factor. The Kirkwood G, factor is often calculated by using the expression:
where Giis the molecular dipole moment of water molecule i, and N is the total number of water molecules in a simulated system. Thus, the Kirkwood factor G, is related to the fluctuation of the collective dipole moment ZYGj of the water molecules in a simulation cell; it is well known that to obtain reliable estimates of this fluctuational property, one needs to carry out long simulations. Sensitivity coefficients relating the free energy of a system to the atomic partial charges are among the sensitivity coefficients that can converge quickly.
ConverKence Issues 295
100
50 0 -50
-1 00 -1 50
-r o
o
o
Z I T
N
0
Figure 1 Charge sensitivities of the free energy of a solution of glycine dipeptide in methanol. The first three atomdextended atoms (CM1, C, and 0)builds up the acetyl N-terminal blocking group. The next five atomdextended atoms (N, H, CA, C, and 0) represent components of a glycine unit. The last three atomdextended atoms at the Cterminal end (N, H, and CM2) terminate the peptide as an N-methyl amide.
Figure 1 illustrates the calculation of the charge sensitivity of the free energy of an organic compound, N-acetylglycine-N'-methylamide(also called glycine dipeptide or terminally blocked glycine by some authors), in methanol. One can see that the charge sensitivities of the free energy of this system obtained from three 10 ps segments of a 30 ps molecular dynamics simulation are quite similar, even though a large portion of the +$ space has been sampled during the 30 ps simulation (Figure 2). Figure 3 illustrates results obtained from a simulation of the protein bovine pancreatic trypsin inhibitor (BPTI) in a pseudosolvent environment. The simulation was carried out without including explicit bulk solvent molecules, and the screening effects of the bulk solvent on the intramolecular electrostatic interactions were modeled by scaling the charges of the basic and acidic residues so that they retained a net charge of zero, a method employed by the GROMOS molecular dynamics package.16 Figure 3 shows the similarity of results for the sensitivity of the free energy of the protein upon perturbation of the atomic partial charges of the amide nitrogens obtained from two 70 ps segments of a molecular dynamics simulation, suggesting that most of the sensitivity coefficients had converged reasonably well. However, there exist a few residues (e.g., near residue Arg42) that require longer simulation times to achieve statistics comparable to those for the other sensitivity coefficients. This observation is not surprising. Different regions of a protein can differ in flexibility, and the sensitivity coefficients associated with the partial charges located in those flexible
296 SensitivitvAnalvsis in Biomolecular Simulation
0
50 -50
L
1 1
0
0
-loo -150
0
0
-150 -100 -50
0
4
50
100
150
Figure 2 The $4space sampled by a 30 ps molecular dynamics simulation of a solution of glycine dipeptide in methanol.
regions are expected to converge more slowly. This behavior was also found in avian pancreatic polypeptide (APP).19 The example presented for BPTI also highlights the discussion above, which indicated that sensitivity coefficients associated with apparently similar
Figure 3 Sensitivity of the free energy of bovine pancreatic trypsin inhibitor to perturbations of the atomic partial charges associated with the amide nitrogens of BPTI.
Convergence Issues 297 parameters may be very different. All the sensitivity coefficients presented in Figure 3 were associated with the same single property (the free energy of the protein) and the same type of parameter (the atomic partial charge of the amide nitrogens, which has one value in the GROMOS force field.16) Nonetheless, it is clear from Figure 3 that the values of these sensitivity coefficients were quite different, reflecting the different environments of the different amide nitrogens in the protein. Parametric structural sensitivities of the protein avian pancreatic polypeptide have also been studied by Zhang et a1.,19 who found that these parametric structural sensitivities converged more slowly than parametric free .. .. energy sensitivities. The choice of nonbonded cutoffs is another important factor to consider when one is estimating the statistical errors of simulated properties. Many simulations have been carried out by using nonbonded cutoffs (i.e., the interaction potential between two nonbonded atoms is calculated only when the distance separating them is smaller than a user-chosen value). The nonbonded cutoffs for calculating the long-range electrostatic interactions are often only 8 8, and rarely exceed 15 A. The neglect of long-range electrostatic interactions beyond these relatively small values of nonbonded cutoffs may overestimate the fluctuational properties of simulated biomolecular systems. The sensitivity RcutdG,IaRcU, of the Kirkwood G , factor of liquid water to a choice of the nonbonded cutoff value Rcut has been s t ~ d i e d . ~ OThis ? ~ l cutoff is slightly different from the nonbonded cutoff discussed above. Instead of neglecting the interaction energy between atom A and all the atoms more distant from A than the nonbonded cutoff value, Rcut separates a space around an atom into an inner region and an outer region. In the inner region, the electrostatic interactions between atom A and any atom within the inner region were calculated explicitly. The outer region is treated as a dielectric continuum. For RcutdG,IdRcut was found to be negative for this reaction field type of the flexible SPC model, the flexible TIP3P model, and a flexible polarizable water model, 7,40,41 suggesting that a better model for properly including longrange electrostatic interactions could reduce the molecular fluctuational properties. (Recall that the Kirkwood factor is proportional to the fluctuation of the collective dipole moment of the water molecules in a simulation cell). Cutoff sensitivity was further illustrated by a recent Brownian dynamics simulation in which the fluctuation of the collective dipole moment of a NaCl solution calculated by using a 40 8, nonbonded cutoff was -25% smaller than that obtained by using a 20 8, Because the statistical error of a simulated quantity is related to the fluctuation of the property (larger statistical error being expected for a larger fluctuation), a simulation using shorter nonbonded cutoffs may give a larger statistical error than one using longer cutoffs for simulations of the same duration. Keep in mind, too, that a simulated quantity may be directly affected by the choice of the magnitude of the nonbonded cutoff. The simulations of the potentials of mean force between ion pairs illustrate this.44,45The choice of nonbonded cutoff values can be an important factor to consider in simulativg biomolecules that are stabilized by relatively weak
298 Sensitivity Analvsis in Biomolecular Simulation forces. The larger structural fluctuation resulting from the use of a short nonbonded cutoff may unfold a protein or a DNA during a molecular dynamics or Monte Carlo simulation. This artifact can reveal itself quickly in simulations of a short helix, for example, whose structure is stabilized by relatively few interaction~.~~
APPLICATIONS Determinants of (Bio)molecular Properties Sensitivity analysis can be applied in the identification of the features of a force field model or the key components of a bio(molecu1ar) system that are most significant in determining the properties of the system. It is not always straightforward to identify these key features. The interactions operating within typical biomolecular systems are characterized by many short-range and longrange interactions; the complex interplay among these interactions of different types often makes it difficult to predict by intuition alone the role of each in influencing the properties of a (bio)molecular system. Several illustrative examples are given below.
Liquid Water Liquid water is an important medium in which most biomolecules perform their function. Pure liquid water may seem like a very simple system, where one may be able to identify the determining factors of its properties easily, but this is not always found to be the case. Table 2 gives examples of studying several thermodynamic properties of liquid water using a flexible SPC water model in a molecular dynamics sim~lation.~' This flexible water model was characterized by the following interaction potential:
The indexes 1and k in the last term represent either 0 or H, i and j label water molecules; E and u are the Lennard-Jones energy well depth and radius, respectively, the superscript O is used to label reference (idealized) distances; kjare force constants; and N the number of water molecules in the simulations. The
'OH
-27.4
* 0.2
58 f 1
-62.5 2 0.8
-27.4
59 ?
0.2
*1 3.3
+ 0.8
-7.9 2 0.6 2.9 2 0.2 -14 2 1
-64.9 f 0.8 64,400 ? 1000 44,000 C 1000 -32,000 -C 1000 -58,800 2 500
Pressure (atm)
6 t 3
-4 c 1 12 t 2 -4 2 7
Thermal Pressure Coefficient (atddeg)
4 2 2 421 -4.5 2 0.8
-5 ? 4 6+1 -24 2 3
Heat Capacity (calmol-* deg-l)
'The results presented are the sensitivity coefficient Ad0 / dA in which A is one of the potential or model parameters: yOH (equilibrium bond-length of the flexible SPC model), yHH (equilibrium distance between the two hydrogens of a water molecule in the flexible SPC model), k, (harmonic force constant of the bond), u (Lennard-Jones repulsive parameter of the water model), E (Lennard-Jones well depth of the water model), qo (atomic partial charge on the water oxygens), qH (atomic partial charge on the water hydrogens), and Rcu, (cutoff radius in the reaction field model), respectively.
Rcur
40 qH
E
U
kl
'HH
Free Energy (kcal/mol)
Entropy (cal.mol-' deg-')
Internal Energy (kcahol)
Table 2. The Most Significant Parameters of the Flexible SPC Water Model That Control Different Thermodynamic Properties of Liquid Watef
300 SensitivityAnalysis in Biomolecular Simulation cutoff radius Rcut in the reaction field geometry was chosen to be half the basic cubic box length of 19.726 A. Because Lennard-Jones energies were calculated only between two water oxygens, each y j j was the distance between the oxygen of molecule i and that of molecule j. The inclusion of the cross terms in the intramolecular vibrational degrees of freedom allowed couplings among the valence bond angle and the bond lengths of a water molecule. cRFis the dielectric constant of the outer region in the reaction field model and was chosen to be 80 in the simulations. The parameters giving the largest sensitivity for each of the selected thermodynamic properties considered are listed in Table 2. It is clear that different properties of liquid water are controlled by different features of the water model. The free energy of liquid water was found to be affected most by the perturbation of the equilibrium bond length and the Lennard-Jones repulsive parameter of the water model, not by perturbations of the atomic partial charges of the water model. This result reflects the occurrence of a hydrogen bond network in liquid water. Increasing the equilibrium bond length or decreasing the Lennard-Jones repulsive parameter of the water model enhances the formation of hydrogen bonds, consequently decreasing the free energy of the model liquid water. However, this energetic argument is valid only when the free energy of liquid water is dominated by the contribution from the internal energy rather than by the entropy. This is the case for the flexible SPC water model because one finds similar parametric sensitivities of the free energy and the internal energy (Table 2 ) J 7 The entropy of the liquid, on the other hand, was found to be controlled by a different set of potential parameters of the water model.” In particular, the flexibility associated with the 0-H stretching mode played the most significant role in determining the entropy of liquid water. Making the 0-H bond more rigid, by increasing the harmonic force constant k,, decreased the entropy. As expected, increasing the equilibrium bond length of the water model also decreased the entropy of the liquid, because the enhancement of the intermolecular hydrogen bonds gave the liquid a more icelike character. The sensitivity analysis study also revealed that long-range interactions can play an important role in determining the entropy of the liquid. Analysis of the pressure of the liquid gave additional inf0rmation.l’ Whereas it is reasonable to expect that increasing the Lennard-Jones repulsive parameter will increase the pressure of the liquid at constant volume, and increasing the magnitude of the atomic partial charges will decrease the pressure of the liquid, it is hard to predict on the basis of intuition alone that increasing the magnitude of the Lennard-Jones well depth may increase the pressure of the liquid. The effects of different potential and model parameters on different structural and energetic distribution functions-including radial distribution functions between water atoms, the distribution function of the interaction energy of a water molecule with its surroundings [P(u) of Figure 41,and the distribu-
Applications 301
t
1
0
-60
-40
-20
u (kcaVmol)
0
20
Figure 4 Distribution function P(u) describing the interaction energy ( u ) of a water molecule with its surroundings. tion function of the local electric field at the oxygen of a water molecule projected along the permanent dipole moment vector of a water molecule-were studied by Zhu and W ~ n g Examples . ~ ~ from that study are shown in Figure 5 for the flexible SPC water r n ~ d e l . ~ ’It, ~is~clear from Figure 5 that these potential/model parameters affected the distribution function P(u) of the interaction energy of a water molecule with its surroundings very differently. Perturbations of the 0-H bond harmonic force constant k, had a larger influence on the peak and the high energy wing of P(u). Increasing the O-H bond flexibility, effected by decreasing its harmonic force constant, shifted the peak of P(u) toward more favorable interaction energy and reduced the fraction of water molecules having high interaction energies. Increasing the magnitude of the partial charge of the oxygen qo in the water model broadened the distribution P(u), resulting in increases in both the fraction of water molecules having favorable interaction energies (negative) and the fraction of water molecules having unfavorable interaction energies (positive). The increase in the low energy wing of P(u) was greater than that in the high energy wing, so a net gain of water molecules having favorable energies resulted. This observation is consistent with the finding (Tables 1 and 2) that increasing the magnitude of the atomic partial charges of the water oxygens decreased the internal energy of the liquid. Perturbing the atomic partial charges of the water hydrogens qH had little effect on the low energy wing of P(u) but had a significant effect on the high en-
'p
0
cu
0
0
0
P
0
0
(?
302
0
*-)
(u
1
7 0
'p
3
m
Applications 303 ergy wing. Increasing the nonbonded cutoff Rcut of the reaction field model had significant effects on the far wings of P(u), decreasing the fraction of water molecules having very favorable interaction energies and increasing the fraction of water molecules having very unfavorable energies. These examples indicate the complexity of the relationships between the properties of liquid water and the model parameters determining the intramolecular and intermolecular interactions of liquid water. Many of the results discussed in this section for liquid water are hard to predict by intuition alone. Water is a comparatively simple system in (bio)molecular modeling, and it is even harder to identify the determinants of the properties of complex biomolecular systems. Sensitivity analysis should thus be a useful tool for sorting out the key contributing factors of interesting biomolecular properties. Two-Dimensional Square Lattice Model of Protein Folding
Sensitivity analysis has been applied to analyze a two-dimensional square lattice model of protein folding.I8 In this example, each residue of a model polypeptide could only occupy the lattice points of a two-dimensional square lattice, and only two types of residue-hydrophobic and hydrophilic-were assumed to exist. Two conformations of a 10-residue model polypeptide in a twodimensional square lattice are shown in Figure 6. The model polypeptide gained stabilization energy only when two nonbonded hydrophobic groups were in contact, so the polypeptide whose conformation had more hydrophobic contacts had a lower energy than a conformation having fewer hydrophobic contacts. The two conformations of Figure 6 have the same energy if all the residues are hydrophobic because they have the same number (4) of nonbonded hydrophobic contacts. To facilitate sensitivity analysis studies, Bleil et a1.18 wrote the interaction energy between two residues i and i in contact as the product of two energy parameters el and Ei in which el = 0 if residue i is a hydrophilic residue and (&, )2 = chhif residue i is a hydrophobic residue. A sensitivity analysis study was carried out by calculating first-order and second-order sensitivity coefficients of the forms ao/as2and a20/a&,a&,, in which 0 represented a thermodynamic or ensemble-averaged property of a model polypeptide in the two-dimensional square lattice representation. Figure 7 offers an example of the cooperative effects of two hydrophobic residues on the entropy of a 10-residue model polypeptide (with a homogeneous sequence of hydrophobic residues) on the two-dimensional square lattice. The second-order derivative of the entropy of the model polypeptide with respect to each pair of sl and El, ( d 2 S / d ~ l a ~ l ) is ~z~,, displayed. The pattern of sensitivity coefficients was found to be complex and dependent on the hydrophobic interaction energy parameter ehh.At very low hydrophobic interaction energies, 1% k,T, the most prominent cooperative effects arose from (i, i + 3 ) residue pairs. The cooperative effects from (i, i + 5 ) , (2, i + 7 ) ,and (i, i + 9) pairs were also significant. Their sensitivity coeffi-
-
304 Sensitivity Analysis in Biomolecular Siniulation
Helix
antiparallel
p strand
Figure 6 Two conformations of a 10-residue model peptide on a two-dimensional square lattice.
cients were mostly negative, suggesting that the entropy of the model polypeptide decreased when the hydrophobicity of both residues in each of these pairs increased. In the two-dimensional square lattice model, residue pairs separated by even numbers of residues [type 1, i.e., (i, i + 3 ) , (i, i + 5 ) , (i, i + 7), and (i, i + 9) pairs] could be in direct contact in the lattice, and it was therefore not surprising t o see significant cooperativity between these residue pairs in affecting the entropy of the model polypeptide. O n the other hand, residue pairs separated by odd numbers of residues [type 2, i.e., (i, i + 4), (2, i + 6), (i, i + 8), pairs] could not be in direct contact in the lattice and thus could not contribute to the total hydrophobic energy of the model polypeptide. These pairs did not show prominent cooperativity at low hydrophobic energies. However, as the hydrophobic interaction energy parameter chh was increased t o -0.5 k,T, type 2 residue pairs gave significant positive cooperativity. As the hydrophobic energy parameter was increased further to -1 k,T, type 1 residue pairs gave positive cooperative effects instead of negative ones observed at low hydrophobic energies. When the hydrophobic energy parameter was
Figure 7 Scaled second-order sensitivity (d2Sldsidsi)sisi of the entropy of a 10-residue model polypeptide for five values of shh: (a) 0.01 x,’(b)0.5 X, ( c )1.0 X, (d) 2.0 X, and ( e )5.0 X k,T
306 SensitivityAnalysis in Biomolecular Simulation
around 2 k,T, all the residue pairs showed positive cooperativity with comparable magnitudes. At even higher hydrophobic energies (near 5 k,T), most type 1 residue pairs became less cooperative than type 2 residue pairs in affecting the entropy of the model polypeptide. Therefore, the simple two-dimensional square lattice model of protein folding, which was composed of a homogeneous sequence of hydrophobic residues equal in hydrophobicity, exhibited cooperative effects highly dependent on the locations of the residues in the sequence, the hydrophobic energy, and the temperature. More importantly, even residue pairs that had no direct interaction energies could show cooperative/anticooperative effects in influencing system properties. This finding is consistent with the experimental observation that the free energy changes due to multiple mutations could sum nonadditively from the free energy changes due to single mutations even though the residues that were mutated were separated by large distances and therefore had negligible direct interaction energies.4749 Sensitivity coefficients of second and higher order measure nonadditive effects. Consider the expression
u = C-AE~ as +1- ~ C -a2-s- A E ~ + .A .. E ~ aEi
2
.
I
aE,a&,
for estimating the entropy change AS of a model polypeptide. If all the sensitivity coefficients of second and higher order are zero, only the linear terms remain, and the total entropy change can be expressed as a sum of the entropy changes resulting from single “mutations.” The same argument applies to the study of free energy and internal energy changes. Conversely, our findings suggest the need to be careful when double mutation experiments are used to probe residue pair interactions in proteins. The study of the two-dimensional square lattice model clearly demonstrates a counterexample in which cooperative effects between two residue pairs can occur even though there is no direct interaction between the two residue pairs. Therefore, results from double mutation experimentsSo may not necessarily reflect residue pair interactions. This simple model of protein folding was provided to show that it is not always trivial to identify the determinants of the properties of (bio)molecular systems. The complexity of the problem can increase further for more complicated and realistic models of (bio)molecular systems. Sensitivity analysis should therefore be a useful tool for sorting out the significant factors that determine the properties of these complex systems.
Molecular Recognition Just as it may not be trivial to identify the determinants of the properties of complex biological systems, it is also not necessarily straightforward to suggest effective modifications of (bio)molecules needed to achieve desired bio-
Applications 307 logical effects. Sensitivity analysis might be a useful tool for guiding the design of novel bioactive molecules. A few examples of sensitivity analysis relating to molecular design have been published. 8~51-53 In these examples, no bioactive compounds were actually designed, but the process of association of molecular systems was modeled. Cieplak et al.52 examined the use of free energy derivative calculations to suggest what types of cation may be bound most strongly by 18-crown-6. In that work, the authors calculated sensitivity coefficients of the form dAG/dAi in which A G is a binding free energy of a cation by 18-crown-6, and X i is a nonbonded interaction parameter. They found that values of dAGldRI, where Rr is the atomic radius of a cation, were -10.9 kcall(mol.A), 2.7 kcall(mol.A), and 4.5 kcall(mo1-A) for Na+, K', and Rb', respectively. The negative sign of dAG/dR: for Na' suggests that a larger cation than Na' may bind more strongly to 18-crown-6, and the positive sign of dAGlaR," for K' and Rb' suggests that a cation smaller than K' and Rb' may bind more favorably to 18-crown-6. Recalling that the size of these ions increases in the order Na' < K+ < Rb', and that the charges of these ions are similar, these results suggest that a fictional ion of size between that of Na' and K' is optimal for binding to 18-crown-6. However, the magnitude of aAGlaR; for Na' and K' suggests that the size of this optimal ion is closer to K' than to Na'; this result is consistent with the experimental finding that 18-crown-6 binds more tightly to K' than to Na' or Rb'. By calculating values of dAG/dA, involving the nonbonded parameters of 18-crown-6, these authors also speculated that suitable modifications to 18crown-6 that would increase the negative charges of the crown ether oxygens might improve the binding to K'. Because first-order sensitivity coefficients are easier to calculate than higher order sensitivity coefficients, it is likely that the former may be used more frequently in guiding molecular design. However, first-order sensitivity theory can provide reliable predictions only when the sensitivities of the properties of interest are approximately linear with respect to the model parameters. This linear response limit is satisfied when the perturbations of model parameters are small. For certain applications, such as in protein engineering where one amino acid is mutated into another, the linear response approximation may fail to reliably predict the change in the properties of a protein resulting from a point mutation. It is therefore useful to examine in more detail how well first-order sensitivity theory performs in guiding such predictions. The two-dimensional square lattice protein folding model discussed earlier provides a simple basis for probing this issue. The model has the advantage of allowing one to carry out many exact calculations to check the predictions from first-order sensitivity theory. Unlike molecular dynamics or Monte Carlo simulations, there are no statistical errors or convergence problems associated with the calculations of the properties, and their parametric derivatives, of a model polypeptide on a two-dimensional square lattice. Starting from many 10-residue model polypeptides with different sequences, Bleil et a1.I8 made different mutations (corresponding to changing a
308 SensitivityAnalysis in Biomolecular Simulation
hydrophobic residue into a less hydrophobic or a hydrophilic residue, or to changing a hydrophilic residue into a hydrophobic one) and compared the exact results obtained by making explicit mutations and the approximate results obtained from first-order sensitivity theory. In the exact calculations, the change for a property of a model polypeptide was obtained by computing A 0 = On,, - Ooldin which Onewand Ooldwere the property calculated after and before mutation, respectively. When first-order sensitivity theory was used, A 0 was calculated by using A 0 = ( dO/&t)Aezin which et is the hydrophobic energy of residue i defined earlier, and Aef is the change of the value of g2due to a mutation. The following 13 observables 0 were included in that study: free energy, entropy, internal energy, equilibrium constants between different classes of conformations characterized by having different number of hydrophobic contacts, averaged compactness, averaged number of hydrophobic-hydrophobic interactions, averaged number of bends, averaged number of hydrophobic-hydrophobic interactions per bend, and averaged number of buried residues that were hydrophobic. Table 3 summarizes the ability of first-order sensitivity theory to correctly predict the direction of change due to the mutations. It is clear from the data in Table 3 that first-order sensitivity theory works best when e f and hef are both small. When el and Aezare of the order of 2 k,T, the predictive reliability decreased to -75%. Therefore, first-order sensitivity theory does not always give correct predictions. However, since first-order sensitivity coefficients can usually be calculated more easily than higher order sensitivity coefficients in (bio)molecularsimulations, first-order sensitivity coefficients can be used as a preliminary screening tool for suggesting a small number of modifications to a (bio)molecule that may lead to the desired biological effect. More sophisticated (but usually more expensive) calculations and/or suitable experimental studies can then be carried out to sort out from this small number of suggestions those that are more likely to achieve the desired biological effects. If experimentation is easier, the predictions can be tested in the laboratory. An obvious extension of first-order sensitivity theory is to develop higher order theories utilizing higher order sensitivity coefficients. For example, some Table 3. Predicted Results of Mutations for a Two-Dimensional Square Lattice Model of Protein Folding Using First-Order Sensitivity Theorya E*
(k,T)
0.1 1 2 2
A&.
Correct
0.1 1 2 1
133 128 119 152
-
532
aData from Ref. 18.
Incorrect
% Correct
14 25 36 56
90 84 77 73
131
80Tva)
-
Applications 309 investigators have considered the Gaussian-type approximations employed in free energy c a l c u l a t i ~ n s . Specific ~ ~ - ~ ~ applications include estimating the pKa’s of acidic and basic residues of proteinsss and of the excitation energies of chromophores.s6 The Gaussian-type approximations can be conveniently arrived at from Zwanzig’s statistical mechanical perturbation theory,57 the derivation of which follows. The free energy difference AA between system b and system a can be written as follows:
in which A, and Aa are the free energies of system b and system a, respectively,
H , and Haare the classical Hamiltonians of system b and system a, respectively, r denotes phase space variables (atomic coordinates and their conjugate momenta), and p = k,T where kB is the Boltzmann constant and T is the absolute temperature. This equation can be further written as follows:
where AH
=
H, - Ha.By definition,
where the last term denotes an average over an ensemble of system a. Therefore
which is the perturbation formula first derived by Zwanzig in 1954.57The exponential and logarithmic terms of Eq. [31] can be expanded in a Taylor series of AH, and the equation M=(AH),+-((AH-(AH),) 1
2
)
2kBT
can be obtained if one keeps only terms up to second order in AH. This “Gaussian approximation” works well for a number of applications such as the calculation of pKa of acidic and basic residues of proteinsss and the calculation of the solvation contributions to the excitation energies of tryptophan.56 When AH is dominated by contributions from electrostatic interactions, and when
31 0 Sensitivity Analysis in Biomolecular Simulation
these interactions are modeled by Coulomb's law, A H can be written in a very simple form
in which Vcou, is the electrostatic potential of system a, qi is the partial charge of atom i of system a, and qi + Aqi is the corresponding charge in system 6. Putting this Coulombic form of A H into the Gaussian formula (Eq. 1351) for calculating free energy changes, one can show that some second-order, thirdorder, and fourth-order effects in Aq, are included by the Gaussian approximation formula for calculating free energy changes. Instead of including nonlinear effects explicitly in predicting free energy changes, one can include nonlinear effects implicitly by using semiempirical linear response theories. Aqvist and co-workers examined a special case of semiempirical linear response theory by studying the binding energies of a number . ~ ~this application, the binding of inhibitors to the protein e n d o t h i a p e p ~ i n In energies of inhibitors to the protein were assumed to be linearly dependent on the averaged inhibitor-protein interaction energies and the averaged inhibitor-solvent interaction energies. Therefore, for the binding process inhibitor
+ protein
+
inhibitor-protein complex
with associated binding free energy change, AA, the following linear-responsetype formula was used to approximate AA = a( @ktrostatics)
- @ktrostatics))
(.
i-@Lennard-Jones)
- @tennard-Joner
I371
where ('Lxrrosratics ) and ( ULennard-Jones) are, respectively, the averaged inhibitor-protein electrostatic interaction energy and the averaged Lennard-Jones ) interaction energy in a solvated inhibitor-protein simulation; ( U~lectrosrarics are, respectively, the averaged inhibitor-solvent electrostaand (ULennard.Jones) tic interaction energy and the averaged Lennard-Jones interaction energy in an inhibitor-solvent simulation; and ci and y are two empirical parameters obtained by fitting Eq. [37] to a set of experimental data. Once the parameters 01 and y of Eq. 1371 have been determined, this relation may be used to predict the binding energies of other similar inhibitors to the protein. In the work of Aqvist et a1.,58 ci was taken to be 0.5, which has an approximate theoretical b a ~ i s . No ~ ~ firm . ~ ~theoretical framework has yet been worked out to guide the choice of y, so this parameter was treated as an empirical parameter. Promising results for the binding energies of a number of inhibitors to the protein endothiapepsin were obtained by these authors.58 This
Applications 31 1
semiempirical linear response theory was later extended by Carlson and JorgensenG1to calculate the hydration free energies of a number of organic molecules. Carlson and Jorgensen also treated a as an empirical parameter, and they added an extra term that is proportional to the accessible surface area of a solute molecule. Again, encouraging results were obtained in this application. In addition, the method of Aqvist et al. was later employed successfully by Paulsen and Ornsteid2 to study the binding of 11 compounds to cytochrome P-450. Because the semiempirical linear response theory appeared to be so successful in several applications, it is useful to think about what can be done to further improve this theory. The addition of accessible surface area terms by Carlson and Jorgensen61 improved the flexibility of the theory, so it is interesting to ask whether one can use a microscopic description for the effects represented by the accessible surface area terms. These terms probably reflect contributions from hydrophobic effects, which may be described by the solvent-solvent interaction energy. In fact, when carrying out sensitivity analysis on several terminally blocked amino acids in methanol, we found that the solvent-solvent interaction energy was altered by the dissolution of a solute (Table 4). A comparison of the charge sensitivities of the Helmholtz free energy of liquid methanol with those obtained from solutions of glycine, threonine, and serine dipeptides in methanol shows that the dissolution of these solutes has all altered the solvent-solvent interaction energies. Therefore, perhaps one could replace the accessible surface area terms by terms involving solvent-solvent interaction energies in the solutions of the protein, the inhibitor, and the protein-inhibitor complex. One might also improve the semiempirical linear response theory by adding an intrasolute interaction term. However, adding this term will increase the number of empirical parameters by one. The use of this extra term is practical only when sufficient experimental data are available and enough simulations are done to allow the determination of the extra parameter. Other possi-
Table 4. Charge Sensitivity of the Helmholtz Free Energy of Liquid Methanol, and Solutions of Glycine (G), Threonine (T), and Serine (S) Dipeptides in Methanol" Number of aAlaq,, aA/ a4,, a%, Methanols (kJ/mol.esu) (kJ/mol.esu) (kJ/mol.esu) in Simulation Methanol 10 -27 -106 216 17 -42 -173 196 G in methanol T in methanol 16 -42 -167 230 15 -42 -167 230 S in methanol "q,,, qcm,and qhmare, respectively,the partial charges of the oxygen, the methyl group, and the hydroxyl hydrogen of a methanol molecule in the OPLS force field (Ref. 80). In the simulations, the OPLS parameters were used for methanol, and the GROMOS (Ref. 16) parameters were used for the dipeptides.
3 12 Sensitivity Analysis in Biomolecular Simulation
ble improvements might include the use of nonlinear scaling relationships. For example, the use of the functional form
where p and 6 are additional empirical parameters, may improve the performance of the semiempirical theory, but two more adjustable parameters would need to be fit to available experimental data.
Green’s FunctiodPrincipal Component Analysis and Essential Dynamics The idea of the Green’s functiodprincipal component analysis is closely related to the essential dynamics a p p r ~ a c h recently ~ ~ - ~ ~introduced into biomolecular simulations. Other similar works include those by Garcia,67 Ichiye and Karplus,68 G6 and coworker^,^^-^^ and developers of the quasi-harmonic method.26-28 The basic idea of the essential dynamics approach is to diagonalize a covariance matrix cr whose elements are given by the formula
1391 in which C, is the positional correlation matrix of Eq. [Zl].If U is the matrix that diagonalizes cr such that
UTaU = D
[401
and D is a diagonal matrix, cr can be written in the form
i=l
in which Dii is a diagonal element of D, and Gl is the eigenvector represented by the ith column of U. Therefore, the positional covariance matrix cr can be written as a sum of N matrices, where N is the dimension of the matrix cr. If one arranges Dii in decreasing order of magnitude, Q can be written as a sum of terms with decreasing contributions to Q. This way, one can identify from the leading terms of Eq. [39] the most important principal components that determine the atomic positional fluctuation of a (bio)molecule. For biomolecules, a few principal components usually dominate the contributions to cr. This is because the atomic positional fluctuations are dominated by contributions from
Applications 3 13 a small number of large-amplitude, low-frequency collective modes of the biomolecule. The higher frequency modes introduce only small-amplitude, local atomic fluctuations that are approximately harmonic. The large-amplitude collective motions of certain biomolecules have long been thought to play important functional roles.’ Previously, these large-amplitude biomolecular motions had been studied mostly by normal and quasi-harmonic analyses.26-28 A quasi-harmonic analysis is a special form of normal mode analysis in which the second-order derivatives of the potential energy (which form the elements of a Hessian matrix) are replaced by an effective force constant (or Hessian) matrix constructed from a covariance matrix of positional fluctuations obtained from a molecular dynamics simulation. The use of such an effective force constant (or Hessian) matrix can account for some anharmonic effects that a standard normal mode analysis neglects. From the Green’s function described by Eq. [22], it is easy to see that an effective Hessian matrix can be constructed from the inverse of a Green’s function matrix that is related to a covariance matrix of positional fluctuations. (Remember that a component of a force acting on an atom can be obtained as the negative of a potential gradient.) A key reason for developing the Green’s functiodprincipal component analysis approach is to study structural responses of biomolecules due to perturbations introduced to different parts of the biomolecules. For this application, one works with a Green’s function matrix directly rather than using it to construct an effective force constant matrix for a quasi-harmonic analysis. If one diagonalizes a Green’s function matrix, one can also study collective structural responses as in a normal mode or quasi-harmonic analysis. A Green’s function analysis offers the additional advantage of allowing one to introduce explicit external perturbations (through the calculation of d f ,of Eq. [22])to study the structural responses introduced by these perturbations. For example, df,can be calculated from the interaction potential between the averaged structure of a biomolecule and its ligand; Eq. [22] then allows one to use the results from a molecular dynamics or Monte Carlo simulation of the unperturbed biomolecule to predict what structural responses the ligand may introduce to the biomolecule when the ligand binds to the biomolecule. Relative to the essential dynamics approach, the Green’s function method can provide a more realistic description of how biomolecular structure may respond when it interacts with its ligand(s).Because the effects due to the perturbation forces from a ligand are not accounted for, simply analyzing the collective modes of an isolated biomolecule and trying to make functional inferences from these unperturbed modes can sometimes be misleading. The Green’s function approach provides a first-order correction for studying the induction of structural changes to a biomolecule by one or more ligands without actually carrying out a simulation on the biomolecule-ligand( s) complex. If the eigenvectors GI of a Green’s function matrix corresponding to an un-
314 SensitivitvAnalvsis in Biomolecular Simulation complexed biomolecule have already been obtained, the structural response of the biomolecule to the binding of one or more ligands to the biomolecule can be obtained from Eqs. [22] and [39] as follows:
One can also use a Green’s function matrix directly (without diagonalization) with the relation
i
Ti
where Gi is a component of an atomic coordinate and is a component of a force acting on an atom of a biomolecule. The Green’s function approach can also be used to account for intramolecular motions in a Brownian dynamics simulation to study the diffusional-influenced reaction rates between two molecules such as an enzyme-substrate pair. It used to be that Brownian dynamics simulations of the diffusional encounters between biomolecules and their substrates were usually carried out by assuming the biomolecules and the ligands to be completely However, this is a very crude assumption: the approach of a ligand may dynamically distort the biomolecule to faciliate entry of the ligand to the active site of the biomolecule. This structural response of the biomolecule can be described by a Brownian dynamics simulation model in which the biomolecule is approximated by a collection of suitably connected spheres undergoing Brownian mot i ~ nThe . ~ Green’s ~ function approach provides an alternative whereby the results from a molecular dynamics simulation are paired with an explicit-solvent model to describe the dynamical structural response of the biomolecule to an approaching ligand during a Brownian dynamics simulation. In other words, the Green’s function matrix for a biomolecule is obtained from a molecular dynamics trajectory, and the forces of Eq. [42] or [43] are obtained in a Brownian dynamics simulation algorithm from the interaction potential between the biomolecule and its approaching ligand. The advantage of the Green’s function approach is that solvent effects on the structural fluctations of a biomolecule can be treated in a more realistic manner because an explicit-solvent model can be used. However, the Green’s function approach is applicable only when the structural response of a biomolecule depends approximately linearly on the perturbed forces introduced by the approaching ligand.
Error Propagation Sensitivity analysis also can be applied to examinations of error propagations that arise from the use of nonoptimal potential constants in empirical
Applications 315 force fields. This issue is less well studied in biomolecular simulations because it is expensive to repeat many biomolecular simulations with many different choices of potential parameters, as needed to evaluate the sensitivity of results to the uncertainties of potential energy parameters. However, one can gain some insights into this issue by calculating the derivatives of many propertiedobservables with respect to each parameter of a potential model; these derivatives can be calculated relatively easily because they are obtained by carrying out simulations on one reference system only. Once these derivatives are obtained, one can use a Taylor's series expansion A 0=
ao ax
-AXj + O([AX:]) j
to estimate the effects on simulation results when the force field potential constants are modified. In Eq. [44],0 is a simulated property, Ahi is the change of the value of the force field parameter A, from its value in the force field F , with which a simulation is done to a value in a different force field F , , and A 0 provides an estimate of the change of the value of 0 when the force field is changed from F, to F , . This approach was used to study the sensitivity of the calculation of the free energy difference between solutions of serine and threonine dipeptides in methanol when the atomic charges of these molecules were changed from those in the GROMOS force fieldx6 (which was used in the simulation) to those of the AMBER2' and 0PLSgo force fields. In this study, it was found that the absolute free energy of each system could be quite sensitive to the choice of atomic partial charges, but the differences of the free energies between these two solutions were less sensitive, suggesting the occurrence of cancellation of errors in free energy difference calculations. Upon changing from the GROMOS atomic partial charges to the AMBER atomic partial charges, for example, the free energy of the serine dipeptide was changed by 81.9 kJ/mol and that of threonine dipeptide was altered by 78.5 kJ/mol (both were estimated by using the firstorder approximation of Eq. [44]). The change in the free energy difference between these two systems is only 3.4 kJ/mol. A complete analysis should include in Eq. [44] other force field parameters, because there could exist correlations among different potential energy parameters when they were determined from fittings to suitable theoretical and/or experimental data of model compounds. The sensitivity of each absolute free energy to changes of force field parameters might therefore be smaller than in the example above if other force field parameters were included in the analysis. But, one cannot always rely on adjusting the parameters from one type of potential energy term to compensate for any deficiency in determining the parameters of another potential energy term. For example, the nature of the interactions described by Coulombic terms and Lennard-Jones terms differ quite markedly. If the atomic partial charges employed in Coulombic terms are not
31 6 SensitivitvAnalvsis in Biomolecular Simulation adequate for describing the electrostatic properties of a system, one cannot always rely on adjusting the Lennard-Jones parameters to make up for this deficiency. This argument is further accentuated when one wants to reliably describe additional properties of a system from a simulation model. The more properties one wants to properly predict, the less flexibility one has in adjusting the parameters of potential energy terms of one type to compensate for the inappropriate determination of the parameters of potential energy terms of other types. An advantage of calculating sensitivity coefficients is that it can help identify the parameters that are most responsible for a group of system properties. This information can be obtained by examining the sensitivity coefficients and identifying those having the largest magnitudes. One can also employ the principal component analysis technique,13 so that the effects due to the correlations among potential parameters can be accounted for. It is useful to consider the same simple example of computing the free energy difference between serine and threonine dipeptides in methanol to illustrate this. We carried out a principal component analysis on a sensitivity matrix S containing the matrix elements dOl/dAz, where 0, is a calculated free energy of a serine or a threonine dipeptide in methanol and At is an atomic partial charge of the solute. By carrying out a singular value decomposition on S as described by Eq. [24] with M = S, one obtains the eigenvalues of the matrix D and the eigenvectors contained in the matrices U and V. The number of nonzero eigenvalues was the smaller of the dimension of the sensitivity matrix S-the dimension of the matrix was determined by the number of systems simulated (two in this example)-and the number of different parameters considered in the analysis (16 in this example; they were the atomic partial charges associated with the 16 atomdextended atoms shown in Table 5, footnote a ) . There were thus at most two principal components in this example. To understand the insights that each principal component provided, one could use Eq. [25] to write dol and d d 2 in the following form:
d o 2 = U2,1Dl,lFdi+ U2,2D2,2Edi where Viis an eigenvector contained in the matrix V. Only the eigenvectors corresponding to the two major principal components need to be considered in this example because only these two eigenvectors were associated with nonzero eigenvalues. The first terms of Eqs. 1451 arose from the first principal component (associated with the largest eigenvalue), and the second terms of Eqs. [45] originated from the second principal component (associated with the second largest eigenvalue). For the first principal component, Ul,l = -0.74 and U2 = -0.67 were nearly the same. Accordingly, the first terms produced approd-
Applications 31 7 Table 5. Principal Component Analysis of the Free Energies of Serine and Threonine Dipeptides in Methanol at 300 K" D,,l U,,l U2,,
=
1363
= =
-0.74 -0.67
V,,l
-0.17 v2,, 0.01 V,,, -0.27 V,,, = 0.50 V,,, = 0.43 V6,, = -0.17 V,,, = 0.06 v*,, = 0.05 V9,1 = 0.18 v,,,, = -0.10 V,,,, = -0.06 V,,,, = -0.24 v13,1= 0.10 V,,,, = 0.44 V,,,, = 0.33 v,,,, = 0.05 = = =
D,,, U,,, U,,,
= = =
197 -0.67 0.74
Vl,,
= =
V,,,
=
v,,, v,,,
0.04 0.05
0.03
0.05 VS,, = -0.04 V6,, = -0.02 V,,, = 0.38 V8,, = 0.02 v,, = 0.02 V,,,, = 0.80 V,,,, = -0.03 V,,,, = -0.19 v,,,, = -0.01 V,,,, = 0.09 v,,,, = -0.01 V,,,, = -0.40 =
"D,,, are the diagonal elements of the matrix D in Eq. [24]. U,,, is a matrix element of U, where i labels one of the two solutions and j labels the jth principal components. Vi,; is a matrix element of V, where i labels one of the following atoms/extended atoms of the two dipeptides: 1 = CH,, 2 = C , 3 = 0 , 4 = N , 5 = H , 6 = C m , 7 = C,(S),8 = O y , 9 = H , 10 = Cy, 11 = C, 12 = 0 , 1 3 = N, 14 = H, 15 = CH,, 16 = C,(T)and j labels a principal component.
mately the same effect on the free energy of the two solutions (recall from Eqs. [26] that the vectors in the matrix U are associated with the observables or simulated properties). Thus, the first principal component is a general measure of the total free energy sensitivities of the two systems. O n the other hand, U , , = -0.67 and U,, = 0.74 of the second principal component had about the same magnitude Lut different signs. Accordingly, the second principal component describes features that accounted for the differences of the free energy of the two solutions. By examining the eigenvector of V for the second principal component, one could identify the atomic partial charges that were most crucial in determining the free energy difference of the two solutions. The largest component of this eigenvector was associated with the atomic partial charge of the y carbon (in the extended atom representation) of the threonine dipeptide. This result suggested that the free energy difference of the two solutions was largely accounted for by atoms in or near the y-methyl group of the threonine dipeptide. Consequently, the uncertainty of the free energy difference was solely determined by the uncertainty of the potential parameters associated with atoms near the y-methyl group of the threonine dipeptide. The uncertainty of the pa-
31 8 Sensitivity Analysis in Biomolecular Simulation rameters associated with the other atoms of the solutes might not affect the calculated free energy difference of the two solutions very much. This principal component analysis further demonstrates how cancellation of errors can occur in free energy difference calculations. Although the parameters employed in free energy calculations may not be optimal, cancellations of errors can occur when the free energy difference between two similar systems is calculated as described above. This is because the uncertainties of many parameters produce comparable effects on the free energy of two similar systems. This example also illustrates why it is easier to calculate the difference between two free energy changes rather than the free energy changes individually. However, as the systems become more different, more parameters may become significant in determining the free energy difference between the two systems. This subset of “essential” parameters must still be sufficiently reliable to give a proper estimate of the free energy difference between the two systems. Principal component analysis can also be carried out by using simulation data obtained at different conditions, such as at different temperatures, so that more observables can be used to construct a larger sensitivity matrix. This has been done for the evaluation of serine and threonine dipeptides in methanol,29 but the key findings were essentially the same as those described above when the results from only two simulations were used in the analysis.
Potential Energy Function Refinement Sensitivity analysis is also a tool that can help to refine potential energy functions for (bio)molecular simulations. Sensitivity analysis can help one decide whether a specific feature needs to be included in a potential function for describing a specified set of properties of a given class of molecules. For example, because point charge models are commonly used in bio(molecu1ar) modeling, it is useful to inquire whether a dispersed charge representation would improve the description of intra- and intermolecular electrostatic interactions. One study of this type was carried out by Zhu and Wong,4O who included in the force field a squared Lorentzian function f(? - jk)of the form
to describe the smearing out of charges about the oxygens and hydrogens of
water molecules in the simulation of liquid water. In Eq. [46], Tk is the coordinate vector of a water oxygen or hydrogen, and a is a parameter that controls the width of the charge dispersion. In a simulation of several properties of liquid water (internal energy, pressure, isothermal compressibility, Kirkwood factor, radial distribution functions, and distribution function of the interaction energy of a water molecule with its surrounding) using a polarizable water model with this artificially dispersed charge representation, the sensitivity of the
Applications 3 19 properties to perturbations of a, measured by first-order, log-normalized sensitivity coefficients, were two orders of magnitude smaller than the perturbations of the parameters that gave the largest sensitivity of these proper tie^.^^ This finding suggests that a point charge representation is adequate for describing many properties of liquid water. The use of a more expensive dispersed charge representation was not crucial for describing the properties of liquid water discussed above. This example illustrates how sensitivity analysis can help to simplify a potential model for (bio)molecular simulations. The sensitivity coefficients obtained from a sensitivity analysis can also help guide the optimization of the parameters of an empirical force field to best fit a given set of experimentaUtheoretica1 data. In fact, first-order sensitivity coefficients are quantities that are typically calculated in least-squares refinement programs for optimizing the force field parameters to best-fit a set of experimental/theoretical data. Is The difference between a brute force least-squares refinement of model parameters and a sensitivity analysis is that the latter analyzes the informational content provided by these sensitivity coefficients, thereby helping to refine a set of parameters more intelligently. For example, sensitivity coefficients with small absolute magnitudes identify parameters that may not be readily refined by a given set of experimentaYtheoretica1 data. Other suitable experimentalhheoretical data must be included before these parameters can be readily determined. The inclusion of these poorly determined parameters in a parameter refinement process may deteriorate the determination of the other parameters. An example of parameter optimization is found in the determination of atomic partial charges.81 A popular technique currently in use is to derive the atomic partial charges of a molecule by fitting these charges to the quantum mechanical electrostatic potential calculated at a number of points around the m ~ l e c u l e . ~Partial ~ - ~ ~charges of atoms buried inside the molecule are usually less well defined than for atoms that are exposed because the electrostatic potential is usually calculated at points outside the van der Waals surface of the molecule. (Within the van der Waals surface, quantum effects are also significant; classical electrostatics is inadequate for describing the interactions between two atoms that are closer than the sum of their van der Waals radii.) When no additional appropriate data are provided for determining the atomic partial charges, it has been found that imposing constraints on the buried charges to keep them close to some physically meaningful values can help obtain a more reasonable set of charges for the molecule. The charges derived in this way can be more readily transferred to other (similar) molecules, and they better describe intramolecular electrostatic interactions.81 Correlations among parameters and observables can complicate a parameter refinement process. There may be insufficient data to determine N parameters of a force field given N experimental or theoretical data points if there exist significant correlations among the potential constants or among the data used for determining these parameters. A principal component analysis can
320 Sensitivitv Analvsis in Biomolecular Simulation analyze such correlation behavior and reveal how many useful relations are really provided by a set of experimentaI/theoretical data. To help illustrate this application, it is again useful to take the same simple example above for evaluating the free energy of the serine and threonine dipeptides in methanol.29 In this case, the principal component analysis gave only two prinicpal components that were associated with nonzero eigenvalues (or singular values), indicating that two potential parameters at most could be determined by using the free energies of the two solutions. However, the first eigenvalue was almost an order of magnitude larger than the second eigenvalue, suggesting that the first principal component was significantly higher in informational content than the second with respect to determining the atomic partial charges of the two peptides. When this analysis was extended to include nine solutions of the two solutes in two different solvents (methanol and water) at different temperatures, the informational content did not increase much.29 Adding seven more sets of data for assisting the parameter refinement increased the number of useful principal components only by approximately one, and the total number of principal components associated with eigenvalues of significant magnitudes was only three. Therefore, from a sensitivity analysis and a principal component analysis, one can gain insights into how many useful relations can be derived from a given set of experimentalkheoretical data for refining force field parameters. These analyses can also be useful in the selection of suitable experimental/theoretical data to use for force field parameterization. Ideally, one would like to include the smallest amount of data containing the largest amount of information; the judicious choice of experimental/theoretical data needed to accomplish this can help reduce the computational costs of refining a set of potential parameters. Similarly, due to the correlation of potential parameters, N relationships may not readily determine N potential parameters. For the example above of serine and threonine dipeptides in methan01,2~the two relationships provided by the two sets of data involved more than two force field parameters, as suggested by the many components of the eigenvectors having significant magnitudes in V. If only two coefficients were nonzero for the two eigenvectors, these two relationships could have readily determined the two parameters. Because more than two coefficients were nonzero, there exist infinite combinations of parameters that could give the same free energy values for the two solutions (assuming the free energies of the two solutions are the only data available for determining these two parameters).This phenomenon is ubiquitous in bio(mo1ecular) modeling. Different force fields with a similar functional form commonly have different values of analogous parameters. Even so, similar values of certain selected properties can often be obtained by those different force fields. The possibility exists of having many possible combinations of parameters that can describe certain properties with comparable accuracies, and it is consistent with
Conclusions 321 the principal component analysis of dipeptides discussed above, i.e., that different parameter sets may give similar values for a subset of bio(molecu1ar) properties. However, the more properties one requires a force field to describe properly, the more one needs to choose the parameters carefully, because it becomes less likely that adjusting one or more of those parameters will serve to compensate for the deficiency of a poorly determined parameter. Nevertheless, for many applications, it is probably inevitable and adequate to develop ad hoc problem-specific force fields. Force fields must be relatively simple and computationally efficient for studying complex macromolecules such as proteins and DNAs. The force fields usually describe properties of certain types better than others, depending on how the force fields were developed. We have already learned from the sensitivity analysis studies of liquid water and a two-dimensional square lattice model of protein folding that different system properties can be determined by different features of a potential model. An example employing a more realistic force field can also be found in the application of sensitivity analysis to study the determinants of the structural and thermodynamical properties of the protein avian pancreatic polypeptide (APP).19It was found that the size and shape of the protein was determined to a large extent by electrostatic interactions, whereas the free energy of the protein was more sensitive to the surface-area-dependent solvation energy terms that modeled hydrophobic effects. Consequently, it is possible to develop an ad hoc force field that is designed to describe certain classes of (bio)molecular properties properly. The failure of such an ad hoc force field to describe properties of other types does not necessarily indicate that this force field is useless, rather, caution should be exercised in any attempt to apply it to other properties.
CONCLUSIONS In this chapter, we summarized some of the recent developments in sensitivity analysis approach for biomolecular simulations. Although more work needs to be done to exploit the full capability of the sensitivity analysis approach, the initial applications of this technique have already generated many useful insights for enhancing biomolecular simulations and improving models for carrying them out. The sensitivity analysis approach is an efficient and effective method for systematically identifying the determinants of interesting hiomolecular properties, which can be difficult to identify by intuition alone. Although first-order sensitivity theory is not always reliable in predicting the properties of a structurally modified (bio)molecule, it may be useful as a preliminary classification tool for suggesting a small number of modifications
322 Sensitivity Analysis in Biomolecular Simulation
that can be further exploited with more sophisticated (but more expensive) computational simulations or/and experimental studies. The advantage of firstorder sensitivity theory is that it is relatively inexpensive to use. The reliability of sensitivity analysis in the design of novel bioactive compounds can be improved by employing higher order sensitivity theory, with the Gaussian-type app r o x i r n a t i ~ n ~as~a- successful ~~ example. The encouraging preliminary applications of semiempirical linear response t h e ~ r i e s ” ~ , to ~ ’ predicting ,~~ free energy changes should also fuel further research on exploiting the full capability of this approach. The molecular dynamics/Monte Carlo Green’s function approach,24 which is an extension of the molecular mechanics Green’s function app r o a ~ h and ~ ~is, a~special ~ form of sensitivity analysis, is tightly connected to the essential dynamics method introduced recently for studying the possible ~ - advantage ~~ of the functional roles of collective modes of b i o r n o l e ~ u l e s . ~An Green’s function approach is that the effects arising from the introduction of perturbations to a (bio)molecule by its interacting partners can be explicitly included to predict how these perturbations may affect the structure of the (bio)molecule. The sensitivity analysis approach has also been shown to be useful for studying error propagations due to the use of nonoptimal parameters in biomolecular simulations and for examining how error cancellations may occur in free energy difference calculation^.^^^^^ The sensitivity analysis approach can also suggest how potential functions could be simplified and how the parameters of these functions can be effectively refined. Although more work needs to be carried out to fully examine the utility and limitations of the sensitivity analysis approach in (bio)molecular modeling, this methodology has already produced useful insights into the determinants of (bio)rnolecular properties that are difficult to obtain by intuition alone. A key strength of this approach is its ability to examine in an efficient manner many possible factors that may determine a set of (bio)molecular properties. It will be interesting to see how the sensitivity analysis approach can be used with other computational and experimental techniques to gain even deeper insights into the determining factors.
ACKNOWLEDGMENTS Some of the research described in this chapter was supported by the Petroleum Research Fund administered by the American Chemical Society, the National Institutes of Health, the Office of Naval Research, and the Bristol-Meyer Squibb Institute for Medical Research. Work carried out in our laboratories involved a number of collaborators: Richard E. Bleil, Axel Briinger, Gauri Misra, Robert B. Nachbar Jr., Clarence Schutt, Tom Simonson, Roberta Susnow, Qiang Wang, Hong Zhang, and Sheng-bai Zhu.
References 323
REFERENCES 1. J. A. McCammon and S. C. Harvey, Dynamics ofProteins and Nucleic Acids, Cambridge University Press, Cambridge, 1987.
2. C. L. Brooks 111, M. Karplus, and B. M. Pettitt, Proteins: A Theoretical Perspective of Dynamics, Structure, and Thermodynamics, Wiley, New York, 1988. 3. T. P. Lybrand, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1990, Vol. 1, pp. 295-320. Computer Simulation of Biomolecular Systems Using Molecular Dynamics and Free Energy Perturbation Methods. 4. A. E. Torda and W. F. van Gunstersen, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1992, Vol. 3, pp. 143-172. Molecular Modeling Using NMR Data. 5. T. P. Straatsma, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1996, Vol. 9, pp. 81-127. Free Energy by Molecular Simulation. 6. R. Tomovick and M. Vukobratovic, General Sensitivity Theory, American Elsevier, New York, 1972. 7. P. Franck, Introduction to System Sensitivity Theory, Academic Press, New York, 1987. 8. L. Eno and H. Rabitz, Adv. Chem. Phys., 51, 177 (1982).Sensitivity Analysis and Its Role in Quantum Scattering Theory. 9. H. Rabitz, M. Kramer, and D. Dacol, Annu. Rev. Phys. Chem., 34, 419 (1983). Sensitivity Analysis in Chemical Kinetics. 10. H. Rabitz, Chem. Rev., 87, 101 (1987).Chemical Dynamics and Kinetics Phenomena as Revealed by Sensitivity Analysis Techniques. 11. H. Rabitz, Science, 246,221 (1989).System Analysis at the Molecular Scale. 12. G. E. Forsythe, M. A. Malcolm, A. Michael, and C. B. Moler, Computer Methods for Mathematical Computations, Prentice-Hall, Englewood Cliffs, NJ, 1977. 13. T. W. Anderson, An Introduction to Multivariate Statistical Analysis, Wiley, New York, 1958. 14. U. Dinur and A. T. Hagler, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1991, Vol. 2, pp. 99-164. New Approaches to Empirical Force Fields. 15. J. P. Bowen and N. L. Allinger, in Reviews in Compwtatronal Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1991, Vol. 2, pp. 81-97. Molecular Mechanics: The Art and Science of Parameterization. 16. W. F. van Gunsteren and H. J. C. Berendsen, GROMOS, Groningen, Netherlands, 1987. 17. S.-b. Zhu and C. F. Wong, J. Chem. Phys., 98, 8892 (1993). Sensitivity Analysis of Water Thermodynamics. 18. R. E. Bleil, C. F. Wong, and H. Rabitz,J. Phys. Chem., 99, 3379 (1995). Sensitivity Analysis of a Two-Dimensional Lattice Model of Protein Folding. 19. H. Zhang, C. F. Wong, T. Thacher, and H. Rabitz, Proteins: Struct., Funct., Genet., 23, 218 (1995). Parametric Sensitivity Analysis of Avian Pancreatic Polypeptide (APP). 20. B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus, J. Comput. Chem., 4, 187 (1983). CHARMM: A Program for Macromolecular Energy, Minimization, and Dynamics Calculations. 21. S. J. Weiner, P. A. Kollman, D. A. Case, U. C. Singh, C. Ghio, G. Alagona, S. Profeta Jr., and P. Weiner, J. Am. Chem. SOL., 106, 765 (1984). A New Force Field for Molecular Mechanical Simulation of Nucleic Acids and Proteins. 22. R. Susnow, R. B. Nachbar Jr., C. Schutt, and H. Rabitz,J. Phys. Chem., 95,8585 (1991).Sensitivity of Molecular Structure to Intramolecular Potentials.
324 SensitivitvAnalvsis in Biomolecular Simulation 23. R. Susnow, R. B. Nachbar Jr., C. Schutt, and H. Rabitz, J . Phys. Chem., 95, 10662 (1991). Study of Amide Structure Through Sensitivity Analysis. 24. C. F. Wong, C. Zheng, J. Shen, J. A. McCammon, and P. G. Wolynes,]. Phys. Chem., 97,3100 (1993). Cytochrome c: A Molecular Proving Ground for Computer Simulations. 25. P. H. Hiinenberger, A. E. Mark, and W. F. van Gunsteren,]. Mol. Biol., 252,492 (1995).Fluctuation and Cross-Correlation Analysis of Protein Motions Observed in Nanosecond Molecular Dynamics Simulations. 26. M. Born and K. Huang, Dynamical Theory of Crystal Lattices, Clarenden Press, Oxford, 1954. 27. M. Karplus and J. N. Kushick, Macromolecules, 14, 325 (1981).Method for Estimating the Configurational Entropy of Macromolecules. 28. R. M. Levy, M. Karplus, J. Kushick, and D. Perahia, Macromolecules, 17,1370 (1984).Evaluation of the Configurational Entropy for Proteins: Application to Molecular Dynamics Simulations of an u-Helix. 29. C. F. Wong and H. Rabitz, J. Phys. Chem., 95, 9628 (1991).Sensitivity Analysis and Principal Component Analysis in Free Energy Calculations. 30. H. J. C. Berendsen, J. P. M. Postma, W. F. van Gunsteren, and J. Hermans, in Intermolecular Forces, B. Pullman, Ed., Reidel, Dordrecht, 1981, pp. 331ff. 31. W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey, and M. L. Klein, J. Chem. Phys., 79, 926 ( 1 983). Comparison of Simple Potential Functions for Simulating Liquid Water. 32. M. Sprik and M. L. Klein, ]. Chem. Phys., 89, 7556 (1988).A Polarizable Model for Water Using Distributed Charge Sites. 33. J. W. Halley, J. R. Rustad, and A. Rahman,]. Chem. Phys., 98,4110 (1993).A Polarizable, Dissociating Molecular Dynamics Model for Liquid Water. 34. D. N. Bernardo, Y. Ding, and K. Krogh-Jespersen, ]. Phys. Chem., 98, 4180 (1994). An Anisotropic Polarizable Water Model: Incorporation of All-Atom Polarizabilities into Molecular Mechanics Force Fields. 35. R. D. Mountain, ]. Chem. Phys., 103,3084 (1995).Comparison of a Fixed-Charge and a Polarizable Water Model. 36. I. M. Svishchev, P. G. Kusalik, and R. J. Boyd,]. Chem. Phys., 105,4742 (1996).Polarizable Point-Charge Model for Water: Results Under Normal and Extreme Conditions. 37. A. A. Chialvo and P. T. Cummings, ]. Chem. Phys., 105, 8274 (1996). Engineering a Simple Polarizable Model for the Molecular Simulation of Water Applicable over Wide Ranges of State Conditions. 38. S. Zhu, S. Yao, J. Zhu, S. Singh, and G. W. Robinson, J . Phys. Chem., 95, 6211 (1991).A Flexibleh'olarizable Simple Point Charge Water Model. 39. S.-b. Zhu, S. Singh, and G. W. Robinson, J. Chem. Phys., 95, 2791 (1991). A New Flexible/Polarizable Water Model. 40. S.-b. Zhu and C. F. Wong,]. Phys. Chem., 98,4695 (1994). Sensitivity Analysis of a Polarizable Water Model. 41. S.-b. Zhu and C. F. Wong,]. Chem. Phys., 99, 9047 (1993).Sensitivity Analysis of Distribution Functions of Liquid Water. 42. M. P. Allen and D. J. Tildesley, Computer Simulation ofliquids, Oxford University Press, Oxford, 1987. 43. W. Yu, C. F. Wong, and J. Zhang, /, Phys. Chem., 100, 15280 (1996).Brownian Dynamics Simulations of Polyalanine in Salt Solutions. 44. S. Huston and P. J. Rossky, 1.Phys. Chem., 93, 7888 (1989).Free Energies of Association for the Sodium-Dimethyl Phosphate Ion Pair in Aqueous Solution. 45. J. S. Bader and D. Chandler,]. Phys. Chem., 96,6423 (1992).Computer Simulation Study of the Mean Forces Between Ferrous and Ferric Ions in Water.
References 325 46. H. Schreiber and 0. Steinhauser, Biochemistry, 31, 5856 (1992). Cutoff Size Does Strongly Influence Molecular Dynamics Results on Solvated Polypeptides. 47. G. K. Ackers and F. R. Smith, Annu. Rev. Biochem., 54, 597 (1985). Effects of Site-Specific Amino Acid Modification on Protein Interactions and Biological Function. 48. S. M. Green and D. Shortle, Biochemistry, 32, 10131 (1993). Patterns of Nonadditivity Between Pairs of Stability Mutations in Staphylococcal Nuclease. 49. V. J. LiCata and G. K. Ackers, Biochemistry, 34,3133 (1995). Long-Range, Small Magnitude Nonadditivity of Mutational Effects in Proteins. 50. A. R. Fersht, A. Matouschek, and L. Serrano, J. Mol. Biol., 224, 771 (1992).The Folding of an Enzyme. I. Theory of Protein Engineering Analysis of Stability and Pathway of Protein Folding. 51. P. R. Gerber, A. E. Mark, and W. F. van Gunsteren, J. Cornput.-Aided Mol. Design, 7 , 305 (1993). An Approximate But Efficient Method to Calculate Free Energy Trends by Computer Simulation: Application to Dihydrofolate Reductase-Inhibitor Complexes. 52. P. Cieplak, D. A. Pearlman, and P. A. Kollman,]. Chem. Phys., 101,627 (1994). Walking on the Free Energy Hypersurface of the 18-Crown-6 Ion System Using Free Energy Derivatives. 53. P. Cieplak and P. A. Kollman, J. Mol. Recognit., 9, 103 (1996).A Technique to Study Molecular Recognition in Drug Design: Preliminary Application of Free Energy Derivatives to Inhibition of a Malarial Cysteine Protease. 54. R. M. Levy, M. Belhadj, and D. B. Kitchen, J . Chem. Phys., 95,3627 (1991). Gaussian Fluctuation Formula for Electrostatic Free-Energy Changes in Solution. 55. G. S. Del Buono, E. Freire, and R. M. Levy, Proteins: Struct., Funct., Genet., 20,85 (1994). Intrinsic pK,s of Ionizable Residues in Proteins: An Explicit Solvent Calculation for Lysozyme. 56. T. Simonson, C. F. Wong, and A. T. Brunger, J . Phys. Chem. A , 101, 1935 (1997). Classical and Quantum Simulations of Tryptophan in Solution. 57. R. W. Zwanzig,!. Chem. Phys., 22,1420 (1954). High-Temperature Equation of State by Perturbation Method. I. Nonpolar Gases. 58. J. aqvist, C. Medina, and J.-E. Samuelsson, Protein Elrg., 7, 385 (1994). A New Method for Predicting Binding Affinity in Computer-Aided Drug Design. 59. A. Warshel and S. T. Russell, Q. Rev. Biophys., 17,283 (1984). Calculations of Electrostatic Interactions in Biological Systems and in Solutions. 60. B. Roux, H.-a. Yu, and M. Karplus,]. Phys. Chem., 94,4683 (1990). Molecular Basis for the Born Model of Ion Solvation. 61. H. A. Carlson and W. L. Jorgensen,]. Phys. Chem., 99, 10667 (1995).An Extended Linear Response Method for Determining Free Energies of Hydration. 62. M. D. Paulsen and R. L. Ornstein, Protein Eng., 9, 567 (1996). Binding Free Energy Calculations for P450cam-Substrate Complexes. 63. A. Amadei, A. B. M. Linssen, and H. J. C. Berendsen, Proteins: Struct., Funct., Genet., 17, 412 (1993). Essential Dynamics of Proteins. 64. D. M. F. van Aalten and A. Amadei, Proteins: Struct., Funct., Genet., 22,45 (1995).The Essential Dynamics of Thermolysin: Confirmation of the Hinge-Bending Motion and Comparison of Simulations in Vacuum and Water. 65. R. M. Scheek, N. A. J. Van Nuland, B. L. De Groot, A. B. M. Linssen, and A. Amadei, J . Biomol. N M R , 6, 106 (1995). Structure from NMR and Molecular Dynamics: Distance Restraining Inhibits Motion in Essential Subspace. 66. D. van der Spoel, B. L. de Groot, S. Hayward, H. J. C. Berendsen, and H. J. Vogel, Protein Sci., 5,2044 (1996).Bending of the Calmodulin Central Helix: A Theoretical Study. 67. A. Garcia, Phys. Rev. Lett., 68,2696 (1992). Large-Amplitude Nonlinear Motions in Proteins. 68. T. lchiye and M. Karplus, Proteins: Struct., Funct., Genet., 11, 205 (1991). Collective Motions in Proteins: A Covariance Analysis of Atomic Fluctuations in Molecular Dynamics and Normal Mode Simulations.
326 SensitivityAnalysis in Biomolecular Simulation 69. A. Kitao, F. Hirata, and N. Gd, Chem. Phys., 158,447 (1991).The Effects of Solvent on the Conformation and the Collective Motions of Protein: Normal Mode Analysis and Molecular Dynamics Simulations of Melittin in Water and in Vacuum. 70. S. Hayward, A. Kitao, F. Hirata, and N. G6,/. Mol. Biol., 234, 1207 (1993).Effect of Solvent on Collective Motions in Globular Proteins. 71. N. Kobayashi, T. Yamato, and N. Go, Proteins: Struct., Funct., Genet., 28, 109 (1997). Mechanical Property of a TIM-Barrel Protein. 72. N. G6, T. Noguti,and T. Nishikawa, Proc. Natl. Acad. Sci. U.S.A., 80,3696 (1983).Dynamics of a Small Globular Protein in Terms of Low-Frequency Vibrational Modes. 73. B. R. Brooks and M. Karplus, Proc. Natl. Acad. Sci. U.S.A., 80,6571 (1983). Harmonic Dynamics of Proteins: Normal Modes and Fluctuations in Bovine Pancreatic Trypsin Inhibitor. 74. M. Levitt, C. Sander, and P. S. Stern, J. Mol. B i d , 181, 423 (1985). Protein Normal-Mode Dynamics: Trypsin Inhibitor, Crambin, Ribonuclease and Lysozyme. 75. T. Simonson and D. Perahia, Biophys./., 61,410 (1992).Normal Modes of Symmetric Protein Assemblies: Application to the Tobacco Mosaic Virus Protein Disk. 76. R. C. Wade, Trans. Biochem. Soc., 24, 254 (1996). Brownian Dynamics Simulations of Enzyme-Substrate Encounter. 77. S. H. Northrup, S. A. Allison, and J. A. McCammon,]. Chem. Phys., 80,1517 (1984).Brownian Dynamics Simulation of Diffusion-Influenced Biomolecular Reactions. 78. S. H. Northrup, J. 0. Boles, and J. C. L. Reynolds, Science, 241, 67 (1988). Brownian Dynamics of Cytochrome c and Cytochrome c Peroxidase Association. 79. R. C. Wade, M. E. Davis, and B. A. Luty, Biophys./., 64, 9 (1993). Gating of the Active Site of Triose Phosphate Isomerase: Brownian Dynamics Simulations of Flexible Peptide Loops in the Enzyme. 80. W. L. Jorgensen and J. Tirado-Rives, /. Am. Chem. Soc., 110, 1657 (1988). The OPLS Potential Functions for Proteins. Energy Minimizations for Crystals of Cyclic Peptides and Crambin. 81. C. I. Bayly, P. Cieplak, W. D. Cornell, and P. A. Kollman, 1. Phys. Chem., 97, 10269 (1993). A Well-Behaved Electrostatic Potential Based Method Using Charge Restraints for Deriving Atomic Charges: The RESP Model. 82. F. A. Momany,J. Phys. Chem., 82,592 (1978).Determination of Partial Atomic Charges from Ab Initio Molecular Electrostatic Potentials. Application to Formamide, Methanol, and Formic Acid. 83. S. R. Cox and D. E. Williams,]. Comput. Chem., 2,304 (1981).Representation of the Molecular Electrostatic Potential by a Net Atomic Charge Model. 84. U. C. Singh and P. A. Kollman, /. Comput. Chem., 5 , 129 (1984).An Approach to Computing Electrostatic Charges for Molecules. 85. C. F. Wong, J. Am. Chem. Soc., 113, 3208 (1991). Systematic Sensitivity Analyses in Free Energy Perturbation Calculations.
CHAPTER 7
Computer Simulation to Predict Possible Crystal Polymorphs Paul Verwer* and Frank J. J. Leusent *CAOS/CAMMCenter, University of Nijmegen, P.O. Box 9020, 6500 GL Nijmegen, The Netherlands, and tMolecular Simulations Ltd., 240/250 The Quorum, Barnwell Road, Cambridge, CB5 8RE, United Kingdom
INTRODUCTION Organic molecular solids are often obtained in crystalline form, either as single crystals or as a crystalline powder. The specific stacking of molecules in the crystal, the crystal packing, can influence important properties of the material, including density, color, taste, solubility, rate of dissolution, hygroscopic properties, melting point, chemical stability, conductivity, optical properties, and morphology. Crystallization of a given compound need not always lead to the same packing. Different crystal structures of the same compound (polymorphs) can often be observed, depending on crystallization c0nditions.l Polymorphism poses a problem if it leads to the unexpected formation of different crystal structures in commercial crystallization processes. This behavior is especially important in the pharmaceutical industry. Differences in macroscopic crystal shape (morphology) between polymorphs may, for example, lead to problems during filtration, and the shelf life of the final product may change as a result of changed chemical stability. At the same time, polymorphism may be Reviews in Computational Chemistry, Volume 12 Kenny B. Lipkowitz and Donald B. Boyd, Editors Wiley-VCH, John Wiley and Sons, Inc., New York, 0 1998
327
328 Computer Simulation to Predict Possible Crystal Polymorphs exploited by selecting the polymorph that has optimal properties or is not protected by patent. Knowledge of the three-dimensional atomic structure of a crystal will generally be the basis for an understanding of its characteristics. For crystals of sufficient quality and size (0.1 X 0.1 X 0.1 mm3 being a practical minimum), the structure can be determined accurately and reliably via single crystal X-ray diffraction. However, suitable crystals of the compound under investigation cannot always be grown. In some cases, only a powder, thin needles, or platelets can be obtained. In other cases, factors such as a large mosaic spread (i.e., the slight misalignment of small crystal blocks2a), twinning, or radiation-induced decay of the crystal may hamper structure determination via X-ray diffraction. Crystal structure prediction by computer simulation can be used to propose possible structures in those cases, or in cases of a compound that has not yet been synthesized. In cases of the latter kind, important properties (e.g., the density of new explosive materials or the color of new organic pigments) of still hypothetical structures may be predicted. Different routes of arriving at the crystal structure of a molecule are shown in Figure 1.
single crystal diffraction
powder diffraction
-
trial structure
lntroduction 329 Crystal packing simulations are by no means a recent concept. The field was pioneered by, among others, Kitaigorod~kii~ and Williams in the 1960s, and crystal structure determinations by molecular packing analysis were reported some 30 years ago. The determination in 1966 of the crystal structure of dibenzoylmethane by Williams4 provides an early example. In this study, the molecule was kept rigid, and the deviation from expected minimum nonbonded distances was used as a quality measure. Verification and refinement of the proposed packing was done by means of X-ray diffraction data. Later work5 used a more refined potential energy function. The computer program PCKS,6 which minimizes nonbonded interatomic close contacts, was used in the generation of a starting model in the X-ray structure determination of the free radical 2,4,6triphenylverdazyl.’ Here the most intense X-ray reflection was used to obtain an initial angular orientation of the planar molecule. After optimization of several trial structures with PCKS, followed by systematic variation of the three torsional degrees of freedom, a closely packed model structure was obtained, and then refined by means of the single crystal X-ray data. PCKS was also used in 1972 by Zugenmaier and Sarko in their analysis of six monosaccharides.8 These investigators were able to generate crystal packings close to those observed with X-ray diffraction by moving a rigid molecule and its symmetry-related copies in a unit cell of known dimensions, minimizing repulsion between nonbonded atoms. The aim of this work was to develop a method capable of predicting polysaccharide structures, for which single crystal X-ray diffraction data were (and still are) difficult to obtain. In an analysis in 1973 of the effect of packing on the conformation of triphenylphosphine, Brock and Ibers9 used the 1972 version of Busing’s program WMIN.I0 The calculations extended beyond the minimization of close contacts between rigid molecules, taking into account internal molecular strain, and van der Waals and Coulomb interactions. Another example of the application of WMIN is the generation of the initial packing in the crystal structure determination of 2-amino-4-methylpyridine,ll as subsequently refined using X-ray data. In this work, which was part of a systematic investigation of simple organic analogs to molecules of biological interest, computer simulation of crystal packing was essentially used to solve the phase problem in X-ray crystallography. In spite of these early successes, computational methods and algorithms that allow packing simulations with many degrees of freedom, including unit cell parameters and molecular flexibility, were not developed until this decade, making true “ab initio” crystal structure predictions possible on the basis of molecular information alone. In this chapter, we survey recently developed algorithms for predicting low energy crystal packings and discuss speed, accuracy, and other aspects of the computational methods involved. We mention the pitfalls encountered in the predictions of crystal structures, and illustrate how experimental information can help speed up the process of generating and identifying the correct structure from a large set of possibilities.
330 Computer Simulation to Predict Possible Crystal Polvmorbhs We limit our coverage to crystal structures of small organic molecules. Simulations of other materials (e.g., metal oxides, biopolymers), which often require their own, even more arduous approach to obtain accurate and reliable results, are not discussed here.
THEORY AND COMPUTATIONAL APPROACHES Crystals A crystal can be described as a three-dimensional stacking of identical building blocks, the unit cells. Predicting the structure of the crystal thus means predicting the size and shape of the unit cell and the positions of the atoms in it. The magnitudes of the spanning (or lattice) vectors (a, b, c) and the angles between them (a,p, y) define the unit cell (see Figure 2). In combination with the fractional coordinates of the atoms, the six cell parameters specify the crystal structure. In addition to the lattice symmetry, the atomic arrangement in a crystal often displays extra symmetry (e.g., mirror, rotational, inversion, translational) within the unit cell. If no translational symmetry is present within the
Figure 2 The vectors defining the unit cell, a, b, and c, and the angles between those vectors, a,p, and y.
Theory and Computational Approaches 331 unit cell, the cell is called “primitive.” In certain cases it is possible to define a larger unit cell, which has additional symmetry.2b This larger cell is said to be centered, and it displays extra translational symmetry, for example, along a translation vector t (a + b). The specific combination of symmetry elements present in a crystal structure defines its space group, and in three dimensions 230 different space groups can be constructed.2b If symmetry is present within the unit cell, only the coordinates of a unique part of the structure, the asymmetric unit, and the space group are necessary to define the positions of all atoms in the unit cell. For molecular crystals, the number of molecules per unit cell is labeled Z. The number of molecules per asymmetric unit is usually called Z‘. If the molecule is symmetrical itself, the number of molecules per asymmetric unit can be a fraction. Crystal structure prediction software will usually ignore this intramolecular symmetry. The choice of cell parameters is not unique because the same lattice can be described by different sets of parameters. Rules exist for obtaining a set of standard cell parameters for a given lattice, called the conventional cell. l2>l3 The “reduced cell” is the standard primitive (noncentered) cell to describe a given lattice. Andrews and Bernstein14 describe a method to determine the reduced cell from a given set of cell parameters, and in that report an overview of earlier methods is presented. More recently, a new algorithm for this purpose was developed by Zuo et a1.15
Thermodynamics The relative stability of polymorphs at a given temperature and pressure is determined by their differences in Gibbs energy, AG: AG = AU -k p AV - TAS
111
with energy U, pressure p, volume Y temperature T, and entropy S. Thus AG depends on differences in packing energy, crystal density, and entropy. The contribution of p AV is negligible at normal pressure because differences in density between polymorphs for small organic molecules rarely exceed a few percent.16 Even a difference in density of 10% would amount to an energy difference due to p AVof only about 1caYmo1 for an organic molecule of mass 300, as in, say, a typical steroid. This is three orders of magnitude smaller than the differences in U that can be found in energy calculations on pairs of steroid polymorphs, where energy differences of a few kilocalories per mole are common. For simulations of high pressure phases, the pressure term can become significant, however, as in the example of benzene at 25 kbar studied by Gibson and Scheraga.” Entropy differences need not be negligible at room temperature, but they are usually ignored because reliable calculation1* is not always straightforward. Effectively, then, the energy at 0 K is taken, leaving only U as the quantity to be calculated. Further assuming that structures with a calculated low energy U
332 Computer Simulation to Predict Possible Crystal Polymorphs
relative energy
(at 0 K)
Figure 3 True and modeled relative energies of polymorphs.
are good candidates to be observed in reality, most crystal structure prediction methods rank the predicted structures accordingly. Hence, the true free energy at a given temperature may be slightly different, not only because such approximations were made but also because of inaccuracies in the calculation of U . A hypothetical relationship between true and calculated energy is depicted in Figure 3 . In practice, the thermodynamic stability need not be the decisive factor in crystallization because kinetics plays an important role, influenced by such crystallization conditions as supersaturation and solvent environment. The macroscopic shape (morphology)of a crystal is also known to depend on the solvent environment; for examples and references see Weissbuch et al.I9 or Davey et aL2* This dependence is usually attributed to the influence of a solvent on the growth rates of different crystal surfaces. Although solvent has no effect on the relative thermodynamic stability of different polymorphs, it can be a key factor in specific crystallization of one particular polymorph by favoring its growth kinetically. Therefore, calculation of quantities, such as U or G for a set of predicted polymorphs is unable, in general, to provide a conclusive answer of what crystal (or crystalline powder) will be observed. Experimental data, like a powder diffraction pattern that is highly specific for a given crystal packing, often are essential for picking the true structure(s) in a set of possible low energy polymorphs from computer simulation. In conclusion, the crystal structure observed experimentally need not have the lowest possible Gibbs energy; depending on the crystallization conditions, different polymorphs may be grown. It can be assumed, though, that the observed structure is among those having a relatively low Gibbs free energy. The latter can be approximated by the energy U , which is considerably easier to calculate than G.
Theory aiid Comput~trorialApproaches 333
Computational Techniques A number of computational methods are frequently used in crystal structure prediction programs. To assess the differences between the programs, we list the most important techniques commonly used and mention their strengths and weaknesses. Potential Energy Functions The most common method used to calculate the energy of a structure relies on a force field as implemented in molecular mechanics (MM).Reviews on this technique can be found in Bowen and Allinger,21Dinur and Hagler,22 and Pettersson and L i l j e f ~ r s The . ~ ~ MM energy is calculated as a sum of readily identifiable parts, arising from bond stretching (EJ, angle bending ( Eb), torsional interactions (E,”,), van der Waals interactions ( Evdw),and electrostatic interactions (Eelec).The total energy is given by:
A separate energy term to account for hydrogen bonds is used in some force fields, and cross-terms (such as stretch-bend) are often used. Not all terms in Eq. [2] are always relevant to each crystal prediction method. If the molecules are considered to be rigid, Es, E,, and Etor are irrelevant because they depend on intramolecular features that do not change between different crystal structures during an energy minimization of the lattice. The obvious advantage of imposing rigidity is that energy calculations and minimizations can be done more quickly, because the number of degrees of freedom is much reduced. Treating molecules as rigid bodies can be a valid approach if the molecules are known to be rigid or have negligible flexibility, as in many of the fused aromatic ring systems used by Chaka et al.,24 or if crystal structures of similar molecules tend to have the same conformation, as can be observed in paraffins. Here, the Cambridge Structural Database25 (CSD) is a valuable source of information. Further simplifications to the MM energy function can be introduced such as by ignoring the electrostatic contribution for nonpolar compounds, and/or using mainly the repulsive part of the van der Waals function by ignoring the van der Waals energy for atom pairs with an interatomic distance greater than a certain threshold.26 Obviously, these simplifications will generally improve the speed of calculations, but they do so at some cost in accuracy. Quantum Mechanics Quantum mechanics (QM)provides a different means of calculating the internal energy of crystal structures. Software capable of calculating the energy of periodic systems using Hartree-Fock or density functional theory exists (e.g., CASTEP,27 C r y ~ t a l 9 5 FH196MD29). ,~~ Unfortunately these programs do not always provide the functionality to carry out energy minimizations. Moreover,
334 Computer Simulation to Predict Possible Crystal Polymorphs these methods are orders of magnitude slower than MM, and, because they account poorly for electron dispersion (correlation), they are not necessarily more accurate in their application to molecular crystals. Currently, the main application of Q M calculations on periodic systems seems to be focused on smaller, inorganic compounds like metal oxides, and not (yet) on the crystal structures of organic molecules that are handled easily and reliably by MM.
Scoring Functions A scoring function is a simple mathematical function (compared to a complete set of M M energy potentials) that estimates energies. It is often used to provide rough energies for a large number of similar molecular systems (e.g., when one is calculating the interaction energy between one molecule and a range of other molecules, each in a number of different orientations, as is done when docking a series of related ligands in a receptor). This approach was used by Hofmann and Lengauer30 to approximate the energies of predicted polymorphs, relying on a scoring function that is derived statistically from a set of observed crystal structures taken from the CSD. The distribution of observed interatomic distances is used to calculate a pair potential function for a given pair of atom types.
Charge Models The electrostatic contribution to the M M energy is usually calculated as the pairwise interaction between point charges placed on each atom. The resulting energy is easily calculated via Coulomb’s law: Eelec =
4i4;
D Y;,
[31
with q iand qi the charges on the atoms i and j, and rii the distance between those atoms; D is the effective dielectric constant. Numerous ways exist to assign atomic charges, some of which are computationally inexpensive (e.g., the methods developed by Gasteiger and Marsili,31 and by RappC and G ~ d d a r d ~Other ~ ) . methods, such as Mulliken charges33 or charges fitted to the molecular electrostatic potential ( use the results of Q M calculations. Methods to assign atomic charges have been reviewed by Williams36 and Ba~hrach.~’Unlike, for example, bond angles, atomic point charges do not represent a physically defined quantity: they are merely a representation that accounts for the effects of a particular electronic distribution throughout the molecule. Because these electronic effects can (in part) be taken into account implicitly in a force field, the choice of the best set of atomic charges generally depends on the force field selected; force fields are usually parameterized using a particular atomic charge scheme. Electrostatic interactions depend on the electric field around the molecule, which in turn is determined by the molecule’s electron density distribution. It is
Theory and Computational Approaches 335 thus sensible to use atomic charges that optimally reproduce the electrostatic potential in the vicinity of the molecule, so called ESP-derived charge^.^^,^^ From the electron density distributions calculated by Q M packages like MOPAC,38 Gaussian,39 GAMESS-US,40and GAMESS-UK?l the corresponding ESP atomic charges can be fitted, either within the program itself or via a separate program. The program MOLDEN,42*43for example, can be used to generate ESP charges from Gaussian and GAMESS (USRJK)output; the program PDM9344 also calculates ESP charges from Q M wavefunctions. The general procedure involves choosing a set of points around the molecule, calculating the electrostatic potential at each point, and then fitting atomic charges that best reproduce those calculated potentials. The position of the points where the potential is calculated, as well as the number of points, can be different for difCHELPG4’). However, ferent methods ( CHELP,45 Be~ler-Merz-Kollman,~~ ESP charges may be less well determined (mathematically) for atoms that are shielded by surrounding atoms-for example, the carbon atom in a methyl group, or more generally, any atom that is not at the surface of a molecule. The resulting artifacts may be avoided by applying appropriate restraints when these atomic charges are derived.48 A difficulty also arises if a molecule has conformational flexibility: the ESP charges are likely to be dependent on the molecular conformation, and the molecule may adopt conformations in the proposed crystal structures that are different from that used in the ESP charge calculation. One solution to this problem was proposed by Reynolds, Essex, and who fitted atomic charges to electrostatic potentials for several conformations, weighted with the appropriate Boltzmann factor. They applied their method only to alcohols and threonine. In some cases a computationally less expensive alternative is the method of Bayly, Cieplak, Cornell, and Kollman,48 which can force identical charges on atoms that are equivalent through rotational freedom, such as the hydrogens in methyl groups. Atomic multipoles can be used as an alternative for atomic point charge^,^^^^^ and they are expected to give a better representation of the nonspherical features of the electron density distribution, as found in lone pairs and certain .rr-electron densities. For example,51 optimized crystal structures of acetic acid using atomic multipoles were indeed closer to the experimental structure than those based on atomic point charges. The drawback of this method is that the calculation of the electrostatic term is more CPU-intensive than a point charge model.
Calculations Under Periodic Conditions The M M energy of a crystal is usually calculated for the asymmetric unit of a unit cell that is supposed to be part of an infinite lattice; thus, the cell is surrounded by an infinite number of identical cells. This convention has no technical implications for the calculation of the terms in the M M energy function that are restricted to atoms within the same molecule (bond-stretching, an-
336 Computer Simulation to Predict Possible Crystal Polymorphs
gle-bending, and torsional interactions, the so-called bonded interactions); these terms are made up of a limited number of contributions. O n the other hand, a difficulty arises in the calculation of nonbonded interactions, which occur between all atoms in the complete crystal, with the result that the number of interactions becomes unmanageably large. One way to deal with the fact that each atom in a periodic system has an infinite number of van der Waals and electrostatic interactions is to use a cutoff radius: interactions between atoms separated by an interatomic distance larger than a predefined value (the cutoff radius) are neglected. This will lead to a systematic error in the van der Waals energy because the neglected part will always be negative (the repulsive van der Waals interactions occur only when the interatomic distance is small, and these interactions are always included). Fortunately, since the magnitude of van der Waals interactions decreases rapidly with increasing distance (following r - 6 ) , the resulting error can be made acceptably small. Electrostatic interactions, however, are more problematic, because their magnitude decreases only slowly, as lly, leading to a slow convergence of the resulting energy for electrically neutral systems (see, e.g., Leusen , ~ ~ Table 3 ) . The net elecet a1.,s2 their Figure 4, or Gibson and S ~ h e r a g atheir trostatic interaction of a point charge with all point charges within the cutoff distance will often suffer from large fluctuations as the cutoff radius is changed. This implies that even for relatively large cutoff radii, the calculated Coulombic energy can have a significant error. A major improvement can be achieved by grouping the atoms (in the case of atomic point charges) into so-called charge groups, small clusters that have no net charge, and including interactions with all atoms within a charge group if one of its members (or its center of mass) is inside the cutoff radius. A more rigorous approach is the Ewald s u m m a t i ~ n This . ~ ~method, ~~~ first presented by P. P. Ewald in 1921, exploits the periodicity of the system by calculating part of the summations in reciprocal space. When this method is used, the energy converges much faster, and a more accurate result can be obtained. In their study of crystal packing, Gibson and Scheragas3 used a cutoff radius in the calculation of van der Waals and hydrogen bond terms, and Ewald summation for the Coulombic contributions to the energy. Gibson and Scheraga concluded that there is little influence on the final lattice parameters and the number of iterations in the minimization when the cutoff radius is varied. The absolute energy of the minimized structures did change significantly, however. Recently, van Eijck and Kr00n"~discussed the implications of dependence of the electrostatic energy of a crystal on its macroscopic shape if the crystal has a nonzero dipole moment. Ewald summation then results in the lowest possible energy. This minimum energy corresponds to the situation that the crystal finds an energetically optimal shape (a needle, with the dipole moment directed along the needle axis, or a platelet with the dipole moment in its plane) and/or
Theory and Computational Approaches 337 that the charges on the crystal surface are counterbalanced by external charges. The latter situation is sometimes referred to as “tinfoil boundary conditions” (because a conductive surface allows for the necessary redistribution of charge); crystallization in water or any other medium with a high dielectric constant may approach this situation. In their paper, van Eijck and Kroon concluded that the minimum achievable energy is best used in crystal structure predictions, thus assuming optimal crystal shape or tinfoil boundary conditions. Ewald summation will directly yield this energy; if a cutoff radius is used, a (simple) correction term must be added to the result.
Minimizers Energy minimization is often an important part of the crystal structure prediction process. For example, many crude trial structures can be generated rapidly, but all must be optimized to obtain low energy crystal packings. The time spent on minimizing a trial structure is usually orders of magnitude longer than the time needed to generate it. Consequently, minimizations are often the most time-consuming step in the complete structure prediction process, and a fast minimization algorithm is important. Several minimization methods are well known, and ready-to-use program codes are a~ailable.~’ The large number of variables and the complexity of the energy function are among the factors making energy minimization a time-consuming process. Considerable speed-up may be achieved by introducing simplifications when the structure is still far from its minimum energy. For example, the number of variables may be reduced by treating the molecules as rigid bodies, and the energy function may be simplified by omitting certain contributions (e.g., electrostatic terms).58 Switching between different minimization algorithms may also improve ~peed.~~.~~ The efficiency of widely used programs for rigid body minimization of crystal structures was criticized by Gibson and S ~ h e r a g aThey . ~ ~ introduced a new algorithm, based on secant methods (computationally fast methods to compute derivative matrices6*) that efficiently calculate the energy gradient with respect to the minimization variables. The energy surface described by a force field is usually complicated, containing many local minima in addition to the minima corresponding to the true crystal structures. Simple minimization algorithms will generally produce a minimum energy structure that is near to its starting structure. This implies that many starting structures must be tried to find all relevant minima. A way to avoid this problem is to agitate the structure at certain points during the simulation, allowing energy barriers between different minima to be overcome and making it possible to reach low energy minima from local minima with a higher energy. A similar idea is used in the method of simulated annealing, where the temperature of the simulated system, which determines the ease at which energy barriers can be overcome, is slowly lowered during the simulation. The OREMWA method62 uses simulated annealing, along with a fast minimizing
338 Computer Simulation to Predict Possible Crystal Polymorphs algorithm, and is sometimes, but not always, capable of reaching a global minimum starting from local minima.63 However, because observed crystal structures do not necessarily correspond to global energy minima (hence the occurrence of polymorphism), and because force fields are only approximations of the true energy function, a method that searches only for the global energy minimum seems inadequate. One is better served by searching for a set of low energy crystal packings, and simulated annealing can assist in this endeavor, especially when low energy structures encountered during the annealing process are stored for further analysis.
Clustering of Similar Structures At different stages of a crystal structure prediction, it may be necessary to reduce the number of structures under consideration. One way to do this is to cluster the generated structures and select one structure from each cluster. One type of clustering is the grouping of almost identical structures contained within the same local energy minimum after minimization. A comparison based on cell parameters is problematic, however, because different sets of cell parameters can be used to describe the same lattice, a problem that can (in part) be avoided by comparing the reduced cell parameters, which constitute a unique representation of the cell. Unfortunately, small deviations in coordinates can lead to very different angles for the reduced cell. Therefore, comparison based purely on reduced cell parameters cannot reliably identify similar structures. Fortunately, an alternative set of unique parameters, avoiding the discontinuities of the reduced cell angles, was suggested by Andrews, Bernstein, and Pelletier.64 Karfunkel et al.65 suggested a method to quantify similarity between crystal structures. It is based on the correspondence between X-ray powder diagrams, simulated for the predicted structures. The advantage of this method is that powder diagrams are independent of the mathematical description of the crystal lattice. Powder diagrams give the intensity of the reflected X-ray beam as a function of the reflection angle. Because the peaks are spikelike, and their position and height are very sensitive to small deviations in the structure, the overlap between the corresponding peaks of similar structures is rapidly lost. In this method, the intensity at each point in one powder diagram is compared with the intensity in the environment of the corresponding point in the other diagram (and vice versa). Thus, the authors avoid the problem of rapid loss of the overlap itself (i.e., the strict point-to-point correspondence) if structures are not identical. The algorithm, written in FORTRAN-77, has been p~blished.~~ The commercially available Polymorph Predictor66 program clusters structures by comparing lists of interatomic distances, an approach that is described in more detail in a later section. A second type of clustering concerns the grouping of thousands or even millions of trial structures into a limited number of clusters (tens or hundreds), containing structures that have common features but are not identical. From
Theory and Cornputatzonal Approaches 339 each cluster a single structure is then minimized. Here, the aim is to represent as much structural diversity as possible in the small set of structures to be minimized. In principle, this type of clustering can be considered perfect if all structures in one cluster end up as the same crystal structure after minimization, and if all clusters represent different energy minima. From a practical point of view, clustering at this stage can be considered successful if it produces a small subset of all trial structures from which many different structures are obtained after minimization, thus reducing the total CPU time needed. Clustering methods that are sensitive to small structural changes cannot be expected to perform too well because the differences between the trial structures are usually quite large. An algorithm suitable for clustering crude trial structures as well as optimized structures was developed by van Eijck and Kro01-1.~’Simplifying a procedure proposed earlier by Dzyabchenko68 for measuring similarity, the authors base their approach on a comparison of cell parameters and the positions and orientations of structural fragments in different structures, taking into account the transformations allowed by space group symmetry. Currently, their method has been worked out only for the case of one rigid molecule present in the asymmetric unit. Because the method has to account for operations allowed by space group symmetry, it requires specific code for each space group.
Crystal Structure Prediction Methods A variety of computational methods for the prediction of crystal packing have emerged during the last decade. At least three approaches to constructing low energy crystal packings can be discerned: 1. Construction of low energy clusters of 10-50 molecules, which can be viewed as the nucleus from which the crystal will eventually grow. The center of such a cluster is assumed to be similar to the final crystal structure. Thus, the crystal structure is to be found by simulating the start of the crystallization process. 2. Construction of configurations containing 1-10 molecules, related by the desired symmetry elements, which are then subjected to lattice symmetry to form crystals. As in method 3 , nonperiodic clusters are generated first, but here, instead of having a relatively large cluster size, translational symmetry is introduced to simulate a bulk environment.
3. Generation of a large set of crude molecular packings, subject to the desired space group symmetry, which are then energy optimized. Periodicity is assumed at all stages, and there is no initial consideration of aggregates of a small number of molecules. We mention these approaches mainly to highlight the most characteristic features of different prediction methods. The efficiency, reliability, and general applicability of each method will vary with the particular implementation.
340 Computer Simulation to Predict Possible Crystal Polymorphs An example of the first approach is provided by Williams,69 who modeled crystallization nuclei by minimizing the energy of clusters of 2-15 benzene molecules. It was argued, however, that this cluster size is too small to have a significant relation to the crystal structure.70 Calculations on clusters of up to 42 benzene molecules71 reproduced the structure of the benzene crystal at the center of the cluster. Simulations of this type are in principle more straightforward than those imposing lattice symmetry on the final structure. To obtain a reasonable result using such an approach, however, a large number of molecules in the cluster must be used to reduce artificial surface effects. It has been claimed that some of the simulated clusters of benzene molecules correspond to clusters that are proposed in the interpretation of experimental data.62 Several programs exist for the generation of small molecular clusters, which can then be put onto a three-dimensional grid. These include Gavezzotti’s PROMET372 and the recently developed Fle~Cryst.~* The method of Perlstein73-77 also works more or less along these lines. The general idea behind these methods is that strong interactions, such as hydrogen bonds between a small number of molecules, play a decisive role in the formation of the complete crystal structure. Therefore in many cases the crystal structure can be based on a suitable low energy configuration of just a small number of molecules. This procedure may provide an efficient means of generating correct crystal structures in some instances, but it will be unreliable in cases where very stable clusters can be formed that are not observed experimentally. Examples of this include the dimers of acetic a ~ i dand ~ a~l l ~, x~a n~. ’ ~ PROMET3 generates clusters of molecules that are related by common symmetry operators (inversion center, screw, glide, and translation).80More than 80% of all structures in the CSD are in space groups formed by these symmetry elements. The clusters are then subjected to translational symmetry, thus creating a trial crystal structure. The molecules are kept rigid in the whole procedure. To obtain minimum energy structures, the trial structures are optimized in a subsequent step, by means of a separate energy minimization program. The method was recently applied to generate crystal packings for a coumarin derivati~e.~~ FlexCryst contains a number of uncommon features.30 Unlike most other programs, it computes strong intermolecular interactions between functional groups, such as hydrogen bond centers and phenyl, methyl, and amide groups. Those are used to form clusters of molecules that fill the unit cell, to generate interactions between translated clusters of molecules and to calculate the corresponding translation vectors. Suitable triples of these vectors, which constitute a three-dimensional cell with sufficiently strong interactions along the translation vectors, can then be used as possible lattice vectors. The energy of the resulting crystal structures is evaluated by means of a scoring function, which in turn is derived from the distribution of interatomic distances in known crystal structures in the CSD. The use of a simple function to evaluate energies, together with the absence of repeated optimization, makes the method very fast,
Theory and Computational Approaches 341 albeit at the expense of accuracy. If one considers the length of the difference vector between true and predicted lattice vectors as a measure of accuracy, FlexCryst generates errors of 0.7 A compared to 0.1 A for methods using minimization. These results are somewhat optimistically biased, however, because the molecular conformations found in the experimental crystal structures were used. The method developed by Perl~tein~"'~ is based on the construction of stable one-dimensional aggregates that become the building blocks for two-dimensional aggregates, from which eventually three-dimensional structures may be constructed. The method, which builds on work by Scaringe and Perez,*l is currently implemented for the one- and two-dimensional stages. The one-dimensional aggregates are linear chains of molecules that are related by a symmetry operation, the most common operations being translation, glide, screw, and inversion. In the translation aggregate, all molecules are identical in geometry and orientation, and are positioned along a line according to a given repeat distance. In the glide aggregate, subsequent molecules are mirror images, because they are related by a combination of a mirror and a translation operation. In the screw aggregate, the operation that relates subsequent molecules is the combination of a rotation and a translation. The inversion aggregate is composed of molecules that are related by inversion points. Because of the large number of possibilities, a systematic search over all possible aggregates is impossible. To search for low energy aggregates more efficiently, Perlstein used a Monte Carlo (MC) procedure, using the orientational angles of the molecule and the repeat distances as variables. Efficiency was further improved by varying the M C temperature (4000-300 K ) and the maximum allowed change in molecular orientation during the simulation. Energies were calculated using electrostatic interactions based on Gasteiger3' atomic charges and van der Waals interactions as parameterized in the MM282 force field. Depending on the symmetry element present, experimentally observed aggregates were usually found among the best 10-20 predicted. A variety of methods exists for the generation of crystal structures by applying lattice symmetry at all stages (as opposed to generating nonperiodic aggregates first). Among them are MPA59y*3and MDCP.84 Other methods, which apply full space group symmetry (including symmetry within the unit cell) to a given number of independent molecules in the cell, include MOLPAK,26 UPACK,85 ICE9,24 the method of Schmidt and Englert,86 and the Polymorph PredictoF of Molecular Simulations Inc. (MSI). Each of these programs is described below. In MOLPAK26 (molecular packing), lattice symmetry is introduced first in one dimension by generating a close packing of molecules along a line, and then extended to two and three dimensions in subsequent steps. During packing, space group constraints are accounted for by adding additional molecules, related via inversion, mirror plane, glide, twofold axis, or twofold screw axis symmetry as needed. The initial orientation of the central molecule is varied
342 Computer Simulation to Predict Possible Crystal Polymorphs
systematically (the program is limited to a single molecule in the asymmetric unit). Trial structures have to be refined to obtain possible crystal packings; the authors of MOLPAK used the WMIN programs7 for rigid body refinement. The crystal structure prediction program MPA59.83 (Molecular Packing Analysis) works by using a Monte Carlo procedure to randomly orient a given number of molecules in a trial unit cell. To accomplish this stochastic approach to sampling trial unit cells, the energy of the cell is minimized with a rigid body optimizer. Although this procedure imposes no space group symmetry, symmetry may be present in the minimized structures as permitted by the number of molecules in the cell. UPACKg5 (Utrecht crystal packer) was originally developed for the specific problems of predicting crystal structures of monosaccharides,88 flexible molecules that form hydrogen-bonded structures. It generates trial structures systematically, which are then subjected to a rough rigid body minimization. Hydrogen atoms of hydroxyl groups are not included in the model at this stage; rather, the hydroxyl groups are treated as united atoms. After this first quick minimization, equivalent structures are removed by means of a dedicated clustering a l g ~ r i t h m . ~Another ’ rigid body minimization is performed after hydroxyl hydrogens have been added, followed by a second clustering step. A final energy minimization using a very strict convergence criterion, followed by another clustering, produces a list of predicted structures. The program is currently limited to triclinic, monoclinic, and orthorhombic space groups, with a single molecule in the asymmetric unit. ICE924 also starts by systematically generating trial crystal packings. In their studies of mostly aromatic hydrocarbons, the authors of this program used quantum mechanically optimized (3-21G basis sets9) geometries that were kept rigid throughout the calculation. Following energy minimization, which is based only on interactions between molecules in a central cell with their direct neighbors, the energy is recalculated using a cutoff radius of 10 A and a molecular multipole expansion for the electrostatic interactions. Predicted structures are sorted by energy for each space group, and the top-ranking structures from each space group are combined into the final set of predicted polymorphs. The MDCPs4 (Molecular Dynamics for Crystal Packing) program works by performing a molecular d y n a m i c ~ ~ 0 (MD) - 9 ~ run at constant temperature and pressure on a periodic system. The unit cell consists of 4 or 8 rigid molecules (allowing for crystal structures with Z = 1,2,4, or 8 ) and is initially very loosely packed to allow the molecules to change their orientation. During the MD run, low energy structures are stored for minimization at a later stage, thus producing proposed crystal structures, which are finally checked for space group symmetry. Energy calculation is done using the Ewald method for electrostatic interactions and a cutoff radius (typically 14 A) for the van der Waals terms. The method has been tested with mixed success on the structures of CO,, benzene, pyrimidine, and 1,2-dimetho~yethane.~~ Schmidt and Englert86 developed a method called CRYSCA (Crystal Structure Calculation) based on rigid body lattice energy minimization of ran-
Theory and Computational Approaches 343
dom crystal packings. Their method uses cutoff radii of up to 20 A or a limited summation including five unit cells in each direction for the nonbonded interactions. The method can handle all space groups and allows molecules to occupy special positions. The method was successfully tested on 25 organic and organometallic compounds, using atomic charges from extended Huckel calc u l a t i o n ~and ~ ~van der Waals parameters obtained by carefully combining different published parameter sets. The MSI Polymorph Predictor66 (PP) is based on a four-step m e t h ~ d . ~ ~ - ~ ~ Sampling via Monte Carlo simulated annealing provides a starting set of trial structures. These are clustered to delete similar structures, minimized to create low energy crystal packings, and once more clustered to remove duplicates. Since the method is under continuous development and the current implementation differs significantly in some places from the procedure originally published, we present it below in relative detail. During trial structure generation, angular degrees of freedom (the cell angles, the Eulerian angles describing the orientation of the independent molecules in the cell, and the Eulerian angles of the vectors between them) are varied in a Monte Carlo procedure. In each Monte Carlo step, new angular parameters are chosen based on a “move factor,” a number between 0 and 1 that sets the maximum possible change in parameters. A move factor of 1 means that all parts of phase space (all possible combinations of angular parameters) are accessible within one move. Once new angular parameters have been chosen, the translational parameters (cell lengths and distances between independent atoms) are adjusted to relieve close interatomic contacts. The new trial structure is then accepted or rejected according to the Metropolis algorithm.97 That is, its energy Enewis compared to the energy Eoldof the last accepted structure, and it is accepted if exp(Eold - Enew)/kTis larger than a random number between 0 and 1. This implies that a structure with lower energy than the last accepted one will always be accepted, and that a structure with higher energy has a probability of being accepted depending on the energy increase and the product kT, where k is the Boltzmann factor and T the “temperature” of the simulation. During the packing procedure, the molecules are treated as rigid bodies and the temperature is slowly decreased from several thousand kelvins to 300 K. Thus energy barriers are easily overcome in the beginning of the simulation, and there is a gradual steering toward low energy structures as T drops. The move factor described earlier is used to aid the search. It is doubled every time a trial structure is accepted to encourage the search to visit another area of phase space. Every time a structure is rejected, the move factor is halved to get a more detailed sampling of that region of phase space. At all times, the move factor stays in the range of 0 to 1. Typically, some 2000 trial structures are generated per space group and are clustered in the second step of the prediction. Clustering is based on interatomic distances, which are grouped according to force field atom type, the element, or the name of the atoms. Thus if four different atom types (say, a, byc, and d) are present in the structure, 10 combinations of two atom types are pos-
344 Computer Simulation to Predict Possible Crystal Polymorphs
sible (a-a, a-b, a-c, a-d, b-b, b-c, b-d, c-c, c-d, and d-d), which means that 10 types of interatomic distance are present. For each structure, a list of interatomic distances (within a certain cutoff radius) is made for all combinations of atom types. Clustering is then based on the similarity between the lists generated for different structures. A cluster is formed by taking the lowest energy trial structure, and adding to this cluster all structures having sufficiently similar distance lists. This process is repeated until all trial structures are clustered, or a preset maximum number of clusters has been created. Generally some 250 clusters are formed. The elegance of this clustering algorithm is its speed; a drawback is sometimes poor discrimination. In the third step, the lowest energy structure from each cluster is subjected to a full-body minimization (i.e., including molecular flexibility) under space group symmetry constraints, using Ewald summation for the van der Waals as well as the electrostatic terms, and a fast second-derivative minimizer. Finally, in the fourth step the minimized structures are clustered once again, to remove duplicate structures. This automated four-step procedure produces possible polymorphs for a given combination of space group, number of molecules in the asymmetric unit, and molecular starting conformations. It has been applied to a number of crystal structures of organic molecule^.^^-^^
Related Software It is worth mentioning other programs that can carry out some, but not all of the computational tasks in crystal structure prediction. For example, an efficient rigid body, second-derivative crystal packer, PCK83,100 could be combined with a crude packing generator to produce optimized crystal structures. Such an approach was used by G a v e z z ~ t t iSimilarly, .~~ the DMARELS0 crystal structure relaxation program, which implements a set of distributed multipoles to model electrostatic interactions, could be utilized. C r y ~ t a l 9 5FH196MD,29 ,~~ and CASTEP27 are among the programs that can do ab initio quantum mechanical calculations on crystalline materials. Unfortunately, neither Crystal95 nor FHI96MD has the capability to optimize crystal structures. Computed crystal structures are not always in a standard cell setting (like the reduced ce1112):the packing process may produce a unit cell with very acute angles, and the assignment of axis labels may be nonstandard. Although this makes no difference to the actual crystal structure, inasmuch as it is merely a matter of choosing between equivalent mathematical descriptions of the structure, a standard description of the lattice is often needed when one is comparing structures, converting data to other formats, and so on. Among the computer programs that can generate reduced cell parameters are PLATONTO' and NIST"LATTICE.102 Space group symmetry can be detected in periodic structures by the symmetry-finding module in Cerius2 by M S F and by a (freely available) modified version of the library program ACMM.Io3 Methods have been published to derive both the conventional cellt3 and the reduced cell.I5
Theory and Computational Approaches 345 Le Page, Klug, and Tse104 describe a method to derive lattice parameters for atomic clusters, generated from simulations without periodic constraints. Aimed at simulations of inorganic materials, it relies on eye identification of atom pairs that are related by lattice translation. These are used to derive a primitive cell. Possible symmetry elements are then detected by the program MISSYM.10”,’06
Comparison of Different Techniques Although in their operation crystal structure prediction programs differ widely, many follow a similar basic approach to the problem. First some form of sampling is carried out, for the generation of a set of trial structures or clusters. Then, from this set, possible crystal structures are generated via packing and energy minimization processes. In these processes, a number of common factors can be identified that influence speed, accuracy, and completeness of the predictions. A number of these factors are discussed below, and Table 1 gives a summary, to the best of our knowledge, based on the sometimes limited data in the relevant papers. At the start of a calculation, one or more molecular conformations must be selected. If the simulation is carried out to test a novel prediction method, usually a known crystal structure from the CSD is used. Although it is tempting to use the known, solid state molecular conformation, a more rigorous test
Table 1. Comparison of Characteristic Features of Crystal Structure Prediction Methods
Features (see table notes) Program ICE9 FlexCry st MDCP MOLPAK MPA MSI PP PROMET3 CRYSCA UPACK
a
b
S S MD S R MC S R S
Y N Y Y Y Y Y Y Y
c
d
N FF N S N FF N FF N F F Y FF N FF N F F Y F F
e
f
g
h
MMP N Y N+Y Y
N N Y N Y Y Y N N
N Y N N N Y Y N Y
Y Y N Y N Y Y Y Y
Y N+Y Y Y
I
Z’=1 Z’=1 2 5 4
Z’=1
2 5 4 Unrestricted 2 5 4 Unrestricted
Z’=1
“Search type: systernatic/random/MC/MD. hMinimization of structures: YM. ‘Full-body minimization: Y/N. ”Energy function: force field /scoring function. ‘Coulombic interactions included: Y/N/(N Y)/MMP. ( N + Y) = in final stages only; M M P = via molecular multipole moments. (Ewald summation: Y/N. %Clustering:YIN. hApplication of symmetry: Y/N. ‘Maximum Z or Z’ value: 1/2/. . . /unrestricted.
+
346 Computer Simulation to Predict Possible Crystal Polymorphs is to involve one or more minimum energy conformations generated by molecular mechanics or quantum mechanics. Variables like starting orientations of molecules and initial cell parameters, which are continuous (as opposed to space group and number of molecules in the cell), can be sampled systematically or randomly, by taking frames from a molecular dynamics simulation or a Monte Carlo procedure. To perform a systematic search on the variables, one must limit the possibilities to certain discrete values. For example, translational parameters can be limited to points on a grid, and rotational parameters can be varied in steps. The step size or density of the grid points will eventually determine the completeness of the deterministic search. Factors defining the expanse of the space to be searched include the number of independent molecules in the unit cell (adding to the number of degrees of freedom), the number of bond rotations, and the size and shape of the molecule. Eventually, with growing complexity of the system, a systematic search will become impractical and random search or Monte Carlo methods more effective. A molecular dynamics simulation is least favorable,*07>10sbecause much time is spent near a few minima, making a thorough sampling rather time-consuming. In many of the approaches described above, a large part of the CPU time is spent on minimization of trial structures. Important factors here are the minimization algorithm, the energy function, and the number of variables to be minimized. In principle, the minimization variables are all atomic coordinates and the cell parameters; their number can be reduced by imposing space group symmetry or by treating the molecules as rigid bodies. If rigid body constraints are imposed, the molecular conformation cannot change in response to packing forces. If space group symmetry is imposed, the coupling of the movements of symmetry-related atoms may lead to energy minima that become unstable when symmetry constraints are removed.35 Thus space group constraints may lead to false energy minima. Choosing a more tractable energy expression, by neglecting some of its critical terms at appropriate stages of the minimization, offers another way to reduce computational cost, a strategy followed in PROMET3, where electrostatic interactions are initially left out, and in MOLPAK, which omits the attractive part of the van der Waals interactions. To limit the number of atom-atom interactions to be considered in the calculation of van der Waals and electrostatic interactions, a suitable cutoff radius (UPACK) or an Ewald summation (MPA, Polymorph Predictor) may be used, or both (Gibson and Scheragas3). Finally, the use of lookup tables instead of repeated evaluation of interaction functions may speed up computati0ns.~~,~09
Using Experimental Data If the crystal structure of a particular observed polymorph is to be determined, experimental data of several types can be used at different stages of the
Predicting and Evaluating Crystal Structures 347 prediction. Powder diffraction data as well as other data (from, e.g., solid state NMR or IR spectroscopy) can be used in setting up the simulation. Spectroscopic data may provide information on the orientation of the molecules in the crystal and help in choosing the initial conformations of the molecules in the prediction. Solid state NMR data can be used to determine the number of molecules in the crystal that are not related by symmetry, and thereby the number of molecules in the asymmetric unit. Cell parameters can be obtained from a good powder diffraction pattern by searching for a set of cell axes and angles that would produce reflections at the observed diffraction angles, Such a procedure, called “indexing,” can be carried out mostly automatically with programs like TREOR90110 or DICVOL91 Once the cell parameters have been established, the cell volume provides an estimate for 2, the number of molecules in the cell. Based on Z and the observed cell parameters, the space groups that are most likely can be identified. For example, if all cell angles are 90°, all cell axes have different lengths and 2 is estimated to be 4, the space group is most probably P212,2,. If cell angles and cell lengths all have different values, and 2 equals 2, space group PI is most likely. In a later stage of the prediction, powder diffraction data can be compared with simulated powder patterns of proposed polymorphs for identification purp o s e ~A. ~model ~ structure that is close to the experimental one will produce a similar pattern and can then be further refined by means of the Rietveld method.l12
PREDICTING AND EVALUATING CRYSTAL STRUCTURES The first steps of a crystal structure prediction usually involve choosing the starting molecular conformation(s) and calculation of a set of charges. A complete conformational analysis must be carried out to obtain a good set of conformations that is within an acceptable energy range. Alternatively, the CSD can often provide crystal structures of similar compounds, from which feasible molecular conformations or geometries of ionic complexes can be extracted. Semiempirical or ab initio quantum mechanical methods are then applied to derive a charge distribution (as explained earlier) and to optimize the molecular geometry if necessary. Typically, ESP-derived charges, based on Hartree-Fock calculations using a 6-31G“ or 6-31G*“ basis set are used. Although MNDO charges, scaled by an appropriate factor, may provide a computationally less expensive a l t e r n a t i ~ e ? ~ it , ’ should ~~ be kept in mind that the charge calculation generally takes only a small part of the total computation time spent in a polymorph search. Since, however, these charges may have a large influence on the calculated energy, the time needed for a more elaborate charge calculation may be well spent. Another option is to use a force field such as CFF,l14 for
348 Computer Simulation to Predict Possible Crystal Polymorphs which atomic charges have been optimized together with the other parameters in the force field fitting procedure. Crystal structures are usually predicted in separate runs for given combinations of space group and number of independent molecules in the unit cell, Z ' . Although the explicit use of space group symmetry is, in principle, unnecessary if predictions with a suitable number of independent molecules in the cell are carried out, it will generally be more efficient to use space group symmetry and a small number of independent molecules instead, because this can reduce the number of variables drastically. Choosing space groups for structure prediction and estimating the number of independent molecules to use can be facilitated by using data from the CSD: statistical a n a l y ~ i s l ' ~of. ~the ~ ~CSD shows that five space groups (P2,/c, Pi, P2,2,2,,C2/c, P2,) account for approximately 78% of all crystal structures of organocarbon compounds in that database, and only 8.3% of all structures have more than one formula unit in the asymmetric unit. Optically pure chiral compounds obviously cannot crystallize in space groups that contain a mirror or an inversion operation. These compounds can crystallize only in a subset of 65 out of the 230 space groups. Their distribution over these 65 space groups is similar to the distribution of all compounds over this subset: 78 % of the chiral compounds crystallize in either P2,2,2, or P2,. Keep in mind that these numbers reflect the distribution of solved crystal structures for rather nonspecific sets of molecules. Structures that are less readily solved experimentally, such as those containing more than one (nonsolvent)molecule in the asymmetric unit (Z' > I),may be underrepresented in the database. Particular subsets of structures may also deviate significantly from trends generally observed: for example,l17 40% of the alcohol crystals have Z' > 1. Still, as a general rule, prediction of crystal structures with a single molecule in the asymmetric unit in P2,/c,PI, P2,212,, C2/c, and P2, for nonchiral compounds; in P2,/c, P1, and C2/c for racemates; and in P212,2, and P2, for optically pure chiral compounds, is, statistically, a logical place to start. The first result from a crystal structure prediction usually involves a large set of crystal structures for the complete range of space groups and Z values considered. An example is given in Figure 4,where energy and density of predicted crystal structures for acetic acid are plotted in a scatter diagram. Often hundreds of crystal packings are predicted. How relevant are all these minimum energy structures? One criterion for discarding proposed structures is the calculated energy: within 3 kcaVmol of the global minimum seems to be a reasonable acceptance range for crystals of typical organic molecules found in the CSD,16 although this number will depend on the size of the molecule. This cutoff means that in principle, one can discard all predicted structures in excess of 3 kcal/mol above the lowest energy structure predicted. This range depends on how well the force field performs for the molecule; adjustments to standard force fields may be necessary (and feasible) in some cases, as in the determination of the structure of 4-amidinoindanone guanylhydrazone by Karfunkel et al.,llR where equilibrium bond distances and angles were shifted to values obtained in ab initio calculations.
Predicting and Evaluating Crystal Structures 349 1.40
1.20
%. h
0)
1 .oo
0.80 -441.0
Figure 4
-42.0
-40.0
Energy (kcalhol)
-38.0
I -36.0
Energy and density of predicted crystal structures of acetic acid.
If a Monte Carlo search procedure is used, it often proves more efficient to perform several short simulations rather than one long simulation for each
space group. The overlap (or the lack thereof) in the identified low energy structures from a series of short runs is a good indication of the effectiveness of the search; if the second or third simulation in a series does not provide any new low energy structures, it can usually be assumed that all relevant minima have been found. If only one long simulation is performed, it is much more difficult to establish the same level of confidence. After the energy cutoff has been applied, a large number of predicted structures may yet remain. Their number can be further reduced for the following reasons: 1. The same packing structure may have been predicted in different space groups. One example is the prediction of a structure in PI, Z = 2, as well as in P2,, Z = 1 (which, on the other hand, can be taken as an indicator of the completeness of the search). Intramolecular symmetry elements may lead to the prediction of the same structure in different space groups, with the same number of independent molecules. For example, certain crystal packings of acetic acid, which is mirror-symmetric, can be predicted in both P2,lc, Z = 4,and P2,2,2,,
350 Computer Simulation to Predict Possible Crystal Polymorphs Z = 4 (both structures have a single independent molecule in the cell). A suitable clustering of the complete set of predicted structures should thus remove duplicate structures of this type. 2. Minimizations are often performed based on an assumed space group symmetry and a certain value of Z. Thus, constraints are imposed that may lead to structures that do not correspond to local energy minima with respect to the degrees of freedom that were constrained. Minimization using a superlattice (no symmetry imposed on the contents of the unit cell) or a supercell (a new cell made of two or more original cells) may lead to a lower energy minimum. Note that this artifact of imposing space group symmetry does not prevent correct crystal structures from being found; it merely leads to the generation of additional, unrealistic crystal structures. 3. Some of the energy minima may be separated from lower minima by a small barrier that would be easily overcome in reality. In practice, these metastable structures would therefore not correspond to stable polymorphs. A brief molecular dynamics simulation on all structures might overcome the small energy barriers.51 4. Even if a force field suitable for the particular class of molecules is used, its limitations may lead to artificial energy minima.58 Recalculation of the low energy structures with a different force field (that is also supposed to work well for the molecule in question) may then eliminate erratic structures.
Finally, if the structure of an experimentally observed polymorph is to be determined, powder diffraction data may help to identify the true crystal structure, as mentioned earlier. A flowchart describing the general procedure is given in Figure 5.
Example: Polymorph Prediction for Estrone As an illustration of the procedure given above, we will describe a prediction of possible polymorphs for the steroid estrone. The structures of three polymorphs of estrone are available in the CSD,119,120one structure in space group P2,, Z = 4, Z’ = 2 (two independent molecules in the asymmetric unit) and two structures in P2,2,2,, both with Z = 4,Z’ = 1. The structure of the estrone molecule is given in Figure 6. We will assume the crystal structures to be unknown and indicate how experimental data could help predict the correct polymorphs. First, we build models of the molecule we will use in our prediction. The molecule is sketched and optimized using a suitable force field. A search in the CSD for steroids with an identical ring skeleton will reveal that this type of skeleton is rather rigid because of the aromaticity of the first ring. The skeleton of our optimized molecule should fit reasonably well onto those of the experimental structures. Although the skeleton is rigid, there is some conformational
Predicting and Evaluating Crystal Structures 351 experimental powder pattern
structures
x
I
I
experimental data: IR-Raman, solid state NMR, elemental analysis, crystal structures of other polymorphs
I
conformational analysis or conformers from crystal structures of
II
I n 6 1 -..-rm..+n. U l V l yawl IIG11y
and charge
I LI
energetics against crystal structures of other polymorphs < if available
information on space group and unit cell dimensions
t
@?)Parameterize force field if necessary
1 I polymorph search
>- per space group
- per conformer
analysis of low energy, high density structures experimental powder pattern Rietveld refinement
prediction of physico-chemical solid state properties Figure 5
Flowchart describing the general polymorph prediction process.
flexibility in the molecule: the hydroxyl group has two possible orientations, both in the plane of the aromatic ring. The orientation of this group will have a profound influence on crystal structures, inasmuch as it plays an important role in the formation of hydrogen bonds. The barrier between the two conformations, however, is too high to be overcome during energy minimization. We
352 ComDuter Simulation to Predict Possible Crvstal PolvmorDhs
0 Estrone
HO
0
Acetaminophen
Benzene
NH2
Acetic acid
0
H Quinacridone
Figure 6 Structural formulas of estrone, acetaminophen, 4-amidinoindanone guanylhydrazone (AIGH), acetic acid, benzene, prednisolone tert-butylacetate, and quinacridone.
will therefore have to carry out separate prediction runs for both rotamers. The two rotamers are next optimized using quantum mechanical methods, and ESP charges are calculated for the optimized structures. We now have our starting models. The following step is to decide on the space groups to predict packing structures and the number of independent molecules in the cell. Since we have a chiral molecule, and the compound is optically pure, it can crystallize only in space groups that lack a mirror or inversion center. The most common ones are P2,, P2,2,2,, and PI. If powder diffraction patterns are available, those may be used to obtain cell parameters. For the f2, structure, such patterns would indicate a primitive cell with two right angles and a volume corresponding to four molecules in the cell, so one would start with predictions in P2,, Z’ = 2.
Application Examples 353 The powder patterns of the other polymorphs would indicate a primitive cell with all right angles and four molecules in the unit cell. In this case, the space group is most likely P 2 , 2 , 2 , with 2' = 1. Other possibilities exist (e.g., P2,2,2, 2' = 1)but are statistically less probable. At this stage, one could use solid state NMR data to determine the number of independent molecules in the cell, Z ' , which in all three cases would be in agreement with the most probable space group and Z' combinations. Next, crystal structures can be predicted with the program of choice. Taking the predictions for P2,2,2, as an example, we would obtain two sets of predicted structures, one for each starting conformation of the molecule. Because we have used different point charges in the calculation of these sets of structures, we cannot directly compare the MM energies of structures in different sets. To obtain energies that can be compared, one must use structures that have been optimized using the same point charges. This can be done by transferring the charges used for one conformer to the structures of the other conformer and minimizing those structures once more, or by calculating charges suitable for both conformers (the average of the charges calculated for the different conformers could be used) and minimizing all structures using these charges. Finally, powder diffraction patterns may be calculated for low energy structures, which can be compared to the experimentally obtained patterns. Simulated patterns for one of the experimentally observed estrone polymorphs and for the corresponding predicted structure are given in Figure 7a and 7b. A superposition of the two structures is given in Figure 8.
APPLICATION EXAMPLES In the literature, crystal structure predictions are presented on compounds with known crystal structures (for testing purposes) as well as on compounds for which no experimental data were available or for which only limited data exist (usually X-ray powder diffraction patterns). Rigid molecules composed of (aromatic) ring systems, as well as structures selected randomly from the CSD, are popular for testing purposes. If the molecules are flexible, usually the conformation observed in the true crystal structure is used in the prediction, which will bias the results. Exceptions include the crystal structure prediction for pigment red,65 predictions on monosaccharides, where six torsional angles are assumed unknown,85 and the 4-amidinoindanone guanylhydrazone example discussed below.llg Reports focusing on as yet unknown crystal structures, based on computer simulations, are still rare. One reason for their scarcity is that the field is relatively new; in addition, this type of work often takes place in an industrial environment, where the application examples remain confidential for some time. The sections that follow list a number of application examples. The structural formulas of the compounds involved are given in Figure 6.
354 Computer Simulation to Predict Possible Crystal Polymorphs (a)
Powder Dit fraction Radiation umed I XIUY Wavelength I 1.5418 estronlO
I
I
Powder Diffraction Radiation ummd = XRAY Wavelen th I 1.5418 Frame O f
80
I
n t e
9t
Y
60
40
h
20
10
15
20
25
Diffraction Angla
30
3
Figure 7 The X-ray powder diffraction patterns of one of the experimentally observed polymorphs of estrone (a) and of the corresponding predicted structure (b).
Application Examples 355
Figure 8 Superposition of an experimentally observed polymorph of estrone and the corresponding predicted structure. The ab initio prediction of the crystal packing of 4-amidinoindanone guanylhydrazone (AIGH) by Karfunkel et al.ll8 provides an example of the determination of a previously unknown crystal packing by polymorph prediction. This compound was proposed by scientists at Ciba-Geigy (Novartis) as an anticancer drug. Two solvent-free polymorphs, called A and B, were observed experimentally. The crystal structure of B could be solved by single crystal X-ray diffraction, but no crystals of sufficient quality could be grown for polymorph A. However, a powder diffraction pattern of A could be used to obtain reduced lattice parameters and the number of molecules in the unit cell, implicating two possibilities for the space group: PI, Z’ = 2, and PI, Z‘ = 1, the latter being most probable. The molecule may exist in two tautomeric forms. After energy calculations on different conformers of the two tautomers, four conformers of the most stable tautomer were selected for use in packing predictions in PI, using MSI’s Polymorph Predictor. In these calculations, the DREIDING-2.21 force field121 was used, with some corrections made to the parameters to permit reproduction of the optimized geometries obtained in ab initio calculations. The predicted crystal structure with lowest energy had cell parameters close to those derived from the powder diffraction pattern, and a satisfactory agreement between the observed and simulated powder patterns could be achieved using
Figure 9 Experimentally observed polymorphs of acetaminophen. Left: monoclinic form (CSD ref. code: HXACANOl). Right: orthorhombic form (CSD ref. code: HXACAN).
Amlication Examdes 357 Rietveld refinement. Thus, both the conformation of AIGH and the packing of the A polymorph, which were unknown before, were determined. Another example concerns the analgesic p-hydroxyacetanilide, better known as acetaminophen and marketed as Tylenol (in the United States) and Paracetamol (elsewhere). This substance has two anhydrous polymorphs. The monoclinic formy22 is known to be considerably more stable than the orthorhombic m 0 d i f i ~ a t i o n . lSuccessful ~~ crystallization of the latter metastable form has been reported only a few times. Both crystal structures (see Figure 9) involve a two-dimensional hydrogen-bonding motif in the lattice. A study was undertaken to determine whether any other polymorphs of this compound could be expected as part of a validation project of MSI's Polymorph Predictor. The polymorph search was performed in 17 space groups (together accounting for more than 95% of known molecular crystal structures) with one molecule in the asymmetric unit. Both experimentally known crystal structures were predicted in the correct stability order. The search, using the DREIDING-2.21 force field in combination with semiempirical MNDO-ESP atomic charges, also suggested a third potential polymorph in the P212121 space group to be about as stable as the metastable orthorhombic form. However, closer inspection of that structure revealed a distinct hydrogen-bonding pattern, which is incorrectly favored by the DREIDING force field. Recalculation of the lattice energetics with a force field more suitable for distinguishing subtle differences in hydrogen-bonding patterns ( CFF114) established that this third structure is actually too unstable to exist. applied the Polymorph Predictor to acetic acid and Recently, Payne et halogenated analogs thereof. As starting molecular geometries, they used either the known, experimental conformation, the STO-3G optimized structure, or the 6-31G" *89 optimized structure. For each geometry, 6-3lG" " ESP atomic charges were calculated and used in the predictions. In all cases, a packing was found that corresponded to the energy-minimized crystal structure, as well as several packings with an energy even lower (within 1 kcaUmo1). The authors attributed this apparent error in relative energies of different packings to shortcomings in the DREIDING-2.21 force field, specifically in its hydrogen bond potential. One of their predicted structures in P2,lc could be refined reasonably well to agree with the X-ray powder diffraction pattern of a high pressure form of acetic acid reported by Bertie and W i l t ~ n for , ~ which ~ ~ no experimental crystal structure is known. Thus, the authors78 were able to construct a model that is a good representation of the crystal structure, in the absence of single crystal diffraction data. Benzene is a popular validation case for crystal structure prediction metho d ~ . ~Two~ polymorphic , ~ ~ forms > ~ of~the~compound ~ ~ ~are known: a stable one with Z = 4126and a high pressure, metastable form with Z = 2.127 When MSI's polymorph predictor is used with the DREIDING-2.21 force field and atomic charges of -0.15 on carbon atoms and +0.15 on hydrogen atoms, both polymorphs can easily be p r e d i ~ t e dBecause .~~ the simulation does not consider pressure, the unit cell vectors of the high pressure polymorph are predicted too long. It is also possible to predict both forms in one single simulation by per-
358 Combuter Simulation to Predict Possible Crvstal Polvmorbhs forming a search in space group PI, with Z = 4.Note, however, that such a prediction with multiple molecules in the asymmetric unit is feasible only for simple, highly symmetrical compounds like benzene. For asymmetrical compounds, such simulations can prove unfeasible because of the large number of degrees of freedom due to the additional molecules in the asymmetric unit. The steroid prednisolone-t-butylacetate was patented as a glucocorticoid by Merck & Company in 1962. Two anhydrous polymorphs and five solventcontaining “pseudopolymorphs” are known,128 but the crystal structure of one anhydrous form has never been determined experimentally. Indexing of the experimental powder pattern of this elusive polymorph indicates that the crystal belongs to the P2,2,2, space group. Simulations with the MSI polymorph predictor considering two of the low energy conformations of the compound and using the DREIDING-2.21 force field with MNDO-ESP atomic charges find a stable P2,2,2, polymorph with a simulated powder pattern very similar to the experimental pattern.129 Subsequent Rietveld refinement confirmed that this predicted structure is the previously undetermined anhydrous polymorph of prednisolone. Quinacridone and its derivatives represent one of the most important classes of organic pigments, both in terms of annual production and in terms of wide ranging applications. The excellent performance of these pigments is explained by their thermal stability, weather resistance, and 1ightfa~tness.l~~ The parent compound quinacridone can crystallize in at least three polymorphs,131 each having a different color and a different set of application properties. Although the experimental crystal structure of the most stable y polymorph has been reported recently,132 rational control over the system is still impeded by a lack of knowledge of the crystal structures of the (Y and p forms, The MSI polymorph predictor (with the DREIDING-2.21 force field and ab initio 6-3PG** ESP atomic charges) predicted all possible crystal forms of quinacridone. 133 Powder patterns simulated for the three most stable predicted structures compare well with experimental powder patterns recorded for the three polymorphs of quinacridone. Rietveld refinement successfully refined the predicted structures, proving that the previously undetermined (Y and p polymorphs are now solved.
ACKNOWLEDGMENTS We acknowledge the support of this work by the Computational Materials Science Crystallization project, a Dutch research collaboration with academic and industrial partners, focusing on precompetitive research into modeling, packing, morphology, and industrial crystallization of organic compounds. Project information is accessible at http://www.caos.kun.nl/cmsc/.The industrial partners contributing to the project funding are Akzo-Nobel, Organon, Unilever, and DSM. Additional funding support is obtained from the Netherlands Organization for Scientific Research (NWO) and the Netherlands Foundation for Chemical Research (SON).
References 359
REFERENCES 1. J. D. Dunitz and J. Bernstein, Acc. Chem. Res., 28, 193 (1995).Disappearing Polymorphs. 2. G. H. Stout and L. H. Jensen, X-Ray Structure Determination, Macmillan, New York, 1970. (a) Chapter 3 . (b) Chapter 2. 3. A. I. Kitaigorodskii, Organic Chemical Crystallography, Consultants Bureau, New York, 1961. 4. D. E. Williams, Acta Crystallogr., 21, 340 (1966).Crystal Structure of Dibenzoylmethane. 5. D. E. Williams, Trans. Am. Cryst. Assoc., 6,21 (1970). Computer Calculation of the Structure and Physical Properties of the Crystalline Hydrocarbons. 6. D. E. Williams, Acta Crystallogr., Sect. A, 25,464 (1969).A Method of Calculating Molecular Crystal Structures. 7 . D. E. Williams, Actu Crystallogr., Sect. B, 29, 96 (1973). Crystal Structure of 2,4,6-Triphenylverdazyl. 8. P. Zugenmaier and A. Sarko, Acta Crystallogr., Sect. B, 28, 3158 (1972). Packing Analysis of Carbohydrates and Polysaccharides. 1. Monosaccharides. 9. C. P. Brock and J. A. Ibers, Acta Crystallogr., Sect. B, 29, 2426 (1973). Conformational Analysis of the Triphenylphosphine Molecule in the Free and Solid States. 10. W. R. Busing, Acta Crystallogr., Sect. A, 28, S252 (1972).A Computer Program to Aid in the Understanding of Interatomic Forces in Molecules and Crystals. 11. A. Kvick and J. H. Noordik, Acta Crystullogr., Sect. B, 33, 2862 (1977). Hydrogen Bond Studies. CXXI. Structure Determination of 2-Amino-4methylpyridine by Molecular Packing Analysis and X-Ray Diffraction. 12. T. Hahn, Ed., International Tables for Crystallography, Reidel, Dordrecht, 1983, Vol. A, pp. 734-744. 13. Y. Le Page, J. Appl. Crystallogr., 15, 255 (1982). The Derivation of the Axes of the Conventional Unit Cell from the Dimensions of the Buerger-Reduced Cell. 14. L. C. Andrews and H. J. Bernstein, Actu Crystallogr., Sect. A, 44,1009 (1988).Lattices and Reduced Cells as Points in 6-Space and Selection of Bravais Lattice Type by Projections. 15. L. Zuo, J. Muller, M.-J. Philippe, and C. Esling, Acta Crystallogr., Sect. A, 51, 943 (1995). A Refined Algorithm for the Reduced-Cell Determination. 16. A Gavezzotti and G . Filippini, J. Am. Chem. Soc., 117, 12299 (1995).Polymorphic Forms of Organic Crystals at Room Conditions: Thermodynamic and Structural Implications. 17. K. D. Gibson and H. A. Scheraga, J. Phys. Chem., 99, 3765 (1995).Crystal Packing Without Symmetry Constraints. 2. Possible Crystal Packings of Benzene Obtained by Energy Minimization from Multiple Starts. 18. G. Filippini and C. M. Gramaccioli, Acta Crystallogr., Sect. B, 42,605 (1986).Thermal Motion Analysis in Tetraphenylmethane: A Lattice-Dynamical Approach. 19. I. Weissbuch, R. Popovitz-Biro, M. Lahav, and L. Leiserowitz, Acta Crystallogr., Sect. B , 51, 115 (1995).Understanding and Control of Nucleation, Habit, Dissolution and Structure of Two- and Three-Dimensional Crystals Using ‘Tailor-Made’ Auxiliaries. 20. R. J. Davey, S. J. Maginn, S. J. Andrews, S. N. Black, A. M. Buckley, D. Cottier, P. Dempsey, R. Plowman, J. E. Rout, D. R. Stanley, and A. Taylor, J. Chem. Soc., Faruduy Trans., 90, 1003 (1994).Morphology and Polymorphism in Molecular Crystals: Terephthalic Acid. 21. J. P. Bowen and N. L. Allinger, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1991, Vol. 2, pp. 81-97. Molecular Mechanics: The Art and Science of Parameterization. 22. U. Dinur and A. T. Hagler, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1991, Vol. 2, pp. 99-164. New Approaches to Empirical Force Fields.
360 Computer Simulation to Predict Possible Crystal Polymorphs 23, I. Pettersson and T. Liljefors, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1996, Vol. 9, pp. 167-189. Molecular Mechanics Calculated Conformational Energies of Organic Molecules: A Comparison of Force Fields. 24. A. M. Chaka, R. Zaniewski, W. Youngs, C. Tessier, and G. Klopman, Acta Crystallogr., Sect. B, 52, 165 (1996). Predicting the Crystal Structure of Organic Molecular Materials. 25. F. H. Allen and 0. Kennard, Chem. Design Autom. News, 8,31 (1993). 3D Search and Research Using the Cambridge Structural Database. The URL is http://www. ccdc.cam.ac.uk/. 26. J. R. Holden, 2. Du, and H. Ammon, f. Comput. Chem., 14,422 (1993). Prediction of Possible Crystal Structures for C-, H-, N-, 0-,and F-Containing Organic Compounds. 27. M.C.Payne, M. P. Teter, D. C. Allan, T. A. Arias, and J. D. Joannopoulos, Rev. Mod. Phys., 64, 1045 (1992). Iterative Minimization Techniques for Ab Initio Total-Energy Calculations: Molecular Dynamics and Conjugate Gradients. 28. R. Dovesi, V. R. Saunders, C. Roetti, M. Caush, N. M. Harrison, R. Orlando, and E. Apra, Crystal-Electronic Structure of Periodic Systems, User Manual (1996). The URL is http://gservl .dl.ac.ukA’CSC/Softare/CRYSTALI. 29. R. Stumpf and M. Scheffler, Comput. Phys. Commun., 79, 447 (1994). Simultaneous Calculation of the Equilibrium Atomic Structure and its Electronic Ground State Using Density-Functional Theory. The URL is http://www.fhi-berlin.mpg.de/th/fhi96md/code.html. 30. D.W. M. Hofmann and T. Lengauer, Acta Crystallogr., Sect. A, 53, 225 (1997). A Discrete Algorithm for Crystal Structure Prediction of Organic Molecules. 31. J. Gasteiger and M. Marsili, Tetrahedron, 36, 3219 (1980). Iterative Partial Equalization of the Orbital Electronegativity-A Rapid Access to Atomic Charges. 32. A. K. Rappe and W. A. Goddard II1,f. Phys. Chem., 95,3358 (1991). Charge Equilibration for Molecular Dynamics Simulations. 33. R. S. Mulliken, J. Chem. Phys., 23, 1833 (1955). Electronic Population Analysis on LCAO-MO (Linear Combination of Atomic Orbitals-Molecular Orbital) Molecular Wave Functions. 34. F. A. Momany, f. Phys. Chem., 82, 592 (1978). Determination of Partial Atomic Charges from Ab lnitio Molecular Electrostatic Potentials. Application to Formamide, Methanol, and Formic Acid. 35. S. R. Cox and D. E. Williams,J. Comput. Chem., 2,304 (1981). Representation of the Molecular Electrostatic Potencia1 by a Net Atomic Charge Model. 36. D. E. Williams, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1991, Vol. 2, pp. 219-271. Net Atomic Charge and Multipole Models for the Ab Initio Molecular Electric Potential. 37. S. M.Bachrach, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1994, Vol. 5, pp. 171-227. Population Analysis and Electron Densities from Quantum Mechanics. 38. J. J. P. Stewart, MOPAC93, Fujitsu Limited, Tokyo, 1993. The URL is http://www.fujitsu.com/. 39. M. J. Frisch, G. W. Trucks, H. B. Schlegel, P. M. W. Gill, B. G. Johnson, M. A. Robb, J. R. Cheeseman, T. Keith, G. A. Petersson, J. A. Montgomery, K. Raghavachari, M. A. At-Laham, V. G. Zakrzewski, J. V. Ortiz, J. B. Foresman, J. Cioslowski, B. B. Stefanov, A. Nanayakkara, M. Challacombe, C. Y. Peng, P. Y. Ayala, W. Chen, M. W. Wong, J. L. Andres, E. S. Replogle, R. Gomperts, R. L. Martin, D. J. Fox, J. S. Brinkley, D. J. DeFrees, J. Baker, J. P. Stewart, M. Head-Gordon, C. Gonzalez, and J. A. Pople, Gaussian 94, Gaussian Inc., Pittsburgh, PA 1995. 40. M. W. Schmidt, K. K. Baldridge, J. A. Boatz, S. T. Elbert, M. S. Gordon, J. H. Jensen, S. Koseki, N. Matsunaga, K. A. Nguyen, S. Su, T. L. Windus, M. Dupuis, and J. A. Montgomery, f. Comput. Chem., 14, 1347 (1993). General Atomic and Molecular Electronic Structure System.
References 361 41. M. Guest, J. Kendrick, J. van Lenthe, K. Schoeffel, and P. Sherwood, GAMESS-UK Users Guide and Reference Manual. Computing for Science (CFS) Ltd., Daresbury Laboratory, UK, 1994. 42. G. Schaftenaar, MOLDEN, QCPE Bulletin, 12, (1992), Program No. 619, Quantum Chemistry Program Exchange, Indiana University, Bloomington, Indiana, USA. The URL for MOLDEN is http://w.caos.kun.n!l-schaft/molden/molden.html. See also http://qcpeS.chem. indiana.edulqcpe.htrn1. E-mail: [email protected]. 43. G. Schaftenaar and J. H. Noordik, J. Comput. Aided Mol. Design, submitted (1998). MOLDEN: A Pre- and Post-Processing Program for Molecular and Electronic Structures. 44. D. E. Williams, PDM93, Electrostatic Potential-Derived Charges and Multipoles, 1993. Department of Chemistry, University of Louisville, Louisville, KY 40292. E-mail: dew01 @xray5.chem.louisville.edu. 45. L. E. Chirlian and M. M. Francl,J. Comput. Chem., 8,894 (1987).Atomic Charges Derived from Electrostatic Potentials: A Detailed Study. 46. B. H. Besler, K. M. Merz, and P. A. Kollman, 1. Comput. Chem., 11, 431 (1990). Atomic Charges Derived from Semiempirical Methods. 47. C . M. Breneman and K. B. Wiberg,J. Comput. Chem., 11, 361 (1990). Determining AtomCentered Monopoles from Molecular Electrostatic Potentials. The Need for High Sampling Density in Formamide Conformational Analysis. 48. C. I. Bayly, P. Cieplak, W. D. Cornell, and P. A. Kollman, J. Phys. Chem., 97, 10269 (1993). A Wcll-Behaved Electrostatic Potential Based Method Using Charge Restraints for Deriving Atomic Charges: The RESP Model. 49. C. A. Reynolds, J. W. Essex, and W. G. Richards, J. Am. Chem. Soc., 114, 9075 (1992). Atomic Charges for Variable Molecular Conformations. 50. D. J. Willock, S. L. Price, M. Leslie, and C. R. A. Catlow,J. Comput. Chem., 16,628 (1995). The Relaxation of Molecular Crystal Structures Using a Distributed Multipole Electrostatic Model. 51. W. T. M. Mooij, B. P. van Eijck, S. P. Price, P. Verwer, and J. Kroon, 1. Comput. Chem., 19, 459 (1998).Crystal Structure Predictions for Acetic Acid. 52. F. J. J. Leusen, H. J. Bruins Slot, J. H. Noordik, A. D. van der Haest, H. Wynberg, and A. Bruggink, Red. Trav. Chim. Pays Bas, 111, 111 (1992).Towards a Rational Design of Resolving Agents. Part IV. Crystal Packing Analyses and Molecular Mechanics Calculations for Five Pairs of Diastereomeric Salts of Ephedrine and a Cyclic Phosphoric Acid. 53. K . D. Gibson and H. A. Scheraga, J. Phys. Chem., 99, 3752 (1995).Crystal Packing Without Symmetry Constraints. 1. Test of a New Algorithm for Determining Crystal Structures by Energy Minimization. 54. D. E. Williams, Acta Crystallogr., Sect. A, 27,452 (1971).Accelerated Convergence of Crystal-Lattice Potential Sums. 5 5 . N. Karasawa and W. A. Goddard 111,J. Phys. Chem., 93,7320 (1989).Acceleration of Convergence for Lattice Sums. 56. B. P. van Eijck and J. Kroon,]. Phys. Chem. B, 101, 1096 (1997).Coulomb Energy of Polar Crystals. 57. W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, in Numerical RecipesThe Art of Scientific Computing, Cambridge University Press, Cambridge, 1987, pp. 274-334. Minimization or Maximization of Functions. 58. A. Gavezzotti, Acta Crystullogr., Sect. B, 52, 201 (1996). Polymorphism of 7-Dimethylaminocyclopenta[c]coumarin: Packing Analysis and Generation of Trial Crystal Structures. 59. D. E. Williams, Actu Crystallogr., Sect. A, 52, 326 (1996). Ab Initio Molecular Packing Analysis. 60. D. C. Sorescu, B. M. Rice, and D. L. Thompson, J. Phys. Chem. B, 101, 798 (1997).Intermolecular Potential for the Hexahydro-1,3,5-trinitro-1,3,5-S-triazineCrystal (RDX): A Crystal Packing, Monte Carlo and Molecular Dynamics Study.
362 Computer Simulation to Predict Possible Crystal Polymorphs 61. J. E. Dennis and R. B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice-Hall, Englewood Cliffs, NJ, 1983, pp. 167-217. 62. D. E. Williams, Chem. Phys. Lett., 192,538 ( 1 992). OREMWA Prediction of the Structure of Benzene Clusters: Transition from Subsidiary to Global Energy Minima. 63. T. Shoda and D. E. Williams,]. Mol. Struct. (THEOCHEM), 357,l (1995).Molecular Packing Analysis. Part 3. The Prediction of m-Nitroaniline Crystal Structure. 64. L. C. Andrews, H. J. Bernstein, and G. A. Pelletier, Acta Crystallogr., Sect. A, 36,248 (1980). A Perturbation Stable Cell Comparison Technique. 65. H. R. Karfunkel, B. Rohde, F. J. J. Leusen, R. J. Gdanitz, and G. Rihs, ]. Comput. Chem., 14, 1125 (1993). Continuous Similarity Measure Between Nonoverlapping X-Ray Powder Diagrams of Different Crystal Modifications. 66. CeriusZ User Guide, March 1997, Molecular Simulations Inc., 9685 Scranton Road, San Diego, CA. The URL is http://www.msi.com/. 67. B. P. van Eijck and J. Kroon, ]. Comput. Chem., 18, 1036 ( 1 997). Fast Clustering of Equivalent Structures in Crystal Structure Prediction. 68. A. V. Dzyabchenko, Acta Crystallogr., Sect. B, 50,414 (1994).Method of Crystal-Structure Similarity Searching. 69. D. E. Williams, Acta Crystallogr., Sect. A, 36, 715 (1980). Calculated Energy and Conformation of Clusters of Benzene Molecules and Their Relationship to Crystalline Benzene. 70. B. W. van de Waal, Acta Crystallogr., Sect. A, 37,762 (1981).Significance of Calculated Cluster Conformations of Benzene: Comment on a Publication by D. E. Williams. 71. S. Oikawa, M. Tsuda, H. Kato, and T. Urabe, Acta Crystallogr., Sect. B, 41, 437 (1985). Growth Mechanism of Benzene Clusters and Crystalline Benzene. 72. A. Gavezzotti, PROMET3, A Program for the Generation of Possible Crystal Structures from the Molecular Structure of Organic Compounds, 1994. Available on request from A. Gavezzotti, University of Milan, Via Veneziano 21, 1-20133 Milan, Italy. E-mail: gave @stinch12.csmtbo.mi.cnr.it. 73. J. Perlstein, ]. Am. Chem. Soc., 114, 1955 (1992). Molecular Self-Assemblies: Monte Carlo Prediction for the Structure of the One-Dimensional Translation Aggregate. 74. J. Perlstein, J. Am. Chem. Soc., 116,455 (1994).Molecular Self-Assemblies.2. A Computational Method for the Prediction of the Structure of One-Dimensional Screw, Glide, and Inversion Molecular Aggregates and Implications for the Packing of Molecules in Monolayers and Crystals. 75. J. Perlstein, Chem. Mater. 6, 319 (1994).Molecular Self-Assemblies. 3. Quantitative Predictions for the Packing Geometry of Perylenedicarboximide Translation Aggregates and the Effects of Flexible End Groups. Implications for Monolayers and Three-Dimensional Crystal Structure Predictions. 76. J. Perlstein,]. Am. Chem. Soc., 116, 11420 (1994). Molecular Self-Assemblies. 4. Using Kitaigorodskii’s Aufbau Principle for Quantitatively Predicting the Packing Geometry of Semiflexible Organic Molecules in Translation Monolayer Aggregates. 77. J. Perlstein, K. Steppe, S. Vaday, and E. M. N. Ndip,]. Am. Chem. Soc., 118, 8433 (1996). Molecular Self-Assemblies, 5. Analysis of the Vector Properties of Hydrogen Bonding in Crystal Engineering. 78. R. S. Payne, R. J. Roberts, R. C. Rowe, and R. Docherty, 1. Comput. Chem., 19, 1 (1998). The Generation of Crystal Structures of Acetic Acid and Its Halogenated Analogues. 79. D. S. Coombes, G. K. Nagi, and S. L. Price, Chem. Phys. Lett., 265,532 (1997). On the Lack of Hydrogen Bonds in the Crystal Structure of Alloxan. 80. A. Gavezzotti,]. Am. Chem. Soc., 113, 4622 (1991). Generation of Possible Crystal Structures from the Molecular Structure for Low-Polarity Organic Compounds. 81. R. P. Scaringe and S. Perez,]. Phys. Chem., 91,2394 (1987).A Novel Method for Calculating the Structure of Small-Molecule Chains on Polymeric Templates.
References 363 82. N. L. Allinger, J. Am. Chem. SOC.,99, 8127 (1977). Conformational Analysis. 130. MM2. A Hydrocarbon Force Field Utilizing V, and V, Torsional Terms. 83. D. E. Williams, Program mpalmpg, Molecular Packing Analysis/Molecular Packing Graphics, 1996. Department of Chemistry, University of Louisville, Louisville, KY 40292. 84. N. Tajima, T. Tanaka, T. Arikawa, T. Sukarai, S. Teramae, and T. Hirano, Bull. Chem. Soc. Jpn., 68, 519 (1995). A Heuristic Molecular-Dynamics Approach for the Prediction of a Molecular Crystal Structure. 85. B. P. van Eijck, W. T. M. Mooij, and J. Kroon, Acta Crystullogr., Sect. B, 51, 99 (1995). Attempted Prediction of the Crystal Structures of Six Monosaccharides. 86. M. U. Schmidt and U. Englert,]. Chem. SOC.,Dalton Trans., 2077 (1996).Prediction of Crystal Structures. 87. W. R. Busing, WMIN, A Computer Program to Model Molecules and Crystals in Terms of Potential Energy Functions. Report ORNL-5747, 1981. Oak Ridge National Laboratory, Oak Ridge, TN 37831. E-mail: [email protected]. 88. R. J. Woods, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1996, Vol. 9, pp. 129-165. The Application of Molecular Modeling Techniques to the Determination of Oligosaccharide Solution Conformations. 89. D. Feller and E. R. Davidson, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1990, Vol. 1, pp. 1-44. Basis Sets for Ab Initio Molecular Orbital Calculations and Intermolecular Interactions. 90. W. F. van Gunsteren and H. J. C. Berendsen, Angew. Chem., Int. Ed. Engl., 29,992 (1990). Computer Simulation of Molecular Dynamics: Methodology, Applications, and Perspectives in Chemistry. 91. T. P. Lybrand, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1990, Vol. 1, pp. 295-320. Computer Simulation of Biomolecular Systems Using Molecular Dynamics and Free Energy Perturbation Methods. 92. T. Arikawa, N. Tajima, S. Tsuzuki, K. Tanabe, and T. Hirano,J. Mol. Struct. (THEOCHEM), 339,115 (1995). A Possible Crystal Structure of 1,2-Dimethoxyethane: Prediction Based on a Lattice Variable Molecular Dynamics. 93. R. Hoffmann,]. Chem. Phys., 39, 1397 (1963). An Extended Hiickel Theory. I. Hydrocarbons./. Chem. Phys., 40,2745 (1964).Extended Hiickel Theory. 11. u Orbitals in the Azines. J. Chem. Phys., 40, 2474 (1964). Extended Huckel Theory. 111. Compounds of Boron and Nitrogen. J. Chem. Phys., 40,2480 (1964). Extended Hiickel Theory. IV. Carbonium Ions. 94. R. J. Gdanitz, Chem. Pbys. Lett., 190, 391 (1992). Prediction of Molecular Crystal Structures by Monte Carlo Simulated Annealing Without Reference to Diffraction Data. 95. H. R. Karfunkel and R. J. Gdanitz,]. Comput. Chem., 13,1171 (1992). Ab Initio Prediction of Possible Crystal Structures on the Basis of Molecular Information Only. 96. H. R. Karfunkel, F. J. J. Leusen, and R. J. Gdanitz, ]. Cornput.-Aided Muter. Design, 1, 177 (1993). The Ab Initio Prediction of Yet Unknown Molecular Crystal Structures by Solving the Crystal Packing Problem. 97. N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, 1. Chem. Phys., 21, 1087 (1953). Equation of State Calculations by Fast Computing Machines. 98. H. R. Karfunkel and F. J. J. Leusen, Speedup, 6, 43 (1992). Practical Aspects of Predicting Possible Crystal Structures on the Basis of Molecular Information Only. 99. F. J. J. Leusen,J. Crystl. Growth, 166, 900 (1996). Ah Initio Prediction of Polymorphs. 100. D. E. Williams, PCK83. QCPE Program No. 548, 1983. Quantum Chemistry Program Exchange, Indiana University, Bloomington, IN 47405. 101. A. L. Spek, Acta Crystallogr., Sect. A, 46, C34 (1990). PLATON, An Integrated Tool for the Analysis of the Results of a Single Crystal Structure Determination. 102. V. L. Karen and A. D. Mighell, NIST*LATTICE, A Program to Analyze Lattice Relationships, Technical Note 1290, 1991. National Institute of Standards and Technology, Gaithersburg, MD 20899. E-mail: [email protected].
364 Comtmter Simulation to Predict Possible Crystal Polymorbhs 103. K. Mika, J. Hauck, and U. Funk-Kath, J . Appl. Crystullogr,, 27, 1052 (1994). Space-Group Recognition with the Modified Library Program ACMM. 104. Y. Le Page, D. D. Klug, and J. S. Tse, J . Appl. Crystullogr., 29, 503 (1996). Derivation of Conventional Crystallographic Descriptions of New Phases from Results of Ab-Initio Inorganic Structure Modelling. 105. Y. Le Page,]. Appl. Crystullogr., 20,264 (1987). Computer Derivation of the Symmetry Elements Implied in a Structure Description. 106. Y. Le Page,]. Appl. Crystullogr., 21, 983 (1988).MISSYM 1.1-A Flexible New Release. 107. M. Saunders, K. N. Houk, Y.-D. Wu, W. C. Still, M. Lipton, G. Chang, and W. C. Guida, J . Am. Chem. Soc., 112,1419 (1990). Conformations of Cycloheptadecane. A Comparison of Methods for Conformational Searching. 108. A. R. Leach, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1991, Vol. 2, pp. 1-55. A Survey of Methods for Searching the Conformational Space of Small and Medium-Sized Molecules. 109. N. G. Hunt and F. E. Cohen, ]. Comput. Chem., 17,1857 (1996). Fast Lookup Tables for Interatomic Interactions. 110. P.-E. Werner, L. Eriksson, and M. Westdahl,]. Appl. Crystullogr., 18,367 (1985). TREOR, a Semi-Exhaustive Trial-and-Error Powder Indexing Program for All Symmetries. 111. A. Boultif and D. Louer,]. Appl. Crystullogr., 24, 987 (1991). Indexing of Powder Diffraction Patterns for Low-Symmetry Lattices by the Successive Dichotomy Method. 112. H. M. Rietveld, J. Appl. Crystullogr., 2, 65 (1969). A Profile Refinement Method for Nuclear and Magnetic Structures. 113. M. Orozco and F. J. Luque, J . Comput. Chem., 11, 909 (1990). On the Use of AM1 and MNDO Wave Functions to Compute Accurate Electrostatic Charges. 114. M. J. Hwang, T. P. Stockfisch, and A. T. Hagler, J. Am. Chem. Soc., 116, 2515 (1994). Derivation of Class I1 Force Fields. 2. Derivation and Characterization of a Class I1 Force Field, CFF93, for the Alkyl Functional Group and Alkane Molecules. 115. N. Padmaja, S. Ramakumar, and M. A. Viswamitra, Actu Crystullogr., Sect. A , 46, 725 (1990). Space-Group Frequencies of Proteins and of Organic Compounds with More than One Formula Unit in the Asymmetric Unit. 116. W. H. Baur and D. Kassner, Actu Crystullogr., Sect. B, 48,356 (1992).The Perils of Cc: Comparing the Frequencies of Falsely Assigned Space Groups with Their General Population. 117. A. Gavezzotti and G. Filippini,J. Phys. Chem., 98,4831 (1994). Geometry of the Intermolecular X-H..Y (X, Y = N, 0 )Hydrogen Bond and the Calibration of Empirical Hydrogen-Bond Potentials. 118. H. R. Karfunkel, Z. J. Wu, A. Burkhard, G. Rihs, D. Sinnreich, H. M. Buerger, and J. Stanek, Actu Crystullogr., Sect. B, 52, 555 (1996). Crystal Packing Calculations and Rietveld Refinement in Elucidating the Crystal Structure of Two Modifications of 4-Amidinoindanone Guanylhydrazone. 119. T. D. J. Debaerdernaeker, Cryst. Struct. Commun., 1, 39 (1972). 3-Hydroxyestra-1,3,5(10)trien-17-one (Estrone), C,,H,,O,. 120. B. Busetta, C. Courseille, and M. Hospital, Actu Crystullogr., Sect. B, 29,298 (1973). Structures Cristallines et Moliculaires de Trois Formes Polymorphes de I'Oestrone. 121. S. L. Mayo, B. D. Olafson, and W. A. Goddard HI,]. Phys. Chem., 94, 8897 (1990).DREIDING: A Generic Force Field for Molecular Simulations. 122. M. Haisa, S. Kashino, R. Kaiwa, and H. Maeda, Actu Crystullogr., Sect. B, 32,1283 (1976). The Monoclinic Form of p-Hydroxyacetanilide. 123. M. Haisa, S. Kashino, and H. Maeda, Actu Crystullogr., Sect. B, 30,2510 (1974). The Orthorhombic Form of p-Hydroxyacetanilide. 124. J. E. Bertie and R. W. Wilton,J. Chem. Phys., 75, 1639 (1981). Acetic Acid Under Pressure: The Formation Below 0 "C, X-Ray Powder Diffraction Pattern, and Far-Infrared Absorption Spectrum of Phase 11.
References 365 125. T. Shoda, K. Yamahara, K. Okazaki, and D. E. Williams, J. Mol. S t r u t . (THEOCHEM), 333,267 (1995).Molecular Packing Analysis of Benzene Crystals. Part 2. Prediction of Experimental Crystal Structure Polymorphs at Low and High Pressure. 126. C . E. Weir, G. J. Piermarini, and S. Block,]. Chem. Phys., 50,2089 (1969).Crystallography of Some High-pressure Forms of C,H,, CS,, Br,, CCI,, and KNO,. 127. R. Fourme, D. Andrk, and M. Renaud, Actu Crystallogr., Sect. B, 27,1275 (1 971). A Redetermination and Group-Refinement of the Molecular Packing of Benzene I1 at 25 kilobars. 128. S. R. Byrn, P. A. Sutton, B. Tobias, J. Frye, and P. Main,]. Am. Chem. SOC., 110, 1609 (1988). The Crystal Structure, Solid-state NMR Spectra, and Oxygen Reactivity of Five Crystal Forms of Prednisolone tert-Butylacetate. 129. F. J. J. Leusen, unpublished results. 130. K. J. North, in Pigment Handbook, P. A. Lewis, Ed., Wiley, New York, New York, 1988, Vol. 1, Section 1-D-j. 131. C. W. Manger and W. S. Struve, U.S. Patent 2,844,581 (1958);W. S. Struve, U.S. Patent 2,844,485 (1959). 132. G. D. Potts, W. Jones, J. F. Bullock, S. J. Andrews, and S. J. Maginn, 1. Chem. Soc., Chem. Commtln., 2565 (1994).The Crystal Structure of Quinacridone: An Archetypal Pigment. 133. F. J. J. Leusen, M . R. S. Pinches, N. E. Austin, S. J. Maginn, R. Lovell, H. R. Karfunkel, and E. F. Paulus, to be published.
CHAPTER 8
Computational Chemistry in France: A Historical Survey Jean-Louis Rivail and Bernard Maigret Laboratoire de Chimie the'orique, Unite' Mixte de Recherche au Centre National de la Recherche Scientifique (CNRS)N o 7565, Institut Nance'ien de Chimie mole'culaire, Universite' Henri Poincare', Nancy 1, Domaine Universitaire Victor Grignard, B.P. 239, 54506 Vandmuvre-lt?s-Nancy,France
INTRODUCTION The development of stored program digital computers has been perceived as a major event by many theoretical chemists all over the world. Therefore, the chronology analyzed in the case of the United States1 may apply to the situation in France, provided some adjustments, mainly with respect to a difference in aggregate computing power, are made to compensate for the size of our country. Computational chemistry started with early attempts to integrate the Schrodinger equation for molecules. This activity is, of course, still flourishing and gives rise to intensive numerical calculations. As time progressed, other applications appeared. These include statistical computations by means of Monte Carlo or molecular dynamics simulations, in which the thought process is different from that in quantum chemistry because a simulation tracks a large number of events designed to mimic what happens at the microscopic level in a macroscopic sample. This kind of approach can be classified as "numerical exReviews in Computational Chemistry, Volume 12 Kenny B. Lipkowitz and Donald B. Boyd, Editors Wiley-VCH, John Wiley and Sons, Inc., New York, 0 1998
367
368 Comoutational Chemistrv in France: A Historical Survey periments.” A third field in which computers became an indispensable tool for chemists is that of information processing. This is another important aspect of computers and computational science, which are expressed by the French words ordinateurs and informatique, respectively. This short historical essay starts by providing some details on the beginning of computational quantum chemistry in France after World War 11. We then consider the computational aspects of statistical mechanics. The main contributions of French scientists to the development of specialized software are briefly recounted. Processing chemical information is then mentioned, followed by a description of the evolution of computing facilities available to our chemists; the extent of government funding of the field is mentioned, as well. The importance of the facts surrounding the roots of computational chemistry in France can be safely evaluated from the perspective of some distance in time. We do not consider ourselves capable of providing an objective opinion on what happened during the past ten years or so. Accordingly, our bibliography is strongly focused on early writings, which may not be known by younger readers. The few references we selected deal with seminal papers which, according to French tradition, were often published in French, either in the Comptes Rendus de I’Acadkmie des Sciences (C.R. Acad. Sci. Paris) or the lournal de Chimie Physique. In general the important work was followed by other papers or reviews published in ;rarious specialized journals, and often in English.
EARLY AGE OF THEORETICAL CHEMISTRY In France, before World War 11, the application of wave mechanics to understanding the structure of matter was first a subject for physicists dealing with the electronic structure of atoms. In the 1930s, if one excepts Louis de Broglie, who spent most of his life working on the interpretation of quantum mechanics, the most prominent French scientist in the field of electronic structure of atoms was Lion Brillouin. Interest in molecules, and chemical implications, came later. The first assignment of the term “theoretical chemistry” can be found in a laboratory called “Centre de Chimie thtorique de France,” founded in Paris in 1943 by Raymond Daudel under the patronage of Louis de Broglie and Irtne and FrtdCric Joliot-Curie. This laboratory obtained official acknowledgment in 1948 when the Centre National de la Recherche Scientifique (CNRS)started to support its activities. Theoretical chemistry was officially recognized as a branch of science in France a half-century ago. In April 1948 an international symposium organized in Paris under the auspices of the CNRS and the Rockefeller Foundation offered French chemists the opportunity to interact with the world leaders in this new science, including C. A. Coulson, J. A. A. Ketelaar, H. C. Longuet-Higgins, and R. S. Mulliken. In addition, the first chair entitled “theoretical chemistry”
Early Age of Theoretical Chemistry 369 was created in October 1948 at the University of Nancy, and its first titular was Jean Barriol, whose early works dealt with molecular quantum mechanics from a rather basic point of view (group theory). Afterward Barriol became involved with the study of electric polarization effects on single molecules and, later, on liquids from both theoretical and experimental points of view. This growing interest in theoretical chemistry became quite visible during the 1950s. Bernard Pullman, who started his career in a CNRS position, was offered a professorship in quantum chemistry by the Sorbonne in 1954. His reputation was already well established on the basis of his early works, mainly devoted to the properties of welectron systems, and embodied in the book Les the‘ories dectroniques de la chimie organique. The book, which was written in collaboration with his wife, Alberte Pullman, and published in 1953, can be considered as another founding event of theoretical chemistry in France. At the same time, several other universities invited a theoretical chemist to join their faculty. Thus Andrt Julg moved to Marseille in 1957, and in 1958 Bordeaux created a chair of theoretical chemistry for Jean Hoarau. Following these institutions were Rennes (for Claude Gutrillot) and Pau (for Jean Deschamps). The other major events of this period are the transformation, in 1957, of Daudel’s Centre de Chimie thtorique de France into a CNRS research center called Centre de Mtcanique Ondulatoire Appliqute (CMOA) and the foundation in 1958 of a laboratory of theoretical biochemistry by Bernard and Alberte Pullman, resulting in the move of the Pullmans to the Institut de Biologie Physicochimique in Paris. The CMOA moved in 1962 into a vast building, north of Paris, in which the CNRS installed a CDC 7600 multipurpose computer, devoted to both atomic and molecular computations. During the same period, a new quantum chemistry group was founded at the &ole Normale Superieure, under the supervision of Josiane Serre. The situation then remained stable, not evolving for several years, so that the list of French universities having a group active in the field of theoretical chemistry was still rather limited. This situation can be explained by noting that the various departments of chemistry were composed exclusively of experimentalists, chemists who considered the development of their own disciplines to be more important and did not expect much from theory, which they regarded as some kind of “icing on the cake.” This situation, which may have been observed in other countries, was particularly strong in France because of old academic traditions dating from the time when the leaders of French chemistry did not accept the atomic theory. As a consequence, activity in theoretical chemistry was above a critical threshold only in Paris, where thanks to Raymond Daudel and the Pullmans, quite an active intellectual life developed. The vitality of these groups is evident in their scientific production and also in two books that played an important, worldwide role in promoting computational quantum chemistry: Quantum Chemistry,Methods and Applications, published in 1959 by Raymond Daudel, Roland Lefebvre, and Carl Moser,2 and Quantum Biochemistry, published in 1963 by Bernard and Alberte P ~ I l m a n . ~
370 Combutational Chemistrv in France: A Historical Survev
A community started to develop as a result of the monthly seminars of the CMOA. In addition, the summer schools organized in Menton, in the South of France, soon became an important international meeting place and, from a French point of view, played an important role in helping the small minority of theoretical chemists to take part in the worldwide adventure of their discipline. Fortunately the situation was not as stark as it appears if one looks only at the number of academic positions at that time. The CNRS, which created its own hierarchy for full-time research scientists (equivalent to the hierarchy of university professors), offered such positions to many theoretical chemists, thereby expanding the number of theoretical chemists active in France. Indeed, if one limits the list to our now retired colleagues, one should remember that Alberte Pullman, Gaston Berthier, Odilon Chalvet, Carl Moser, Roland Lefebvre, and Alain Veillard spent their full careers with the CNRS. as~ The main characteristic of French theoretical chemistry of the 1 9 5 0 ~ elsewhere in the world for the most part, is that the computations were performed by hand with a desktop mechanical calculator. The self-consistent field (SCF) computation on the IT system of azulene (10 electrons) by Andrk Julg is a typical example of the work done by the pioneers. After having computed the 4500 integrals required for this system, Julg started the SCF iterations, which exhibited a strong divergence that resisted the standard numerical convergence recipes of the time. By looking at the results, he found a very efficient graphical method, which led him to the solution before his competitors, who were using a c ~ m p u t e rNevertheless, .~ his own estimate of the human time spent on this problem is more than 4000 hours! During these early years, the French theoretical chemists played an important part in the applications of quantum mechanics to chemistry as described in many papers. Among the most original contributions, one may select a few topics such as: the relationship between the electronic structure of aromatic compounds and carcinogenicity5 the so-called “lodge” theory, which is one of the early attempts to analyze electronic density of atoms and molecules on the basis of information theory6 an application of London (gauge-invariant) atomic orbitals (GIAO) to the computation of molecular magnetic susceptibility’ the first unrestricted Hartree-Fock computations* an improved SCF-LCAO method for IT electron^.^
COMPUTATIONAL QUANTUM CHEMISTRY Modern quantum chemistry is strongly dependent on time-consuming computations, so the introduction of the first computers, at the end of the
Computational Quantum Chemistry 371 1950s, initiated a new era in theoretical chemistry. The first all-electron valence bond calculation was performed on the hydrogen fluoride (HF) molecule in 1953.1° The first all-electron SCF studies using molecular orbitals (MO) represented by linear combinations of atomic orbitals (LCAO),still on diatomics, came later.l13l2 Finally the first ab initio computation using a Gaussian-type function basis set on a polyatomic molecule, hydrazine, was published in 1966.13 The general tendency that soon appeared in the French community, as e l ~ e w h e r e ,was ~ ? ~for ~ the theoretical chemists to split into two categories: the methodologists, who were trying to improve the accuracy and the efficiency of the methods, and those who were in contact with experiment. The methodologists were mainly concerned with electron correlation. Most of them stem from the Pullmans’ group and later from the kcole Normale SupCrieure, and, in both cases, Gaston Berthier played an important part in the development of the French school. In continuation of the early work of Brillouin,15 multiconfiguration computations became a permanent area of interest16 leading to the generalized Brillouin theorem.’’ These works are at the basis of multiconfiguration self-consistent field (MCSCF) methodologies that are still in use.18 In addition, the problem of electron correlation and the selection of the configurations in configuration interaction (CI) computations received an efficient solution in the form of a method with a long name: configuration interaction by perturbation with multiconfigurational zeroth-order wavefunction selected iteratively (CIPSI).19 The main promoters of this method were Jean-Paul Malrieu and Jean-Pierre Daudey. Malrieu moved from Paris to Toulouse in 1974, where he joined Philippe Durand’s group of theoretical physicists. Daudey joined them in 1978, and nowadays the Toulouse group is quite prosperous and active in the fields of electron correlation, pseudopotentials, and effective Hamiltonians. Similarly, Bernard LCvy emigrated from the kcole Normale SupCrieure to Orsay in 1985, where his group is mainly concerned with the accurate treatment of large (from a quantum chemist’s point of view) systems. The second category of quantum chemists has more members working in various fields. Chemical reactivity is, of course, one of the major subjects for chemists, and several groups soon took a leading position in the field.20 This is particularly true for the group that Lionel Salem founded when he settled in Orsay.21 He introduced an orbital analysis in chemical reactivity studies and initiated reaction dynamics studies. His group has spawned groups in Lyon (Bernard Bigot), Montpellier (Odile Eisenstein and Claude Leforestier), and Paris (Alain Sevin); but the theoretical chemistry laboratory in Orsay, now headed by Xavier Chapuisat, maintains its tradition of excellence. In 1968, Alain Veillard moved to Strasbourg, where he started a laboratory that soon became recognized for the study of transition metal compounds.22 Later, the laboratory, now directed by Elise Kochanski, broadened its interests to include the study of intermolecular interactions.
372 Computational Chemistry in France: A Historical Survey In Nancy, Jean Barrio1 retired in 1974. His successor, Jean-Louis Rivail, started the very early quantum chemical studies of solvated species,23 and when Bernard Maigret joined the group in 1991, the field of investigation was widened to include biomolecular systems. Among the early theoretical chemistry groups is the laboratory in Bordeaux, where Jean-Claude Rayez introduced reaction dynamics and kinetics studies in connection with experimentalists working on molecular beams. The group in Pau, under the direction of Alain Dargelos, is concerned with the computation of spectroscopic properties. Daudel's CMOA disappeared in 1984, but most of his co-workers moved into the UniversitC Pierre et Marie Curie, where Marcel Allavena founded a laboratory called DIM (dynamrque des interactions molkulaires). This group is now part of the Paris theoretical chemistry laboratory directed by Alain Sevin. In the field of theoretical biochemistry, the successes of Alberte and Bernard Pullman are now legendary. This group was truly monumental in the development of biomolecular computing in France, and the many major scientists trained in their laboratory were the seeds for the extensions of computational chemistry in our country. The tradition of this laboratory is maintained by one of these seeds, namely Richard Lavery, who focused on modeling nucleic a ~ i d s . ~But ~ , many ~ " other researchers who defended their Ph.D. theses in this laboratory continue to pursue their own research in the basic spirit of the Pullmans. Since about 1960, and especially after the pivotal publication of their book reporting simple Huckel (velectron) molecular orbital calculations on biom~lecules,~ the Pullmans saw their laboratory become one of the most attractive and creative centers for molecular computations. Their pioneering work and leadership were acknowledged by an extraordinary number of awards for their major contribution to the development and recognition of what is now called computational chemistry. Because they awakened scientists to the possibility that computations were possible and informative for drug molecules, Bernard and Alberte Pullman opened the door to the present state of the art in ligand design, quantitative structure-activity relationships (QSAR), and molecular simulations on biomolecules. Mention is made of a new field appearing in the late 1980s, theoretical astrochemistry, under the guidance of Yves Ellinger. The late Pierre Claverie was one of the most influential French researchers in the field of intermolecular interactions. His work dedicated to the foundation of molecular force fields and their links with quantum chemistry was truly vi~ionary.~~ Since . ~ ' about 1970, Claverie has pointed out the importance of taking polarization effects into account in molecular mechanics, and he proposed the concept of self-encased different levels of computations regarding solvent effects, thereby showing the road to the present development of hybrid quantum mechanics/molecular mechanics ( Q WMM)methods.
Software Development 373
STATISTICAL MECHANICS With the seminal work of Jean Yvon28 on the statistical mechanical treatment of liquids, one would have expected an original school in this field. This did not happen, probably because of the difficulties of the subject and the success of quantum mechanics. Hence, for years, the macroscopic properties of matter were far from the concern of theoretical chemists, except in rare cases.29 Following the advent of the method of Monte Carlo sampling, a decisive change occurred in the late 1960s. The work of Loup Verlet30 played an important part, together with the availability of increasingly powerful computational facilities, in the development of molecular dynamics and of simulations of molecular liquids. Verlet’s co-workers in Orsay are still active in this field. Some years later, Savo Bratos founded in Paris a laboratory in which the theoretical treatment of molecular liquids soon reached a high level.31 Nowadays, some important chemical problems are approached by means of computational statistical thermodynamic^.^^ In the meantime, the barriers between various theoretical fields have tended to vanish, and the interplay between statistical mechanics and quantum chemistry is becoming stronger and stronger, leading to a more comprehensive approach to chemical problems in condensed phases.
SOFTWARE DEVELOPMENT Among the outstanding contributions to computational quantum chemistry, we have mentioned the CIPSI p r ~ g r a m , ’which ~ allows efficient postHartree-Fock computations. A Hartree-Fock molecular orbital program, ASTERIX, adapted to vector and parallel computers, was developed in Strasb0u1-g.~~ Before these codes, which deal with high-level ab initio computations, some successful attempts were made to use semiempirical methods to solve some problems not tractable at a more rigorous level. The best example is probably the perturbation configuration interaction using localized orbitals (PCILO) method,34 which combines the simplicity of the CNDO (and INDO) approximations to a moderate configuration interaction with a basis of localized orbitals. This method gave a fair estimate of conformational energy changes. Thanks to this method, the Pullmans’ group produced a series of pioneering works in the conformational analysis of modest-sized biomolecules of many types. The discovery of the so-called C, conformation of dipeptides was a result.35 The first system of programs to allow a full analytic computation of energy derivatives and geometry optimization within the framework of any semi-
374 Computational Chemistry in France: A Historical Survey
empirical MO theory method was written in 1972 under the name of GEOM0.36 It has been further improved by including solvent effect simulations, giving rise to the GEOMOS pa~kage.~’ On the basis of a sound analysis of intermolecular interactions, performed by means of a quantum perturbational approach, Claverie derived a force field that could suitably represent intermolecular interactions.26 The electrostatic interactions are described by means of a distributed multipole analysis, and induction effects are taken into account. The force field sum of interactions between fragments completed ab initio (SIBFA)27originated from this study and was subsequently applied successfully to many biophysical problems. Also in the field of biomacromolecules, several powerful methods for conformational analysis and shape descriptors of nucleic acids24 and proteins25 have been recently developed by Lavery, who maintains the tradition of the theoretical biochemistry laboratory in Paris. Several groups have a special interest in chemical information handling, in particular the subjects of chemical structure storage and computer-assisted chemical synthesis. The most remarkable French contribution in this area is the DARC system developed by Jacques-fimile Dubois for molecular encoding and chemical information retrieval.38
COMPUTATIONALFACILITIES The 1950s were characterized by early uses of computers for solving chemical problems. The first mention of such results can be found in the papers of the Daudel and Pullman39 groups. These first computations were made possible because of access granted to those scientists by hardware companies (Bull, IBM). Nevertheless, during the second half of the 1950s, several academic institutions started purchasing their own computers (an IBM 604, soon replaced by a model 650, was installed at the University of Nancy in 1957).These local, multipurpose computing centers developed with the computer technology, but without offering the highest computational power of the moment, as available elsewhere. At the end of the 1960s, these facilities appeared insufficient to chemists, who then asked for the creation of a specialized center devoted to theoretical chemistry or, at least, to scientific computing. The CNRS director, Pierre Jacquinot, and his successor, Hubert Curien, favored the second solution and entrusted an astrophysicist, Janine Connes, with the creation of a national center for scientific computing. This facility, called Centre Inter-Regional de Calcul Electronique (CIRCE),was founded on January l, 1969. From its very beginning, the organization hosted another institution named the Centre Europken de Calculs Atomiques et Mokculaires (CECAM), an international project that organized workshops on computational chemistry and physics. In
Teaching Computational Chemistry 375 1993, CIRCE was replaced by another organization still in Orsay, Institut pour le Dkeloppement et la Recherche en Informatique Scientifique (IDRIS), and CECAM moved to Lyon. The first vectorial machine in France (a Cray 1 ) was bought by the Atomic Energy Agency (CEA) and opened to academics in 1981. In 1983, vector computing was organized independently from CIRCE, to be shared by several research institutions including the CNRS. This situation was changed with the termination of CIRCE and the creation of IDRIS, which now offers vector and parallel computing facilities (three Cray supercomputers: a C98, a C94, and a T3E). In the meantime, most of the local computing centers disappeared and at the behest of the Ministry of Education, another national center was founded in Montpellier, the Centre National Universitaire Sud de Calcul (CNUSC).The CNUSC is accessible from everywhere in the country by means of the Internet.
INDUSTRY The development of computational chemistry at French chemical and pharmaceutical companies is relatively recent. Only Roussel-Uclaf (Romainville) developed a real computational chemistry group in the 1970s, built around Gilles Moreau and N. Claude Cohen, in close collaboration with the Pullmans’ laboratory. The two Roussel-Uclaf workers developed, respectively, an original method for QSAR (the autocorrelation method) and the SCRIPT program for molecular modeling. Both these tools are still being used and developed inside the current organization of the company (Hoechst Marion Roussel). Rh8ne-Poulenc Rorer has groups of computational chemists in France, England, and the United States. Like most other pharmaceutical companies, however, Rh8ne-Poulenc makes use of commercial software as “black boxes,” and thus the pioneering efforts of French software developers are of little influence.
TEACHING COMPUTATIONAL CHEMISTRY Most of the curricula in chemistry or physical chemistry include at least an introduction to scientific computing. Specific applications to chemistry are taught in many universities. Computational chemistry-in French Chimie informatique, where “informatique” is an adjective-is considered as a specialty and appears at the predoctoral level. Since 1987, a national predoctoral program (Diplomed’Etudes Approfondies),called Chimie informatique et the‘orique,has been taught jointly in seven universities: Henri PoincarC (Nancy),Paris Sud (Orsay),Pierre et Marie Curie (Paris VI), Denis Diderot (Paris VII), Rennes I, Louis Pasteur (Stras-
3 76 Computational Chemistry in France: A Historical Survey
bourg), and Paul Sabatier (Toulouse). Its aim is to teach, at a research level, basic computational science, quantum chemistry, and molecular modeling; other applications of computers in chemistry are not excluded, however. For example, some students pursue work in diverse fields such as experimental control and data processing in nuclear magnetic resonance spectroscopy. So far, all the students, especially those who ended with a thesis in computational chemistry, have been able to find either academic or industrial positions.
GOVERNMENT FUNDING In France a large part of academic research (universities and the CNRS) is supported by the government. The early work in theoretical chemistry would not have been possible without full public funding. We have mentioned, for example, the crucial role played by the CNRS in the development of theoretical chemistry. Similarly, the computing centers, in particular CIRCE, CNUSC, and now IDRIS, are almost entirely funded by national bodies. In addition, some research programs intended for developing special aspects of computing have been launched by the CNRS. This has been the case for a program on chemical modeling, which was a joint CNRS-IBM initiative in 1988. More recently, a project was initiated on computer-aided synthesis, and another project, which is sponsored in part by the French Petroleum Institute (IFP), deals with quantum mechanics applied to heterogeneous catalysis. Finally, specialized programs dealing, for instance, with astrophysics and astrochemistry are also concerned with computing. Among the other governmental research institutions having some activity in the field of computational chemistry, we have mentioned CEA. The National Research Institute for Informatics and Automatics (INRIA) is another example. Both institutions collaborate with the CNRS and the universities.
CONCLUSION In half a century, the impact of computers in chemistry has developed to an extent that was probably difficult to predict when the first attempts were made. This situation is common to other sciences in France and elsewhere. Progress correlates with spectacular increases in computer performance. Obviously, the evolution is not finished, and, no doubt, some studies that are out of reach of today’s computers will be feasible on the hardware of the future. For the time being, the users of existing computers do not share much among themselves except the use of the machines. For the future, one may infer that some of them, in particular those who develop new codes, will benefit even
References 377 more than now by exchanging knowledge and solutions, mainly because the machines are becoming more and more complex, and the variety of problems handled by them is rapidly becoming more comprehensive in scope. This evolution is becoming apparent in many places; for instance, probably one of the most successful attempts at making connections between mathematicians, computer scientists, physicists, chemists, geologists, and other scientists is occurring at the Charles Hermite Center in Nancy.40 In this joint project involving local universities, the INRIA, and the CNRS, there is an intense scientific activity dealing with new computational strategies for intensive computation and modeling to run on a 64-processor Origin 2000 Silicon Graphics parallel computer. In the near future, improved software and optimal use of such modern computers will significantly enhance interdisciplinary cooperation.
ACKNOWLEDGMENTS The authors are grateful to their colleagues Gaston Berthier, Janine Connes, Raymond Daudel, Andre Julg, Jean Hoarau, and Alberte Pullman for providing useful information.
REFERENCES 1. J. D. Bolcer and R. B. Hermann, in Reviews in Computational Chemistry, K . B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1994, Vol. 5, pp. 1-63. The Development of Computational Chemistry in the United States. 2. R. Daudel, R. Lefebvre, and C. Moser, Quantum Chemistry, Methods and Applications, Wiley-Interscience, New York, 1959. 3. B. Pullman and A. Pullman, Quantum Biochemistry, Wiley-Interscience, New York, 1963. 4. A. Julg, C.R. Acad. Sci. Paris, 239, 1498 (1954). Structure electronique de I’azulkne: Etude par la mithode du champ self-consistent. A. Julg,]. Chim. Phys., 52, 377 (1955).ktude dc I’azulene par la methode du champ moliculaire self-consistent. 5 . A. Pullman, C.R. Acad. Sci. Paris, 221, 140 (1945).Mise en evidence d’une liaison trks apte a I’addition (rigion K) chez certaines molecules cancerigknes. 6. R. Daudel, S. Odiot, and H. Brion, C.R. Acad. Sci. Paris. 238,458 (1954).La notion de loge et la signification giometrique de la notion de couche dans le cortege electronique des atomes. H. Brion, R. Daudel and S. Odiot,]. Chim. Phys., 51, 553 (1954). Theorie de la localisabilite des corpuscules. IV. Emploi de la notion de loge dans I’ktude des liaisons chimiques. 7. M. Mayot, G. Berthier, and 8.Pullman,/. Phys. Radium, 12, 652 (1951). Calcul quantique de I’anisotropie diamagnCtique des molicules organiques. I. La methode. G . Berthier, M. Mayot, and B. Pullman,]. Phys. Radium, 12,717 (1951). Calcul quantique de I’anisotropie diamagnetique des molecules organiques. 11. Principaux groupes d’hydrocarbures aromatiques. J. Hoarau,]. Chim. Phys., 57, 855 (1960). Calcul de l’anisotropie diamagnetique de quelques systemes graphitiques. 8. G. Berthier, C.R. Acad. Sci. Paris, 238,91 (1954). Extension de la mithode du champ moleculaire self-consistent a I’etude des etats a couches incompletes. G . Berthier, I. Chim. Phys.,
3 78 Cornbzftational Chemistrv in France: A Hzstorzcaf Survev
9. 10. 11. 12. 13. 14.
15. 16. 17. 18. 19.
20. 21.
22. 23.
51, 363 ( 1 954). Configurations tlectroniques incomplktes. Partie I. La methode du champ molPculaire self-consistent et I’kude des .&tatsb couches incomplktes. A. Julg, J . Chim. Phys., 58, 19 (1960). Traitement L.C.A.O. amtliort des molecules conjugutes. I. ThCorie gentrale. Applications aux hydrocarbures. D. Kastler, C.R. Acad. Sci. Paris, 236,1271 (1953).Thtorie quantique de la moltcule d’acide fluorhydrique. G. Berthier, Mol. Phys., 2, 225 (1959). A Self-consistent Field for the H, Molecule. H. Brion, C. Moser, and M. Yamazaki, J. Chem. Phys., 30, 673 (1959). Electronic Structure of Nitric Oxide. A. Veiiiard, Theor. Chim. Acta, 5 , 413 (1966). Quantum Mechanical Calculations on Barriers to Internal Rotation. I. Self-consistent Field Wavefunctions and Potential Energy Curves for the Hydrazine Molecule in the Gaussian Approximation. S. J. Smith and B. T. Sutcliffe, in Reviews in Computational Chemistry, K. B. Lipkowitz and D. B. Boyd, Eds., VCH Publishers, New York, 1997, Vol. 10, pp. 271-316. The Development of Computational Chemistry in the United Kingdom. See also, C. A. Coulson, Rev. Mod. Phys., 32, 170 (1960).Present State of Molecular Structure Calculations. L. Brillouin, Actualitb Scientifiques et Industrielles, Vol. 159, Hermann, Paris, 1934. Les champs “self-consistents” de Hartree et de Fock. R. Lefebvre, C.R. Acad. Sci. Paris, 237, 1158 (1953).Sur I’application de la mtthode d’interaction de configuration aux molecules. B. Ltvy and G. Berthier, Znt. J. Quantum Chem., 2,307 (1968).Generalized Brillouin Theorem for the Multiconfigurational SCF Method. B. Ltvy, Chem. Phys. Lett., 4, 17 (1969).Multi-configuration Self-consistent Wavefunctions of Formaldehyde. (This is probably the first MCSCF computation on a polyatomic molecule.) B. Huron, J. P. Malrieu, and P. Rancurel, J. Chem. Phys., 58, 5745 (1973).Iterative Perturbation Calculations of Ground and Excited State Energies from Multiconfigurational Zeroth-Order Wavefunctions. R. Daudel and 0. Chalvet, Colloques Znternationaux du CNRS (CNRS, Paris, 1958), Calcul des Fonctions d’Onde Moleculaires, No. 82, p. 389. Theorie du mecanisme des reactions. 111. Sur I’application de la Chimie Quantique ila determination des mecanismes de reaction. L. Salem, J. Am. Chem. Soc., 90, 543 (1968). Intermolecular Orbital Theory of the Interaction Between Conjugated systems. I. General Theory. L. Salem, J. Am. Chem. SOC., 90,553 (1968).Intermolecular Orbital Theory of the Interaction Between Conjugated Systems. 11. Thermal and Photochemical Cycloadditions. H. M. Gladney and A. Veillard, Phys. Rev., 180,385 (1969).A Limited Basis Set Hartree-Fock Theory of Ni;-. D. Rinaldi and J.-L. Rivail, Theor. Chim. Acta, 32, 57 (1973). Polarisabilitts molkculaires et effet diilectrique de milieu i I’ttat liquide. fitude thiorique de la moltcule d’eau et de ses dimkres.
24. R. Lavery and H. Sklenar, J. Biomol. Struct. Dyn., 6, 63 (1988). The Definition of Generalized Helicoidal Parameters and of Axis Curvature for Irregular Nucleic Acids. R. Lavery and H. Sklenar, J. Biomol. Struct. Dyn., 6,655 (1989).Defining the Structure of Irregular Nucleic Acids: Conventions and Principles. R. Lavery, in Unusual D N A Structures, R. D. Wells and S. C. Harvey, Eds., Springer-Verlag, New York, 1988, pp. 189-206. DNA Flexibility Under Control: the JUMNA Algorithm and Its Application to BZ Junctions. R. Lavery, K. Zakrzewska, and H. Sklenar, Comput. Pbys. Commun., 91,135 (1995).JUMNA: Junction Minimisation of Nucleic Acids. 25. H. Sklenar, C. Etchebest, and R. Lavery, Proteins: Struct., Funct., Genet., 6, 46 (1989). Describing Protein Structure: A General Algorithm Yielding Complete Helicoidal Parameters and a Unique Overall Axis. 26. M. J. Huron and P. Claverie, Chem. Phys. Lett., 4, 429 (1969). Practical Improvements for
References 379
27.
28. 29. 30. 31. 32.
33.
34.
35. 36.
the Calculation of Intermolecular Energies. M. J. Huron and P. Claverie, Chem. Phys. Lett., 9, 194 (1971).Study of Solute-Solvent Interactions. N. Gresh, P. Claverie, and A. Pullman, 1nt.j. Quantum Chem., 13,243 (1979).Intermolecular Interactions: Reproduction of the Results of Ab Initio Supermolecule Computations by an Additive Procedure. N. Gresh, P. Claverie, and A. Pullman, Theor. Chim. Acta, 66, 1 (1 984). Theoretical Studies of Molecular Conformation. Derivation of an Additive Procedure of the Computation of Intramolecular Interaction Energies. Comparison with Ab Initio SCF Computations. N. Gresh, A. Pullman, and P. Claverie, Theor. Chim. Acta, 67, 11 ( 1 985). Theoretical Studies of Molecular Conformation. 11. Application of the SIBFA Procedure to Molecules Containing Carbonyl and Carboxylate Oxygens and Amide Nitrogens. J. Yvon, Actualitis Scientifiques et lndustrielles, Vol. 203, Hermann, Paris 1935. La theorie statistique des fluides et I’iquation d’etat. J. L. Greffe and J. Barriol, C.R. Acad. Sci. Paris, 270C, 253 (1970). Contribution au calcul statistique du facteur g de Kirkwood pour des liquides polaires purs. L. Verlet, Phys. Rev., 159, 98 (1967).Computer “Experiments” on Classical Fluids. 1. Thermodynamical Properties of Lennard-Jones Molecules. L. Verlet, Phys. Rev. 165,201 (1968). Computer “Experiments” on Classical Fluids. 11. Equilibrium Correlation Functions. Y. Guissani, B. Guillot, and S. Bratos, 1. Chem. Phys., 88, 5850 (1988).The Statistical Mechanics of the Ionic Equilibrium of Water: A Computer Simulation Study. G. Wipff and L. Troxler, in Computational Approaches in Supramolecular Chemistry, G. Wipff, Ed., NATO AS1 Series, Kluwer, Amsterdam, 1994, pp. 319-348. MD Simulations on Synthetic Ionophores and Their Cation Complexes: Comparisons of AqueousNon-aqueous Solvents. R. Ernenwein, M.-M. Rohmer, and M. Benard, Comput. Phys. Commun., 58,305 ( 1 990). A Program System for Ab Initio MO Calculations on Vector and Parallel Processing Machines. I. Evaluation of Integrals. M.-M. Rohmer, J. Demuynck, M. Binard, R. Wiest, C. Bachmann, C. Henriet, and R. Ernenwein, Comput. Phys. Commun., 60, 127 (1990). A Program System for Ab Initio M O Calculations on Vector and Parallel Processing Machines. 11. SCF Closed-Shell and Open-Shell Iterations. R. Wiest, J. Demuynck, M. Binard, M.-M. Rohmer, and R. Ernenwein, Comput. Phys. Commun., 62,107 (1991). A Program System for Ab Initio MO Calculations on Vector and Parallel Processing Machines. 111. Integral Reordering and Four-Index Transformation. S. Diner, J. P. Malrieu, and P. Claverie, Theor. Chim. Acta, 13, 1 (1969). Localized Bond Orbitals and the Correlation Problem. I. Perturbation Calculation of the Ground-State Energy. J. P. Malrieu, P. Claverie, and S. Diner, Theor. Chzm. Acta, 13, 18 (1969). Localized Bond Orbitals and the Correlation Problem. 11. Application to .ir-Electron Systems. S. Diner, J. P. Malrieu, F. Jordan, and M. Gilbert, Theor. Chim. Acta, 15,100 (1969). Localized Bond Orbitals and the Correlation Problem. 111. Energy Up to the Third Order in the Zero-Differential Overlap Approximation. Application to u-Electron Systems. F. Jordan, M. Gilbert, J. P. Malrieu, and U. Pincelli, Theor. Chim. Acta, 15, 21 1 (1969).Localized Bond Orbitals and the Correlation Problem. IV. Stability of the Perturbation Energies with Respect to Bond Hybridization and Polarity. P. Claverie, J. P. Daudey, S. Diner, C. L. Giessner-Prettre, M. Gilbert, J. Langlet, J. P. Malrieu, U. Pincelli, and B. Pullman, Quantum Chemistry Program Exchange, Bloomington, IN. QCPE Program no 220, PCILO: Perturbation Configuration Interaction Using Localized Orbital Method in the CNDO Hypothesis. J. Langlet, J. P. Malrieu, J. Douady, Y. Ellinger, and R. Subra, Quantum Chemistry Program Exchange, Bloomington, IN. QCPE Program no 327, PCIRAD: The PCILO Method Extended to Localized Open Shell Systems. J. Douady, B. Barone, Y. Ellinger, and R. Subra, Quantum Chemistry Program Exchange, Bloomington, IN. QCPE Program no 371, PCILINDO: The PCILO Method in the INDO Approximation. B. Pullman and A. Pullman, Adv. Protein Chem., 28, 347 (1974).Molecular Orbital Calculations on the Conformation of Amino Acid Residues of Proteins. D. Rinaldi and J.-L. Rivail, C.R. Acad. Sci. Paris, 274C, 1664 (1972).Recherche rapide de la geometrie d’une molicule a I’aide des mithodes LCAO semiempiriques ne faisant intervenir que des intkgrales mono- et bicentriques. D. Rinaldi, Quantum Chemistry Program Ex-
380 Computational Chemistry in France: A Historical Survey
37. 38.
39. 40.
change, Bloomington, IN. QCPE Program no 290, GEOMO: A System of Programs for the Quantitative Determination of Molecular Geometries and Molecular Orbitals. D. Rinaldi, P. E. Hoggan, and A. Cartier, Quantum Chemistry Program Exchange, Bloomington, IN. QCPE Program no 584, GEOMOS: Semiempirical SCF System Dealing with Solvent Effects and Solid Surface Adsorption. J.-E. Dubois, D. Laurent, and H. Veillard, C.R. Acad. Sci. Paris, 263C, 764 ( 1 966). Systtme de documentation et de recherches de corrklations (DARC). Principes gdneraux. J.-E. Dubois, D. Laurent, and H. Viellard, C.R. Acad. Sci. Pa+, 263C, 1245 (1966). S y s t h e DARC. Description structurale et polymatricielle (DSP). Ecriture des matrices formelles. M. Mayot, H. Berthod, G. Berthier, and A. Pullman,J. Chim. Phys., 53, 774 (1956). Calcul des intkgrales polycentriques relatives 1 I’etude des structures moltculaires. 1. Intkgrale tricentrique homonucldaire du type Coulomb-kchange. Centre Charles Hermite. http://www.loria.fr/CCW
Reviews in Computational Chemistry, Volume12 Edited by Kenny B. Lipkowitz, Donald B. Boyd Copyright 0 1998 by Wiley-VCH, Inc.
Author Index Abell, G. C., 238 Abeygunawardana, C., 61 Abraham, F. E., 69 Abrarnowitz, M., 203 Ackers, G. K., 325 Adant, C., 274,278 Agren, H., 277,279 Aiga, F., 277 Akke, M., 61 Alagona, G., 323 Albert, I. D. L., 279 Alder, B. J., 61, 202, 237 Alexandrowicz, Z., 6 2 , 6 4 Al-Laham, M. A., 360 Allan, D. C., 360 Allen, A. T., 135 Allen, F. H., 360 Allen, M. P., 61, 135,202,203, 324 Allinger, N. L., 237,239, 323,359, 363 Allison, S. A., 326 Alonso, J. A., 201 Amadei, A., 325 Ammon, H., 360 Andersen, H. C., 136 Anderson, A. G., 63, 70, 71 Anderson, T. W., 323 Andr6, D., 365 Andres, J. L., 360 Andrews, L. C., 359,362 Andrews, S. J., 359, 365 Apra, E., 360 Aqvist, J., 69, 73, 325 Archontis, G., 71, 72 Arfken, G., 136 Arias, T. A., 360 Arikawa, T., 36 Ashcroft, N. W., 202 Ashwell, G. J., 275 Atkins, P. W., 275 Attard, P., 202 Auffinger, P., 202 Austin, N. E., 365
Avdeev, V. I., 199 AvilCs, F. X., 71 Ayala, P. Y., 360 Bachmann, C., 379 Bachrach, S. M., 360 Bader, J. S., 72,324 Baker, J., 360 Balbes, L. M., 237 Baldridge, K. K., 278,360 Balluffi, R. W., 239 Barabino, G., 201 Baranyai, A., 136 Barber, M., 203 Barford, R. A., 73 Barker, J. A., 66, 69 Barnes, A. J., 134 Barnes, P., 200 Barojas, J., 135 Barone, B., 379 Barriol, J., 379 Bartlett, R. J., 275, 276, 277, 278 Bartolotti, L. J., 201, 237 Basch, H., 205 Bascle, J., 62 Bash, P. A,, 69, 205 Baskes, M. I., 237, 239 Baums, D., 275 Baur, W. H., 364 Bayly, C. I., 326, 361 Baysal, C., 73 Beazley, D. M., 239 Becke, A. D., 278 Becker, J. M., 61 Beers, Y.,200 Belch, A. C., 67 Belhadj, M., 72, 325 Bell, C. D., 68 Bellemans, A., 134 BCnard, M., 379 Benjamin, I., 199,202,205 Ben-Naim, A., 64
381
382 Author Index Bennett, C., 64 Berard, D. R., 200,201,202,204 Berendsen, H. J. C., 66,69,70, 71,134, 136, 199,200,323,324,325,363 Berens, P. H., 136 Bereolos, P., 64 Berg, B. A., 62, 63, 74 Berkowitz, M., 67,201,202 Berkowitz, M. L., 202,203 Bernardo, D. N., 324 Berne, B. J., 67, 134, 200,205 Bernstein, H. J., 359, 362 Bernstein, J., 359 Berthier, G., 377, 378, 380 Berthod, H., 380 Bertie, J. E., 364 Bertolini, D., 200 Bertoni, C. M., 202 Besler, B. H., 361 Beutler, T. C., 67, 70, 71 Beveridge, D. L., 61,63, 67, 69,202 Bhanot, G., 74 Bienstock, R. J., 65 Binder, K., 61,66 Bishop, D. M., 274, 276,278 Biswas, R., 238 Bixon, M., 134 Bjork, A., 136 Black, S., 74 Black, S. N., 359 Blake, J. F., 71 Bleil, R. E., 323 Block, S., 365 Bloor, D., 275, 278 Boatz, J. A., 278, 360 Bocker, J., 200,201 Boczko, E. M., 67 Bogusz, S., 203 Bolcer, J. D., 377 Boles, J. O., 326 Bolger, M. B., 70 Boresch, S., 71 Borges, G. L., 204 Boris, J. P., 136 Born, M., 324 Boudon, S., 70 Boultif, A., 364 Bouzida, D., 67, 204 Bowen, J. P., 237,323,359 Bowles, R. K., 66 Boyd, D. B., v, vi, vii, 63, 64, 134, 135,201, 202,205,236,237,277, 323, 359,360, 363,364,377,378
Boyd, R. J., 324 Boyd, R. W., 275 Brady, J.%64 Bratos, S., 379 Bredas, J. L., 274,276,278 Bremi, T., 67 Breneman, C. M., 361 Brenner, D. W., 237,238, 239 Briels, W. J., 68 Brillouin, L., 378 Brinkley, J. S., 360 Brion, H., 377, 378 Brock, C. P., 359 Brodsky, A. M., 74,204 Brook, R. J., 275 Brooks, B. R., 64,65,203,323,326 Brooks, C. L., 111, 63,67, 68, 69, 70, 135, 136,323 Brown, B. C., 66 Brown, F. K., 69,70 Bruccoleri, R. E., 64,323 Bruggink, A., 361 Bruins Slot, H. J.%361 Briinger, A., 70, 325 Bruni, F., 200 Bruschweiler, R., 61 Buckingham, A. D., 275 Buckley, A., 275 Buckley, A. M., 359 Buckner, J. K., 68, 70 Buerger, H. M., 364 Bullock, J. F., 365 Burgess, A. W., 64 Burkert, U., 239 Burkhard, A., 364 Burland, D. M., 275 Busetta, B., 364 Busing, W. R., 359, 363 Byrn, S. R., 365 Cahn, W., 275 Calhoun, A., 199 Calvin, M. D., 199 Cammi, R., 279 Car, R., 237 Card, D. N., 74 Carlson, H. A., 71, 72, 73,325 Carlsson, A. E., 238 Carnie, S. L., 200 Carter, P., 74 Cartier, A,, 380 Case, D. A., 65, 323 Cassettari, M., 200
Author lndex 383 Catlow, C. R. A., 361 Causa, M., 360 Ceperley, D. M., 202,237 Chadi, D. J., 238 Chagas, C., 69 Chaka, A. M., 360 Challacombe, M., 360 Chalvet, O., 378 Champagne, B., 278 Chan, C. T., 238 Chandler, D., 72, 134, 324 Chandrasekhar, I., 136 Chandrasekhar, J., 68, 199, 324 Chang, G., 364 Chang, I., 62 Chao, K.-C., 64 Cheeseman, J. R., 360 Chen, S., 279 Chen, S.-w., 68 Chen, W., 360 Cheng, L.-T., 276 Chialvo, A. A., 324 Chin, S., 276, 278 Chirlian, L. E., 361 Choi, C., 135 Chu, Z.-T., 63, 70, 73 Ciccotti, G., 134, 135, 203 Cieplak, P., 70, 325,326, 361 Cioslowski, J., 236, 360 Clarke, J. H. R., 203 Claverie, P., 378, 379 Clementi, E., 277 Clough, S. A., 200 Cohen, D., 205 Cohen, F. E., 364 Cohen, H. D., 275 Colton, R. J., 238 Colwell, S. M., 278 Coombes, D. S., 362 Cornell, W. D., 326,361 Cossi, M., 279 Cottier, D., 359 Coulson, C. A., 200, 378 Courseille, C., 364 Covell, D. G., 69, 72 Cox, S . R., 326, 360 Cramer, C. J., 63 Cram&, H., 64 Cross, A. J., 69 Cross, P. C., 136 Cummings, P. T., 324 Cun-xin, W., 70 Cushman, J. H., 202
Cyrot-Lackmann, F., 238 Czerminski, R., 135 Dacol, D., 323 Dagani, R., 275 Daggett, V., 70 Dahlquist, G., 136 Dang, L. X., 67,68,70 Darden, T., 203 Dauber-Osguthorpe, P., 65 Daudel, R., 377, 378 Daudey, J. P., 379 Daura, X., 71 Davey, R. J., 359 Davidson, E. R., 236,277, 363 Davis, M. E., 326 Davis, T. F., 74 Daw, M. S., 237 Dawnkaski, E. J., 238 Day, P. N., 205 De Groot, B. L., 325 de Pablo, J. J., 64 Dean, P. M., 63 DeBaerdemaeker, T. D. J., 364 DeBolt, S . E., 136 Decius, J. C., 136 DeFrees, D. J., 360 Del Buono, G. S., 325 Dempsey, P., 359 Demuynck, J., 379 Dennis, J. E., 362 Denti, T. Z. M., 71,72 DePristo, A. E., 239 Dewar, M. J. S . , 276 di Bella, S., 279 DiCapua, F. M., 61 Dickson, R. M., 278 Diederich, F., 71, 72 Diestier, D. J., 202 Dieter, K. M., 275 Diner, S., 379 Ding, Y., 324 DiNola, A., 66 Dinur, U., 323, 359 Dirk, C. W., 276 Docherty, R., 362 Dorn, R., 275 Douady, J., 379 Dovesi, R., 360 Drabold, D. A., 238 Du, Z., 360 Dubois, J.-E., 380 Dudis, D. S., 277, 279
384 Author lndex Duffy, E. M., 72 Dunitz, J. D., 359 Dunlap, B. J., 278 Dunning, T. H., Jr., 276 Dupuis, M., 276, 277, 278, 360 Durell, S. R., 71 Dykstra, C. E., 275, 276 Dzyabchenko, A. V., 362 Eastwood, J. W., 136 Eaton, D. F., 275 Eck, B., 200 Edberg, R., 136 Edholm, O., 66 Eerden, J. V., 68 Eisenberg, D., 200 Eisenmenger, F., 74 Elber, R., 65, 135 Elbert, S. T., 278, 360 Elert, M. L., 238 Ellinger, Y., 379 Englert, U., 363 Eno, L,, 323 Ercolessi, F., 238 Erickson, B. W., 72 Eriksson, L., 364 Ernenwein, R., 379 Ernst, R. R., 6 7 Erpenbeck, J. J., 62 Escobedo, F. A., 64 Esling, C., 359 Esselnik, K., 66 Essex, J. W., 71, 361 Essmann, U., 203 Etchebest, C., 378 Evans, D. J., 135, 136,204 Evans, W. A. B., 135 Eyring, H., 237 Failor, B. H., 136 Fedders, P. A., 238 Feil, D., 68, 363 Feller, D., 236, 277 Feller, S. E., 203 Ferrante, J., 238 Ferrario, M., 135,203 Ferrenberg, A. M., 74 Fersht, A. R., 325 Fickett, W., 64 Field, M. J., 205 Filippini, G., 359, 364 Fincham, D., 135
Finney, J. L., 200 Finnis, M. W., 237 Fischer, J., 66 Fisher, G. B., 200 Fixman, M., 66, 134 Flannery, B. P., 203, 361 Flemings, M. C., 275 Flurchick, K., 201, 237 Flytzanis, C., 276 Foiles, S. M., 238 Foresman, J. B., 360 Forester, T. R., 136 Forsythe, G. E., 323 Foster, K., 201 Foulkes, W. M. C., 237 Fourme, R., 365 Fox, D. J., 360 Franck, P., 323 Francl, M. M., 361 Franken, P. K., 276 Fraternali, F., 68 Frauenheim, T., 238 Frenkel, D., 66,202,276 Friedman, R. A., 68 Friedman, R. S., 275 Friere, E., 325 Friesner, R. A., 66 Frisch, M. J., 360 Frye, J., 365 Fuhua, H., 70 Funk-Kath, U., 364 Galazka, W., 63 Gans, P. J., 62 Gao, J., 68, 70, 205 Garcia, A., 325 Gardner, A. A., 204 Garel, T., 6 2 Garmer, D., 205 Garrison, B. J., 238 Gasteiger, J., 360 Gavezzotti, A., 359,361, 362,364 Gavotti, C., 201 Gdanitz, R. j., 362, 363 Gear, C. W., 136 Gelin, B. R., 61,134 Genest, M., 65 Cerber, P. R., 70, 325 Gerhardts, R. R., 201 Ghio, C., 323 Gibbs, J. W., 64 Gibson, J. B., 237
Author Index 385 Gibson, K. D., 65, 359,361 Gierasch, L. M., 65 Gies, P., 202 Giessner-Prettre, C. L., 379 Gilbert, M., 379 Gill, P. M. W., 360 Gilson, M. K., 68 Gladney, H. M., 378 Gland, J. L., 200 Glosli, J .N., 204, 205 Go, M., 65 GO, N., 61, 65, 135, 326 Goddard, W. A., 111, 360, 361, 364 Golab, J. T., 205 Goland, A. N., 237 Goldstein, H., 135, 203 Gomperts, R., 360 Gonzalez, C., 360 Goodwin, P. D., 238 Gordon, J. G., 204 Gordon, M. S., 205,278, 360 Gosh, I., 69 Gough, C. A., 135 Grahame, D. C., 199 Gramaccioli, C. M., 359 Grassberger, P., 62 Green, C. D., 200 Green, S. M., 325 Greengard, L. F., 203 Greffe, J. L., 379 Gresh, N., 379 Grigera, J. R., 136, 200 Guirdia, E., 67 Gubbins, K. E., 64, 135 Guest, M., 361 Guida, W. C., 364 Guillot, B., 379 Guissani, Y., 379 Gurskii, Z., 201 Guttman, C. M., 62 Haak, J. R., 69 Hagler, A. T., 61, 62, 65, 323, 359, 364 Hahn, T., 359 Haisa, M., 364 Halley, J. W., 199, 201, 324 Hammersley, J. M., 64 Hammonds, K. D., 135 Handscombe, D. C., 64 Handy, N. C., 277 Hann, R. B., 278 Hansen, J.-P., 61, 66
Hansmann, U. H. E., 63,74 Hansson, T., 73 Hao, M.-H., 63,74 Harkema, S., 68 Harp, G. D., 134 Harris, J., 237 Harrison, J. A., 238, 239 Harrison, N. M., 360 Harvey, S. C., 64,68,323,378 Hawk, J., 364 Hautman, J., 203 Haydock, R., 237 Hayward, S., 325, 326 Head, J. D., 199 Head-Gordon, M., 360 Hegger, R., 62 Heine, V., 238 Heinzinger, K., 199,200,201,202,203 Helfand, E., 135 Henderson, D., 66, 69,201, 204 Hendrickson, T. F., 62 Henriet, C., 379 Hermann, R. B., 377 Hermans, J., 63, 69, 70, 71, 72,200, 324 Hesselink, F. T., 65 Heyes, D. M., 203 Hill, T. L., 63 Hiller, L. A., 62 Hirano, T., 363 Hirara, F., 326 Hirono, S., 70 Hirschfelder, J., 237 Ho, K. M., 238 Hoarau, J., 377 Hockney, R. W., 136 Hodes, R. S., 72 Hoffmann, R., 363 Hofmann, D. W. M., 360 Hogenson, G. J., 74 Hoggan, P. E., 380 Hohenberg, P., 201, 237 Holden, J. R., 360 Holian, B. L., 239 Hooft, R. W. W., 67 Hoover, W. G., 61,66, 136,203 Horsfield, A. I?, 238 Hospital, M., 364 Houk, K. N., 364 Howard, J. N., 204 Hshitsume, N., 64 Huang, K., 63,324 Hummer, G., 73,203
386 Author Index Hunenberger, P. H., 71, 324 Hunt, N. G., 364 Hunter, J. E., 111, 74 Huron, B., 378, 379 Hurst, G. J. B., 277 Huston, S. E., 68, 324 Hwang, J.-K., 69, 72 Hwang, M. J., 364 Ibers, J. A., 359 Ichimura, H., 64 Ichiye, T., 325 Impey, R. W., 72,199,324 Inoue, T., 276 Irikura, K. K., 65 Isaacson, L. M., 62 Ito, T., 204 Itoh, R., 277 Iwata, S., 276 Jackson, J. L., 64 Jacobsen, K. W., 239 Jacobson, J. D., 64 Jacucci, G., 67 Jancso, G., 72 Jang, S., 203 Janssen, L. H. M., 71 Jasien, P. G., 276 Jaszunski, M., 276 Jayaram, B., 202 Jensen, J. H., 205,278, 360 Jensen, L. H., 359 Jerkiewicz, G., 200 Jin, S., 199 Joannopoulos, J. D., 360 Johnson, B., 201 Johnson, B. G., 360 Johnson, K. W., 66 Jolles, G., 63 Jones, W., 365 Jones-Hertzog, D. K., 71 Jonsson, D., 277 Jordan, F., 379 Jrargensen, P., 277, 279 Jorgensen, W. L., 63,68, 69, 70, 71, 72,73, 199,324,325,326 Julg, A., 377, 378 Kahn, M., 64 Kaino, T., 275 Kaiwa, R., 364 Kalos, M. H., 64
Kanis, D. R., 274, 276,279 Karasawa, N., 361 Karen, V. L., 363 Karfunkel, H. R., 362, 363, 364, 365 Karirn, 0. A,, 67 Karna, S. P.,276,277,278 Karplus, M., 61,63,64,65,66,67,68, 70, 71,72,134,205,323,324,325,326 Kashino, S., 364 Kassner, D., 364 Kastler, D., 378 Kato, H., 362 Katz, H., 135 Kay, L. E., 61 Keith, T., 360 Kernball, C., 200 Kendrick, J., 361 Kennard, O., 360 Kersten, P., 275 King, G., 73 Kinoshita, M., 201 Kirkwood, J. G., 61 Kirtman, B., 199,278 Kitaigorodskii, A. I., 359 Kitao, A., 326 Kitchen, D. B., 72, 325 Kitson, D. H., 65 Klein, G. P., 200 Klein, L. S., 64 Klein, M., 203 Klein, M. L., 72, 135, 199,200, 203, 324 Kleinman, D. A., 275 Kloprnan, G., 360 Klug, D. D., 364 Kobayashi, H., 204 Kobayashi, N., 326 Koerber, S. C., 62, 65 Kohler, T., 238 Kohlrneyer, A., 202 Kohn, W., 201,237 Kolinski, A., 63 Kollman, P. A., 61, 63, 67, 68, 69, 70, 135, 136,323,325,326,361 Konishi, Y., 66 Korambath, P. K., 278 Koseki, S., 278, 360 Koster, G. F., 238 Kramer, M., 323 Krauss, M., 205 Kriebel, C., 66 Krogh-Jespersen, K., 324 Kroon, J., 67,361, 362,363
Author Index 387 Krynicki, K., 200 Kubo, R., 64 Kuczwra, K., 70 Kuharski, R. A., 72 Kumar, S., 67, 204 Kurnar, S. K., 72 Kurtz, H. A., 275, 278,279 Kusalik, P. G., 324 Kushick, J. N., 61,66, 324 Kutteh, R., 136, 202 Kuzyk, M. G., 276 Kvick, A., 359 Kwon, I., 238 Ladd, A. J. C., 136 Lahav, M., 359 Lambrakos, S. G., 136 Lanczos, C., 135 Lang, N. D., 201,239 Langlet, J., 379 Langley, D. R., 202 Langridge, R., 69 Larter, R., 64 Laurent, D., 380 Lavery, R., 63,378 Le Page, Y., 359,364 Leach, A. R., 364 Lebowitz, J. L., 200 Lee, A. M., 278 Lee, C. Y., 66,67,204 Lee, F. S., 70 Lee, H., 203 Lee, J., 63, 65 Lefebvre, R., 377, 378 Leiserowitz, L., 359 Lengauer, T., 360 Leslie, M., 361 Leusen, F. J. J., 361,362,363,365 Levesque, D., 66, 135 Levitt, M., 326 LCvy, B., 378 Levy, R. M., 66,72,324,325 Lewis, P. A., 365 Li, L., 72 Li, X. P., 238 Li, Z., 66 LiCata, V. J., 325 Liljefors, T., 360 Lim, D., 71 Linderberg, J., 277 Lindsay, G. A., 275 Linssen, A. B. M., 325
Lipkowitz, K. B., u, ui, uii, 63, 64, 134, 135, 201,202,205,236,237,277, 323,359, 360,363,364,377,378 Lipton, M., 364 Liu, H., 72 Liu, S.-Y., 275 Lobaugh, J., 200 Lomdahl, P. S., 239 Louer, D., 364 Lovell, R., 365 Luo, Y.,277, 279 Luque, F. J., 364 Luty, B. A., 326 Lybrand, T. P., 63,69, 134,202, 323, 363 MacCrackin, E L., 62 MacKerell, A. D., Jr., 70 Madey, T. E., 200 Madura, J. D., 199, 324 Maeda, H., 364 Maginn, S. J., 359, 365 Mahajan, S., 275 Main, P., 365 Malcolm, M. A., 323 Malik, D. J., 275 Malrieu, J. P., 378, 379 Manger, C. W., 365 Mansfield, M. L., 66 March, N. H., 201 Marchesi, M., 201 Marcus, R. A., 72, 73 Marder, S. R., 274,275, 276 Marichal, G., 134 Margenau, H., 275 Mark, A. E., 63, 70,71, 72,324, 325 Marks, T. J., 274, 276, 279 Maroulis, G., 276 Marrakchi, A., 275 Marrone, T. J., 68 Marshall, A. W., 64 Marsili, M., 360 Martin, R. L., 360 Martyna, G. J., 203,205 Mascarella, S. W., 237 Matouschek, A., 325 Matsunaga, N., 278, 360 Maye, P. V., 68, 72 Mayo, S. L., 364 Mayot, M., 377,380 Mazur, J., 62 McCammon, J. A., 61, 63,64, 67,68,69, 71, 72,134,135,204,323,324,326
388 Author Index McGrath, E., 135 McGuire, R. F., 64 McDonald, I. R., 61,66, 135 McDonald, N. A., 71 McWeeny, R., 276 Medina, C., 73,325 Mehrotra, P.K., 67 Meirovitch, E., 64, 65 Meirovitch, H., 62,64, 65, 66, 73 Melroy, 0. R., 204 Memon, M. K., 136 Mennucci, B., 279 Menon, M., 238 Meredith, G. R., 275 Mermin, D. N., 202 Mertz, J. E., 135 Merz, K. M., Jr., 70, 361 Methfessel, M., 237 Metropolis, N., 61,203, 363 Meyer, M., 203 Meyers, F., 276 Mezei, M., 63, 67, 68, 69, 70, 71, 72 Michael, A., 323 Mighell, A. D., 363 Mika, K., 364 Mikkelsen, K. V., 279 Mildvan, A. S., 61 Milgram, M., 237 Miller, J. S., 275 Miller, W. H., 237 Mitchell, M. J., 71 Mitra, S. K., 136 Miyamoto, S., 135 Moler, C. B., 323 Mornany, F. A., 64,326,360 Mon, K. K., 74 Montgomery, J. A., 278, 360 Montroll, E. W., 200 Mooij, W. T. M., 361, 363 Moran, B., 136 Moriguchi, I., 70 Morley, J. O., 276, 278 Morris, G. P., 136 Moser, C., 377, 378 Mostow, M., 62 Motakabbit, K., 201,202 Mougenout, P., 278 Mountain, R. D., 204,324 Mowbray, S. L., 73 Mruzik, M. R., 69 Miiller, A., 66 Muller, J., 359
Mulliken, R. S., 360 Murad, S., 135, 204 Murphy, G. M., 275 Nachbar, R. B., Jr., 323, 324 Nagi, G. K., 362 Nagumo, M., 136 Naider, F., 61 Nanayakkara, A., 360 Nazmutdinov, R. R., 199, 200, 204 Ndip, E. M. N., 362 Needs, R. J., 237 Nelson, J. S., 239 Nhe thy, G., 62, 64 Neuhaus, T., 62 Newton, C. G., 63 Nguyen, K. A., 278,360 Nguyen, T. B., 72 Nicholas, J. B., 136 Nicholas, J. D., 200 Nishikawa, T., 326 Noguti, T., 326 Noordik, J. H., 359, 361 Norman, P., 277 Norskov, J. K., 239 North, R. J., 365 Northrup, S. H., 67, 326 NosC, S., 203 Nunes, W., 238 O’Connell, T., 69, 72 Oddershede, J., 277 Odiot, S., 377 Ohm, N. Y., 277 Oikawa, S., 362 Okamoto, Y., 63, 74 Okazaki, K., 365 Olafson, B. D., 64, 323, 364 Olsen, J., 277 Ooi, T., 65 Oran, E. S., 136 Orban, J., 136 Orland, H., 62 Orlando, R., 360 Omstein, R. L., 325 Orozco, M., 364 Orr, B. J., 276 Ortiz, J. V., 360 Orville-Thomas, W. J., 134 Osguthorpe, D. J., 65 Ossicini, S., 202 Owicki, J. C., 69
Author Index 389 Padrnaja, N., 364 Padr6, J. A., 67 Paler, A. G., 61 Panagiotopoulos, A. Z., 72 Pangali, C., 67 Parkinson, W. A., 277 Parrinello, M., 237, 238 Parsonage, N. G., 64 Pastor, R. W., 203 Patey, G. N., 66,200,201,202,203,204 Paulsen, M. D., 325 Paulus, E. F., 365 Pavlider, P., 276 Payne, M. C., 238,360 Payne, P. W., 6 7 Payne, R. S., 362 Pear, M. R., 67, 134 Pearlman, D. A., 70, 135, 325 Pecina, O., 202, 204 Pedersen, L. G., 203 Pelletier, G. L., 362 Peng, C. Y., 360 Perahia, D., 66, 324,326 Perdew, J. P., 237 Perera, L., 203 Perez, S., 362 Perkyns, J. S., 71 Perlstein, J., 362 Perrin, E., 278 Persoons, A., 274 Peterson, G. A., 360 Pettersson, I., 360 Pettitt, B. M., 63, 323 Pettitt, M., 71 Philippe, M.-J., 359 Philpott, M. R., 201,204,205 Pierce, B. M., 274, 276 Piermarini, G. J., 365 Pincelli, U., 379 Pinches, M. R. S., 365 Plowman, R., 359 Pohorille, A., 204 Polatoglu, H. M., 237 Pontikis, V., 203 Pople, J. A,, 360 Popovitz-Biro, R., 359 Porezag, D., 238 Porter, J. D., 204 Postma, J. P. M., 69, 200, 324 Potts, G. D., 365 Pound, G. M., 69 Powles, J. G., 135
Prasad, P. N., 275, 278 Pratt, L. R., 204 Press, W. H., 203, 361 Price, D. L., 199, 201 Price, S. L., 361, 362 Price, S. P., 361 Prigogine, I., 275 Probst, M., 199 Prod’hom, B., 72 Profeta, S., Jr., 323 Pugh, D., 276,278 Pullman, A., 377, 379, 380 Pullman, B., 69, 377, 379 Purisima, E. O., 66 Purvis, G. D., 111, 275 Puska, M. J., 239 Quentrec, B., 135 Querol, E., 71 Quinn, J. E., 200 Quirke, N., 67 Rabin, H., 276 Rabitz, H., 323, 324 Rablen, P. R., 71 Raecker, T. J., 239 Raghavachari, K., 360 Raghavan, K., 201 Rahrnan, A., 134,237,324 Ramakumar, S., 364 Rancurel, P., 378 Rao, M., 67 Rappt, A. K., 360 Ratner, M. A., 274,276,279 Ravirnohan, C., 69 Ravishanker, G., 202 Read, A. J., 237 Rebertus, D. W., 134 Ree, F. H., 61 Regener, R., 275 Reinhardt, B. A., 275 Reinhardt, W. P., 73,74, 204 Renaud, M., 365 Replogle, E. S., 360 Resat, H., 68, 70, 71, 72 Rey, R., 6 7 Reynolds, C. A,, 361 Reynolds, J. C. L., 326 Ricci, M. A., 200 Rice, B. M., 361 Rice, J. E., 68, 274, 275,276, 277, 278 Rice, S. A., 275
390 Author Index Richards, W. G., 361 Richer, J., 204 Rick, S. W., 200 Rietveld, H. M., 364 Rihs, G., 362, 364 Rinaldi, D., 378, 379, 380 Rivail, J.-L., 378, 379 Rivier, J., 62, 65 Rizo, J., 65 Robb, M. A., 360 Roberts, R. J., 362 Roberts, V. A., 65 Robertson, D. H., 238 Robertson, I. J., 238 Robinson, G. W., 324 Rocklin, V., 203 Roetti, C., 360 Rohde, B., 362 Rohmer, M.-M., 379 Rojas, 0. L., 66 Rojnuckarin, A., 203 Roothaan, C. C. J., 275 Rose, D. A., 202,205 Rose, J. H., 238 Rosenberg, J. M., 67 Rosenbluth, A. W., 61,62,203, 363 Rosenbluth, M. N., 61, 62,203, 363 Ross, M., 66 Rossky, P. J., 67, 68, 204, 324 Rothman, L. S., 200 Rout, J. E., 359 Roux, B., 68, 325 Rowe, R. C., 362 Rubin, R. J., 62 Russell, S. T., 325 Rustad, J. R., 324 Ryckaert, J. P., 134, 135, 136,203 Sabin, J. R., 274 Sadlej, A. J., 277,278 St-Amant, A., 237 Salahub, D. R., 204, 278 Saleh, B. E. A., 275 Salem, L., 378 Salsburg, Z. W., 64 Salvador, R., 74 Salvetti, G., 200 Samuelsson, J.-E., 73, 325 Sander, C., 326 Sarko, A., 359 Sasagane, K., 277 Saunders, M., 364
Saunders, V. R., 360 Sawyer, D. W., 200 Scaringe, R. P., 362 Schaftenaar, G., 361 Scheek, R. M., 325 Scheffler, M., 360 Scheraga, H. A., 61, 62,63,64,65, 66, 69, 73, 74,135,359,361 Schlegel, H. B., 360 Schmickler, W., 199,201,202, 204 Schmidt, M. U., 363 Schmidt, M. W., 278,360 Schnabel, R. B., 362 Schneider, S. E., 72 Schoeffel, K., 361 Schoen, M., 202 Schon, J. C., 74 Schreiber, D. E., 69 Schreiber, H., 325 Schutt, C., 323,324 Schwalm, M., 201 Schweighofer, K. J., 202 Scott, H. L., 66 Scrivastava, D., 238 Seifert, G., 238 Sekino, H., 276,277,278 Sellers, H., 199, 205 Selmi, M., 71 Serrano, L., 325 Severance, D. L., 71 Sham, L. J., 201,237 Sham, Y. Y., 73 Shankar, S., 69 Sharon, R., 61 Shelley,J. C., 200, 203, 204 Shelton, D. P., 274, 275 Shen, J., 324 Shenderova, O., 237,239 Sherwood, P., 361 Shing, K. S., 64 Shoda, T., 362, 365 Shortle, D., 325 Showalter, K., 64 Siepmann, J. I., 200, 201 Silva, S. J., 199 Sim, E, 276, 278 Simonson, T., 70, 325, 326 Sinclair, J. E., 237 Singer, K., 61,66 Singer, K. D., 275 Singh, S., 324 Singh, U. C., 69, 135, 323,326
Author Index 391 Sinnott, S. B., 238, 239 Sinnreich, D., 364 Sippl, M. J., 64 Sklenat, H., 378 Skolnick, J., 63 Slater, J. C., 238 Smit, B., 66, 202 Smith, D. E., 67 Smith, E. R., 200 Smith, F. R., 325 Smith, J. A., 72 Smith, J. R., 238 Smith, P. E., 70, 71, 72 Smith, S. F., 68 Smith, S. J., 378 Smith, W., 136 Sneddon, S. F., 68 Sohn, J. E., 274, 275 Sommer, M. S., 70 Somorjai, R. L., 135 Soper, A. K., 200 Sorensen, L. B., 204 Sorescu, D. C., 361 Soriaga, M. P., 200 Soukoulis, C. M., 238 Spackman, M. A., 277 Speedy, R. J., 66 Spek, A. L., 363 Spohr, E., 199,200,201,202,203,204 Sprik, M., 72,200,201, 324 Squire, D. R., 61 Stanek, J., 364 Stanley, D. R., 359 Stanton, J. F., 237, 277 States, D. J., 64,323 Stefanov, B. B., 360 Stegun, I. A., 203 Steinhauser, O., 325 Steppe, K., 362 Stern, P. S., 61,326 Stevens, W. J., 205 Stewart, J. J. P., 275, 276, 360 Still, W. C., 364 Stillinger, F. H., 65, 134,200,237 Stivers, J. T., 61 Stackfisch, T. P., 364 Stout, G. H., 359 Straatsma, T. P., 63, 135, 136, 200, 202, 323 Struve, W. S., 365 Stuart, S. J, 200 Stucky, G. D., 274,275 Stumm, P., 238
Stumpf, R., 360 Su, S., 278, 360 Subbaswamy, K. R., 238 Subra, R., 379 Sudhakar, P. V.,199 Sukarai, T., 363 Susnow, R., 323,324 Sussman, F., 69 Sutcliffe, B. T., 378 Sutton, A. P., 238,239 Sutton, P. A., 365 Svishchev, 1. M., 324 Swaminathan, S., 64,69, 323 Swendsen, R. H., 67,74,204 Swope, W. C., 136 Szabo, A., 66,73 Szewczuk, Z., 66 Szleifer, I., 72 Tackx, P., 274 Tajima, T., 363 Talbot, J., 64 Tanabe, K., 363 Tanaka, T., 363 Tang, C. L., 276 Taylor, A., 359 Taylor, R. S., 238 Teich, M. C., 275 Teller, A. H., 61, 203, 363 Teller, E., 61,203, 363 Ternbe, B. L., 69 Teramae, S., 363 Tersoff, J., 237, 238 Tessier, C., 360 Teter, M. P., 360 Teukolsky, S. A., 203,361 Thacher, T., 323 Thakkar, A. J., 276 Thiel, P. A., 200 Thirumalai, D., 204 Thompson, D. L., 361 Tidor, B., 65,70 Tildesley, D. J., 61, 135,202,203, 324 Tirado-Rives, J., 63,70, 326 Tobias, B., 365 Tobias, D. J., 67, 68, 135, 136 Tomaru, S., 275 Tomasi, J., 279 Tornovick, R., 323 Toney, M. F., 204 Topley, B., 237 Topper, R. Q., 63
392 Author Index Torda, A. E., 323 Torrie, G. M., 61, 200, 204 Tossatti, E., 238 T6th, G., 199,202, 204 Trasatri, S., 199 Tropsha, A., 69, 71, 72 Troxler, L., 72,379 Trucks, G. W., 360 Truhlar, D. G., 63 Tse, J. S., 364 Tsuda, M., 362 Tsuda, Y., 66 Tsuzuki, S., 363 Tuckerman, M. E., 203,205 Uhlmann, S., 238 Uosaka, K., 200 Urabe, T., 362 Ursenbach, C. P., 199 Usui, T., 64 Vaday, S., 362 Vahtras, O., 277 Vaidehi, N., 205 Valleau, J. P., 61, 66, 74, 204 van Aalten, D. M. F., 325 van de Waal, B. W., 362 van der Haest, A. D., 361 van der Spoel, D., 325 van Eijck, B. P., 67,361, 362, 363 van Gunsteren, W. F., 63, 67, 68, 70, 71, 72, 134,200,323,324,325,363 van Helden, S. P., 71 van Lenthe, J-, 361 Van Nuland, N. A. J., 325 van Schaik, R. C., 70 Van Zandt, L. L., 136 Vanderbilt, D., 238 Visquez, M., 62,65,67,73 Veillard, A., 378 Veillard, H., 380 Velikson, B., 62 Verlet, I.., 61, 66, 136, 379 Verwer, P., 361 Vesely, F. J., 136 Vetterling, W. T., 203, 361 Vineyard, G. H., 237 Viswamitra, M. A., 364 Vogel, H. J., 325 Voter, A. F., 67 Voth, G. A., 199,200,203 Vukobratovic, M., 323
Wade, R. C., 326 Waicwright, T. E., 61 Wall, F. T., 62 Wallqvist, A., 69, 71, 72 Walter, R., 135 Wang, C. Z., 238 Wang, J., 66, 69, 71 Wang, L., 72 Wang, Y., 71 Ward, J. F., 276 Warshel, A., 63,68,69,70,72,73,205,325 Watanabe, M., 73, 74,200,204 Webb, S. P., 205 Weber, T. A., 65, 134,237 Weich, F., 238 Weiner, J. H., 134 Weiner, P., 63, 323 Weiner, S. J., 323 Weir, C. E., 365 Weissbuch, I., 359 Wells, R. D., 378 Werner, P.-E., 364 Wesolowski, 'I: A., 205 Westdahl, M., 364 Wheeler, D. J., 62 White, C. T., 238 Whitlock, P. A., 64 Whitman, C. P., 61 Whitten, J. L., 199 Wiberg, K. B., 361 Widder, D. V., 136 Widom, B., 64 Wieckowski, A., 200 Wiesler, D. G., 204 Wiest, R., 379 Wilkinson, A. J., 63 Willetts, A,, 275, 278 Williams, D. E., 326,359, 360, 361, 362, 363, 365 Williams, D. J., 275 Willock, D. J., 361 Wilson, E. B., Jc, 136 Wilson, K. R., 136 Wilson, M. A., 204 Wilton, R. W., 364 Windus, T. L., 278, 360 Windwer, S., 62 Winkelmann, J., 66 Wipff, G., 69, 72, 379 Witschel, W., 202 Wolff, J., 65
Author lndex 393 Wolynes, P. G., 324 Wong, C. F., 69,323, 324, 325, 326 Wong, M. W., 360 Wood, R. H., 71 Wood, W. W., 64 Woods, R. J., 363 Woon, D. E., 276 Wright, A. F., 239 WU,Y.-D., 364 Wu, Z. J., 364 Wynberg, H., 361 Xia, X., 202 Xu, C. H., 238 Xu, Y.,72 Yamahara, K., 365 Yamaotsu, N., 70 Yamato, T., 326 Yamazaki, M., 378 Yan, Y., 72 Yang, D., 61 Yao, S., 324 Yarwood, J., 134 Ye, X., 201 Yeates, A. T., 277,278, 279 Yee, D., 204 Young, M. A., 202 Youngs, W., 360
Yu, H.-a., 325 Yu, J., 278 Yu, W., 324 Yue, S.-Y., 66 Yun, R. H., 70,71 'fun-yu, S., 70 Yvon, J., 379 Zacharias, M., 135 Zakharov, I. I., 199 Zakrzewska, K., 378 Zakrzewski, V. G., 360 Zangwill, A., 201 Zaniewski, R., 360 Zerner, M. C., 274, 278 Zhang, H., 323 Zhang, J., 324 Zheng, C., 324 Zhidomirow, G. M., 199 Zhu, J., 324 Zhu, S.-b., 323, 324 Zhu, S.-B., 201 Zinn, A. S., 204 Zou, S. J., 239 Zugenmaier, P., 359 Zunger, A., 237 zuo, L., 359 Zwanzig, R. W., 61, 325 Zyss, J., 275
Reviews in Computational Chemistry, Volume12 Edited by Kenny B. Lipkowitz, Donald B. Boyd Copyright 0 1998 by Wiley-VCH, Inc.
Subject Index Computer programs are denoted in boldface; databases and journals are in italics. Ab initio calculations, 148,235, 344 Absolute entropy functional, 50 Absolute free energy of binding, 39 Acetarnide, 37 Acetaminophen, 352, 356,357 Acetic acid, 335,340, 348, 349, 352 Acetylene, 271 ACMM, 344 Adaptive umbrella sampling, 28 Adiabatic switching, 58 Aggregates, 341 Alanine, 32 Alanine dipeptide, 29,37 Alcohols, 335, 348 Alkanes, 75 Alloxan, 340 Alpha-helix, 30 Aluminum surface energy, 216 AMBER, 284 AMBER force field, 315 Amidinoindanone guanylhydrazone, 353 Analytical method of constraint dynamics, 80, 84, 89, 98, 101 Analytical potential energy function characteristics, 21 1 Angle-bend constraints, 82, 118, 121, 123, 130,133 Angular distributions, 182 Anharmonic effects, 290, 313 Antamanide, 29 Antibody, 39 Argon, 209 Aromatic hydrocarbons, 342 ASTERIX, 373 Asymmetric unit, 331, 348 Atomic multipoles, 335 Atomic orbitals (AO), 213,217,220, 260 Atomic charges, 190,292,293, 301, 319, 334,335,347,352,357,358 Atomistic simulations, 207 Avian pancreatic polypeptide (APP), 296, 321
Azurin, 36 Basis set dependence, 267 Basis sets, 210,216,265 3-21G, 342 6-31G, 265 6-31G*, 272,347 6-31G*", 347, 357 6-31G + PD, 265,267,272 6-31G(+sd+sp),265 6-311++G**, 266,267 aug-cc-pVDZ, 267, 272 aug-cc-pVTZ, 267,272 aug-cc-pVXZ, 266 CC-pVDZ,266,267 CC-PVTZ,266,267 CC-pVX Z, 266 d-aug-cc-pVDZ, 267, 272 d-aug-cc-pVTZ, 267, 272 double-zeta, 266 ELP, 265,267 POL, 266 POL+, 266,267,272 Sadlej, 267, 272 Spackman, 267 STO-3G, 357 t-aug-cc-pVDZ, 267, 272 t-aug-cc-pVTZ, 272 triple-zeta, 265,266 x-aug-cc-pVXZ, 266 Benzamidine, 32,40 Benzene, 132,331,340,342,352,357 Beta-sheets, 30 Bethe lattice, 229 Binding energy, 144,217,233 Bioactive molecules, 282 Biomolecular simulation, 281 Biomolecules, 313 Biphenyl, 254, 255 Boltzmann probability, 1,4, 7, 12, 15,20,41, 43, 50, 56
3 95
396 Subject Index Bond angle bending, 125,284,333 Bond angle constraint, 125 Bond distance, 21 1,217, 230 Bond order, 228 Bond stretching, 118, 125,284,333 Bond stretching constraints, 77, 80, 83, 91, 92,106,115,116,118,129,130 Boundary conditions, 29, 176 Bovine pancreatic trypsin inhibitor (BPTI), 21, 28,295,296 Brillouin theorem, 371 Brownian dynamics, 165, 297, 314 Bulk adsorption energy, 190 Bulk materials, 242 Bulk modulus, 217, 219,235 Bulk properties, 210 Bulk susceptibilities, 243 Bulk water, 141, 190, 195 Bulk water-semiconductor interface, 144 Butadiene, 273 Butane, 23, 79, 91 C,H,, 265 Caloric integration, 24 Cambridge Structural Database (CSD), 333, 340,345,347,348,353 Cancellation of errors, 318 Canonical (NVT)ensemble, 158, 163, 166 Carbon dioxide, 342 Carcinogenicity, 3 70 Car-Parrinello molecular dynamics, 144,210 Cartesian coordinates, 77,78, 84, 110, 155 CASTEP, 333,344 Cell parameters, 330, 338, 347 Centre de Mkcanique Ondulatoire Appliqute (CMOA), 369 Centre Europken de Calculs Atomiques et Molkculaires (CECAM), 374 Centre National de la Recherche Scientifique (CNRS), 368 CFF force field, 347, 357 Chain attrition problem, 44 Charge groups, 336 Charge models, 334 Charge sensitivities, 292, 295, 311 Charge transfer, 258 Charged metal surfaces, 195, 197 Charges, 190,292,293, 301, 319, 334,335, 347,352,357,358 CHARMM, 17,284 CHELP, 335 CHELPG, 335 Chemical and Engineering News, u, ix, xi
Chemical dynamics, 209 Chemical information retrieval, 374 Chemical literature, u Chemical potential, 175 Chemical reactor, 282 Chemical stability, 327 Chemisorption, 139 Chiral compounds, 348, 352 Chromophores, 244,309 Chymotrypsin, 32 CIPSI, 373 Close contacts, 329 Close-packed metals, 233 Clustering, 338 Coherent anti-Stokes Raman spectroscopy, 244 Cohesive energy, 217, 227,228, 235 Collective motions, 313 Complementary error function, 157 Computational chemistry, u, 241, 367 Computational Chemisty List (CCL), vi Computer packages, 255 Computer simulation, 1 , 137, 327 Computers, 374,377 Condensed phase systems, 273 Configuration, 18 Configuration interaction (CI), 213, 257 Configuration space, 1, 12, 144, 173 Conformation, 18, 335 Conformational analysis, 347 Conformational search, 58 Conjugate momenta, 309 Constrained coordinates, 98, 103 Constrained degrees of freedom, 90 Constraint correction, 83, 99, 132 Constraint dynamics, 78, 111, 116 Constraint equations, 84 Constraint forces, 103 Constraints, 161, 165, 319, 350 Construction probability, 41, 45 Conventional cell, 331 Converged properties, 265 Convergence, 38,47,336 Convergence criteria, 256 Convergence of sensitivity coefficients, 294 Convergence rate, 123 Conversion factors, 251 Cooperative effects, 283,291, 303 Cooperativity, 304, 306 Correlation energies, 213 Corrugated metal surfaces, 145, 146, 164, 176,177,180,193,194 Coulombic interactions, 155, 212
Subject index Coulomb’s law, 146,292, 310, 334 Coupled cluster methods, 248 Coupled-cluster equations-of-motion method, 264 Coupled holonomic constraints, 80, 89 Covariance matrices, 289, 312, 313 18-Crown-6, 307 CRYSCA, 345 Crystal density, 331 Crystal packing, 336, 337 Crystal polymorphs, 327 Crystal structure prediction, 328, 339, 347, 353 Crystal surfaces, 332 Crystal95,333, 344 Crystalline solids, 216 Crystallization, 327, 332, 339 Crystals, 251, 273, 330 Cutoff radius, 29, 154,300,336, 342,343, 346 Cytochrome P-450, 31 1 d orbitals, 226 D’Alembert’s principle, 95 DARC system, 374 Database of physical properties, 208 de Broglie wavelength, 167 Decaglycine, 23,48, 52 Defects, 138,210,218,223, 225, 232,234 Degenerate four-wave mixing, 246 Degrees of freedom, 76, 333 Deletions from an ensemble, 173 Density functional theory (DFT) calculations, 149, 150,208,209, 212,214,215,219, 220,232,333 Density matrices, 261 Density of states, 222,223 Density profiles, 166 Desktop mechanical calculator, 370 Detailed balance condition, 16, 56, 169, 173, 174 Diamond, 211,219,221,230,231 Diatomics, 216,217 Dibenzoylmethane, 329 1,2-Dichloroethane, 29 DICVOL91,347 Dielectric constant, 337 Diffuse functions, 265, 268 Diffusion coefficient, 187, 189 Diffusional-influenced reaction rates, 3 14 Digital computers, 367 Dimethoxyethane, 342 Dipeptides, 31 1
397
Dipole moment, 141, 149, 191,243,250, 254,256,257,266,292,294,336 Dipole moment matrix, 261 Direct methods for computing entropy, 19,49 Displacement correlation function, 288 Distributed multipoles, 344 Distribution function, 301 DMAREL, 344 DNA, 21,281 Docking, 334 Dodecane, 30 Domain of applicability, 123 Double excitation operators, 265 DREIDING force field, 355,357, 358 Drug design, 209 ECEPP, 17,21,48,52 EEM method, 90 Effective medium theory, 208,226,231 Effective pair potential, 228 Einstein harmonic oscillator formula, 21 Elastic constants, 211, 236 Electric field, 195, 196, 242,247, 254, 256, 259,301 Electrical response property calculations, 266 Electrode, 191 Electron correlation, 213,254, 272, 371 Electron density, 179,212,214, 232,334 Electron dispersion, 334 Electron gas, 148, 150, 213, 215, 232 Electronic energy, 214 Electrooptic modulation, 244 Electrostatic interactions, 321, 333, 336, 342, 344,346 Electrostatic potentials, 156, 192, 319 Electrostatically derived charges, 335, 347, 352,357,358 Embedded-atom method, 226,231,233,234 Empirical bond order model, 208, 226 Empirical potential energy functions, 1 7 Endothiapepsin, 310 Energies, 21 1 Energy fluctuation, 34 Energy minimization, 333, 337 Energy surface, 337 Enrichment method, 48 Ensemble averages, 5, 12 Ensembles, 159 Entropy, 1, 2, 5, 9, 19, 20, 22, 23, 24, 36,41, 55,58,300,303,306,331 Entropy functional, 50, 52 Enzyme-ligand binding, 32 Equations of motion, 82, 85, 89, 160, 162, 164
398 Subiect Index Equilibration, 165 Equivalent alternative constraints, 109, 133 Ergotic process, 15 Error function, 157 Error propagation, 314 Errors, 89, 100, 101, 130, 132,216 Essential dynamics approach, 312 Estrone, 350, 352,355 Ethane, 31,37 Ethylene, 273 Euler angles, 78, 151, 170, 173 Euler equations of rotational motion, 78 Ewald summation, 156, 336,342, 344, 346 Exchange-correlation functional, 213, 21 4, 215 Exchange-correlation potential, 214 Excitation energies, 257,265 Excited state, 247 Excluded volume, 42 Extended Hiickel theory (EHT), 218,219,343 Extensive variables, 6 False energy minima, 346 Fermi energy, 222 FHI96MD, 333,344 Finite difference methods, 160 Finite field method, 252,254 Finnis-Sinclair potential, 208, 220, 226, 229, 235 Fitting databases, 235 Flexcryst, 340,345 Flexibility in crystal structures, 211, 333 Fluctuations, 6, 12,23, 166 Fluids, 17, 138 p-Fluorobenzamidine, 32 Force fields, 2, 17,22,36, 37,210, 281, 292, 315,319,333,337,347,3SO Forces of constraint, 89, 102 Fractional coordinates, 330 France, 367 Free energy, 1,20,32,36,41, 59,295 Free energy difference calculations, 318 Free energy of binding, 32, 39,40 Free energy of solvation, 34,41 Free energy perturbation (FEP), 2,39, 40 Frequency doubling, 244 Frequency upconversion lasing, 246 Frequency-dependent properties, 256,257, 264 Friction term, 165 Friedel oscillations, 178 Fullerenes, 219
Galactose receptor, 40 GAMESS, 266,271,335 Gasteiger atomic charges, 341 Gaussian, 335 Gauss’s principle of least constraint, 77, 95 GEOMO, 374 Ghost molecule, 34 Gibbs free energy, 6,41,331,332 Global minimum, 338 Glucocorticoid, 358 Glue model, 226 Glycine, 32 Glycine dipeptide, 29,295,296 Glycol, 29 GMP, 36 Gonadotropin-releasing hormone (GnRH), 52 Grain boundary, 21 1 Gramicidin A, 30 Grand canonical ensemble, 158 Grand canonical MC, 29, 173 Graphene sheets, 231 Green’s function, 286, 287, 289, 290, 312, 313,314,322 GROMOS, 284,295 GROMOS force field, 284, 315 Ground state energy, 216 Hamiltonian, 10, 11,22, 26, 30, 33, 38, 58, 77,140, 147, 156, 212, 219, 221,248, 254,257,264,284,309 Hamilton’s principle, 95 Hardware companies, 374 Harmonic approximation, 20 Harmonic entropy, 20 Harmonic force constant, 300 Harris functional, 208, 215, 216, 220 Hartree potential, 212 Hartree-Fock calculations, 252, 255,333, 347 Hartree-Fock equation, 213,259 HCN, 266,267,268,270 Helix bundle, 30 Helix-coil transition, 21, 38, 56 Hellmann-Feynman theorem, 248 Helmholtz free energy, 2,5, 286, 31 1 Hessian, 20,287,289, 313 Hexatriene, 273 High pressure phases, 331 High pressure polymorph, 357 Histogram method, 59 Holonomic constraints, 75, 77, 82, 83, 106, 128,133
Subiect Index 399 Hiickel calculations, 372 Hund’s rule, 222 Hydration free energies, 3 11 Hydrazine, 371 Hydrocarbon interface, 191 Hydrogen bonds, 139, 166, 176, 186, 195, 300,333,340,351,357 Hydrogen fluoride, 371 Hydrophilic residue, 308 Hydrophobic contacts, 303, 308 Hydrophobic effects, 311, 321 Hyperpolarizability, 243,247,248, 249, 250, 252, 257,258,259,264,270,271, 273 Hypothetical scanning method, 17,49 Ice lattices, 195 ICES, 341,342,345 Ideal chain, 42 Image potentials, 146, 144, 148, 197 Imaging enhancements, 246 Importance sampling, 13,23,26,41,47,54, 167 Initial conditions, 159 Insertions in an ensemble, 173 Insulin, 21 Integration algorithm, 82, 84, 100, 132 Interatomic forces, 207 Interfaces, 137, 144, 207 Internal coordinate constraints, 75, 82, 110, 111,115,130 Internal coordinates, 77 Ionic solutions, 198 IR spectroscopy, 347 Ising model, 3,6,7,24,53,54, 59 Jellium model, 140, 143, 148, 152, 178,208, 232,234,235 Jellium potential, 150 Job cuts in industry, vizi Job opportunities for computational chemists, v, ix Kerr effect, 249,251 Kinetic energy operator, 212 Kirkwood equation, 10 Kirkwood factor, 294 Kleinman symmetry, 249 Kohn-Sham density functional theory, 149 Kohn-Sham orbitals, 214,220 Lagrangian dynamics, 77, 78
Lagrangian multipliers, 78, 81, 82, 85, 89, 98, 102,113,161 Lasers, 244, 246 Lattice chain models, 56 Lattice constant, 217,219,235 Lattice coordinates, 155 Lattice models, 42, 58, 226, 228 Lattices, 195, 335, 338 Lattice spin models, 59 Lattice symmetry, 330, 339, 341 Lattice vectors, 330 Law of cosines, 124 Layering, 196 Leapfrog Verlet algorithm, 132 Leibniz rule, 86, 87 Lennard-Jones constants, 10, 17 Lennard-Jones energy, 298 Lennard-Jones fluids, 25,31, 173 Lennard-Jones potential, 4, 144, 154, 192, 210 Leu-enkephalin, 21,23, 52, 56 Ligand, 39,334 Ligand binding, 30 Linear buildup procedures, 41 Linear constraints, 77 Linear response approximation (LRA), 39 Linear response theory, 310 Linear scaling algorithms, 218 Linearization, 104, 108 Liouville’s theorem, 58 Lipids, 281 Liquid crystals, 251 Liquids, 273 Local density approximation (LDA), 150, 215, 219,232 Local electronic bond energy, 227 Local energy minima, 337, 350 Local states (LS) method, 3, 17, 25, 51, 52 Localized microstates, 18 Lodge theory, 370 Low frequency modes, 76 Lysozyme, 39 M site, 142 Mach-Zender interferometer, 245 Magnetic susceptibility, 370 Many-body analytic potential energy function, 210 Marcus relationship, 39 Materials simulation, 207, 210 Matrix method, 83, 94, 103, 105, 111, 116, 118,120
400 Subject Index Matrix of constraint displacements, 113 Maxwell-Boltzmann distrihution, 159 Maxwell’s equations, 242 MDCP, 341, 342,345 Mean field approximation, 149 Melting points, 236 Mercury surfaces, 139, 176, 177, 181, 182, 186 Mercury-mercury potentials, 145 Mercury-water interface, 179, 185, 192 Metal clusters, 138, 139, 144, 197 Metal sxfaces, 137, 138, 140, 147, 148, 152, 186,196,231 Metals, 230, 231,233 Metal-water interfaces, 137, 143, 153, 175, 193,194 Metastable structures, 350 Met-enkephalin, 21, 56, 58 Methanol, 31, 311, 317 Method of strides, 47 Method of undetermined multipliers, 102 Method of undetermined parameters, 81,95, 101, 111,126 Methyl chloride, 29 9-Methyladenine, 30 4-Methylpyridine, 329 1-Methylthymine, 30 Metropolis algorithm, 343 Metropolis Monte Carlo, 1, 13, 15,52, 5 5 , 56,166,175 Microcanonical ensemble, 158 Minimal basis sets, 272 Minimization, 350 Minimizers, 337 Minimum free energy principle, 6, 7, 45, 54 MISSYM, 345 MM2,341 MNDO, 254,255 Mobility, 187 Modeling, 140 MOLDEN, 335 Molecular conformations, 345,347 Molecular design, 307 Molecular dynamics (MD), 1, 13, 15, 17, 41, 52, 75, 133, 140, 152, 159, 186, 196, 292, 298, 313, 322, 342, 346, 350, 373 Molecular dynamics trajectories, 289 Molecular electrostatic potential (MEP), 334 Molecular flexibility, 329, 344 Molecular mechanics (MM),286,333 Molecular orbital energies, 221
Molecular orbital (MO)basis, 262 Molecular packing analysis, 329 Molecular properties, 247 Molecular recognition, 306 Molecular vibrations, 110 Maller-Plesset (MP) perturbation theory, 264 MOLPAK, 341,342,345,346 Moments, 222,223 Moments theorem, 224 Monolayers, 144 Monosaccharides, 329,342, 353 Monte Carlo (MC)calculations, 1, 13, 15, 25, 31, 44, 55,56, 140, 152, 166, 168, 174, 187,196,292,341,343,349,373 MOPAC, 252,335 Morphology, 327 Morse function, 235 MPA, 341,342,345,346 MSHAKE, 132 MSI PP, 345 Mulliken charges, 334 Multicanonical algorithm, 3, 16, 56 Multicanonical probability, 56 Multiphoton pumping mechanisms, 246 Multipole expansion, 342 Multistage sampling method, 59 Myoglobin, 32 Naphthalene, 132 Native structure, 35 Newtonian mechanics, 78 NIST*LATTICE,344 p-Nitroaniline, 271 NMR, 347,353 Nonbonded cutoffs, 297, 303 Nonbonded interactions, 210 Nonbonded parameters, 307 Nonchiral compounds, 348 Nonholonomic constraints, 95 Nonlinear effects, 310 Nonlinear constraints, 107 Nonlinear optical (NLO) properties, 236, 241, 252,256,263 Nonlinear scaling relationships, 312 Nonlinearity, 99 Nonphysical transformations, 31, 32, 35 Non-self-consistent treatments, 149 Nonvariational methods, 248 Normal mode analysis, 20, 290, 3 13 NosC-Hoover thermostats, 163 Numerical breakdown, 255
Subject lndex 401 Numerical drift, 81 Numerical experiments, 367 Numerical integration, 89, 90, 160 NVT ensemble. 163 One-electron approximation, 212 OPLS force field, 315 Optical bistability, 245 Optical data storage, 246 Optical Kerr effect, 244 Optical rectification, 244 Optical signal processing, 245 Optical storage devices, 244 Order parameters, 182. 194 OREMWA method, 337 Orientational potentials, 144 Packing energy, 331 Pair-additive interactions, 210 Paraffins, 333 Parameter optimization, 319 Partial atomic charges, 190, 292, 301 Partially constrained coordinates, 97, 101 Partially rigid models, 79 Particle mesh Ewald, 158 Partition function, 4, 9, 16, 18, 48, 59, 167, 169 PCILO, 373 PCKS, 329 PCK83,344 PDM93,335 Peptide growth simulation, 38 Peptides, 56 Periodic boundary conditions, 154, 155 Periodic conditions, 335 Periodic interactions, 156 Perturbation potential, 287 Perturbation series expansion, 247 Perturbation theory, 248, 256,264, 309 Pharmaceutical industry, x, 327,375 Pharmaceutical Research and Manufacturers of America (PhRMA), uiii Phase conjugation, 246 Phase space, 5, 58, 77, 165, 309 Phase space volume, 58 Phonons, 145 Physical transformation, 40 Physisorption, 137, 139 Physisorption of water, 140, 143, 182 Pigment red, 353 pKa calculations, 31, 39, 309 Platinum-water potential, 146, 185
PLATON, 344 PGckels effect, 244 Point mutation, 307 Poisson-Boltzmann method, 29 Poisson’s equation, 152 Polarizability, 243, 247, 248, 252, 258, 266, 292 Polarizable water model, 31 8 Polarization, 147, 149, 191, 242, 292 Polarization propagators, 263 Polarized continuum model (PCM), 273 Polyacetylene, 273 Polyalanine, 38 Polyenes, 265, 271, 272 Polymer films, 273 Polymers, 2,42 Polymorph prediction process, 351 Polymorph Predictor, 338,343, 345,346, 355 Polymorphs, 327, 350 Polypeptides, 30, 36 Polysaccharide structures, 329 Positional fluctuations, 313 Potential drop at interfaces, 180, 190, 192 Potential energy function, 36, 139, 142, 146, 207,208,211,283,329,333 Potential energy function refinement, 31 8 Potential energy surface (PES), 2, 19, 209 Potential of mean force (PMF), 25,26, 58, 60, 76,198,297 Powder diffraction, 328, 332, 338, 347, 350, 352,353,354 Predictor-corrector SHAKE algorithm, 132 Prednisolone t-butylacetate, 352, 358 Primitive cell, 331, 352 Principal component analysis (PCA), 283, 290,312,316,317,320 Probability density, 1 PROMET3,340,345,346 Propagators, 263 Protein engineering, 307 Protein environment, 35 Protein folding, 18, 30, 42, 56, 58, 303, 307, 308 Proteins, 2, 17,22, 37, 39, 56, 75, 281 Pseudoenergy, 264 Pseudopotential, 210 Pyrimidine, 342 Quadrupole moments, 191 Quantum chemistry, 248,370 Quantum mechanical bonding, 208
402 Subject Index Quantum mechanical calculations, 144,212, 265 Quantum mechanical entropy, 21 Quantum mechanics (QM), 333, 368 Quantum Monte Carlo calculations, 150 Quasi-harmonic approximation, 22, 312, 313 Quaternions, 78, 126, 173 Quinacridone, 352, 358 Racemates, 348 Radial distribution function, 26, 141, 143, 166,180,185,300 Radius of gyration, 30 Random coil, 51 Random number generator, 174 Random phase approximation (RPA) methods, 261,263 Random walk, 2,42 Ras protein, 52 RATTLE, 82,83,128,129,132,133 Reaction coordinates, 27,30, 76 Reaction field model, 273 Receptor, 334 Reciprocal space, 336 Reduced cell, 331, 338, 344 Redundancy of constraints, 79, 109 Reference state, 39 Refractive index, 243,245, 246 Research and development (R&D) expenditures, viii Response functions, 264 Restraint potentials, 36, 38 Rietveld method, 347 Rigid body minimization, 337, 342 Rigid body translation, 288 Rigid bodies, 333,343, 353 Rigid models, 78 Rigid water model, 116, 126 RNase T, 36 Rotamers, 352 Rotation matrix, 170 Runge-Kutta integration algorithm, 91 Sampling theory, 12 Scanning method, 3, 7,44,46, 51 Scanning probe microscope (SPM), 138,197 Scanning transition probabilities, 49 Schrodinger equation, 212 Scoring functions, 334,340 SCRIPT, 375 Second harmonic generation (SHG), 243,244, 245,251
Second-moment approximation, 224,225 Self-avoiding walks (SAWS),42, 50 Self-consistent field (SCF), 260 Self-intersecting walks, 42, 50 Semiempirical molecular orbital approximations, 218, 271 Sensitivity analysis, 281, 290, 31 1 Sensitivity coefficients, 283, 284, 285, 307, 319 Sensitivity matrix, 287, 291, 316 Serine dipeptide, 315 SETTLE method, 82,83,132 SHAKE method, 82, 83, 106, 108, 110, 111, 115, 116, 117, 118, 119, 120, 121, 122, 123,124,128,129,132,133 Shear constants, 233 SIBFA force field, 374 Silver-water interfaces, 185 Simple fluids, 18 Simple sampling, 14,42, 44 Simulated annealing, 337 Simulation box, 155 Singular value decomposition (SVD), 283, 290,316 Slater determinant, 213 Slow growth thermodynamic integration, 29, 34,37 Smooth surfaces, 176 Smooth truncation, 156 S,2 reaction, 29 Sodium chloride (NaCI),29 Software, xiii, 225, 373 Solutes, 31 1 Solvation free energy, 31 Solvent effects, 289, 314 Solvent-accessible surface area (SASA),40 Solvent-free polymorphs, 355 Solvochromatic method, 258 Somatostatin, 23 Space group constraints, 346 Space group symmetry, 331,340,342,344, 348,350 SPC water model, 141, 163,292,298,299 SPClE water model, 123, 124,126,141, 163 Specific heat, 6, 9, 11, 24 Spectral density, 189 Square lattice, 42,48, 53, 225, 303, 308 State function, 32 Statistical mechanics, 4, 209, 373 Step-by-step buildup, 3 Sterically accessible regions, 192
Subject Index 403 Stochastic models method, 7, 53 Stochastic process, 15 Stormer algorithm, 101 Structural diversity, 339 Structural properties, 180 Structural response, 313, 314 Structure-binding relations, 209 Sum-over-states (SOS) methods, 252, 256, 263 Supercell, 350 Supercomputers, 375 Superlattice, 350 Surface area, 40, 3 1 1 Surface corrugation, 144, 197 Surface effects, 340 Surface polarization, 139 Surface properties, 210, 223 Susceptibilities, 242, 248 Symmetry, 331, 340,342, 344,348,350 Systematic search, 346 Target state, 39 Tautomeric forms, 355 Taylor series, 95, 100, 104, 107, 114, 243, 247 Teaching computational chemistry, 375 Temperature constraint, 95 Theoretical biochemistry, 369, 372 Theoretical chemistry, 368 Thermal expansion, 236 Thermodynamic cycle, 33,34 Thermodynamic integration, 2, 9, 11, 24, 31, 33, 36, 37, 76 Thermodynamic perturbation, 36, 76 Thermodynamic properties, 286 Thermodynamics, 331 Third harmonic generation, 244 Three-dimensional grid, 340 Threonine, 335 Threonine dipeptide, 315 Tight binding method, 208, 218 Time scales, 75, 175, 176, 186 Time steps, 90, 133, 161, 162, 198 Time-dependent Hartree-Fock method, 258 Time-dependent response functions, 263 Time-dependent Schrodinger equation, 259 Tinfoil boundary conditions, 337 TIP3P, 292 TIP4P, 141,163 TIP-4FP, 141,163 Torsional constraints, 120, 130, 131 Torsional interactions, 333
Trajectory, 101,209 Transferability, 21 1 Transformation paths, 37 Transition moments, 257, 258 Transition probabilities, 2,41,49, 51, 56 TREOR90,347 Trial crystal packings, 342 Trial structures, 340, 345 Triangulation procedure, 82, 109, 123 Tribochemistry, 231 Triphenylphosphine, 329 Triphenylverdazyl, 329 Trypsin, 32, 40 Tryptophan, 309 Two-dimensional bias function, 29 Two-electron integral calculations, 255 Two-photon upconverted emission, 247 Umbrella sampling, 14, 16, 23, 24, 25, 27, 30 Unconstrained coordinates, 102, 103 Undetermined parameters, 81, 82, 98, 100, 111 Unit cells, 329, 330, 331,344 United atom model, 92, 342 Unrestricted Hartree-Fock (UHF) computations, 370 UPACK, 341,342,345,346 Valence bond calculation, 371 Valence electrons, 148, 197 van der Waals energy, 336 van der Waals interactions, 333, 346 van der Waals surface, 3 19 Variational principle, 213,233, 259 Velocity autocorrelation function, 188 Velocity Verlet integration algorithm, 83, 126 Verlet integration algorithm, 83, 101, 102, 111,126,127,160,164 Vibrational calculations, 273 Vibrational dynamics, 110, 113 Vibrational frequency, 217 Vibrational modes, 141 Visualization, 176 Volume, 40 Water, 17, 27,29, 31, 80, 109, 139, 141, 142, 144, 148, 159, 160, 161,162, 166, 173, 177, 181, 183, 184, 190, 193, 293, 298, 301,318 Water density profile, 194 Water models, 292
404 Subject Index Water-metal potentials, 144, 148 Water-water interactions, 37, 194 Wave equation, 242 Wavefunction, 212,213 Weak coupling, 76 Wide microstates, 18 Wigner-Seitz radius, 148 Wilson G matrix, 113 Wilsonvectors, 110, 112, 118, 119 Windows, 24
WMIN, 329,342 Work function, 149 World Wide Web, u, xiii Xenon, 32 X-ray powder diffraction, 328, 332, 338, 347, 350,352,353,354 X-ray scattering, 194 Zwanzig equation, 10