THE PHYSICAL CHEMIST’S TOOLBOX
THE PHYSICAL CHEMIST’S TOOLBOX
Robert M. Metzger
Copyright Ó 2012 by John Wiley & S...
136 downloads
1677 Views
9MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
THE PHYSICAL CHEMIST’S TOOLBOX
THE PHYSICAL CHEMIST’S TOOLBOX
Robert M. Metzger
Copyright Ó 2012 by John Wiley & Sons, Inc. All rights reserved Published by John Wiley & Sons, Inc., Hoboken, New Jersey Published simultaneously in Canada No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 7504470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www. wiley.com/go/permission. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com. Library of Congress Cataloging-in-Publication Data Metzger, R. M. (Robert M.), 1940The physical chemist’s toolbox / by Robert M. Metzger. pages cm Includes index. ISBN 978-0-470-88925-1 (hardback) 1. Chemistry, Physical and theoretical. I. Title. QD453.3.M48 2012 541–dc23 2011045248
Printed in the United States of America 10 9 8 7 6
5 4 3
2 1
Contents
Foreword
vii
Chapter 1 j Introduction: A Physical Chemists’s Toolbox
1
Chapter 2 j Particles, Forces, and Mathematical Methods
5
Chapter 3 j Quantum Mechanics
121
Chapter 4 j Thermodynamics
244
Chapter 5 j Statistical Mechanics
284
Chapter 6 j Kinetics, Equilibria, and Electrochemistry
335
Chapter 7 j Symmetry
387
Chapter 8 j Solid-State Physics
443
Chapter 9 j Electrical Circuits, Amplifiers, and Computers
503
Chapter 10 j Sources, Sensors, and Detection Methods
571
Chapter 11 j Instruments
647
Chapter 12 j From Crystals to Molecules
781
Appendix
823
Index
903 v
Foreword
In the first few years of the 21st Century it has become clear that an intrinsically cross-disciplinary perspective, in which expertise across the whole spectrum from condensed-matter physics through chemistry and materials science as well as molecular biology, will be necessary for successful research in almost any of these overlapping fields. However as more-and-more knowledge is accumulated and the complexity of that knowledge increases, researchers must bend over backward to avoid the ever-present tendency to overspecialize. This textbook aims to counter the pressure to over-specialize to which many doctoral programs are subject by offering a convergent ensemble perspective on the common elements that have developed. The coverage is broad, including relevant concepts and equations from classical mechanics, electricity and magnetism, optics, special relativity, quantum mechanics, thermodynamics, statistical mechanics, kinetics, electrochemistry, crystallography, solid-state physics and electronics. The key formulas are presented succinctly, but with derivations and interspersed problems, so the student can easily assimilate and understand the intimate interconnections. Metzger has added two very useful chapters focused on instrumentation in order to introduce the readers quickly to a wide range of major research experimental approaches. He has included a detailed assessment of the types and value of the experimental information obtainable. Finally a bonus chapter has been included consisting of recent special topics. Interspersed throughout the text are occasional historical “sidelines”, some amusing but always interesting and informative. These inserts, by occasionally breaking up the rhythm of general approach which is towards deeper understanding, help the learning process immeasurably by making it clear that the scientific advance is first-and-foremost a human endeavor driven by curiosity. Overall, the book is a tour de force. The book offers a challenging and refreshing approach and it deserves to become a much used and dog-eared basic text, in fact a key reference on every book shelf, rather than a door-stop. It should become a basic text for the next generation of graduate students (post-graduate in UK parlance). Metzger’s aim seems to be to lead students towards a much more broadlybased outlook than is the case at present, and it is to be hoped that professors will be inspired to lead them to this outlook. This is a revolutionary book, which does not aim to replace the more detailed traditional specialist courses in, say, quantum mechanics, statistical mechanics or solid-state physics. It just reveals a refreshing unity to the whole range of subjects in a very profound yet compact way. SIR HAROLD WALTER KROTO, FRS FLORIDA STATE UNIVERSITY
vii
CHAPTER
1
Introduction: A Physical Chemists's Toolbox
“Indocti discant, amentque meminisse periti.” Charles Jean Henault (1685–1770) in Nouvel Abrege Chronologique de l’Histoire de France jusqu’ a la Mort de Louis XIV Jack Sherman: “Dr. Pauling, how does one get good ideas?”
Linus Pauling: “Well, I guess one must have many ideas, and throw away the bad ones.” Linus Carl Pauling (1901–1994)
“Never give in. Never give in. Never, never, never, never, never give in.” Sir Winston Churchill (1874–1965) at Harrow School, 29 October 1941
This compendium, “vademecum,” or toolbox is an abbreviated introduction to, or review of, theory and experiment in physics and chemistry. The term “vade mecum” or “go with me” was the first tentative title for this book; it was associated with the learned and boring BaedekerÒ guidebooks for travel in the early 1900s: These Baedekers have been replaced with heavily illustrated and less boring Dorling–KindersleyÒ guides. Most students in 2011 who know some Latin would ask “vade mecum?” go with me? where? why? The intended audience for this toolbox is the beginning researcher, who often has difficulty in reconciling recent or past classroom knowledge in the undergraduate or first-year graduate curriculum with the topics and research problems current in research laboratories in the twenty-first century. While several excellent and specialized monographs exist for all the topics
The Physical Chemist’s Toolbox, Robert M. Metzger. Ó 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc.
1
2
1
I NT RODUC TION : A PH YS ICAL CH EMIS TS’S TO OLBO X
discussed in this book, to my knowledge there is no single compact book that covers adequately the disparate techniques needed for scientific advances in the twenty-first century. In particular, there is a need to find “What will this or that technique do for my research problem?” The aim of this toolbox is thus fourfold: 1. Summarize the theory common to chemistry and physics (Chapters 2–6). 2. Introduce topics and techniques that lead to instrumentation (Chapters 7–9). 3. Discuss the advanced instrumentation available in research (Chapters 10 and 11). 4. Travel a path from crystals to nanoparticles to single molecules (Chapter 12). The book is interspersed with problems to do: some trivial, some difficult. This expedient can keep the volume more compact, and it becomes a useful pedagogical tool. This book tries to be a mathematically deep, yet brief and useful compendium of several topics, which can and should be covered by more specialized books, courses, and review articles. Throughout, the aim is to bring the novice up to speed. The teaching of chemistry leaned rather heavily toward mathematical and physical rigor in the 1960s, but this fervor was lost, as chemical, physical, and biochemical complexity eluded simple mathematical precision. Alas, chemical and biological phenomena are usually determined by small but significant differences between two very large quantities, whose accurate calculation is often difficult! Lamentably, the recent educational trend has been to train what could be called one-dimensional scientists, very good in one subfield but blissfully unaware of the rest. It is sad that we no longer produce those broadly trained scientists of past generations, who were willing to delve into new problems far from their original interest: I am thinking of Hans Bethe, Peter Debye, Enrico Fermi, Linus Pauling, or Edward Teller. This toolbox tries to adhere to this older and broader tradition, redress the temporary malady, and help restore the universality of scientific inquiry. To the instructor: This toolbox could form the basis of a one-year graduate course in physical chemistry and/or analytical chemistry, perhaps team-taught; it should be taught with mandatory problem sets (students will connect the dots by doing the suggested problems) and with recourse to traditional texts that cover, for example, quantum mechanics or statistical mechanics in much greater detail. I am reminded of the very successful one-year team-taught courses such as “Western Civilization” at Stanford University in the 1960s! I have taught the toolbox several times at the University of Alabama as a one-semester course, but found the pace exhausting. To the many students who took my course: Thanks for being so patient. To Chemistry and Physics departments: The toolbox could become a valuable resource for all entering graduate students, so maybe students, even in areas far from physical chemistry, should be encouraged to buy it and work at it on their own.
1
I NT RODUC TION : A PH YS ICAL CH EMIS TS’S TO OLBO X
To the student: (1) Do the problems; (2) read around in specialized reference texts that may be suggested either in this toolbox or by your instructor(s); (3) discover whether the toolbox could be developed in new directions. To myself: To adapt Tom Lehrer’s (1928– ) famous quip, I am embarrassed to realize that at my present age Mozart had been dead for 36 years. Alan MacDiarmid (1927–2007) once said “Chemistry is about people”: In this spirit, full names and birth and death dates are given to all the scientists quoted in this book; such brief historical data may help illuminate how and when science was done. I have resisted mentioning who was a Nobel prize winner: too many to list, and some worthy scientists—for example, Mendeleyeff, Eyring, Edison, Slater, and Tesla—were not honored. I owe a deep debt of gratitude to many people who have educated me over several decades, as live teachers and silent authors. In particular, I am indebted to Professor Willard Frank Libby (1908–1980), who taught us undergraduates at UCLA to love current research problems and led us into quite a few wild-goose chases; Professor Harden Marsden McConnell (1927– ), who led us at Caltech and Stanford by example to see what are the interesting problems and what are “trivial” problems; Professor Linus Carl Pauling (1901–1994), who taught me electrical and magnetic susceptibilities with his incomparable photographic recall of data and dates, and with his insight and humanity about current events; Dr. Richard Edward Marsh (1922– ) and Professor Paul Gravis Simpson (1937–1978), who taught me crystallography; Professor Michel Boudart (1925– ), who introduced me to heterogeneous catalysis; Mr. William D. Good (1937–1978), who taught me combustion calorimetry; Professor Sukant Kishore Tripathy (1952–2000), who introduced me to Langmuir–Blodgett films; and finally, Professor Richard Phillips Feynman (1918–1983), who taught me about the Schwartzschild singularity and event horizons and who was a source of deep inspiration, pleasant conversations, and mischievous fun. Thanks are also due to two persons who helped me greatly in my academic career and taught me a thing or two about what good science really means: Professor Andrew Peter Stefani (1927– ) of the University of Mississippi and Professor Michael Patrick Cava (1926–2010) of the University of Alabama. Professor Carolyn J. Cassady (University of Alabama) kindly allowed me to use an experiment she had devised for students of mass spectrometry. The following books have inspired me: (1) Principles of Modern Physics by Robert B. Leighton, (2) Theoretical Physics by Georg Joos, (3) The Feynman Lectures on Physics by Richard P. Feynman, and (4) Principles of Instrumental Analysis by Doug Skoog, James Holler, and Stanley Crouch. In this twentyfirst century, much help was obtained on-line from Wikipedia, but “caveat emptor”! Writing is teaching but also learning; Marcus Porcius Cato (234–149 BC), who was echoing Solon (630–560 BC), said “I dare to say again: ‘senesco discens plurima.’” Thanks are due to several friends and colleagues, who corrected errors and oversights in the early drafts: Professor Massimo Carbucicchio (University of Parma, Italy), Professor Michael Bowman, Dan Goebbert, Shanlin Pan, and Richard Tipping (University of Alabama), Professor Harris J. Silverstone (Johns Hopkins University), Professor Zoltan G. Soos (Princeton University),
3
4
1
I NT RODUC TION : A PH YS ICAL CH EMIS TS’S TO OLBO X
Dr. Ralph H. Young (Eastman Kodak Co.), and Adam Csoeke-Peck (Brentwood, California). The errors that remain are all mine; errare humanum est, sed perseverare diabolicum [Lucius Anneus Seneca (ca. 4 BC–AD 66)]. To the reader who finds errors, my apologies: I will try to correct the errors for the next edition; echoing what Akira Kurosawa (1910–1998) said in 1989, when he received an honorary Oscar for lifetime achievements in cinematography: “So sorry, [I] hope to do better next time.”
CHAPTER
2
Particles, Forces, and Mathematical Methods
“Viribus Unitis” [with united forces] Emperor Franz Josef the First (and the Last) (1830–1916)
“It is difficult to make predictions, especially about the future.” Yogi Berra (1925– )
This chapter summarizes the fundamental forces in nature, reviews some mathematical methods, and discusses electricity, magnetism, special relativity, optics, and statistics. Sideline. The name “physics” derives from the Greek word fusiB (¼ nature, essence): Early physicists like Newton were called natural philosophers. The word “chemistry,” through its Arabic precursor alchimya, derives from the Greek word wZmi (¼ black earth), a tribute to the Egyptians’ embalming arts. Mathematics comes from the Greek mayZma (¼ learning, study). Algebra comes from the Arabic “al-jabr” (¼ transposition [to the other side of an equation]). Calculus (as in “infinitesimal calculus”) is the Latin word for a small pebble.
2.1 FUNDAMENTAL FORCES, ELEMENTARY PARTICLES, NUCLEI AND ATOMS The four fundamental forces, their governing equations, the mediating particles, their relative magnitudes, and their ranges are listed in Table 2.1.
The Physical Chemist’s Toolbox, Robert M. Metzger. Ó 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc.
5
6
Table 2.1
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
The Fundamental Forces
Force Gravitation Electricity Weak nuclear Strong nuclear Strong nuclear
Law
Equation F 12 ¼ Gm1 M2 r 12 =r12 3 F 12 ¼ q1 q2 r12 =4 p e0 r12 3 — — —
Newton’s law Coulomb’s law — Inter-quark Inter-nucleon
Mediating Particle
Relative Magnitude
Graviton (?) Photon Vector boson Gluona Pion (gluon?)a
39
10 102 105 1 1
Range Infinite Infinite 1018 m 1015 m 1015 m
Source: Adapted from Serway [1]. a For nucleon–nucleon strong interactions within nuclei, pions (¼ two-quark particles; see below) may be the mediating particles: Gluons are probably not involved directly, since the nucleons have no “color charge.” The inter-nucleon potential goes to zero beyond 1.7 fm ¼ 1.7 1015 m.
The first (and weakest) force is Newton’s1 force of universal gravitation (1687) [2]: F 12 ¼ Gm1 M2 r 12 =r123
ð2:1:1Þ
which describes the attractive force F12 between two bodies of masses m1 and M2 placed a distance r12 apart, where G is the constant of gravitation. The largest visible objects in the universe (galaxies, stars, quasars, planets, satellites, comets) are held together by this weakest force, which may be transmitted by a presumed but hitherto unobserved mediating particle called the graviton. Its range extends to the whole universe. Masses are always positive. The second force is the electrical force, which obeys Coulomb’s2 law (1785) [3]: F 12 ¼ q1 q2 r 12 =4 p e0 r123
ð2:1:2Þ
which describes the attractive (or repulsive) force F12 between two electrical charges q1 and q2 (positive or negative) placed r12 apart, where e0 is the electrical permittivity of vacuum. The fundamental electrical monopole (electron) is probably infinitely stable; the mediating particle for the electrical force (photon) is observed and well understood. Magnetism is usually due to moving electrical charges, but its monopole has never been seen, so magnetism is not really an independent force; atoms have magnetic properties, and in wires the gegenions of electrical currents are “stationary,” yet the overall charge is zero: Hence magnetism is a special relativistic effect. As explained below, electricity and magnetism are well described by Maxwell’s3 four field equations [4]. The third force is the “weak nuclear” or “Fermi”4 force (1934), which stabilizes many radioactive particles and the free neutron; it explains “beta decay” and positron emission (e.g., the free neutron decays within a half-life of 13 minutes into a proton, an electron, and an electron antineutrino). The weak force has a very narrow range. 1
Sir Isaac Newton (1642–1727). Charles-Augustin de Coulomb (1736–1806). 3 James Clerk Maxwell (1831–1879). 4 Enrico Fermi (1901–1954). 2
2.1
FUNDAMENTAL FORCES, ELEMENTARY PARTICLES, NUCLEI AND ATOMS
The fourth and strongest force in the universe is the “strong nuclear force,” which binds together the nuclei and the constituents of atomic nuclei, but has an extremely narrow range. Indirect experimental evidence exists for a mediating particle (gluon). Nucleons (neutrons, protons) and maybe nuclei consist of “elementary” particles called quarks, which have never been seen free, although proton–proton scattering experiments show that protons consist of “lumps,” which may be the best experimental evidence for quarks. Between 1900 and 1960 a zoo of 100-odd stable and unstable elementary particles were discovered; the shortest-lived among them were called “resonances”; quarks were proposed in 1964 by Zweig5 and Gell-Mann6 to help order this zoo. Within the nucleus, the inter-nucleon “strong” force was traditionally thought of as being mediated by pions (themselves combinations of two quarks). The nuclear “shell model” assigns quantum numbers to the protons and neutrons that have merged to form a certain nucleon. Certain “magic values” of these nuclear quantum numbers explain why certain nuclei are more stable (have longer lives) than others. Sideline. The name “quark” comes from a sentence in Joyce’s7 Finnegan Wake; a free quark has never been isolated, but physicists have not looked in German grocery stores, where Quark is a well-known special soft cheese! In 1960 electrical and weak forces were merged by Glashow8 into electroweak theory. Evolving in the 1960s and 1970s from the quark hypothesis, the Standard Model of Glashow, Weinberg9, and Salam10 explains nucleons and other particles (hadrons, baryons, and mesons) as unions of either three or two “quarks” each, with a new set of ad hoc “quantum numbers.” This Standard Model has a symmetry basis in the finite special unitary group SU(3), along with a mathematical expression in quantum chromodynamics, but does not yield a force field. These seemingly provisional ex post facto arguments and quantum numbers are reminiscent of the chemical arguments used by Mendeleyeff11 in 1869 to construct the Periodic Table of chemical elements (whose explanation had to wait for quantum mechanics in the 1920s). Sideline. Mendeleyeff divorced his wife in 1882 and married a student: By the rules of the Russian Orthodox Church, he became a bigamist, and according to an Edict of the Russian Czar, only members of the Church in good standing could teach in Russian Universities. When apprised of the dilemma, Czar Alexander III12 said: “Mendeleyeff may have two wives, but I only have one Mendeleyeff”: Professor. Mendeleyeff kept his job! Table 2.2 lists the presently known fundamental particles (unobserved quarks and some neutrinos), the elementary particles, and the observed
5
George Zweig (1937–). Murray Gell-Mann (1929–). 7 James Augustine Aloysius Joyce (1842–1941). 8 Sheldon Glashow (1932–). 6
9
Steven Weinberg (1933–). Mohammed Abdus Salam (1926–1996). 11 Dmitri Ivanovich Mendeleyeff (1834–1907). 12 Alexander III Alexandrovich Romanov, Czar of All the Russias (1845–1894). 10
7
8
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
Table 2.2 Fundamental (Quark, Gluon, Graviton, Neutrino) and Elementary (¼ Fundamental Plus 2-Quark and 3-Quark Combinations) Particlesa
Particle Name
Symbol
Lifetime (t/s)
Quarks [fundamental (with six “flavors” u, d, s, c, b, and t) but so far Up Quark u 1? Anti-up u 1? Down Quark d 1? d 1? Anti-down Charmed Quark c 1? c Anti-charmed 1? Strange Quark s 1? s Anti-strange 1? Bottom Quark b 1? Anti-bottom bb 1? Top Quark t 1? t 1? Anti-top
Relative Mass m0
Relative Electron Charge Q
unobserved as single particles] 4.6 þ2/3 4.6 2/3 16 1/3 16 1/3 2490 þ2/3 2490 2/3 200 1/3 200 1/3 8480 1/3 8480 1/3 >3.4E5 2/3 >3.4E5 2/3
Spin S
Parity P
Isospin T
1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2
þ1 1 þ1 1 þ1 1 þ1 1 þ1 1 þ1 1
1/2 1/2 1/2 1/2 0 0 0 0 0 0 0 0
1 — — — — — — — — — — — — — — — — — — —
1,0 1 1 1 0 0 0 0 0 0 0 0 — — — — — — — —
Fundamental interaction carriers (for gluons, combination of color and anti-color) Photon n 1 0 Vector boson Wþ ? 1.6E5 Vector boson Z ? 1.8E5 Vector boson W ? 1.6E5 Gluon1 r y ? 0 Gluon2 r b ? 0 Gluon3 y r ? 0 Gluon4 y b ? 0 Gluon5 b r ? 0 Gluon6 b y ? 0 Gluon7 ðr r y yÞ21=2 ? 0 1=2 Gluon8 ðr r þ y y b bÞ6 ? 0 Electron neutrino ne 1 0 ne 1 0 Electron antineutrino 1 0 Muon neutrino nm Muon antineutrino -nm 1 0 1 0 Tau neutrino nT -nT 1 0 Tau antineutrino Graviton? G 0? 0 Higgs boson? H0 ? >224
1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1/2 1/2 1/2 1/2 1/2 1/2 2 0
Leptons (elementary particles) Electron e Positron eþ Muonz m mþ Positive muon Tau t Positive tau tþ
1 1 2.2E-6 — 1.6 1031 kg; otherwise the star will decay into a white dwarf. Several models were adopted to explain the structure of stable and radioactive nuclei. The liquid drop model assumes that protons and neutrons coalesce to form a liquid drop of high density (spherical, or prolate spheroidal, or oblate spheroidal); Weizs€ acker’s20 semiempirical mass formula of 1935
19 20
Gregory of Ockham and de Saint-Pour¸cain (ca. 1288–1348). Carl Friedrich Freiherr von Weizs€acker (1912–2007)
14
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
accounts fairly well for the masses M(Z, A) (in atomic mass units) for stable nuclei: MðZ; AÞ ¼ 1:007825Z þ 1:008665ðA ZÞ 0:01691A þ 0:01911A2=3 þ 0:000763Z2 A1=3 þ 0:10175ðZ A=2Þ2 A1 þ ð1; 0; or þ 1Þ 0:012A1=2 ð2:1:5Þ However, the liquid-drop model does not account for the relative stability of certain nuclei called “islands of (relative) nuclear stability” (Z and/or N ¼ 2, 8, 20, 28, 50, 82, 126, 184). The shell model of G€ oppert-Mayer21 and Jensen22 posits populating nuclear states as if the nucleons occupied the lowest possible quantum states for a three-dimensional harmonic oscillator, but with an energy correction due to a nuclear spin–orbit interaction: The nuclear “spin” quantum numbers I and “orbital” quantum numbers M, couple strongly as I * M; this nuclear spin–orbit interaction (invented in analogy to the electron spin-orbit interaction) is, however, due to an unknown potential function; nevertheless, this model does account nicely for magic number stability and nuclear excited states. There is an acrostic “spuds if pug dish of pig” that serves as a mnemonic for the ordering 1s, 1p, 1d, 2s, 1f, 2p, 1g, 2d, 3s, 1h, 2f, 3p, 1i, 2g (before nuclear spin–orbit splittings). The final model that accounts for nuclear stabilities must, of course, be the strong force, or rather the residual component of the strong force that works outside of quark confinement. Natural or artificial radioactive nuclei can exhibit several decay modes: a decay (N 0 ¼ N 4; Z0 ¼ Z 2; A0 ¼ A 4; with emission of a 2He4 nucleus), which is dominant for elements of atomic number greater than Pb; b–decay or electron emission (N 0 ¼ N 1; Z0 ¼ Z þ 1; A0 ¼ A; this involves the weak force and the extra emission of a neutrino); positron or bþ decay (N 0 ¼ N þ 1; Z0 ¼ Z 1; A0 ¼ A; emission of a positron and an antineutrino; this also involves the weak force); g decay: no changes in N or Z, and electron capture (N0 ¼ N þ 1; Z0 ¼ Z 1; A0 ¼ A; emission of electron; this involves the weak force). There is also internal conversion, from a metastable nucleus to a more stable nucleus with no particle emission. Very useful is a wall chart of all nuclides developed by the Knolls Atomic Power Laboratory of the General Electric Company in the 1950s and subsequently updated often (Appendix, Table A) [6]. The Periodic Table of The Chemical Elements (Table 2.3) was first organized by Mendeleyeff in 1869 [7] well before quantum mechanics and the modern theory of atomic structure, by using group analogies in chemical and physical properties; Mendeleyeff even predicted two as yet undiscovered elements (Ga, Ge) and left spaces for them in his table. Sideline. In the 1780s Lavoisier first pinpointed the irreducibility of chemical elements (like hydrogen and oxygen) and their combination in chemical compounds (like water). In the early 1800s Dalton23 revived the
21
Maria G€ oppert-Mayer (1906–1972). Johannes Hans Daniel Jensen (1907–1973). 23 John Dalton (1766–1844). 22
15
a
/2A/
24
/6B/
Grp 6
25
/7B/
Grp 7
26
/B/
Grp 8
27
/8B/
Grp 9
28
Grp 10 /8B/
29
Grp 11 /1B/
30
Grp 12 /2B/
5
Grp 13 /3A/
6
Grp 14 /4A/
7
Grp 15 /5A/
8
Grp 16 /6A/
9
/7A/
Grp 17
2
/8A/
Grp 18
57
104
Rf
89
Ac‡
137.327
88
Ra
(226)
87
Fr
(223)
75
44
Ru
92
U 238.02891
91
Pa 231.03588
‡90 Th
93 (237)
Np (244)
Pu
94
150.36
Sm
(243)
Am
95
151.964
Eu
63
Mt
109
192.217
Ir
77
62
61
Pm (145)
60
Nd 144.242
46
(247)
Cm
96
157.25
Gd
64
(281)
Ds
110
195.084
Pt
78
Pd 106.42
Rh
45
Ni 58.6934
102.90550
(276)
Hs
108
190.23
Os
76
101.07
Co 58.933195
(270)
Pr
232.03806
Fe 55.845
(272)
Bh
107
186.207
Re
140.90765
(271)
Sg
106
183.84
43
Tc (98)
Ce
59
(268)
Db
105
180.94788
W
74
Ta
73
42
Mo 95.96
Nb 92.90638
41
Mn 54.938045
Cr 51.9961
140.116
*58
(267)
178.49
V 50.9415
The 18 groups (Grp n) are the modern ones; the older grouping is given inside slashes.
(227)
138.90547
Hf
132.9054511
La*
56
Ba
55
Cs
72
Y
40
Zr 91.224
39
88.90585
38
Sr
87.62
37
Rb
85.4678
Ti 47.867
21
Sc
44.955912
20
Ca
40.078
K
19
39.0983
Cu
(247)
Bk
97
158.92535
Tb
65
(280)
Rg
111
196.966569
Au
79
107.8682
Ag
47
63.546
Zn
(251)
Cf
98
162.500
Dy
66
(285)
Cn
112
200.59
Hg
80
112.411
Cd
48
65.38
31
Ga
(252)
Es
99
164.93032
Ho
67
(284)
Uut
113
204.3833
Tl
81
114.818
In
49
69.723
32
Ge
(257)
Fm
100
167.259
Er
68
(289)
Uuq
114
207.2
Pb
82
118.710
Sn
50
72.64
(258)
Md
101
168.93421
Tm
69
(288)
Uup
115
208.98040
Bi
83
121.760
Sb
51
74.92160
As
33
30.9738
P 28.0855
Si Al 26.9815
Mg
15
14.0067
N 14
12.011
C
13
10.811
24.3050
12
11
Na
9.0122
6.941
B
22.9898
4
Be
3
Li
(259)
No
102
173.054
Yb
70
(293)
Uuh
116
(209)
Po
84
127.60
Te
52
78.96
Se
34
32.066
S
16
15.9994
O
(262)
Lr
103
174.9668
Lu
71
Uus
117
(210)
At
85
126.90447
I
53
79.904
Br
35
35.4527
Cl
17
18.9984
F
(294)
Uuo
118
(222)
Rn
86
131.293
Xe
54
83.798
Kr
36
39.948
Ar
18
20.1797
Ne
10
He
23
/5B/
Grp 5
4.0026
22
/4B/
Grp 4
H
/3B/
Grp 3
1.0079
1
Grp 2
/1A/
The Periodic Table of the Known Chemical Elementsa
Grp 1
Table 2.3
16
Table 2.4
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
Fundamental Constantsa
Name
Symbol
SI Value 11
m3 kg1 s2 6.67384 10 31 9.1093897 10 kg 1.602176565 1019 C 2.99792458 108 m s1 6.6260696 1034 J s 1.0545716 1034 J s 2.002319304386 6.0221413 1023 molecules (gram-mole)1 8.3144622 J mol1 K1
G me e c h h h/2p ge NA R
Gravitational constant Rest mass of electron Electrical charge of electron Speed of light Planck’s constant of action Planck’s reduced constant of action free electron g-factor Avogadro’s number gas constant
Source: P.J. Mohr, B.N. Taylor, and D.B. Newell, “The 2010 CODATA Recommended Values of the Fundamental Physical Constants” (National Institute of Standards and Technology, Gaithersburg, MD 20899, 2011). a gram-mole ¼ molar mass in grams.
Table 2.5
Other Constants
Name
Symbol
Sommerfeld fine-structure constant Gravitational acceleration at sea level at equator Boltzmann’s constant Electrical permittivity of vacuum Magnetic permeability of vacuum Quantized Hall resistance Magnetic flux quantum
a g kB e0 m0 R0 F0
Equation ¼ e /2e0ch 2
¼ R/NA 2 ¼ m1 0 c ¼ 4p 107 ¼ he2 ¼ h/2 e
SI Value 1/137.035999 (¼ e2/ h c in esu) 9.78031 m s2 1.3806488 1023 J K1 8.853742338 1012 F m1 1.2566370614 106 N A2 25,812.807443 O 2.06783376 1015 Wb
Source: P.J. Mohr, B.N. Taylor, and D.B. Newell, “The 2010 CODATA Recommended Values of the Fundamental Physical Constants” (National Institute of Standards and Technology, Gaithersburg, MD 20899, 2011).
ancient idea of indivisible “fundamental” atoms proposed by Leucippus24 and his pupil Democritus.25 Dalton also demonstrated two laws (of definite and multiple proportions); as a result, relative empirical formulas and tables of relative atomic weights were established for a growing list of chemical compounds. But for several decades the molecular structure of water was erroneously assumed to be “HO.” Avogadro’s26 1811 principle that at constant pressure and temperature equal volumes of gases contained an equal number of molecules was ignored until Cannizzaro’s 1858 work,27 circulated at the Karlsruhe conference of 1860, convinced the German chemists to finally take Avogadro’s principle seriously: Shazam! The molecular formula for water became H2O, all relative scales rolled into one, and Mendeleyeff could then build his periodic table!
24
Leucippus of Elea (early 5th century. BC). Democritus of Abdera (ca. 460 BC– ca. 370 BC). 26 Lorenzo Romano Amedeo Carlo Bernadette Avogadro, Conte di Quaregna e Cerreto (1776–1856). 27 Stanislao Cannizzaro (1826–1910). 25
2.3
17
RE VIEW OF M ATH EMA TIC AL CON CEPT S
2.2 GRAVITATION The gravitational force is the weakest force in nature, but it binds together the most massive bodies in the universe. The force is in newtons (N) in the SI system, but in dynes in the cgs system (see Appendix, Table A). This force can be rewritten in terms of a vector gravitational field F1(r2) experienced by particle 2 at position r2, due to the existence of a particle 1 of mass m1 at r1, and mediated by a continuous, if virtual, flow of gravitons emanating from particle 1: F1 ðr 2 Þ ¼ Gm1 r 12 r123
ð2:2:1Þ
where r12 ¼ r2 r1. This gravitational field can be integrated once to yield the scalar gravitational potential U1 (an energy): U1 ¼ Gm1 r1 1
ð2:2:2Þ
where the field is the negative gradient of the potential U1 evaluated at any “field point” r, for example, r ¼ r2 (except at the singular position r ¼ r1): F1 ¼ rU1 ¼ er ð@=@rÞU1
ð2:2:3Þ
where er is the unit vector in the radial direction. This potential energy is measured in joules, J (1 J 1 N m) in the SI system, or in erg (1 erg 1 dyne cm) in the cgs system. We can also define the gravitational potential energy U12 as the potential energy of the two-body system: U12 ¼ Gm1 M2 r1 12
ð2:2:4Þ
2.3 REVIEW OF MATHEMATICAL CONCEPTS When a function y ¼ f(x) is specified (e.g, y ¼ x4 þ 3 sin x þ tanh x), then x is the independent variable and y is the dependent variable, whose value is computed once numbers are assigned to the (one or more) independent variables for f. In other words, a function is a recipe for going from a variable (x) to a number ( f(x)). An equation is when the function is restricted by a definite value it must obtain after evaluation, for example, x3 þ 3 sin x þ tanh x ¼ 55 means that we must “solve” the equation for x (i.e., compute x) such that that value of x will satisfy the given equation. A functional F[g] ¼ 33 means that the explicit functional form of g is not known, or not knowable, but its use must yield the definite value of 33. In other words, a functional is a recipe for going from a function (g) to a number (F[g]). Algebraic equations with one variable l of order n ¼ 1 through 4 can be solved explicitly.
18
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
If n ¼ 2, the quadratic equation l2 þ al þ b ¼ 0
ð2:3:1Þ
l1 ¼ a=2 þ ð1=2Þ½a2 4b1=2 l2 ¼ a=2 ð1=2Þ½a2 4b1=2
ð2:3:2Þ
has two solutions:
If the discriminant D a2 4b < 0, then both roots are complex; if D ¼ 0, then the roots are real and degenerate (equal to each other). The solution for n ¼ 2 was known to the Egyptians in the Middle Kingdom (ca. 2160–1700 BC), the Hindus (Brahmagupta28 in 628 AD), and in its geometrical form to the ancient Greeks (Euclid29 and Diophantus30). If n ¼ 3, for the cubic equation: l3 þ al2 þ bl þ c ¼ 0
ð2:3:3Þ
the solution was found by del Ferro31 and Tartaglia,32 published by Cardano33 in 1545, and confirmed by Ferrari34 as l ¼ u p=3u a=3 ðthis encapsulates Viete's35 substitutionÞ; where p b a2 =3 q ð2=27Þa3 ð1=3Þab þ c n 1=2 o1=3 u q=2 ð1=4Þq2 þ p3=27
ð2:3:4Þ
Two distinct roots are possible, for the two alternatives for . The three cube roots of 1, namely, (i) exp(2pi/3) ¼ cos(120 ) þ i sin(120 ) ¼ 0.500000 þ (ii) exp(4pi/3) ¼ cos(240 ) þ isin(240 ) ¼ i0.866025 ¼ (1/2) þ i(31/2/2), 1/2 0.500000 i0.866025 ¼ (1/2) i(3 /2), and (iii) exp(6pi/3) ¼ 1, provide three roots; this times two is six roots, which do reduce to only three. To get the three correct roots l1, l2, and l3, it is essential that u (and not l) be premultiplied by factors of exp(6pi/3), exp(2pi/3), and exp(4pi/3), or else wrong results will be obtained. The discriminant D 18abc 4a3c þ a2b2 4b3 27c2 determines the nature of the three roots: If D > 0, then there are three distinct
28
Brahmagupta (598–668). Euclid of Alexandria (fl[oruit] ca. 300 BC). 30 Diophantus (born between 220 and 214 AD, died between 284 and 298 AD). 29
31
Scipione del Ferro (1495–1528). Niccol o Fontana, nicknamed Il Tartaglia “the stutterer” (1499–1557). 33 Girolamo Cardano (1501–1576). 34 Lodovico Ferrari (1522–1565). 32
2.3
19
RE VIEW OF M ATH EMA TIC AL CON CEPT S
real roots; if D ¼ 0, then all three roots are real (but some are degenerate); if D < 0, then there are one real root and two complex and mutually conjugate roots. An “umbrella” or “monic” formula, which is foolproof, is
h 2 3 i1=2 1=3 ð1=2Þ 2a3 9ab þ 27c þ 2a3 9ab þ 27c 4 a2 3b
h 2 3 i1=2 1=3 ð1=3Þ ð1=2Þ 2a3 9ab þ 27c 2a3 9ab þ 27c 4 a2 3b
h i h 2 3 i1=2 1=3 l2 ¼ ð1=3Þa þ 1 þ i 31=2 =6 ð1=2Þ 2a3 9ab þ 27c þ 2a3 9ab þ 27c 4 a2 3b
h i h 2 2 3 i1=2 1=3 1=2 3 3 þ 1 i 3 =6 ð1=2Þ 2a 9ab þ 27c 2a 9ab þ 27c 4 a 3b
h i h 2 2 3 i1=2 1=3 1=2 3 3 l3 ¼ ð1=3Þa þ 1 i 3 =6 ð1=2Þ 2a 9ab þ 27c þ 2a 9ab þ 27c 4 a 3b
h i h 2 2 3 i1=2 1=3 1=2 3 3 ð2:3:5Þ =6 ð1=2Þ 2a 9ab þ 27c 2a 9ab þ 27c 4 a 3b þ 1 þ i 3
l1 ¼ ð1=3Þa ð1=3Þ
Equivalently, one can define:
1=3 A q=2 þ ð1=4Þq2 þ p3 =27
1=3 B q=2 ð1=4Þq2 þ p3 =27
ð2:3:6Þ
The three solutions are then l1 ¼ A þ B l2 ¼ ð1=2ÞðA þ BÞ þ i 31=2 =2 ðA BÞ l3 ¼ ð1=2ÞðA þ BÞ i 31=2 =2 ðA BÞ
ð2:3:7Þ
Consider d (1/4)q2 þ p3/27; within overall multiplicative factors, this d is equivalent to, but opposite in sign to, the discriminant D 18abc 4a3c þ a2b2 4b3 27c2 defined above. If d > 0, there will be 1 real root and 2 conjugate imaginary roots. If d ¼ 0, there will be 3 real roots, of which at least 2 are equal. If d < 0, there will be 3 real & unequal roots; if d < 0, then define the following: cos y ðq=2Þ=ðp3 =27Þ1=2 and the 3 unequal real roots ðfor d < 0 & D > 0Þ are l1 ¼ a=3 þ 2ðp=3Þ1=2 cos ðy=3Þ
ð2:3:8Þ
l2 ¼ a=3 þ 2ðp=3Þ1=2 cos ðy=3 þ 120 Þ l3 ¼ a=3 þ 2ðp=3Þ1=2 cos ðy=3 þ 240 Þ If n ¼ 4, the general quartic equation l4 þ al3 þ bl3 þ cl þ d ¼ 0
ð2:3:9Þ
20
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
has a solution: h h l ¼ a=4 þ ð1=2Þ W ð1Þ* ð3a þ 2y 2b=W Þ1=2
ð2:3:10Þ
where a ð3=8Þa2 þ b b ð1=8Þa2 þ ð1=2Þab þ c g ð3=256Þb4 þ ð1=16Þa2 b ð1=4Þac þ d P ð1=12Þa2 g Q ð1=108Þa3 þ ð1=3Þag ð1=8Þb2 1=2 R ð1=2ÞQ ð1=4ÞQ2 þ ð1=27ÞP3
ð2:3:11Þ
W ½a þ 2y1=2 U R1=3 y ð5=6Þa þ U P=3U y ð5=6Þa þ U Q
1=3
if U 6¼ 0 if U ¼ 0
where all the upper signs “travel together.” This solution was found by Ferrari in 1545. If n > 4, Abel36 showed in 1824 that there can be no general closed-form solution [8]. Thus, numerical methods must be used when n > 4. Plane Trigonometric Functions. In a right plane triangle with right angle g ¼ 90 , sin a A/C, where A is the segment opposite to the angle a, and C is the hypotenuse; cos a B/C; tan a A/B ¼ sin a/cos a; sin b B/C; cos b A/C; a þ b ¼ 90 ; sec a 1/cos a; cosec a 1/sin a; cotan a 1/tan a; sin2 a þ cos2 a ¼ 1; sin (x) ¼ sin x; cos (x) ¼ cos x; tan (x) ¼ tan (x); sin (x y) ¼ sin x cos y cos x sin y; cos(x y) ¼ cos x cos y sin x sin y; 2 sin x cos y ¼ cos(x þ y) þ sin(x y); 2 cos x cos y ¼ cos(x þ y) þ cos(x y); 2 sin x sin y ¼ cos(x y) cos(x þ y). Thus, sin x and tan x are odd functions of x, while cos x is an even function of x. In some countries, tan x is written as tg x, and cotan x is written cotg x. Inverse functions: If x ¼ cos y, then y ¼ cos1 x. Be careful: cos1 x 6¼ 1/ cos x!!! Hyperbolic Functions
36
expðxÞ ex 2:718281828x
ð2:3:12Þ
sinh x ð1=2Þ½expðxÞ expðxÞ
ð2:3:13Þ
cosh x ð1=2Þ½expðxÞ þ expðxÞ
ð2:3:14Þ
Niels Henrik Abel (1802–1829).
2.3
21
RE VIEW OF M ATH EMA TIC AL CON CEPT S
tanh x sinh x=cosh x ¼ ½expðxÞ expðxÞ=½expðxÞ þ expðxÞ
ð2:3:15Þ
cotan x cosh x=sinh x ¼ ½expðxÞ þ expðxÞ=½expðxÞ expðxÞ ð2:3:16Þ Differential calculus was developed independently by Newton and Leibniz.37 Derivatives ðd=dxÞxn ¼ nxn1
ð2:3:17Þ
ðd=dxÞxn ¼ nxn1
ð2:3:18Þ
ðd=dxÞ sin x ¼ cos x
ð2:3:19Þ
ðd=dxÞcos x ¼ sin x
ð2:3:20Þ
ðd=dxÞtan x ¼ sec2 x ¼ 1=cos2 x
ð2:3:21Þ
ðd=dxÞ cotan x ¼ cosec2 x
ð2:3:22Þ
ðd=dxÞ sec x ¼ sec x tan x
ð2:3:23Þ
ðd=dxÞ cosec x ¼ cosec x cotan x
ð2:3:24Þ
ðd=dxÞex ¼ ex ðd=dxÞ expðxÞ expðxÞ
ð2:3:25Þ
ðd=dxÞ sin1 x ¼ ½1 x2 1=2
ð2:3:26Þ
1=2 ðd=dxÞ cos1 x ¼ 1 x2
ð2:3:27Þ
1 ðd=dxÞ tan1 x ¼ 1 þ x2
ð2:3:28Þ
1 ðd=dxÞ cotan1 x ¼ 1 þ x2
ð2:3:29Þ
Differential operators ðd=dxÞf ðxÞ df =dx f 0 ðxÞ; ðd=dxÞðd=dxÞf ðxÞ ¼ ðd2 f =dx2 Þ f 00 ðxÞ; ðd=dxÞn f ðxÞ ðdn f =dxn Þ ¼ f ðnÞ ðxÞ. Chain Rule dðuvÞ=dx ¼ uðdv=dxÞ þ vðdu=dxÞ
Integrals
37
ð
xn dx ¼ ðn þ 1Þ1 xnþ1 þ C
Gottfried Wilhelm von Leibniz (1646–1716).
ð2:3:30Þ
ð2:3:31Þ
22
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
where C is a constant; ð sin x dx ¼ cos x þ C
ð2:3:32Þ
ð cos x dx ¼ sin x þ C
ð2:3:33Þ
tan x dx ¼ logjcos xj
ð2:3:34Þ
cotan x dx ¼ logjsin xj
ð2:3:35Þ
ð
ð
Integration by Parts ð
ð u dv ¼ u v v du
ð2:3:36Þ
Taylor38 Series f ðxÞ ¼ f ðaÞ þ ðx aÞ=1! ðd=dxÞ f ðxÞjx ¼ a þ ðx aÞ2 =2! ðd2 =dx2 Þf ðxÞjx ¼ a þðx aÞ3 =3! ðd3 =dx3 Þf ðxÞjx ¼ a þ þ ðx aÞn =n! ðdn =dxn Þf ðxÞjx ¼ a þ ð2:3:37Þ Maclaurin39 Series.
38 39
(¼ Taylor series for a ¼ 0):
expðxÞ ¼ 1 þ x þ x2 =2 þ x3 =6 þ x4 =24 þ x5 =120 þ
ð2:3:38Þ
expðxÞ ¼ 1 x þ x2 =2 x3 =6 þ x4 =24 þ x5 =120 þ
ð2:3:39Þ
cos x ¼ 1 x2 =2 þ x4 =24 x6 =720 þ
ð2:3:40Þ
sin x ¼ x x3 =6 þ x5 =120 x7 =5040 þ
ð2:3:41Þ
tan x ¼ x þ x3 =3 þ ð2=15Þx5 þ ð17=315Þx7 þ
ð2:3:42Þ
Sir Brook Taylor (1685–1731). Colin Maclaurin (1698–1746).
2.3
23
RE VIEW OF M ATH EMA TIC AL CON CEPT S
cotan x ¼ 1=x x=3 x2 =45 ð2=945Þx5 x7 =4725
ð2:3:43Þ
sec x ¼ 1 þ x2 =2 þ ð5=24Þx4 þ ð61=720Þx6 þ ð277=8064Þx8 þ
ð2:3:44Þ
Euler40 Formula expðixÞ ¼ cos x þ i sin x
ð2:3:45Þ
where i (1)1/2, which gives the funny-looking but nevertheless true result: exp ðipÞ ¼ 1
ð2:3:46Þ
PROBLEM 2.3.1. Prove Eq. (2.3.21) by using the chain rule. Sums
i¼n X
ai a1 þ a2 þ a3 þ þ an1 þ an
ð2:3:47Þ
i¼1
The Einstein41 summation convention is that a sum is understood for any repeated indices over their full range: ai bi
i¼n X
ð2:3:48Þ
ai bi
i¼1
If ai a þ di, then
Arithmetic Series.
i¼n X
ai ¼
i¼1
Geometric Series. i¼n X i¼1
ai ¼
i¼n X
i¼n X
ða þ idÞ ¼ na þ ½nðn 1Þ=2d
ð2:3:49Þ
i¼1
If ai adi, then
adi ¼ a þ ad þ ad2 þ þ a dn ¼ a
i¼1
i¼n X i¼1
di ¼
að1 dnþ1 Þ ð2:3:50Þ 1d
If n is infinite and ai adi, then iX ¼1 i¼1
40 41
Leonhard Euler (1707–1783). Albert Einstein (1879–1955).
adi ¼
a 1d
ð2:3:51Þ
24
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
PROBLEM 2.3.2.
Verify Eq. (2.3.50).
PROBLEM 2.3.3.
Verify Eq. (2.3.51).
Partial Fractions. It is often convenient or desirable (e.g., in some difficult integrations) to break up a complicated factored polynomial expression in the denominator into partial fractions involving new denominators of order no higher than 2. For instance, it can be shown that the fraction on the left can be decomposed into a sum of the simpler fractions on the right: ðx þ 3Þ=½ðx 1Þ2 ðx 2Þðx 3Þðx2 þ 2x þ 2Þ ¼ Aðx 1Þ1 þ Bðx 1Þ2 þ Cðx 2Þ1 þ Dðx 3Þ1 þ ðEx þ FÞðx2 þ 2x þ 2Þ1 where the coefficients A, B, C, D, and especially E and F are nonzero. These coefficients are found by “brute force.” The two rules for how to set up the partial fractions are as follows (i) If a linear factor ax þ b occurs n times in the denominator, then to this factor will correspond a sum of n partial fractions: A1 ðax þ bÞ1 þ A2 ðax þ bÞ2 þ þ An ðax þ bÞn with An 6¼ 0; (ii) if a quadratic factor ax2 þ bx þ c occurs n times as factors in the denominator, then to this factor will correspond a sum of n partial fractions: ðA1 x þ B1 Þðax2 þ bx þ cÞ1 þ ðA2 x þ B2 Þðax2 þ bx þ cÞ2 þ þ ðAn x þ Bn Þ ðax2 þ bx þ cÞn with An x þ Bn 6¼ 0: PROBLEM 2.3.4. Evaluate the coefficients A, B, C, D, E, and F in the equation ðx þ 3Þ=½ðx 1Þ2 ðx 2Þðx 3Þðx2 þ 2x þ 2Þ1 ¼ Aðx 1Þ1 þ Bðx 1Þ2 þ Cðx 2Þ1 þ Dðx 3Þ1 þ ðEx þ FÞðx2 þ 2x þ 2Þ1 The Lagrange Method of Undetermined Multipliers. To prove important statistical mechanical results in Chapter 5, we need the method of undetermined multipliers, due to Lagrange.42 This method can be enunciated as follows: Assume that a function f(x1, x2,..., xn) of n variables x1, x2,..., xn is subject to two auxiliary conditions: gðx1 ; x2 ; . . . ; xn Þ ¼ 0
ð2:3:52Þ
hðx1 ; x2 ; . . . ; xn Þ ¼ 0
ð2:3:53Þ
We seek extrema (maxima, minima, or saddle points) of f, subject to these two conditions. We shall show that there exist two constants, defined as a and b (these two are known as the Lagrange multipliers), such that the system of n þ 2 equations @f ðx1 ; x2 ; . . . ; xn Þ=@xi þ a@gðx1 ; x2 ; . . . ; xn Þ=@xi þ b@hðx1 ; x2 ; . . . ; xn Þ=@xi ¼ 0 ði ¼ 1; 2; . . . ; nÞ
42
Joseph Louis Lagrange ¼ Giuseppe Lodovico Lagrangia (1736–1813).
ð2:3:54Þ
2.4
25
ME CHA NIC S, VECT OR S, T ENS ORS , A ND DE TERM INA NT S
gðx1 ; x2 ; . . . ; xn Þ ¼ 0
ð2:3:55Þ
hðx1 ; x2 ; . . . ; xn Þ ¼ 0
ð2:3:56Þ
when solved, will provide the desired extremum for f. We want to find the conditions for which df ¼ 0. Remember that, while the constraints g ¼ 0 and h ¼ 0, in general f 6¼ 0. Define the differential: n X @f df dxi @xi i¼1
ð2:3:57Þ
We want to find when df ¼ 0. Since g ¼ 0 and h ¼ 0, therefore also, a fortiori n X @g dg dxi ¼ 0 @xi i¼1
ð2:3:58Þ
n X @h dxi ¼ 0 @xi i¼1
ð2:3:59Þ
and also dh ¼
Rewriting Eq. (2.3.54) now yields @f =@xi ¼ a@g=@xi b@h=@xi
ði ¼ 1; 2; . . . ; nÞ
ð2:3:60Þ
so that finally n n X X @g df ¼ a dxi b ð@h=@xi Þdxi ¼ a0 b0 ¼ 0 @xi i¼1 i¼1
ð2:3:61Þ
that is, we found the condition of Eq. (2.3.54), that df ¼ 0, as desired, as conditions for a and b. Proving whether this extremum in f(x1, x2, . . . , xn) is a maximum, a minimum, or zero is usually not done analytically (e.g., by further differentiation to make sure that d2f > 0 for a minimum, d2f < 0 for a maximum, etc.), but instead by recourse to physical arguments. Indeed, the values of the Lagrange multipliers a and b can often be found from physical arguments.
2.4 MECHANICS, VECTORS, TENSORS, AND DETERMINANTS Force is defined by Newton’s second law: F ¼ @p=@t
ð2:4:1Þ
26
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
where F is the force, t is the time, p is the particle’s linear momentum, and m is its mass (this equation is relativistically correct). When the momentum is given by the product of mass time velocity v: p ¼ mv
ð2:4:2Þ
(this is not valid at relativistic speeds) and if a ¼ dv/dt, then F ¼ ma
ð2:4:3Þ
The rest mass m of any particle or celestial body can be considered in three ways: (i) as a proportionality constant between force and acceleration; (ii) as a curvature of the space–time continuum around a massive body (the effects of Einstein’s theory of general relativity were relabeled by Wheeler43 as “geometrodynamics” [9]); (iii) as a fundamental property, of dimension [M], defined by Eq. (2.1.2) or by Eq. (2.4.3) as “inertial mass” in outer space, or as “amount of material.” Interpretation (ii) has triumphed, but one may still argue about what m really is. If m is a fundamental ”essence,” of dimension [M], then force and field have dimensions [M] [L] [T]2, while energy has units [M] [L]2 [T]2. What rest mass an elementary particle should have may be predictable if the Higgs boson is ever found. Five unit systems should be summarized here: (A) The SI (Systeme International) units use kilograms, meters, seconds, amperes, kelvin, mole (6.022 1023 molecules per gram-mole, and not per kg-mole), and candela for [M], [L], [T], current, absolute temperature, mole, and luminous intensity, respectively. It started from an MKS (m-kg-s) system and included an electrical unit as part of the definition, as first suggested by Giorgi44 in 1904. There is a very slight modification of SI, used in nonlinear optics, confusingly dubbed MKS by its users, but called SI0 here. (B) The older cgs units started from the French Academy work of 1793 defining the gram and the meter, and we use grams, centimeters, and seconds for [M], [L], and [T], respectively. To define electrical and magnetic quantitites, cgs comes in two flavors: cgs-esu, or simply esu (where statCoulombs are the units for electrical charge), and cgs-emu, or simply emu, where the Oersted45 is the unit of magnetic field. There are other variants of the cgs units: Gaussian46 and Heaviside47-Lorentz.
43 44
John Archibald Wheeler (1911–2008).
Giovanni Giorgi (1871–1950). Hans Christian Oersted (1777–1851). 46 Karl Friedrich Gauss (1777–1855). 47 Oliver Heaviside (1850–1925). 45
2.4
27
ME CHA NIC S, VECT OR S, T ENS ORS , A ND DE TERM INA NT S
(C) In addition to the SI and cgs systems, we can define a system of Hartree48 atomic units (a.u.). Alas, a slightly different set of Rydberg49 a.u. also exists, but will not be discussed here. The Hartree atomic units are defined so that (i) the unit of length [L] is a0 ¼ 1 bohr ¼ 5.29177 1011 m ¼ radius of the “Bohr50 orbit” for hydrogen, (ii) the unit of mass [M] is 1 electron mass ¼ me ¼ 9.109 1031 kg, (iii) the unit of [action] is [M] [L]2 [T]1 ¼ (h/2p) h ¼ reduced Planck constant of action ¼ 1.055 1034 J s (iv) the unit of electrical charge is the proton charge e ¼ 1.602 1019 C. This is equivalent to putting h ¼ 1, me ¼ 1, e ¼ 1 in all formulas. As a consequence, (v) the unit of time [T] ¼ time for 1 electron to travel 1 Bohr radius ¼ 2.419 1017 s, (vi) the unit of energy [M] [L]2 [T]2 ¼ 1 hartree ¼ twice the ionization energy of the hydrogen atom ¼ 4.360 1018 J. (D) In Planck units, of interest to quantum gravity and to early cosmology, h ¼ 1, as in the atomic units, but c ¼ speed of light in vacuo ¼ 1, and G ¼ gravitational constant ¼ 1. (E) In Astronomical units (unfortunately, also called a.u.) the unit of mass is the solar mass (1.98892 1011 kg), and the unit of length is the mean distance from earth to sun (1.49597871464 1011 m). For a single particle of mass m and momentum p, or velocity v ¼ (dr/dt), the kinetic energy T (originally dubbed “vis viva,” or live energy!) is defined as T p2 =2m ¼ mv2 =2
ð2:4:4Þ
T is positive definite. The potential energy U introduced above depends on an arbitrary choice of its zero, which depends on “tradition”,—that is, on the whim of the first or loudest experimenter: U becomes positive or negative, relative to that zero, depending on that tradition. A concept helpful for solving simple celestial mechanics problems is the centrifugal acceleration a of any body moving with a speed v in a circular orbit of radius r: a ¼ v2 =r
ð2:4:5Þ
which, expressed as a vector, is a ¼ d2r/dt2 ¼ v2r r2. Its opposite is the centripetal acceleration, which will keep a body on its circular path. In general, the acceleration a will have a radial component, the centripetal acceleration (v2/r) along an unit (inward) vector en, and a tangential component, along the unit tangent vector et: a ¼ d2r/dt2 ¼ (dv/dt) et þ (v2/r) en. PROBLEM 2.4.1. Given that the average earth–moon distance is Rem ¼ 3.844 108 m and that the moon’s revolution around the earth is 27.3 days (from which its tangential orbital velocity is vm ¼ 1.0186 103 m s1), compute the mass of the earth.
48
Douglas Rayner Hartree (1897–1958). Johannes Robert Rydberg (1854–1919). 50 Niels Henrik David Bohr (1885–1965). 49
28
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
PROBLEM 2.4.2. Given the mass of the earth Me ¼ 5.977 1024 kg and its mean radius Re ¼ 6371 km, verify that the acceleration due to mean gravity at sea level at the equator is 2 g ¼ GMe R2 e ¼ 9:780 m s
ð2:4:6Þ
PROBLEM 2.4.3. Show that the relative gravitational potential energy at a height h above the earth’s surface is Urel ¼ mgh
ð2:4:7Þ
where Urel ¼ 0 at the earth’s surface. Use the Maclaurin series: ð1 þ xÞ1 ¼ 1 x þ x2 x3 þ x4
ð2:4:8Þ
PROBLEM 2.4.4. Show that the escape velocity vesc from the earth’s gravitational field is 1.1 104 m s1. Given the necessary escape kinetic energy ð1=2Þ mv2esc ¼ ð3=2Þ kB T; where kB ¼ 1.3807 1023 J K1 atom1 is Boltzmann’s51 constant, which molecules, at an effective temperature of 30,000 K, can leak out from the earth’s atmosphere into space? Is this temperature reasonable? PROBLEM 2.4.5. Show that the gravitational potential energy of an object of mass m at the earth’s surface is only 7% due to the earth, and 93% due to the sun (the earth–sun distance is 149,600,000 km; the mass of the sun is 1.985 1030 kg). So why do we not fall off the earth and tumble toward the sun? PROBLEM 2.4.6. What is the center of gravity in the earth–sun trajectory (sun–earth distance ¼ 1.496 1011 m; earth mass ¼ 5.977 1024 kg; sun mass ¼ 1.985 1030 kg)? PROBLEM 2.4.7. If a satellite is to reach an orbit 100 km above the surface of the earth, what tangential velocity must it have as it enters the orbit? How long will it take to make one revolution around the earth (earth mass ¼ 5.977 1024 kg; earth radius ¼ 6.371 106 m)? Since forces are represented by vectors, we next review some properties of vectors, with particular applications to crystals. The position vector r is usually given in a Cartesian (orthogonal) system, but crystals are defined as symmetric objects with translational symmetry, with a fundamental or unit cell, which is the basic nonrepeating unit that is not necessarily orthogonal (see Sections 7.1 and 7.10)]. The unit cell has axes a, b, c (measured in nm or in A); the axes do form a right-handed system, and are, in general, the corners of an oblique parallelopiped (in the lowest symmetry system, the triclinic system, see Fig. 2.4); the angle between a and b is called g; the angle between b and c is called a, and the angle between c and a is called b. The Cartesian system is named after Descartes.52 51 52
Ludwig Boltzmann (1844–1906). Rene Descartes (1596–1650).
2.4
29
ME CHA NIC S, VECT OR S, T ENS ORS , A ND DE TERM INA NT S
ec β c
α γ
FIGURE 2.4
eb
ea
The general unit cell in a triclinic (lowest symmetry) crystal. The unit cell has sides a, b, c, angles a, b, and g and volume V.
a
b
Sideline. Descartes, a late riser, died in Sweden, maybe of pneumonia, because Queen Christina, who had hired him, insisted that he teach her philosophy at dawn. The French broadsheet La Gazette d’Anvers announced Descartes’ death by “En Suede un sot vient de mourir qui disait qu’il pouvait vivre aussi longtemps qu’il voulait.” The dot or scalar inner product between vectors a and b is a scalar quantity, defined by a b jajjbjcos g ¼ b a
ð2:4:11Þ
where the angle between the a and b axes is g; similarly:
b c ¼ jbjjcjcos a ¼ c b
ð2:4:12Þ
c a ¼ jcjjajcos b ¼ a c
ð2:4:13Þ
One can also define the a, b, c system in terms of any arbitrarily oriented Cartesian system: a ax ex þ ay ey þ az ez b bx ex þ by ey þ bz ez
ð2:4:14Þ
c cx ex þ cy ey þ cz ez but in crystallography it is customary to align either the b axis with ey or the c axis with ez. The inner product can be defined in a space of any dimensions from two to infinity, and it obeys the associative and commutative laws. Of course, if a and b are orthogonal, then the dot product a b is zero; in the Cartesian system used above we have ex ey ¼ ey ez ¼ ez ex ¼ 0
ð2:4:14Þ
and the length, or norm, of the unit vectors is, of course, unity: ex ex ¼ ey ey ¼ ez ez ¼ 1
ð2:4:15Þ
30
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
The vector product or “cross” product (term coined by Gibbs53) is defined only in three-dimensional space: The vector product, or cross product, of vectors a and b is a vector v, whose magnitude is jaj jbj sin g, where g is the angle between a and b, and whose direction is perpendicular to both a and b, and whose orientation is such that a, b, and v form a right-handed system: v ¼ a b ev jajjbj sin g
ð2:4:16Þ
In a Cartesian coordinate system, by applying the definition of a cross product in this orthogonal system, the unit vectors ex, ey, ez are related as follows: ex ey ¼ ez ¼ ey ex ; ey ez ¼ ex ¼ ez ey ; and ez ex ¼ ey ¼ ex ez ð2:4:17Þ while the cross product of any vector with itself vanishes: ex ex ¼ ey ey ¼ ez ez ¼ 0
ð2:4:18Þ
By using the distributive property of the vector product (Problem 2.4.8) and using Eq. (2.4.17): v ¼ a b ¼ ax ex þ ay ey þ az ez bx ex þ by ey þ bz ez ¼ b a ð2:4:19Þ ¼ ex ay bz az by þ ey ðaz bx ax bz Þþez ax by ay bz and then remembering the properties of a 3 3 determinants, one sees ex v ¼ a b ax bx
PROBLEM 2.4.8.
ey ay by
ez az bz
Verify the distributive property of the vector product:
A ðB þ C þ D þ Þ ¼ A B þ B C þ A D þ
PROBLEM 2.4.9.
ð2:4:20Þ
ð2:4:21Þ
Prove A ðB CÞ ¼ BðA CÞ ðA BÞC
ð2:4:22Þ
PROBLEM 2.4.10. Prove A B C D ¼ ðA CÞðB DÞ ðB CÞðA DÞ
53
Josiah Willard Gibbs, Jr. (1839–1903).
ð2:4:23Þ
2.4
31
ME CHA NIC S, VECT OR S, T ENS ORS , A ND DE TERM INA NT S
PROBLEM 2.4.11. Prove ðA BÞ ðC DÞ ¼ CðD A BÞ DðA B CÞ
ð2:4:24Þ
The cross product is anticommutative, that is, it changes sign when the factors are reversed. Indeed, the cross product is really a “pseudovector,” or “polar vector,” which has all the desirable properties of vectors, plus one undesirable one: A pseudovector is “married” to a right-handed system (in a left-handed system its magnitude is the same, but its sign changes). To differentiate them from pseuodovectors, the “normal” vectors are also called sometimes “axial vectors.” The anticommutation is also implicit in the properties of a determinant. Geometrically, the magnitude of the vector product of a and b is the area A of the parallelogram with a and b as the sides: A ¼ ja bj ¼ jajjbj sin g
ð2:4:25Þ
We next calculate the volume V of the parallelopiped of Fig. 2.4: We need the projection of c onto the direction normal to both a and b; this demands the dot product of c times the cross product of a and b; indeed (in a cyclic fashion) it can seen that V ¼ ða bÞ c ¼ ðb cÞ a ¼ ðc aÞ b
ð2:4:26Þ
and using the commutative properties of the scalar product and the anticommutative properties of the vector product one sees the cyclic permutation of vectors as giving a positive value for the volume, while any twofold permutation yields a negative volume (left-handed system!): V ¼ c ða bÞ ¼ a ðb cÞ ¼ b ðc aÞ ¼ ðb aÞ c ¼ ðc bÞ a ¼ ða cÞ b ¼ c ðb aÞ ¼ a ðc bÞ ¼ b ða cÞ ð2:4:27Þ V is also “married” to a right-handed system, so V should be called a pseudoscalar. To evaluate V, we need the difficult projection of c onto the direction normal to both a and b; this is not easy (see Problem 2.4.12). The result is 1=2 V ¼ a b c 1 cos2 a cos2 b cos2 g þ 2 cos a cos b cos g
ð2:4:28Þ
An easier result to see from Eqs. (2.4.21) and (2.4.22) is ax V ¼ a ðb cÞ ¼ bx cx
ay by cy
az bz cz
ð2:4:29Þ
A detailed discussion of reciprocal space is given in Section 7.10.
32
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
PROBLEM 2.4.12. Prove Eq. (2.4.28). “Ski Slopes,” “Hernias”, and Curls. In Cartesian space the “del” (or nabla Assyrian harp or atled “backwards delta”) operator or function r is given by r ex ð@=@xÞ þ ey ð@=@yÞ þ ez ð@=@zÞ
ð2:4:30Þ
The grad g r g could also be called the “ski-slope” function: It shows the vector sum of the partial derivatives along three Cartesian axes for a scalar function g(x, y, z) and has the value of a vector: grad gðx; y; zÞ rg ex ð@g=@xÞ þ ey ð@g=@yÞ þ ez ð@g=@zÞ
ð2:4:31Þ
When applied to a mountain, r g gives the direction of steepest descent for the most ardent and risk-averse skier! We could call the div u ¼ ru function the “hernia function”: It shows how a vector function u(x, y, z) herniates, that is, sprouts in the x, y, and z directions at the point (x, y, z); in Cartesian space the div or “del dot” operator can be represented by div u r u ð@ux =@xÞ þ ð@uy =@yÞ þ ð@uz =@zÞ
ð2:4:32Þ
The curl u, or r u, or rot u, or “curly function” describes how “curly” a vector field (array) of vectors u is. Verify that in a Cartesian system the pseudovector: ex ey ez curl u rot u r u @=@x @=@x @=@z ux uy uz ¼ ex @uz =@y @uy =@z þ ey ð@ux =@z @uz =@xÞ þ ez @uy =@x @ux =@y ð2:4:33Þ As defined above, r u is a pseudovector. Incidentally, consider the face of a hairy monkey, or the face of an ape-man: if it is taken to be spherical, it can be covered by facial hair everywhere, but the topology of putting hair all over a sphere requires that there be at least two whorls, or points on the sphere where the hair density is zero, and the curl of the hair is maximized. The Laplacian54, or del-squared, operator is r r ¼ r2 (sometimes shown as D to further confuse the poor reader); r r is given in Cartesian space as div grad ¼ r r ¼ r ¼ @ 2 =@x2 þ @ 2 =@y2 þ @ 2 =@z2
54
Pierre-Simon, Marquis de Laplace (1749–1827).
ð2:4:34Þ
2.4
33
ME CHA NIC S, VECT OR S, T ENS ORS , A ND DE TERM INA NT S
PROBLEM 2.4.13. Verify the following identity, valid for any vector v: r ðr vÞ ¼ rðr vÞ ðr rÞv ¼ rðr vÞ r2 v
ð2:4:35Þ
or, in a different notation: curl curl v ¼ grad ðdiv vÞ ðdiv gradÞ v
ð2:4:36Þ
PROBLEM 2.4.14. Calculate the volume of a tetrahedron of sides a, b, c (where jaj ¼ jbj ¼ jcj). PROBLEM 2.4.15. Prove that r r U ¼ 0 for any vector U. Tensors. While a scalar (a pure number, real or complex) p merely multiplies the length of a vector V: V 0 ¼ pV
ð2:4:37Þ
(i.e., multiplies all three components of V by p), there are 3 3 tensors a that can rotate them: they are known as tensors of rank two: V 00 aV
ð2:4:37Þ
These tensors are square matrices. If we are dealing with a (column) vector V (or tensor of rank one) in three-dimensional space, then the tensor a of rank 2 will have, in general, 3 3 ¼ 9 independent elements: a11 a a21 a31
a12 a22 a32
a13 a23 a33
ð2:4:38Þ
where each tensor element aij is a scalar (real or complex). If a tensor b is symmetric, that is, if bij ¼ bji, then this tensor has only six independent elements: b11 b b12 b 13
b12 b22 b23
b13 b23 b33
ð2:4:39Þ
A tensor g is Hermitian55 if its off-diagonal elements are the complex conjugates of the corresponding elements across the main diagonal g ij ¼ g ji (this Hermitian condition is written in matrix form as g ¼ g y); a tensor d is unitary if the inverse of the matrix is equal to its Hermitian conjugate: d ¼ (dy)1.
55
Charles Hermite (1822–1901).
34
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
The trace Tr of a square tensor is the scalar sum of its diagonal elements: TrðaÞ
X
aii ¼ fif a is a 3 3 tensorg ¼ a11 þ a22 þ a33
i
ð2:4:40Þ
Of course, n n tensors of rank 2 can be defined for dimensions n > 3: They occur frequently in four-dimensional special and general relativity theories. Determinants. To each square matrix a of dimension n we can associate a determinant det a as follows: det a ¼
j¼n i¼n X i¼n X X ð1Þh a1j1 a2j2 a3j3 . . . anjn ¼ aik Aik i¼1 j¼1
ðk ¼ 1; 2; . . . ; nÞ
i¼1
ð2:4:41Þ where the indices (j1, j2,. . ., jn) are permutations of the “natural” or perfectly sequential order of the first n integers (1, 2, 3, . . ., n) and h is the number of twofold exchanges of any two elements needed to recreate this “natural” order; the cofactor or minor Aik of any element aik is defined as the subdeterminant created by eliminating the ith row and k-column of the original determinant, multiplied by (1)i þ k. A determinant can be computed as a sum of the minors of any one given row (or column) (choose one):
det a ¼
i¼n X
aik ð1Þiþk Aik
ðk ¼ 1; 2; . . . ; nÞ
ð2:4:42Þ
i¼1
where the operator k also means “determinant.” Det a ¼ 0 if (1) all elements of any row are zero, or (2) all elements of any column are all zero, or (3) any two rows have equal elements in the same order, or (4) any two columns have equal elements in the same order. The evaluation of determinant with n 4 rows and columns is laborious. For 3 3 determinants, Sarrus’56 rule is simple: Repeat and append the first two columns at the end, and multiply diagonally down three times with a þ1 prefactor (a11a22a33, then a12a23a31, then a13a21a32), then multiply diagonally up three times with a 1 prefactor (a31a22a13, then a32a23a11, then a33a21a12). For example: 8 9 5 7 9 5 7> 7 9 > > 12 6 2 12 6 2 : 12 6 ; ¼ ðþ1Þ 5 2 2 þ ðþ1Þ 7 14 12 þ ðþ1Þ 9 3 6 þ ð1Þ 12 2 9 þ ð1Þ 6 14 5 þ ð1Þ 2 3 7 ¼ 20 þ 1176 þ 162 216 420 42 ¼ 680 A matrix A is singular if det A ¼ 0; conversely, A is nonsingular if det A 6¼ 0. 56
Pierre Frederic Sarrus (1798–1861).
2.4
35
ME CHA NIC S, VECT OR S, T ENS ORS , A ND DE TERM INA NT S
If the matrix A is symmetric, Hermitian, or unitary, then there is a system of 3 3 rotation matrices R (and their inverses R1) which will rotate the matrix elements Aij so that the only nonzero elements will appear on the diagonal; this is known as a similarity transformation or as a principal-axis transformation or diagonalization: Adiag R1 AR
ð2:4:43Þ
This transformation is crucial when some measured 3 3 matrix has nine experimental values, and a coordinate system is sought which will highlight the physically significant components of this matrix, which may be as few as three after the appropriate similarity transformation into the “right” coordinate system. Written out in full, the relationship between a matrix and its inverse is RR1 ¼ R1 R ¼ E where E is the unit matrix: 1 E 0 0
0 1 0
0 0 1
ð2:4:44Þ
The problem of finding the rotation matix that will “diagonalize” some symmetric, Hermitian, or unitary matrix A can be recast as an eigenvalue–eigenvector problem: We seek the characteristic solutions to the problem K A lE
ð2:4:45Þ
where l is one of n scalar (real or complex) “characteristic values” or “eigenvalues,” and K is the characteristic matrix of A (a matrix constructed by adjoining all column vectors to each other). In particular the scalar nthdegree equation: KðlÞ det K ¼ det jA lEj ¼ 0
ð2:4:46Þ
has n solutions (the n eigenvalues l1, l2,. . ., ln), so that K(l) can rewritten as KðlÞ ¼ ðl l1 Þðl l2 Þðl l3 Þ . . . ðl ln Þ ¼ 0 The determinant may factor naturally: this is rare. For n ¼ 2, 3, or 4 one seeks the roots of a quadratic, cubic, or quartic polynomial equation in l, or one must resort to numerical methods. In the case n ¼ 7, the eigenvalue matrix L R1A R is the diagonal matrix:
36
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
l1 0 0 L 0 0 0 0
0
0
0
0
0
l2
0
0
0
0
0
l3
0
0
0
0
0
l4
0
0
0
0
0
l5
0
0
0
0
0
l6
0
0
0
0
0
0 0 0 0 0 0 l7
ð2:4:47Þ
The problem of finding the n eigenvalues (li, i ¼ 1, 2, . . ., n) and the n corresponding eigenvectors x: Ax ¼ lx
ð2:4:48Þ
is thus equivalent to diagonalizing the matrix A. The n n eigenvalue– eigenvector equation or secular equation (so-called because astronomers used it to describe motions of distant planets tracked over the course of several centuries, or saecula) is given by ðA lEÞx ¼ O
ð2:4:49Þ
where E is the n n unit matrix and O is the n n null matrix. Its solution x has the eigenvector components xi ¼ ð1Þiþj A lj E ij ;
i ¼ 1; 2; 3; j ¼ 1; 2; . . . ; n
ð2:4:50Þ
where k means “determinant”; and the Aij are the (n 1) (n 1) “minors” of the original n n determinant j A lE j , obtained by eliminating the ith row and jth column of the original determinant. The eigenvectors coefficients are redefined by normalization. If the eigenvalues are not degenerate, then the eigenvectors are automatically mutually orthogonal; if the eigenvalues are degenerate, then the eigenvectors can be made to be orthogonal. The n n rotation matrix R is constructed by accosting to each other the n 1 eigenvectors. The inverse rotation matrix R1 is the transpose of R. The transformation of A into L is known as a similarity transformation: R1 AR ¼ L
ð2:4:51Þ
It is important to realize that the trace of the matrix A (the sum of the diagonal terms) is invariant under a similarity transformation. PROBLEM 2.4.16. Invert the symmetric matrix: 0
21
B A¼@ 3
6
3 14 7
6
10
21 3 C 7 A@ 3 14 6 7 12
1 6 7 A 12
2.4
37
ME CHA NIC S, VECT OR S, T ENS ORS , A ND DE TERM INA NT S
PROBLEM 2.4.17. Diagonalize the 3 3 symmetric real matrix: 0 1 21 3 6 A ¼ @ 3 14 7 A 6 7 12 by first solving the determinantal equation: 6 7 ¼ 0 12 l
21 l 3 det 3 14 l 6 7
Finally, check that the trace of A is equal to the trace of L. PROBLEM 2.4.18. Find the eigenvectors for the matrix: 0
21 A¼@ 3 6
1 6 7 A 12
3 14 7
which in Problem 2.4.17 was transformed into the diagonal eigenvalue matrix: 0 B L¼@
24:032324397377
0
0
0
19:474154527374
0
0
0
3:493521075249
1 C A
PROBLEM 2.4.19. What is the inverse of the rotation matrix R 0
0:875212771955
B R ¼ @ 0:072043610084 0:478343311470
0:333162753797 0:80670936955 0:48808049803
0:350720950142
1
C 0:586540460086 A 0:730044590288
which was computed in Problem 2.4.17? PROBLEM 2.4.20. We need to learn how to rotate coordinate systems. In the xy plane a counterclockwise rotation by an angle f causes a rotation from vectors x, y to new vectors x0 , y0 : x0 ¼ x cos f þ y sin f
ð2:4:52Þ
y0 ¼ x sin f þ y cos f
ð2:4:53Þ
which can be represented as a column vector X ¼ (x y) rotated to a new column vector X0 ¼ (x0 y0 ) by premultiplication by the square rotation matrix A ¼ (aij):
x0 y0
¼
cos f sin f
x y cos f sin f
X 0 ¼ AX
ð2:4:54Þ ð2:4:55Þ
38
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
Since the coordinate system is simply rotated, not shrunk or expanded in any way, the matrix A must be orthonormal: X
aij ajk ¼ dik ðj; k ¼ 1; 2Þ
ð2:4:56Þ
i
To do a full rotation in three-dimensional space, Eulerian rotation angles can be defined, as three successive rotations, each in two dimensions. The first rotation (rotation matrix A) is a counterclockwise rotation by a degrees about the z axis: It leaves the z axis unchanged (z0 ¼ z), but it rotates the axis x to x0 by a degrees and rotates the axis y to y’ by a degrees. X 0 ¼ AX
ð2:4:57Þ
The second rotation (rotation matrix B) rotates by a counterclockwise rotation by b degrees about the axis x0 : It leaves the x0 axis unchanged (x00 ¼ x0 ), but it rotates y0 to y00 by b degrees and rotates z0 to z00 by b degrees: X 00 ¼ BX 0
ð2:4:58Þ
The third rotation (rotation matrix C) rotates by a counterclockwise rotation by g degrees about the axis z0 : it leaves the z00 axis unchanged: z000 ¼ z00 , but it rotates x00 to x000 by g degrees and rotates y0 to y000 by g degrees: X000¼ C X00 . Show that the rotation matrices are 0
cos a A ¼ @ sin a 0
sin a cos a 0
1 0 0 A; 1
0
1 B ¼ @0 0
0 cos b sin b
1 0 sin b A; cos b
0
cos g C ¼ @ sin g 0
1 0 0A 1
sin g cos g 0
ð2:4:59Þ Show also that the overall Eulerian rotation matrix E ¼ A B C is given by 0
cos g cos a sin g sin a cos b
B E¼B @ sin g cos a cos g sin a cos b sin a sin b
cos g sin a þ sin g cos a cos b
sin b sin g
1
sin g sin a þ cos g cos a cos b
C sin b cos g C A
cos a sin b
cos b
ð2:4:60Þ
Show also that the inverse matrix E1 is equal to the transpose matrix ET (where rows and columns are interchanged): 0
cos g cos a sin g sin a cos b
B E1 ¼ @ cos g sin a þ sin g cos a cos b sin b sin g
sin g sin a cos g sin a cosb sin g sin a þ cos g cos a cos b sin b cos g
sin a sin b
1
C cos a sin b A cos b ð2:4:61Þ
2.4
39
ME CHA NIC S, VECT OR S, T ENS ORS , A ND DE TERM INA NT S
PROBLEM 2.4.21. If the Eulerian matrix E given by Eq. (2.4.60) must equal the rotation matrix R determined in Problem 2.4.20 above: 0
0:875212771955
B R ¼ @ 0:072043610084 0:478343311470
0:333162753797 0:80670936955 0:48808049803
0:350720950142
1
C 0:586540460086 A 0:730044590288
compute the relevant Eulerian angles a, b, and g. This is useful, for example when the principal axes of a physically measured tensor in a crystal must be oriented relative to laboratory Cartesian axes. When we rotate a contravariant n 1 column vector (for position, velocity, momentum, electric field, etc.) we premultiply it by an n n rotation tensor R. When, instead, we transform the coordinate system in which such vectors are defined, then the coordinate system and, for example, the r operator are covariant 1 n row vectors, which are transformed by the tensor R1 that is the reciprocal of R. A “dot product” or inner product a b must be the multiplication of a row vector a by a column vector b, to give a single number (scalar) as the result. This will be expanded further in the discussion of special relativity (Section 2.13) and of crystal symmetry (Section 7.10). Spherical Trigonometry. Spherical trigonometry is essential for terrestrial and celestial navigation (but not really for the physics and chemistry discussed in this book). Nevertheless, its brief presentation here should help scientific travelers estimate their travel distance and appreciate how the angles we all learned in plane trigonometry are quite different on a spherical surface. In a spherical triangle (a triangle localized on the surface of a sphere), angles A, B, C as well as sides a, b, c are measured in radians (Fig. 2.5) [10]. Here is a summary of the properties of spherical triangles: (1) Any angle A, B, or C must be less than 180 ; (2) a þ b þ c < 360 ; (3) any side is less than the sum of the other two sides; (4) 180 < A þ B þ C < 540 ; (5) the sine law is sin A=sin a ¼ sin B=sin b ¼ sin C=sinc
ð2:4:62Þ
(6) the cosine law for the sides is cos a ¼ cos b cos c þ sin b sin c cos A cos b ¼ cos c cos a þ sin c sin a cos B
ð2:4:63Þ
cos c ¼ cos a cos b þ sin a sin b cos C c = orthodrome C
Α
b
a
C: North pole 90°N b = segment of meridian
A
c C
Β a
B B: TYO (35.70°N, 139.77°E = 35.70, -139.77°)
A: SFO (37.17°N, 122.43W = 37.17°,122.43) D: 35.70°, 122.43°
FIGURE 2.5 (Left) Spherical triangle with sides a, b, c and opposite angles A, B, and C. (Right) Computation of orthodrome between San Francisco (SFO) and Tokyo (TYO), and plane triangle ABD.
40
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
(7) the cosine law for the angles is cos A ¼ cos B cos C þ sin B sin C cos a cos B ¼ cos C cos A þ sin C sin A cos b
ð2:4:64Þ
cos C ¼ cos A cos B þ sin A sin B cos c (8) the half-angle formulas are sinðA=2Þ ¼ ½sinðs bÞsinðs cÞ=sin b sin c1=2 sinðB=2Þ ¼ ½sinðs cÞsinðs aÞ=sin c sin a1=2 sinðC=2Þ ¼ ½sinðs aÞsinðs bÞ=sin a sin b1=2
ð2:4:65Þ
where s ð1=2Þða þ b þ cÞ and tanðA=2Þ ¼ p=sinðs aÞ tanðB=2Þ ¼ p=sinðs bÞ tanðC=2Þ ¼ p=sinðs cÞ where p ½sinðs aÞ sinðs bÞ sinðs cÞ=sin s1=2
ð2:4:66Þ
(9) The four analogies of Napier: sin½ðA BÞ=2=sin½ðA þ BÞ=2 ¼ tan½ða bÞ=2=tanðc=2Þ cos½ðA BÞ=2=cos½ðA þ BÞ=2 ¼ tan½ða þ bÞ=2=tanðc=2Þ sin½ða bÞ=2=sin½ða þ bÞ=2 ¼ tan½ðA BÞ=2=cotanðC=2Þ cos½ða bÞ=2=cos½ða þ bÞ=2 ¼ tan½ðA þ BÞ=2=cotanðC=2Þ
ð2:4:67Þ
and cyclically: sin½ðB CÞ=2=sin½ðB þ CÞ=2 ¼ tan½ðb cÞ=2=tanða=2Þ cos½ðB CÞ=2=cos½ðB þ CÞ=2 ¼ tan½ðb þ cÞ=2=tanða=2Þ sin½ðb cÞ=2=sin½ðb þ cÞ=2 ¼ tan½ðB CÞ=2=cotanðA=2Þ cos½ðb cÞ=2=cos½ðb þ cÞ=2 ¼ tan½ðB þ CÞ=2=cotanðA=2Þ sin½ðC AÞ=2=sin½ðC þ AÞ=2 ¼ tan½ðc aÞ=2=tanðb=2Þ cos½ðC AÞ=2=cos½ðC þ AÞ=2 ¼ tan½ðc þ aÞ=2=tanðb=2Þ sin½ðc aÞ=2=sin½ðc þ aÞ=2 ¼ tan½ðC AÞ=2=cotanðB=2Þ cos½ðc aÞ=2=cos½ðc þ aÞ=2 ¼ tan½ðC þ AÞ=2=cotanðB=2Þ
ð2:4:68Þ
(10) Gauss’s formula: sin½ðA BÞ=2 ¼ sin½ða bÞ=2cosðc=2Þ=sinðc=2Þ
ð2:4:69Þ
Note that one, two, or three right angles may coexist in the same right spherical triangle! The orthodrome between points A and B on the surface of a geode or earth is the shortest distance between A and B on this surface; the orthodrome is a segment of a great circle (e.g., a meridian) passing through both A and B. A loxodrome (from the Greek loxos for slanted and dromos for course) is a path on the earth’s surface that is followed when a compass is
2.4
41
ME CHA NIC S, VECT OR S, T ENS ORS , A ND DE TERM INA NT S
kept pointing in the same direction: It is a straight line on a Mercator57 projection of the globe, precisely because such a projection is designed to have the property that all paths along the earth’s surface that preserve the same directional bearing appear as straight lines. Nunes58 thought that the loxodrome was the shortest distance between two points on a sphere (he was wrong). Many centuries ago, it was difficult for a ship’s navigator to follow a great circle, because this required constant changes of compass heading. The solution was to follow a loxodrome, also known as a rhumb line, by navigating along a constant direction. In middle latitudes, at least, this didn’t lengthen the journey unduly. If a loxodrome is continued indefinitely around a sphere, it will produce a spherical spiral, or a logarithmic spiral on a polar projection. The distance between two points, measured along a loxodrome, is simply the absolute value of the secant of the bearing (azimuth) times the north–south distance (except for circles of latitude). The airports of Helsinki, Finland and Anchorage, Alaska are almost on the same parallel, so one can fly due west from Helsinki and get comfortably close to Anchorage, but it surely is not the shortest path (the loxodrome distance would be 9493 km, while the orthodrome, the shortest distance, is 6520 km). The loxodrome spirals from one pole to the other, with an angle setting equal to the “compass setting.” Close to the poles, the loxodromes resemble closely logarithmic spirals. The total length of the loxodrome from the N pole to the S pole is, assuming a perfect sphere, the length of the meridian divided by the cosine of the bearing away from true north. On a sphere that has coordinates f (latitude), l (longitude), and a (azimuth), the equation of a loxodrome is gd1 ðfÞ ¼ cosh1 sec f ¼ In½sec fð1 þ sin fÞ
ð2:4:70Þ
l ¼ tan a gd1 ðfÞ þ l0
ð2:4:71Þ
where arcgd(f) gd1(f) is the inverse Gudermannian59 function, and l0 is the longitude where the loxodrome passes the equator. Here the Gudermannian function is gd z 2 tan1(exp(z)) p/2, while the inverse GuderÐ z¼z2 mannian function is given by gd1 ðzÞ In½tan ðp=4 þ z=2Þ ¼ z¼z1 sec z dz ¼ Inðsec z þ tan zÞ,. Another way of defining a loxodrome is tan a ¼ ðl2 l1 Þ= cosh1 sec f2 Þ cosh1 sec f1 Þ ¼
ð l2 f¼f ð 2
l1 Þ
ð2:4:72Þ
sec jdf f¼f1
The celestial sphere is a fixed sphere of infinite radius, concentric with the center of the earth. The celestial North and South poles (PN and PS) are an extension of the earth’s North and South poles to infinity. The celestial equator is the great circle whose poles are PN and PS. The (local) zenith point Z is the point vertically above an observer at some arbitrary point on the earth’s
Gerardus Mercator ¼ Gerhard de Cremer (1512–1594). Pedro Nunes (1502–1578). 59 Christoph Gudermann (1798–1852). 57 58
42
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S (local) vertical circle
Z, local zenith
Celestial North Pole, PN (fixed)
t
z φ
hour circle
B
(local) celestial horizon vertical circle
FIGURE 2.6
Celestial Equator (fixed)
The celestial sphere, with local zenith Z, and a point B (a star). Note the spherical triangle BZPN.
Celestial South Pole, PS (fixed)
Z ', local nadir
surface: The point at the antipodes is the nadir point Z’ (Fig. 2.6). The (local) celestial horizon corresponds to what we call conventionally the horizon, but is strictly the great circle whose poles are Z and Z0 . Consider a point B within view of the observer (e.g., a star): The great circle that passes through PN, PS, and B is called the hour circle. The hour circle of Z is called the (local) observer’s meridian. The astronomical triangle is DPN ZB (or DPN Z0 B in
_ _ or co-declination of zenith), arc PN B ¼ 90 d ¼ co-declination of star, arc the southern hemisphere), with arc PN Z ¼ 90 L ¼ co-latitude of observer
_
BZ ¼ 90 h ¼ co-altitude of star, angle ‚ZPN B ¼ t ¼ hour angle of star, angle ‚PN ZB ¼ z ¼ azimuth of star, and finally angle ‚ZBPN ¼ f ¼ position angle of star. PROBLEM 2.4.22. Given two sides (a ¼ 135.8233 , c ¼ 60.0817 ) and the included angle (B ¼ 142.2100 ) of a spherical triangle, compute b, A, and C [10]. PROBLEM 2.4.23. Given two angles (A ¼ 57.9480 , B ¼ 137.3425 ) and the included side (c ¼ 94.8017 ) of a spherical triangle, compute a, b, and C [10]. PROBLEM 2.4.24. Compute the orthodrome, or shortest distance on the earth’s surface, between San Francisco, California (SFO, latitude 37 400 N ¼ 37.17 N; longitude 122 26’W ¼ 122.43 W) and Tokyo, Japan (TYO, latitude 35 42’ N ¼ 35.70 N, longitude 139 46’E ¼ 139.77 E), assuming that the earth is a sphere of circumference 40 million meters. Compute also the initial compass heading as you leave San Francisco in the direction of Tokyo (this compass heading changes continuously as you travel the orthodrome).
2.4
43
ME CHA NIC S, VECT OR S, T ENS ORS , A ND DE TERM INA NT S
Moment of Inertia. We next discuss momentum, rotation, torque, moment of inertia, and angular momentum. A body with velocity v and mass m is defined to have (linear) momentum p: p ¼ mv
ð2:4:73Þ
For a rigid body with differential of mass dm (kg), the total mass m is given by integration: ð m ¼ dm ð2:4:74Þ and the moment of inertia I, or second moment of the mass (units [M] [L]2), is given by the rank-2 tensor I: ð I ¼ rrdm ð2:4:75Þ which for discrete masses is usually written as I¼
P
i r i r i mi
ð2:4:76Þ
This moment of inertia is essential for the analysis of rotational spectra of molecules. For anisotropic solids or for molecules, the moment of inertia I is a second-rank tensor, with three principal-axis components I1, I2, and I3. This moment of inertia is important when the body rotates with angular frequency o radians per second (o Hz), or with n revolutions per second. The angular momentum of a body of mass m and momentum p is the pseudovector L: L r p ¼ mr v
ð2:4:77Þ
with units [M] [L]2 [T]1. It is obvious from Eq. (2.4.77) why the nineteenthcentury scientists also called angular momentum the (first) “moment of momentum.” If this body rotates with constant angular frequency o radians per second (o Hz), then the tangential velocity v in Eq. (2.4.77) is used to define v as a pseudovector oriented along L: v r vr2
ð2:4:78Þ
The kinetic energy E ¼ (1/2)mv2 for linear motion corresponds elegantly to E ¼ (1/2)Iv2 for angular motion. Using the tangential velocity v ¼ v r, the angular momentum L can be rewritten as L ¼ mvr2 ¼ Iv
ð2:4:79Þ
If a body with angular momentum L is subjected to a force F perpendicular to it, then it experiences a torque or turning force T: T r F ¼ dL=dt
ð2:4:80Þ
44
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
This torque, or moment of force (units [M] [L]2 [T]2), will turn the orientation of the body in space, without changing the magnitude of its angular momentum. All gyroscopes maintain (by electrical means) their angular momentum and orientation L in any gravitational field F, pointing into the same direction in inertial space, and are used as “inertial guidance systems.” If mass of a gyroscope (Fig. 2.7) is large enough, it can be used in large ships and ocean liners as a stabilizer, to decrease the pitch, yaw, and roll of the ship in heavy seas. The concept of torque is also important in describing the changes in orientation of electrons or nuclei in magnetic fields for electron spin resonance and nuclear magnetic resonance.The circular analog to Newton’s second law: F ¼ dp=dt ¼ dðmvÞ=dt ¼ ma
ðð2:4:3ÞÞ
can now be written for angular motion: FIGURE 2.7 The gyroscope, or universal or Cardano’s suspension at the intersection of each set of circles, where six mechanical bearings allow complete freedom of rotation about three axes. The rotor’s motion is maintained electrically to have constant angular velocity; the rotor will maintain its spin orientation unchanged in “inertial space” that is, with respect to the “fixed stars.
T ¼ dL=dt ¼ dðIvÞ=dt ¼ Ia
ð2:4:81Þ
where a is the angular acceleration: a ¼ dv=dt
ð2:4:82Þ
Note to the Reader. In this book, double parentheses are used for equation numbers whenever a previously presented equation is repeated for emphasis. PROBLEM 2.4.25. A ship with deadweight 50,000 metric tons is stabilized by a gyroscope of mass 5.1 tons and a diameter of 1.8 m, rotating at 1800 Hz. Calculate its moment of inertia and kinetic energy. PROBLEM 2.4.26. Show from Eq. (2.4.78) that v ¼ v r.
2.5 HOOKE’S LAW, STRESS–STRAIN TENSORS, AND PRINCIPAL-AXIS TRANSFORMATIONS Hooke60 suggested in 1660 “ut tensio, sic vis” [11], that is, that a mechanical spring with spring constant kH, if stretched below its elastic limit (from its resting length r0 to some distorted length r), is subject to a restoring force kH (r r0); this leads to a potential energy U ¼ ð1=2ÞkH jr r 0 j2
ð2:5:1Þ
and to a restoring force F ¼ kH ðr r 0 Þ
60
Robert Hooke (1635–1703).
ð2:5:2Þ
2.5
HOOKE ’S LAW, ST RESS –S TRAIN TENSORS, AND PRINCIPAL-AXIS T RA N S F O R M A T I O N S
45
kH −ymax
FIGURE 2.8 The motion of a mass suspended from an elastic spring of force constant kH satisfying Hooke’s law.
ymax y
where kH is known as the Hooke’s law constant. If a mass m is attached to this spring (Fig. 2.8), then, using Newton’s second law: F ¼ kH ðr r 0 Þ ¼ md2 ðr r 0 Þ=dt2
ð2:5:3Þ
which, rewritten in terms of the amount stretched y r r0, yields kH y ¼ md2 y=dt2
ð2:5:4Þ
This equation is essential for understanding the classical analog of vibrational spectroscopy. After setting the initial condition y ¼ 0 at t ¼ 0, the differential equation can be integrated twice to yield h i y ¼ A sin ðkH =mÞ1=2 t ¼ A sinðotÞ
ð2:5:5Þ
where the angular frequency o (radians per second) is defined by o ðkH =mÞ1=2
ð2:5:6Þ
and the angular frequency n (in hertz ¼ cycles per second) is v ð1=2pÞo ¼ ð1=2pÞðkH =mÞ1=2
ð2:5:7Þ
The maximum amplitude A depends on the elastic modulus of the spring, that is, the spring can stretch only so far (say to ymax), so that it can still recover elastically by a backwards motion, which also obeys Hooke’s law; then h 1=2 i y ¼ ymax sin kH=m t
ð2:5:8Þ
At the extrema of motion (y ¼ ymax) the velocity v ¼ (dy/dt) drops to zero, while the absolute value of the acceleration as well as jd2y/dt2j and the vibrational potential energy U are at their maxima. At y ¼ 0, the velocity and the kinetic energy are at maxima, while U ¼ 0.
46
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
By a change of phase (y ¼ ymax at time t ¼ 0), the solution to Eq. (2.5.4) can be expressed also by a cosine function of the type y ¼ A cos (o t), or, in general, the solution is the equation for classical simple harmonic motion: yðtÞ ¼ A expðiotÞ þ B expðiotÞ
ð2:5:9Þ
where i ¼ (1)1/2; this is the equation for simple harmonic motion, one of whose constants can be reset by choosing an appropriate initial condition, while the other, as before, depends on the physics, that is, on the elastic modulus. Hooke’s law will be taken up again in Section 5.7. Sideline. Newton declared (with false modesty?) that his achievements were possible because “he was walking on the shoulders of giants”; his competitor, Robert Hooke, was a hunchback! That same remark, “nani gigantium humeris insidentes,” was not original with Newton: It was first attributed to Bernard of Chartres (twelfth century). For a three-dimensional body, discussions of elastic responses in the framework of Hooke’s law become more complicated. One defines a 3 3 stress tensor P [12], which is the force (with units of newtons) expressed in a Cartesian coordinate system: 0
P11
B P ¼ @ P21 P31
P12
P13
1
P22
C P23 A
P32
P33
ð2:5:10Þ
In general, the off-diagonal elements of P can be nonzero, but are usually symmetrical, Pij ¼ Pji, so that, of the nine terms of P, only at most six are unique. If the force P is isotropic or has cubic symmetry, then only one term is unique, P11 ¼ P22 ¼ P33, and the off-diagonal terms vanish. The stress on a body can be either (a) positive and tensile (extending the body), or (b) negative and contractile (reducing its dimension). As discussed in Section 7.7, crystals, particularly organic crystals, usually exist in lower-symmetry nonorthogonal systems; for them the off-diagonal terms of P become important. The response of a crystal to the stress tensor P is a series of fractional displacements, small compared to any dimension of the body; these fractional displacements are called strains and are denoted by the strain (or dilatation) tensor s: 0
s11
B s @ s21 s31
s12
s13
1
s22
C s23 A
s32
s33
ð2:5:11Þ
The strain component s12 is usually the deformation of the body along axis 1, due to a force along axis 2; the strain tensor s is usually symmetrical, sij ¼ sji, and thus, of the nine terms of s, at most six are unique. Both P and s can be represented as ellipsoids of stress and strain, respectively, and can be reduced to a diagonal form (e.g., P 0 ) along some preferred orthogonal system of axes, oblique to the laboratory frame or to the frame of the crystal, but characteristic for the solid; the transformation to this diagonal form is a
2.5
HOOKE ’S LAW, ST RESS –S TRAIN TENSORS, AND PRINCIPAL-AXIS T RA N S F O R M A T I O N S
principal-axis transformation of the type P 0 ¼ XPX 1
ð2:5:12Þ
s0 ¼ ZsZ1
ð2:5:13Þ
where X is the 3 3 transformation matrix and X1 (or Z1) is the inverse of X (or Z): XX 1 ¼ ZZ1 ¼ E
ð2:5:14Þ
where E is the diagonal 3 3 unit matrix. In this principal-axis or diagonal system (usually identical for P and s), we can talk about the six unique nonzero terms P0 11, P0 22, P0 33, s0 11, s0 22, and s0 33. The cubic dilatation, or change in volume per unit volume, D, is the trace, or sum of the diagonal terms of s0 : D ¼ s0 11 þ s0 22 þ s0 33
ð2:5:15Þ
Poisson’s61 ratio s is the change in length per unit length, usually the contraction in one direction due to the dilatation in the perpendicular direction (for isotropic elastic bodies, 1.0 < s < þ 0.5). Young’s62 modulus Y, also called the linear modulus of elasticity, is the 3 3 tensor of the stress P divided by the strain s: Yij Pij =sij
ð2:5:16Þ
A body will obey Young’s modulus only if it is stretched or compressed within its elastic limit; if this limit is exceeded, structural failure ensues. For a one-dimensional system, or for a cubic crystal, Young’s modulus reduces to the Hooke’s law constant kH: Y11 ¼ Y22 ¼ Y33 ¼ kH
ð2:5:17Þ
The reciprocal of kH is the scalar compressibility k: k ¼ 1=kH
ð2:5:18Þ
The volume compressibility tensor k is given by its nine terms (i, j ¼ 1, 2, 3): kij ¼ sij =Pij
ð2:5:19Þ
The shear modulus m is given by m ¼ Y=2ð1 þ Þ
61 62
Simeon Denis Poisson (1781–1840). Thomas Young (1773–1829).
ð2:5:20Þ
47
48
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
PROBLEM 2.5.1. Find the equation of motion for a pendulum of mass m suspended by a rod of length l from the ceiling, which at any instant of time makes an angle y with the vertical. Show that for small angles, where sin y y, simple harmonic motion results. For a grandfather’s clock (l ¼ 1m) determine its period.
2.6 LAGRANGE’S FUNCTION AND HAMILTON’S FUNCTION We now introduce [13] the classical expressions for Lagrange’s function L and for Hamilton’s63 function H. These two functions L and H allow us to solve classical problems, by focusing on the energies of the problem, rather than on forces, and thus present certain conceptual advantages; the mathematical labor is the same: From F ¼ mdx2 =dt2 it takes two integrations to obtain x(t); from F to either L and H involves one integration, but then one must do one more integration of L or H with respect to t, to obtain x(t). L and H also become important when the Hamiltonian operator is developed in quantum mechanics (Section 3.1). Let a system of N particles of masses mi and Cartesian positions (xi, yi, zi) (ri1, ri2, ri3) have kinetic energy T, which is a function of only the velocities (d rij/d t) [13]: X i¼N . 3 i¼N 3 . 2 X 1X dr 1 X ij 1 2m dr ¼ T pij ¼ p 2 T m i ij dt ij i dt 2 i¼1 2 i¼1 j¼1 j¼1
ð2:6:1Þ
This system is called “conservative,” or “holonomic,” if and only if it satisfies the following two conditions: (i) It is subject to forces that are expressible as derivatives of a potential energy U. (ii) This potential energy U does not depend explicitly either on time or on the speed of the particles. Frictional systems, for instance, are not holonomic. In particular, let U(rij) depend only on the positions of the particles: for instance, for the harmonic oscillator in one dimension, U ¼ ð1=2ÞkH x2 Then we can define a Lagrangian function L as L T U : L rij ; drij =dt T drij =dt U rij
ði ¼ 1; 2; . . . ; N; j ¼ 1; 2; 3Þ
ð2:6:2Þ
This Lagrangian is a function of the positions and speeds of the particles. Newton’s second-order differential equations of motion: mi d2 rij =dt2 @U=@rij
63
Sir William Rowan Hamilton (1805–1865).
ð2:6:3Þ
2.7
49
ELECTROMAGNETISM
can be now be replaced formally by Lagrange’s second-order differential equations of motion: ðd=dtÞ @L=@ drij =dt @L=@rij
ði ¼ 1; 2; . . . ; N; j ¼ 1; 2; 3Þ
ð2:6:4Þ
Rather than deal directly with the speeds of the particles drij/dt, it is useful to rewrite the kinetic energy T in terms of their momenta pi ¼ mi (dri/dt), or their components pij: X i¼N . 3 1X 1 2mi T pij ¼ pij2 2 i¼1 j¼1
ð2:6:5Þ
Hamilton’s function H is now defined as H T þ U : H rij ; pij T pij þ U rij
ði ¼ 1; 2; . . . ; N; j ¼ 1; 2; 3Þ
ð2:6:6Þ
Then Hamilton’s equations of motion are first-order differential equations: @H=@pij ¼ drij =dt @H=@rij ¼ dpij =dt
ði ¼ 1; 2; . . . ; N; j ¼ 1; 2; 3Þ
ð2:6:7Þ
ði ¼ 1; 2; . . . ; N; j ¼ 1; 2; 3Þ
ð2:6:8Þ
The mathematics is not easier overall, but the emphasis is changed.
PROBLEM 2.6.1. Solve the simple harmonic motion problem by using (i) Lagrange’s and (ii) Hamilton’s equations of motion.
2.7 ELECTROMAGNETISM Given two popular systems of units, SI ( ¼ Systeme International, or rationalized MKSC, or Giorgi, or MKSA) and the older cgs (centimeter– gram–second) system, we give first two fundamental equations, Coulomb’s law of 1785 and Ampere’s64 law of 1826, in both systems. We restate Coulomb’s law for electrostatics, Eq. (2.1.2), for the force F12 between two electrical charges q1 and q2 separated by a distance r12: F12 ¼ q1 q2 r12 = 4pe0 r123 ;
64
ðSIÞ;
Andre-Marie Ampere (1775–1836).
F12 ¼ q1 q2 r12 r123
cgs-esu
ð2:7:1Þ
50
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
Ampere’s law (or the law of Biot65 and Savart66) for the force between two electrical currents j1 and j2 is given by Ð Ð F 12 ¼ ðm0 =4pÞ dv1 dv2 j1 j2 r 12 r 123
ðSIÞ;
Ð Ð F 12 ¼ dv1 dv2 j1 j2 r 12 r 123
cgs-emu
ð2:7:2Þ The older, cgs definition considers the electrical charge (in cgs-esu) as a secondary quantity, statcoulombs, where 1 statcoulomb 1 statC 1 g1/2cm3/2s1 (i.e., units of [M]1/2[L]3/2[T]1); the advantage of cgs units is that 1 statcoulomb2 ¼ 1 dyne. Equation (2.7.2) in cgs-emu form defines an electrical current, the abampere ¼ 1 g1/2cm1/2s1 (i.e., units of [M]1/2[L]1/2 [T]1). By convention, 1 abampere 1 abA ¼ 0.1 ampere; again, it is nice to see that 1 abA2 1 dyne. The disadvantage of cgs-esu and cgs-emu, however, is that the electromagnetic unit of current, favored by magneticians, is not the same as the electrostatic unit: In detail, if rcgs-esu is the charge density and u is the velocity with which this charge density moves, and c is the speed of light, then 1:0 jcgsemu ¼ 1:0rcgsemu u=c
ð2:7:3Þ
In SI this difficulty is avoided by making the electrical charge [Q] a new, fourth fundamental quantity, of equal importance as kg for [M], m for [L], or s for [T]; SI uses the practical units of Coulomb for electrical charge, ampere 1 coulomb s1 for electrical current, henry67 for magnetic inductance, weber68 for magnetic flux ( 1 volt s), tesla69 weber m2 for magnetic induction, farad70 for capacitance, ohm71 for electrical resistance, siemens72 for electrical conductance, and so on. To get this practical advantage, however, SI must introduce two convenient, if physically meaningless, constants: e0, the “electrical permittivity of vacuum,” defined as e0 107 = 4pc2 ¼ 8:854187817 1012 Fm1
ðSIÞ
ð2:7:4Þ
where c is the speed of light in vacuum and m0 is the “magnetic permittivity of vacuum”: m0 4p 107 ¼ 1:256637061432 106 Hm1
65
Jean-Baptiste Biot (1774–1862). Felix Savart (1791–1841). 67 Joseph Henry (1797–1878). 68 Wilhelm Eduard Weber (1804–1891). 66
69
Nikola Tesla (1856–1943). Michael Faraday (1791–1867). 71 Georg Simon Ohm (1789–1854). 72 Ernst Werner von Siemens (1816–1892). 70
ðSIÞ
ð2:7:5Þ
2.7
51
ELECTROMAGNETISM
The product of e0 and m0 is physically meaningful: e0 m0 ¼ c2
ðSIÞ
ð2:7:6Þ
A minor variant of the SI system, called MKS by experimentalists in nonlinear optics, is called SI0 here and defined below. Alas, there are also other systems: “rationalized” or “unrationalized”, Heaviside–Lorentz, “atomic” (where me ¼ c ¼ h ¼ 1), and so on. The “rationalized” and “unrationalized” versions differ in how they apportion the pesky factor 4p (surface area of a sphere with unit radius that is involved in surface integrals) between the various electrical and the magnetic variables. The Coulomb force is mediated by (virtual) photons. Following Faraday, we introduce the artificial concept of the electric field E due to an electric charge distribution r(r): Ð E ¼ ð1=4pe0 Þ dvðr Þrðr Þrr3
Ð E ¼ dvðr Þrðr Þrr3
ðSIÞ;
ðcgs-esuÞ ð2:7:7Þ
For a single charge q at a distance r from the point of observation, Eq. (2.7.7) reduces to E ¼ qr= 4pe0 r3 ðSIÞ; ð2:7:8Þ E ¼ qrr3 cgs-esu Since magnetic monopoles have never been found, the irreducible magnetic entity must be the magnetic dipole. In Ampere’s law, we can define the magnetic flux density or magnetic induction B due to, or induced by, an electrical current j at a distance r, as Ð B ¼ ðm0 =4pÞ dvðr Þj rr3
ðSIÞ;
Ð B ¼ ð1=cÞ dvðr Þj rr3
cgs-emu ð2:7:9Þ
This result explains electrical motors. Thus, the fundamental fields are E and B. In an older school of thought, the fundamental magnetic field was taken as H, because of a symmetry between E and H explained below. We next see how E and B are modified inside matter; this will generate the “constitutive equations.” Material Media and Their Reaction to External Fields. In a material medium, a charge distribution can induce some charge separations, or dipoles, which help to minimize the total energy. Similarly, an external magnetic field will induce some magnetic dipoles in the medium to counteract this field. To handle these effects, an electric polarization (or electrical dipole moment per unit volume) P and a magnetization (or magnetic dipole moment per unit volume) M are defined. If the medium is linear and isotropic, these two new vectors P and M are proportional to E and to H, respectively: P ¼ e0 xE
ðSIÞ;
M ¼ x m H ðSIÞ;
P ¼ wE
M ¼ xmH
cgs-esu
cgs-emu
ð2:7:10Þ
ð2:7:11Þ
52
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
where x is the 3 3 linear volume electric susceptibility tensor, and x m is the linear volume magnetic susceptibility tensor. If the medium is isotropic, then both x and x m become scalars; if one divides w and wm by the mass, molar mass, or density of the medium, one gets the corresponding mass, molar, or specific susceptibility, respectively. If the electric field is very high, there are nonzero higher-order (nonlinear) electric susceptibilities. It is further convenient to define the 3 3 linear dielectric constant (or relative electrical permittivity) tensor «, and the dielectric displacement vector D (as a “net field”: charges þ induced dipoles):
D ¼ e0 «E ¼ e0 E þ e0 x SI E ¼ e0 E þ P D ¼ e0 «E ¼ e0 E þ wSI0 E ¼ e0 E þ P
ðSIÞ
D ¼ «E ¼ E þ 4pP ¼ ð1 þ 4pwesu ÞE
cgs-esu
ðSI0 Þ
ð2:7:12Þ
This introduces SI0 , a minor variant of SI, which is conventionally (and confusingly) called MKS and is used in nonlinear optics. Equation (2.7.12), linking D to E, is the “first constitutive equation.” The magnetic case is similar: The magnetic induction B is the appropriately scaled sum of the magnetic field H and the magnetization M: B ¼ m0 mH ¼ m0 ð1 þ wm ÞH ¼ m0 ðH þ M Þ
ðSIÞ;
B ¼ mH ¼ H þ 4pM
cgs-emu
ð2:7:13Þ
This defines the magnetic field intensity H; m is the 3 3 linear magnetic permeability tensor. Equation (2.7.13), linking B to H, is the “second constitutive equation.” The magnetic field H is expressed explicitly by 1 H ¼ m1 0 BM ¼ m B
ðSIÞ;
H ¼ B 4pM ¼ H=ð1 þ x m Þ
cgs
ð2:7:14Þ In material media we add the empirical Ohm’s law, a “third constitutive equation”: J ¼sE
ðSIÞ
ð2:7:15Þ
where s is the 3 3 linear electrical conductivity tensor of the medium (siemens m1); J is the electrical current, and s is the reciprocal of the electrical resistivity. As will be discussed in detail in Section 8.1, Ohm’s law, obeyed by most metals, is valid only if the conductivity is limited by scattering processes (electrons or holes scattering off inclusions, lattice defects, or phonons). Semiconductors typically have a nonlinear dependence of the current on the applied electric field, because the number of carriers depends on temperature and on reaching or exceeding an Arrhenius-type73 activation energy (“nonohmic behavior”). In vacuum, or within a single atom or a single molecule, conductivity occurs only by quantum-mechanical tunneling over very short
73
Svante August Arrhenius (1859–1927).
2.7
53
ELECTROMAGNETISM
distances (0.1 to about 5 nm); this tunneling is not linear with voltage. A quantum of conductivity can be defined. Maxwell’s Equations. Combining all earlier experimental results, in 1865 Maxwell obtained the following four fundamental equations for electromagnetism for media at rest [4]:
r D ¼ r
ðSIÞ;
r B ¼ 0 ðSIÞ; r H ¼ J þ @D=@t ðSIÞ; r E ¼ @B=@t
cgs
r D ¼ 4pr r B ¼ 0
cgs
ð2:7:16Þ
ð2:7:17Þ
r H ¼ 4pJ þ @D=@t
ðSIÞ;
cgs
ð2:7:18Þ
cgs
r E ¼ @B=@t
ð2:7:19Þ
Of these, the first is Gauss’s law; the second could be called Gauss’s law for magnetic fields; the third is the generalized form of Ampere’s law; the fourth is Faraday’s law of induction. Since electrical charges do exist, but magnetic monopoles have never been found, therefore the source density is nonzero on the right-hand side of Eq. (2.7.16), but must be zero in Eq. (2.7.17). The integral forms of these four Maxwell equations are ðð
ðð D dS ¼ q ðSIÞ;
D dS ¼ 4pq
ðð
ðð B dS ¼ 0
ðSIÞ;
B dS ¼ 0
cgs
ð2:7:20Þ
cgs
ð2:7:21Þ
ð
ð B dl ¼ m0 I þ e0 m0 FE =dt ðSIÞ;
B dl ¼ 4pJ þ dFE =dt
cgs
ð2:7:22Þ ð
ð E dl ¼ dFB =dt ðSIÞ;
E dl ¼ ð1=cÞdFB =dt
cgs
ð2:7:23Þ
where dS is the surface element of a closed surface and the double integrals are over the whole surface), dl is a line element (and the integrals are over a closed loop), q is the total electrical charge enclosed by the surface, FE is the electric flux, FB is the magnetic flux, and J is the electrical current. To be historically accurate, Maxwell did not use vector notation, gradients, and curls, but published eight relevant equations in 1865 using
54
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
quaternions; these later expanded to 20 equations. The brief forms shown above are the culmination of later improvements by Gibbs, Hertz74, and Heaviside and were known variously as the Hertz–Heaviside or Maxwell– Heaviside equations. Since Maxwell unified electricity and magnetism and understood the concept of an electromagnetic wave, the simpler name “Maxwell equations” was bestowed on them by Einstein. Once again, Eqs. (2.7.20)–(2.7.23) are, respectively, Gauss’s law, Gauss’s law in magnetism, the generalized form of Ampere’s law, and Faraday’s law of induction. The four Maxwell equations [Eqs. (2.7.16)–(2.7.19)] plus two constitutive equations [Eqs. (2.7.12) and (2.7.13)] are six equations in four unknown fields (E, H, D, and B) and two unknown tensors (« and m); thus they can be solved uniquely. The Lorentz force F on a particle of charge q in an electromagnetic field is given by F ¼ qðE þ v BÞ
ðSIÞ;
F ¼ qðE þ v BÞ
cgs
ð2:7:24Þ
The Lorentz force encompasses the Lenz75 “right-hand rule” between v, B, and F and also explains how cyclotrons and mass spectrometers work. Practical applications of the Lorentz force are (i) the cyclotron (Problem 2.7.1) with its cyclotron frequency: o ¼ v=r ¼ qB=m
ðSIÞ;
o ¼ v=r ¼ qB=mc
cgs
ð2:7:25Þ
(ii) other particle accelerators that use bending magnets (such as the synchrotron), (iii) ion cyclotron resonance, and (iv) mass spectrometry. The energy density u (J m3) in an electromagnetic field is given by u ¼ ðe0 =2ÞE D þ ðm0 =2ÞB H
ðSIÞ;
u ¼ ð1=8pÞðE D þ B H Þ
cgs
ð2:7:26Þ The propagation of energy is best described by the Poynting76 pseudovector S: SEH
ð2:7:27Þ
For electric fields E and magnetic fields H, either in a vacuum or in isotropic media, the wave equation for propagation is the electromagnetic wave equation (Problem 2.7.4): r2 E ee0 mm0 @ 2 E=@t2 ¼ 0
ðSIÞ;
r2 E em@ 2 E=@t2 ¼ 0
r2 H ee0 mm0 @ 2 H=@t2 ¼ 0 ðSIÞ;
r2 H em@ 2 H=@t2 ¼ 0
74
Heinrich Rudolph Hertz (1857–1894). Heinrich Friedrich Emil Lenz (1804–1865). 76 John Henry Poynting (1852–1914). 75
cgs
cgs
ð2:7:28Þ ð2:7:29Þ
2.7
55
ELECTROMAGNETISM E
FIGURE 2.9 Representation of E, B and k vectors for a transverse electromagnetic wave in vacuo: the E and B fields are in phase, and the three mutually orthogonal vectors E, B, and k form a right-handed set by Lenz’s law. In cgs units in vacuo, the E and B vectors have the same magnitude.
B
k
These waves were initially called “Hertzian waves”,—for example, by Marconi77, the inventor of the radio. The solutions to these wave equations can be shown to be transverse electromagnetic (TEM) waves (Problem 2.7.5, Fig. 2.9). In a vacuum (i.e. for m ¼ 1 and e ¼ 1) Maxwell’s equations simplify to r E ¼ e0 r
ðSIÞ;
r B ¼ 0 ðSIÞ; r B ¼ m0 J þ m0 e1 0 @D=@t ðSIÞ; r E ¼ @B=@t
ðSIÞ;
r E ¼ 4pr r B ¼ 0
cgs
cgs
r B ¼ 4pJ þ @D=@t r E ¼ @B=@t
ð2:7:30Þ
cgs
ð2:7:31Þ
cgs
ð2:7:32Þ ð2:7:33Þ
In a conducting medium (s 6¼ 0) with no source of charge (r ¼ 0), the Maxwell equations yield the first telegraph equation (so named from a transmission-line theory for long-range telegraphy developed by Heaviside) [14]: r2 E ee0 mm0 @ 2 E=@t2 mm0 s@E=@t ¼ 0
ðSIÞ
ð2:7:34Þ
The third term is a damping term, which allows for the possibility that a wave is absorbed by the medium: this is called the evanescent wave; and the quantity s is also called the optical conductivity (at zero frequency it becomes the electrical conductivity). The evanescent wave is exploited in near-field scanning optical microscopy. If waves propagate along x, so that @/@y ¼ 0, @/@z ¼ 0, then Ex ¼ Hx ¼ 0. Next, assume Ey(x,t) ¼ f(x)exp(iot) ¼ 0 and Ez(x,t) ¼ g(x)exp(iot) ¼ 0; that is, assume plane-polarized light with the E vector in the xy plane: Then the differential equation to be solved is more simply d2 f =dx2 þ ee0 mm0 o2 i 4pse0 mm0 o f ¼ 0 The space-dependent solution is h 1=2 i f ðxÞ ¼ A exp i ee0 mm0 o2 i 4pse0 mm0 o x
77
Guglielmo Marconi (1874–1937).
56
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
One can define a complex index of refraction @ by its real part n and its imaginary part -ikn:
1=2 onð1 ikÞ o@ ee0 mm0 o2 i 4pse0 mm0 o
ð2:7:35Þ
ðSIÞ
A second telegraph equation [14] can be written for a magnetic field: r2 B ee0 mm0 @ 2 B=@t2 mm0 s@B=@t ¼ 0
ð2:7:36Þ
The two telegraph equations are derived in Problem 2.7.11. Next, the three assumptions s ¼ 0, r ¼ 0, and E(r,t) ¼ E(r) exp(iot) will lead from Eq. (2.7.34) to the real form of the Helmholtz78 equation: r2 E þ ee0 mm0 o2 c2 E ¼ 0
ðSIÞ;
r2 E þ emo2 c2 E ¼ 0
cgs
ð2:7:37Þ
with a real wavevector k ðee0 mm0 Þ1=2 ðo=cÞ. The Helmholtz equation resembles the spatial part of the classical wave equation for matter waves (waves in ocean, sound waves, vibrations of a string, electromagnetic waves in vacuum, etc.) of amplitude F ¼ F(r, t): r2 Fðr; tÞ þ v2 @ 2 Fðr; tÞ=@t2 ¼ 0
ð2:7:38Þ
The solution to this classical wave equation may be factored into space and time factors: Fðr; tÞ ¼ f ðr ÞgðtÞ ¼ f ðr Þ½A expðit=vÞ þ B expðit=vÞ
ð2:7:39Þ
where r2 f ðr Þ v2 f ðr Þ and d2 gðtÞ=dt2 ¼ v2 gðtÞ, v is the (assumed constant) wave speed, and A and B are constants whose values can are determined from the boundary conditions. Finally, Maxwell’s four equations (2.7.30)–(2.7.33), the three constitutive equations (2.7.12), (2.7.13), and (2.7.15), and the three assumptions r ¼ 0, E(r, t) ¼ E(r) exp(iot), and H(r, t) ¼ H(r) exp(iot), when considered together, yield (Problem 2.7.11) the complex form of the Helmholtz equation: r2 E þ o2 m0 mðee0 is=oÞE ¼ 0
ðSIÞ
ð2:7:40Þ
which, simplified as r2 E þ k2 E ¼ 0, yields a complex wavevector k: k o½mm0 ðee0 is=oÞ1=2
ð2:7:41Þ
If in Eq. (2.7.35) s ¼ 0 and k ¼ 0, then n2 ¼ mm0 ee0
78
ðSIÞ;
n2 ¼ me
Heinrich Ludwig Ferdinand von Helmholtz (1821–1894).
cgs
ð2:7:42Þ
2.7
57
ELECTROMAGNETISM
Table 2.6
Constants k1 Through k6 Needed to Write Eqs. (2.7.43)–(2.7.49) in Several Unit Systemsa
Unit System
[L]
cgs-esu (unrat.) cgs-emu (unrat.) cgs-Gaussian (unrat.) cgs-Heaviside–Lorentz (rat.) a.u. (Hartree) (unrat.) SI (rat.) and SI0 (rat.)
cm cm cm cm a0 m
[M] g g g g me kg
[T] s s s s t0 s
[Q] — — — — e coulomb
k1 1 1 1 1/4p 1 1/4p
k2 1 1 c2 (4pc2)1 1 1/4p
k3 1 1 c c a 1
k4 1 c2 1 1 1 107/4pc2
k5 4p 4p 4p 1 4p 1
k6 2
c 1 1 1 c2 4p 107
Here c ¼ speed of light in cgs or SI units, a0 is the Bohr radius (see Section 3.1), and t0 ¼ (2e02h3/pmee4). The label “rat.” (rationalized) means that a factor of 4p appears explicitly in some of the Maxwell equations (2.7.43)–(2.7.47), but not in the constitutive equations (2.7.48) and (2.7.49). For “unrat.” 4p is absent in Maxwell’s equations, but reappears in the constitutive equations. a
A convenient way of comparing many unit systems used in electromagnetism is to rewrite Maxwell’s equations, by introducing six constants (k1 to k6) plus the dimensionless Sommerfeld79 fine-structure constant a, and remembering that for SI e0 107/4pc2 and m0 4p 107: r D 4pk1 r
ð2:7:43Þ
r B 0
ð2:7:44Þ
r H 4pk2 k3 J þ ðk2 k3 =k1 Þð@D=@tÞ
ð2:7:45Þ
r E þ ð1=k3 Þð@B=@tÞ 0
ð2:7:46Þ
D ¼ k4 E þ k5 P
ð2:7:47Þ
H ¼ ð1=k6 ÞB k5 M
ð2:7:48Þ
hc in cgs-esu ¼ e2 =4pe0 hc ðin SIÞ a 1=137:03599968 ¼ e2 =
ð2:7:49Þ
The values of the six constants k1 through k6 are listed in Table 2.6. The resultant equations are given in Table 2.7. There is some ambiguity in defining magnetic fields in the “Hartree” atomic units: We follow here the Gaussian convention, in which a plane electromagnetic field has electric and magnetic fields of equal magnitudes in vacuum. In the alternate Hartree–Lorentz convention (not used here), the magnetic field is derived from the Lorentz force, and the magnetic induction B will contain a implicitly. PROBLEM 2.7.1. Cyclotron problem: A particle with mass m, charge q, and velocity v injected into a magnetic field B normal to its velocity will feel a force F normal to both v and B. Using Eq. (2.7.25), show that it will move in a circle of radius r ¼ m v/qB. Lawrence80 and Livingston81 in the late 1930s accelerated elementary particles to high kinetic energies in an early particle accelerator named the cyclotron.
79
Arnold Sommerfeld (1868–1951). Ernest Orlando Lawrence (1901–1958). 81 Milton Stanley Livingston (1905–1986). 80
58
Table 2.7
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
Maxwell’s and Constitutive Equations, and Lorentz Force in Various Unit Systemsa
Unit System
Maxwell Equations
Constitutive Equations
Lorentz Force
cgs-esu (unrationalized)
r D ¼ 4pr, r B ¼ 0, r H ¼ 4pJ þ @D/@t, r E þ @B/@t ¼ 0
D ¼ E þ 4pP H ¼ c2B 4pM
EþvB
cgs-emu (unrationalized)
r D ¼ 4pr, r B ¼ 0, r H ¼ 4pJ þ @D/@t, r E þ @B/@t ¼ 0
D ¼ c2E þ 4pP H ¼ B 4pM
EþvB
cgs-Gaussian (unrationalized)
r D ¼ 4pr, r B ¼ 0, cr H ¼ 4pJ þ @D/@t, cr E þ @B/@t ¼ 0
D ¼ E þ 4pP H¼B 4pM
E þ c1v B
cgs-Heaviside– Lorentz (rationalized)
r D ¼ r, r B ¼ 0, cr H ¼ J þ @D/@t, cr E þ @B/@t ¼ 0
D¼EþP H¼BM
E þ c1v B
a.u. (Hartree) (unrationalized)
r D ¼ 4pr, r B ¼ 0,a1r H ¼ 4pJ þ @D/@t, r E þ a@B/@t ¼ 0
D ¼ E þ 4pP H ¼ c2B 4pM
EþvB
SI (and SI0 ) (rationalized)
r D ¼ r, r B ¼ 0, r H ¼ J þ @D/@t, r E þ @B/@t ¼ 0
D ¼ e0E þ P H ¼ m01B M
EþvB
a
For a.u., the unit of energy is the hartree, not the Rydberg, and the Gaussian convention is used: an electromagnetic wave in vacuum has E and H components of equal magnitude.
PROBLEM 2.7.2. Another application of the Lorentz force is the selection of particles by a mass spectrometer. A dilute plasma of ions, each bearing a charge q and mass m, is accelerated by an electric field to a uniform velocity v, then drifts into a uniform magnetic field B, in which ions of different mass describe circular orbits of different radius r. The mass/charge ratio is given by m/q ¼ rB/v. PROBLEM 2.7.3.
Prove that Eq. (2.7.36) follows from Eq. (2.7.32).
PROBLEM 2.7.4. Derive the wave equation for the E field, Eq. (2.7.28), as follows: (i) Consider a region with no net charges (r ¼ 0) and no external sources of electromotive force, where m and e are scalars that are isotropic (not dependent on coordinates) and also independent of time. (ii) Operate with r, that is, take the curl of both sides of Eq. (2.7.33), but replace B by mm0 H. (iii) Substitute Ampere’s law, Eq. (2.7.18), into the previous result, but replace D by ee0 E. (iv) Use the vector identity (2.4.36) r r v ¼ rðr vÞ r2 v, then remember from Gauss’s law, Eq. (2.7.16), that for r ¼ 0, r D ¼ ee0 r E ¼ 0. (v) Use Ohm’s law, Eq. (2.7.15), to get rid of J, and obtain the general wave equation, valid in isotropic media or in free space. (vi) Show that in an insulator (s ¼ 0) the third term drops out. (vii) In conductors the term @ 2 E=@t2 is usually vanishingly small and can be neglected [15].
2.7
ELECTROMAGNETISM
PROBLEM 2.7.5. Show that, both in a dielectric insulator and in a vacuum, a plane-wave electromagnetic field solution propagating along x, whose amplitude depends only on the coordinate x and on the time t, can have no component along x, that is, show that it must be a transverse electric wave [13]. PROBLEM 2.7.6. solution.
For the situation of Problem 2.7.4, find the plane-wave
PROBLEM 2.7.7. For the plane polarized wave propagating along x, as described in Problem 2.7.6, find the magnetic field. PROBLEM 2.7.8. A photon of wavelength l ¼ 500 nm propagating in vacuum (e ¼ 1) has energy W ¼ hc/l. If it is equivalent to a wave packet constrained in a volume of 1 nm3, then estimate its electric field (ignore its magnetic field). PROBLEM 2.7.9. If the electric field E of an electromagnetic wave in vacuum is 3 104 V/m, please estimate its magnetic field H. PROBLEM 2.7.10. Let an electron of mass me ¼ 9.11 1031 kg and charge e ¼ 1.609 1019 C rotate in the first elliptical (quasi-circular) Bohr orbit (radius ¼ a0 ¼ 0.0529 nm) around a proton (Mp ¼ 1.673 1027 kg) at a speed of 2.22 106 m s1. Compute (i) the gravitational energy, (ii) the electrical energy, and (iii) the magnetic energy of a system of two such electrons in coplanar circular orbits around protons 2 nm apart. PROBLEM 2.7.11. (2.7.35).
Derive the two “telegraph equations,” Eqs. (2.7.34) and
PROBLEM 2.7.12. From Eq. (2.7.34) derive the Helmholtz equation, Eq. (2.7.37) [14]. PROBLEM 2.7.13. Start from Maxwell’s equations (2.7.30)–(2.7.33), use the three constitutive equations (2.7.12), (2.7.13), and (2.7.15), and then assume r ¼ 0 to get r E ¼ 0 r H ¼ 0 r E þ m0 m@H=@t ¼ 0 r H sE ee0 @E=@t ¼ 0 For plane waves E(r, t) ¼ E(r) exp(iot) and H(r, t) ¼ H(r) exp(iot): r E þ iom0 mH ¼ 0 r H sE ioe0 eE ¼ 0 Derive the complex form of the Helmholtz equation, Eq. (2.7.40) [14]. PROBLEM 2.7.14. An electromagnetic wave of wavelength l ¼ 589.3 nm (and therefore angular frequency o ¼ 3.20 1015 Hz) penetrates into bulk
59
60
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
1=2 Cu as exp(onkz/c). Estimate the skin depth d 2m1 s1 o1 at which the wave is attenuated to (1/e) of its value at the surface.
i1=2 h 2 2 2 2 2 Use : n k ¼ c =2 e0 e m0 m þ ðmm0 s=oÞ e0 em0 m ; 2 2
2
assuming m ¼ 1, e ¼ 1, s ¼ 5.8 107 S m1, o ¼ 3.20 1015 Hz, and n ¼ 0.62. (Since n < 1, the phase velocity of the light wave in Cu will exceed the speed of light, but of course its group velocity will not (and cannot) do so) [14]. PROBLEM 2.7.15. The original version of Ampere’s law had been simply r B ¼ m0 J
ðSIÞ
ð2:7:50Þ
But Maxwell realized that calculating the divergence (r ) of Eq. (2.7.34) would yield r m0 J ¼ 0
ðSIÞ
ð2:7:51Þ
which would disagree with a continuity equation that must hold whenever the charge density varied with time r m0 J ¼ @r=@t
ðSIÞ
ð2:7:52Þ
How did Maxwell resolve this dilemma? PROBLEM 2.7.16. The transformation from spherical polar (r, y, j) to Cartesian (x, y, z) coordinates is x ¼ r sin y cos j y ¼ r sin y sin j z ¼ r cos y where the ranges are symmetrical in Cartesian space, (1 x < 1), (1 y < 1), (1 z < 1), but not “symmetrical” in spherical polar coordinates: (0 r 1), (0 colatitude y p), (0 longitude j p); the reverse transformation is 1=2 r ¼ þ x2 þ y2 þ z 2 j ¼ tan1 ðy=xÞ; h 1=2 i y ¼ cos1 z= x2 þ y2 þ z2 : The Jacobian82 of this transformation is dV ¼ dx dy dz ¼ r2 sin y dr dy dj.
82
Carl Gustav Jacob Jacobi (1804–1851).
2.7
61
ELECTROMAGNETISM
Now show that r2 ¼ @ 2 =@x2 þ @ 2 =@y2 þ @ 2 =@z2 ¼ @ 2 =@r2 þ ð2=rÞð@=@rÞ þ 1=r2 @ 2 =@y2 þ cos y=r2 sin y ð@=@ yÞ þ 1=r2 sin2 y @ 2 =@j2 ð2:7:53Þ (Every student should go through this pain only once in his/her lifetime). Scalar and Vector Potentials. It is often useful to define an electrical (scalar) potential f(r) and a magnetic (vector) potential A(r) such that
Eðr Þ ¼ rfðr Þ @A=@t
ðSIÞ;
Eðr Þ ¼ rfðr Þ ð1=cÞ@A=@t
cgs
ð2:7:54Þ Bðr Þ ¼ r Aðr Þ
ðSIÞ;
Bðr Þ ¼ r Aðr Þ
cgs
ð2:7:55Þ
these potentials are invariant under the “gauge transformations”: f ! f þ ð@c=@tÞ A ! A þ rc To remove arbitariness, one can also select these two potentials to satisfy the Lorentz condition: r A þ með@f=@tÞ þ msf ¼ 0
ð2:7:56Þ
When f and A are so selected, then one can obtain the two inhomogeneous wave equations: r2 f em@ 2 f=@t2 msð@f=@tÞ ¼ r=e
ð2:7:57Þ
r2 A em@ 2 A=@t2 msð@A=@tÞ ¼ mJ
ð2:7:58Þ
which mathematically look identical, but involve charge density and current density as sources, respectively. These equations are related to Eqs. (2.7.34) and (2.7.36). If the scalar potential does not depend on time, then Eq. (2.7.57) becomes Poisson’s equation: r2 f ¼ r=e
ð2:7:59Þ
and if there is no source of charge, then Eq. (2.7.59) becomes Laplace’s equation: r2 f ¼ 0
ð2:7:60Þ
62
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
Multipoles. One can express [15] the multipole expansion of the energy of a charge distribution in an external field by defining the free energy G of a localized charge distribution r(r), which is placed in an external potential f(r) (which has no charge distribution associated with it): ð G rðr Þfðr Þdvðr Þ
ð2:7:61Þ
Free energy will be defined in Section 4.6. If f(r) varies slowly enough in the region where r(r) is significant, then one can use a Maclaurin expansion of f(r) around some suitably chosen origin r ¼ 0: fðr Þ ¼ fð0Þ þ r rfð0Þ þ ð1=2Þ
XX i
xi xj @ 2 fð0Þ=@xi @xj þ
ð2:7:62Þ
i
where the summations extend over the three Cartesian components (xi, i ¼ 1, 2, 3) of r. By using Eq. (2.7.54) in the form E(r) ¼ rf(r), and, using Eq. (2.7.30) in the simplified form r E ¼ 0 [to justify adding a mathematically convenient term (1/6)r2r E(0), since the trace of a quadrupole moment cannot be measured experimentally], one obtains fðr Þ ¼ fð0Þ r Eð0Þ ð1=6Þ
XX i
3xi xj r2 dij @Ej ð0Þ=@xi þ
i
ð2:7:63Þ where dij is the Kronecker delta: ( dij ¼
0
if i 6¼ j
0
if i ¼ j
ð2:7:64Þ
Thus finally, the free energy contributions to G show explicitly that the electrostatic charge q interacts with the potential, the electric dipole moment vector m (Fig. 2.10) interacts with the external electric field E, the traceless electric quadrupole moment Qij interacts with the external field gradient, and so on: G ¼ qfð0Þ m Eð0Þ ð1=6Þ
PP i
j Qij
@Ej ð0Þ=@xi þ
ð2:7:65Þ
where ð q rðr Þdvðr Þ
ð2:7:66Þ
ð m rðr Þrdvðr Þ
ð2:7:67Þ
ð
Qij rðr Þ 3xi xj r2 dij dvðr Þ
ð2:7:68Þ
2.7
63
ELECTROMAGNETISM
0.5d+ Physicist’s dipole moment vector m
H
m = 1.85 Debyes = 6.17 × 10–30 C m
O d–
0.5 d+ H Chemists’ dipole moment vector
FIGURE 2.10 The dipole moment m of water, H2O in the gas phase, with partial positive charges (0.5d) conceptually localized on the hydrogen atoms and a partial negative charge (d) localized on the oxygen atom. The chemists and physicists disagree about the direction of the static electric dipole moment for this or any other molecule: The physicists run the positive dipole moment vector direction from the locus of partial negative charge (tail) to the locus of positive charge (head); unfortunately, the chemists follow Paulings83 convention in The Nature of the Chemical Bond and do the reverse!
Electric Dipole Moments. In many cases the (di)electric polarization P is proportional to the electric field strength E. The relation between the electric displacement D and the electric field strength E is given by D ¼ e0 eE ¼ e0 E þ e0 x SIð1Þ E ¼ e0 E þ P D ¼ e0 eE ¼ e0 E þ x SI0 ð1Þ E ¼ e0 E þ P
ðSIÞ;
D ¼ eE ¼ E þ 4px esuð1Þ E ¼ E þ 4pP
ðSI0 Þ ð2:7:12Þ
If e and x are scalars, then the (di)electric polarization P is given by P ¼ e0 ðe 1ÞE ¼ e0 wSIð1Þ E
ðSIÞ;
P ¼ ½ðe 1Þ=4pE ¼ wesuð1Þ E
cgs-esu
ðSI0 Þ
P ¼ e0 ðe 1ÞE ¼ wSI0 ð1Þ E
ð2:7:69Þ This polarization has two components: the induced polarization Pa (due to movement of the centers of charge, or to the static electric dipole polarizability a of molecules) and the dipole polarization Pm (due to the orientation of the permanent dipoles m in the applied electric field E): P ¼ Pa þ Pm
ð2:7:70Þ
This second effect will be computed first. A molecule of permanent electric dipole moment m (C m), when put into an external electric field E (V m1), assumes an angle y with respect to E; its energy DU (J) is then DU ¼ m E ¼ mE cos y
ð2:7:71Þ
In intense electric fields E, the electric dipole moment m vector is no longer a constant, but acquires field-dependent higher-order contributions: mi ðEÞ ¼ m0i þ
P
k aik Ek
þ
PP k
l bikl Ek El
þ
PPP k
1
m giklm Ek El Em
þ ð2:7:72Þ
83
Linus Carl Pauling (1901–1994).
cgs-esu
64
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
where a is the rank-2 electric polarizability tensor with 9 elements, b is the rank-3 first hyperpolarizability tensor with 3 3 3 ¼ 27 elements, and g is the rank-4 second hyperpolarizability tensor, with 3 3 3 3 ¼ 81 elements. The next tensor d has not yet been measured. The indices k, l, and m range over values 1, 2, 3, corresponding to x, y, and z coordinates. In abbreviated notation, Eq. (2.7.72) becomes m ¼ m0 þ aE þ bEE þ gEEE þ
ð2:7:73Þ
The macroscopic version of this expansion in powers of E involves the electric susceptibility w: wij ðEÞ ¼ w0ij þ
X k
wð1Þ ijk Ek þ
XX k
1
wð2Þ ijkl Ek El þ
XXX k
1
m
wð3Þ ijklm Ek El Em þ ð2:7:74Þ
where x is the 3 3 electrical susceptibility tensor and x (n) is the nth-order ð2Þ contribution to it. One talks about tensor elements w 1214 or, equivalently, ð2Þ about w xyxz . Since the susceptibility x is now a tensor, the electric (or dielectric) polarization (or polarization density) P becomes, in a formal extension of Eq. (2.7.10), P ¼ e0 xE
ðSIÞ;
P ¼ xE
cgs-esu ;
P ¼ xE
ðSI0 Þ
ð2:7:75Þ
The polarizabilities are calculated from the electrical field dependence of ð2Þ either the total molecular energy or the dipole moment; the w ijkl in crystals obey the laws of crystal symmetry and are measured using powerful laser sources. When materials have nonlinear optical effects, then Eq. (2.7.69) must be modified by writing P as a power series in the electric field E: P ¼ x esu ð0Þ þ x esu ð1Þ E þ x esu ð2Þ EE þ x esu ð3Þ EEE þ P ¼ e0 x SI ð0Þ þ x SI ð1Þ E þ x SI ð2Þ EE þ x SI ð3Þ EEE þ
ðSIÞ
P ¼ x SI0 ð0Þ þ x SI0 ð1Þ E þ x SI0 ð2Þ EE þ x SI0 ð3Þ EEE þ
ðSI0 Þ
ðesuÞ ð2:7:76Þ
where the units are (i) dimensionless for x esu(1), cm2 sC1 for x esu(2), and cm4 sC2 for x esu(3); (ii) dimensionless for x SI(1), m V1 (in practice, a few pm V1) for x SI(2), and m2 V2 for x SI(3); (iii) C2J1m1 for x SI0 (1), C3J2 for x SI0 (2), and C4m J3 for x SI0 (3). x (1) is a rank-2 tensor with 9 terms; w(2) is a rank-3 tensor with 27 terms; x (3) is a rank-4 tensor, with 81 terms. If the molecule or the solid has a center of inversion symmetry, then the permanent dipole moment m0 vanishes, as do the even-rank tensors b and d and the even-rank tensors x (2), x (4), and x (6). All matter, with or without a center of inversion symmetry, has nonzero values for the odd-rank molecular tensors a and g, and all odd-rank tensors x (1), x (3), x (5), and so on. If the crystal has symmetry, then the number of unique tensor components is vastly reduced. The components have values that depend seriously on the frequency of the electromagnetic radiation used to probe them. A practical application of nonlinear optics is frequency-doubling of the high-powered
2.7
65
ELECTROMAGNETISM
near-infrared Nd-YAG laser light from its most intense wavelength 1.06 mm to the frequency-doubled wavelength 530 nm, typically using expensive lithium niobate crystals. To confuse theoreticians, the experimentalists use half-sized d tensor components: 2dij wij ð2Þ
ð2:7:77Þ
Finally, one can define an electric displacement vector D(o), along with three tensors, for the optical dielectric constant «(o), the effective optical susceptibility x eff(o), and the optical index of refraction n(o): DðoÞ ¼ e0 EðoÞ þ PðoÞ ¼ e0 «ðoÞEðoÞ
ðSIÞ;
DðoÞ ¼ EðoÞ þ 4pPðoÞ ¼ « ðoÞEðoÞ ð2:7:78Þ
PðoÞ ¼ e0 x eff ðoÞEðoÞ
ðSIÞ;
PðoÞ ¼ x eff ðoÞEðoÞ
nðoÞnðoÞ ¼ « ðoÞ ¼ 1 þ 4pweff ðoÞ ðSIÞ;
cgs
ð2:7:79Þ
nðoÞnðoÞ ¼ « ðoÞ ¼ 1 þ 4p x eff ðoÞ ð2:7:80Þ
including tensor effects: weff ¼ x ð1Þ þ wð2Þ E þ x ð3Þ : EE þ x ð4Þ : EEE þ
ð2:7:81Þ
If a sample is subjected to the sum of two electrical fields, a direct current (DC) (i.e., frequency-independent component EDC) and a frequencydependent “optical field” E(o), E ¼ EDC þ EðoÞ ¼ EDC þ E0 cos ðot kzÞ
ð2:7:82Þ
then, ignoring tensor effects, the polarization becomes P ¼ wð1Þ ½EDC þ E0 cosðot kzÞ þ wð2Þ ½EDC þ E0 cosðot kzÞ2 þwð3Þ ½EDC þ E0 cosðot kzÞ3 þ
ð2:7:83Þ
Keeping only the expansion terms in o, and using the trigonometric identity cos (3o) ¼ 4 cos3o 3 cos o, we get PðoÞ ¼ wð1Þ E0 cosðot kzÞ þ 2wð2Þ EDC E0 cosðot kzÞ þ3wð3Þ EDC2 E0 cosðot kzÞ þ ð3=4Þwð3Þ E30 cosðot kzÞ þ
ð2:7:84Þ
weff E0 cosðot kzÞ The nonlinear effective refractive index n(o) can be defined by nðoÞ2 1 þ 4pweff
ð2:7:85Þ
which then gives h i nðoÞ2 1 þ 4p wð1Þ þ 2wð2Þ EDC þ 3wð3Þ EDC2 þ ð3=4Þwð3Þ E20 þ
ð2:7:86Þ
cgs
cgs
66
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
If the linear refractive index is n0(o), and if one also defines n0 ðoÞ2 1 þ 4pwð1Þ
ð2:7:87Þ
then one gets h i nðoÞ2 n0 ðoÞ2 1 þ 8pwð2Þ EDC n0 ðoÞ2 þ 12pwð3Þ EDC2 þ 3pwð3Þ E02 n0 ðoÞ2 ð2:7:88Þ which, after taking square roots and using the series expansion [1 þ x]1/2 ¼ 1 þ x/2, valid for small x, yields nðoÞ n0 ðoÞ þ 4pwð2Þ EDC n0 ðoÞ1 þ 6pð3Þ EDC2 n0 ðoÞ1 þ 1:5pwð3Þ E20 n0 ðoÞ1 ð2:7:89Þ It is useful to define the light intensity I(o) as IðoÞ cn0 ðoÞE20 =8p
ðcgsÞ
ð2:7:90Þ
where c is again the speed of light, so that the nonlinear index of refraction, or refractive index, at the frequency o can be rewritten as nðoÞ n0 ðoÞ þ n1 EDC þ n2 ð0ÞEDC2 þ n2 ðoÞIðoÞ
ðcgsÞ
ð2:7:91Þ
where the linear electrooptic (or Pockels)84 effect is given by n1 4pwð2Þ =n0 ðoÞ
ðcgsÞ
ð2:7:92Þ
while the quadratic electrooptic effect is given by n2 ð0Þ 6pwð3Þ =n0 ðoÞ
ðcgsÞ
ð2:7:93Þ
and the optical Kerr85 effect is given by n2 ðoÞ ¼ 12p2 wð3Þ =cn0 ðoÞ2
ðcgsÞ
ð2:7:94Þ
Note that both n1 and n2(0) depend on the DC electric field, but not on the frequency-dependent light intensity I(o), while n2(o) depends on the intensity of the AC electric field at the frequency o—that is, on the intensity of the input light I(o). Similar theoretical expressions can be written for the magnetic susceptibility, but the nonlinear effects are much milder.
84 85
Friedrich Carl Alwin Pockels (1865–1913). John Kerr (1824–1907).
2.9
67
STRONG FORCES AND NUCLEAR STRUCTURE: ISOTOPES
2.8 WEAK FORCES The decay of a free neutron 0 n1 into a proton 1 p1 , an electron e1, and an electron antineutrino: 0
n1 ! 1 p1 þ e1 þ ve
ð2:8:1Þ
(with a half-life of 13 minutes) is an example of beta decay, a force 103 times weaker than the electromagnetic force (see Table 2.2). The particles carrying this interactions, the W and Z vector bosons, are surprisingly massive (see Table 2.2). A theory unifying electromagnetic and weak forces, the electroweak theory, has been developed. Neutrinos (name given by Fermi) carry little mass, occur with great abundance in the universe, but are very difficult to detect, because they often pass through the mass of the earth without deflection or detection. Huge tanks of liquid chlorinated hydrocarbons in deep mines have been successfully tested for neutrino-induced formation of a few atoms of radioactive argon. Many decays of artificial radio-isotopes occur by beta decay. For instance, in the upper atmosphere, 6 C14 is generated by cosmic rays (neutron bombardment of 7 N14 ) to yield a constant fraction of radioactive carbon in living matter by photosynthetic absorption of 6 C14 O2 (1.3 1012); it decays by beta decay, with a half-life of 5730 years, into stable 7 N14 ; this is the basis of Libby’s86 archeological carbon-14 dating method.
2.9 STRONG FORCES AND NUCLEAR STRUCTURE: ISOTOPES Nucleons are in a dense soup, which can be thought of consisting of so many protons and neutrons; in practice, the nuclear densities 1014–1015 g cm3 ¼ 1017 –1018 kg m3 are so great that the particles may be in intimate contact with each other. The strong force was thought to be mediated by pions, but may also be mediated by gluons, as it must be for single hadrons. Various calculations (nuclear “shell model”) that can do limited predictions of nuclear stability depend on nuclear spin I and nuclear angular momentum quantum numbers, but no distance-dependent potentials have emerged. The nuclei are less massive than the sum of their constituent protons and neutrons; this mass defect, or nuclear binding energy, increases with atomic number, up to Fe. This binding energy is colossal, and it can be partially released in fission and fusion reactions (A-bombs, H-bombs, nuclear power plants). The synthesis of elements from H to Fe in stars can be explained as energy-efficient processes. The nucleosynthesis of elements past Fe cannot fuel stars, because the mass defect is ever smaller. The nuclear stability seems to approach a ratio of twice as many neutrons as protons for the heavy elements. Nuclei larger than U are increasingly difficult to make and have ever shorter half-lives. Early predictions of “islands of stability” around element number 120 have proven to be wishful thinking.
86
Willard Frank Libby (1908–1980).
68
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
Most elements have several isotopes, some found in nature (natural radioactivity, restricted to the chemical elements above Pb), but most made artificially.
2.10 LACK OF DISTANCE-DEPENDENT POTENTIALS FOR STRONG AND WEAK FORCES All of eighteenth- and nineteenth-century mathematical physics was based on continua, on the solution of second-order partial differential equations, and on microscopic extensions of macroscopic Newtonian ideas of distancedependent potentials. Quantum mechanics (in its wave-mechanical formulation), classical mechanics, and electrodynamics all have potential energy functions U(r) which are some function of the interparticle distance r. This works well if the particles are much smaller than the distances that typically separate them, as well as when experiments can test the distance dependence of the potentials directly. This technique becomes problematic when the particles touch—for example, for the constituents of atomic nuclei. Already, spin forced us to consider quantization without potentials. Many other strange quantum numbers have been posited, with no help from continuum mathematics. Perturbation expansions become funny, since the interaction is no longer smaller than some overriding field. Nucleon–nucleon potentials are discussed in terms of pion exchange, and may also be discussed in terms of quark–gluon interactions. However, quarks seem to be trapped in deep potential wells, so that they cannot be torn apart easily. When the particles are too close, or the potential wells are too deep, then the old tricks do not seem to apply. What to do? The so-called “standard model” allows one to understand elementary particle classification in terms of quarks and gluons; and strong, weak, and electromagnetic forces have been united in quantum chromodynamics. Only gravitational forces (with their not-yet detected graviton) seem to escape from a unified theory. However, masses cannot be predicted very well. It is posited that a Higgs field and a Higgs boson (not yet seen) may explain mass. String theory posits 17 dimensions instead of the standard four of special relativity (x, y, z, and ct): Is string theory useful? Is it necessary?
2.11 THE SIZE OF FUNDAMENTAL PARTICLES How big is an electron? The “classical radius” of an electron (or Lorentz radius, or Thomson scattering length), rcl ¼ e2me1c2 ¼ 2.892 1015 m, originates from equating the Coulomb potential to the Einstein rest-mass energy: e2/rcl ¼ mec2 (Problem 2.11.1). Another measure of the electron size is its Compton87 wavelength: lC ¼ h/mec ¼ 2.42 1012 m, that is, the length below which particle creation and annihilation (and therefore quantum field theory) come into play: If a photon of energy in excess of mec2 is used to “find” the 87
Arthur Holly Compton (1892–1962).
2.12
THE PHYSICAL MEANING OF QUANTUM NUMBERS
electron, then this photon may itself become a new electron, thus mooting the question of where the electron had been! More precisely, as discussed in Section 3.1, the position uncertainty Dx and the momentum uncertainty Dpx are linked by the Heisenberg uncertainty principle: DxDpx h/2; if we use for Dpx the relativistic momentum of the electron, p ¼ mec, then the uncertainty in its position is given by Dx ( h/2)/mec ¼ lC/4p. Electron–electron scattering provides an estimated collision distance between two electrons of about 1016 m (much smaller than that of an atom, 1010 m, or even of a nucleus, 1014 m). Of course, electrons seem to be “point particles” with negligible radius: Is this correct? Scattering experiments, often using electrons as “bullets,” yield a scattering cross section for hadrons, and from this cross section one can obtain an estimated size for the hadrons. The original Rutherford88 experiment, which arrived at the small value for the nuclear size, relative to the size of atom, used a beam of (2He4 )þþ nuclei (alpha particles) from a Ra source. If one considers the wave nature of light, one may think that the photon size is roughly equal to its wavelength (say 500 nm); however, when the photon is absorbed by an atom, it “disappears” within a body of radius 0.5 nm; this is a manifestation of the intricacies of the wave-particle duality, which are discussed in Section 3.39. PROBLEM 2.11.1. Evaluate the classical radius of the electron r0, by assuming that the rest mass of the electron me is totally due to its electrostatic potential e2/4pe0r0. PROBLEM 2.11.2. Evaluate the speed of the orbital motion of an electron that has orbital angular momentum mevr0 ¼ h, if the mass is concentrated at the electron radius r0 estimated above either from electron–electron scattering (1016 m) or from its “classical radius” (2.892 1015 m) or from the Compton wavelength lC (2.426 1012 m).
2.12 THE PHYSICAL MEANING OF QUANTUM NUMBERS What do quantum numbers mean? As we shall see in Sections 3.5, the three spatial quantum numbers (n, l, ml) for the H atom identify the allowed eigenstates for the solution of the Schr€ odinger89 equation, with certain characteristic energies and spatial features (e.g., the angular momentum quantum number describes how much angular momentum the atom has). But then, what is the meaning of the “electron spin” quantum number? Electron spin can be visualized as the “helicity” of the particle. Schr€ odinger suggested a zig-zag picture or “Zitterbewegung” or “trembling motion” for the electron: From the solutions to the relativisticaly correct Dirac90
88
Ernest Rutherford, first Baron Rutherford of Nelson (1871–1937). Erwin Rudolf Josef Alexander Schr€ odinger (1887–1961). 90 Paul Adrien Maurice Dirac (1902–1984). 89
69
70
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
equation in space–time, the electron seems to oscillate (“zig-zag”) between two states: a massless “zig” particle with right-handed helicity and massless “zag” particle with left-handed helicity; these two particles are mutual sources for each other, and they are coupled by a coupling constant that bears the rest mass of the electron. In normal three-dimensional space, the velocity of “zig” and “zag” (at the speed of light) continually reverses, but the direction of spin remains constant. Since photons are absorbed to change the spin projection of an electron from 1/2 to þ1/2, the spin of the photon must be 1. The graviton has been postulated to carry a spin of 2, because of the symmetry of the equations in Einstein’s general theory of relativity (gravity comes from the rank-2 stressenergy tensor). Other quantum numbers (charm, strangeness, etc.) were invented ad hoc to preserve, at least for fermions, the hypothesis that every fundamental particle must have a unique set of quantum numbers. These newer quantum numbers may have more abstruse physical significance.
2.13 SPECIAL RELATIVITY There was a young lady named Bright, whose speed was much faster than light; she started one day on her relative way and came back on the previous night [Edward Teller (1908–2003)]. There once was a sprinter in action Who lost his best race by a fraction: Ere he breasted the tape He had altered his shape By the Fitzgerald-Lorentz contraction [E. Teller].
Special relativity, a revolutionary theory introduced by Einstein in 1905 [16], recognized the null result of the 1887 Michelson91 Morley92 experiment, that the speed of light was not independent of the seasons—that is, did not add or subtract vectorially from the speed of the “ether wind.” Einstein postulated that the speed of light in vacuo, or the speed of information and energy transfer, is a universal constant c, independent of frame of reference. Phase velocities in excess of c are allowed, but group velocities for waves (which transmit energy or information) are limited by c in vacuo. In material media, the speed is nc, where n is the index of refraction. Tachyons (particles moving faster than light) are just gleams in some theoreticians’ eyes. The following discussion is adapted from Leighton [17]. If a particle moves at speed V in the x-direction, then the usual “Galilean”93 transformation of coordinates from a stationary system S to a
91
Albert Abraham Michelson (1852–1931). Edward Williams Morley (1838–1923). 93 Galileo Galilei (1564–1642). 92
2.13
71
SPECIAL RELATIVITY
system S0 moving with the particle is: x0 ¼ x Vt
ð2:13:1Þ
y0 ¼ y
ð2:13:2Þ
z0 ¼ z
ð2:13:3Þ
t0 ¼ t
ð2:13:4Þ
This says that the time clocks are the same in both systems. This is incompatible with Maxwell’s equations, as shown below by using Gauss’s law, Eq. (2.7.16), and the Lorentz force, Eq. (2.7.24). Assume that the two systems S and S0 move at velocities v and v0 and relative velocity V ¼ v0 v. If we use the Galileian transformation and assume that the charge q and the electric displacement D is the same in the two systems: r0 D ¼ r0 ¼ q0 =Dx0 Dy0 Dz0 ¼ q=DxDyDz ¼ r ¼ r D
ð2:13:5Þ
the Lorentz force must also be the same in both systems: F ¼ q(E þ v B) ¼ F0 ¼ q(E0 þ v0 B0 ) then E ¼ E0 V B0 ð2:13:6Þ For simplicity, assume zero electrical polarization: P0 ¼ P ¼ 0; then D ¼ D0 e0 V B0 . Finally, r0 D0 ¼ r0 e0 r0 V B0
ð2:13:7Þ
which brings out a “second term,” which depends on V. All attempts, including the 1887 Michelson–Morley experiment, to measure the speed of light at various times during the year, hoping to measure the speed V of the “luminiferous ether” (which was thought necessary to propagate electromagnetic waves), were negative; the speed of light was the same, no matter where the earth was with respect to the cosmos. This forced a shift away from the Galileian transformation at relativistic speeds. When this speed V becomes relativistic, then the Lorentz transformation steps in: 1=2 x0 ¼ 1 V 2 c2 ðx VtÞ ¼ gðx bctÞ
ð2:13:8Þ
y0 ¼ y ¼ y
ð2:13:9Þ
z0 ¼ z ¼ z
ð2:13:10Þ
1=2 t Vx=c2 ¼ gðt bx=cÞ t0 ¼ 1 V 2 c2 where
ð2:13:11Þ
1=2 g 1 V 2 c2
ð2:13:12Þ
b V=c
ð2:13:13Þ
72
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
Equation (2.13.8) is called the Lorentz–FitzGerald94 contraction of space; Eq. (2.13.11) is the Einstein time dilatation: A clock advances more slowly in a system moving at a high speed V. When V c, g 1, b 0, and the Lorentz transformation reduces to the Galilean transformation. The Lorentz transformation has the following cute property. If two events are measured in coordinate system S as separated by Dx, Dy, and Dz, and time Dt, and they are measured also in coordinate system S0 as being separated by different amounts of space Dx0 , Dy0 , Dz0 , and time Dt0 , then the Lorentz invariance requires ðDsÞ2 ¼ ðDxÞ2 þ ðDyÞ2 þ ðDzÞ2 c2 ðDtÞ2 ¼ ðDx0 Þ2 þ ðDy0 Þ2 þ ðDz0 Þ2 c2 ðDt0 Þ2 ð2:13:14Þ When time is not involved, this invariance is the well-known and quite ordinary rotation of a coordinate system in x, y, z space: ðDrÞ2 ¼ ðDxÞ2 þ ðDyÞ2 þ ðDzÞ2 ¼ ðDx0 Þ2 þ ðDy0 Þ2 þ ðDz0 Þ2
ð2:13:15Þ
which causes no dilatation or contraction of 3-space. The Lorentz invariance is best analyzed in four-space, by introducing a 1 4 column vector X: 0
x1
1
0
x
1
B C B C B x2 C B y C C B C X¼B Bx C ¼ Bz C @ 3A @ A ict x4
ð2:13:16Þ
(here i ¼ (1)1/2). There are other possible definitions, with and without an explicit i, with and without þ. For the Lorentz transformation given above, the transformation matrix is 0
c11
B B c21 C¼B Bc @ 31 c41
c12
c13
c22
c23
c32
c33
c42
c43
c14
1
0
g
C B c24 C B 0 C¼B B c34 C A @ 0 ibg c44
0
0
1
0
0
1
0
0
ibg
1
C 0 C C 0 C A g
ð2:13:17Þ
This matrix equation relating the four-vector X to the four-vector X0 is X 0 ¼ CX
ð2:13:18Þ
or, using the Einstein summation convention: xm0 ¼ cmn xn
ð2:13:19Þ
which means xm0 ¼
n¼4 X
cmn xn
n¼1
94
George Francis FitzGerald (1851–1901).
ðm ¼ 1; 2; 3; 4Þ
ð2:13:20Þ
2.13
73
SPECIAL RELATIVITY
It can be shown that det C ¼ 1 (Problem 2.13.1). The norm of C, or trace of C, or sum of its diagonal terms, is 2 þ 2g. Since det C ¼ 1, we can consider the Lorentz transformation matrix X like the four-dimensional analog of the Eulerian rotation in 3-space. We now seek quantities that are “covariant with the Lorentz transformation”—that is, are “relativistically correct”. We next define in this new four-space a few essential quantities: The proper time Dt is defined by h i ðDtÞ2 c2 Dxm Dxm ¼ ðDtÞ2 c2 ðDxÞ2 þ ðDyÞ2 þ ðDzÞ2 ð2:13:21Þ If (Dt)2 > 0, then Dt is real and represents a “time-like” interval; if (Dt)2 < 0, then Dt is imaginary and represents a “space-like” interval in proper time. We can also write ðDtÞ2 ¼ ðDtÞ2 c2 ðDrÞ2 ¼ ðDtÞ2 1 V 2 c2 ¼ g2 ðDtÞ2
ð2:13:22Þ
A velocity 4-vector is obtained by differentiating the position 4-vector: Um dxm =dt
ð2:13:23Þ
whence we can obtain, for the Lorentz transformation, Eq. (2.13.16), using (dx/dt) ¼ (dx/dt) (dt/dt), and so forth: U1 ¼ gðdx=dtÞ;
U2 ¼ gðdy=dtÞ;
U3 ¼ gðdz=dtÞ;
U4 ¼ icg ð2:13:24Þ
The 4-vector linear momentum is defined by Pm m 0 U m
ð2:13:25Þ
where m0 is the rest mass (more about m0 below). Therefore the momentum components are P1 ¼ gm0 ðdx=dtÞ;
P2 ¼ gm0 ðdy=dtÞ;
P3 ¼ gm0 ðdz=dtÞ;
P4 ¼ icm0 g ð2:13:26Þ
The 4-vector force then is Fm ¼ d m0 dxm =dt =dt
ð2:13:27Þ
The relativistic mass m is defined as m gm0
ð2:13:28Þ
The total energy W of a particle is defined by W cP4 =i ¼ m0 gc2 ¼ m0 c2 þ T ¼ m0 c2 1 þ ð1=2Þu2 c2 þ ð3=8Þu4 c4 þ ð2:13:29Þ Therefore the kinetic energy T is equal to (1/2)m0u2 only for speeds u c. Rest mass and kinetic energy T may be interconverted. The law of conservation of
74
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
momentum becomes
X
Pm ðiÞ ¼
X
i
Pm ðjÞ
ð2:13:30Þ
j
for the particles i ¼ 1, 2,. . . present before the collision or interaction (on left) and for the particles j ¼ 1,2,. . . present after the event (on the right). In particular, the conservation of P4 implies from Eq. (2.13.29) that rest mass and kinetic energy may be interconverted. Hence also, if p2 ¼ m02 [(dx/dt)2 þ (dy/dt) 2 þ (dz/dt)2] is the square of the ordinary momentum, then W 2 ¼ p2 c2 þ m20 c4
ð2:13:31Þ
The Dirac equation is a covariant version of the Schr€ odinger equation: Hc ¼ caðh=iÞr jejAÞ þ bm0 c2 þ ef c ¼ ðh=iÞð@c=@tÞ
ð2:13:32Þ
where, as above, |e| is the charge on the electron, c is the speed of light, A is the magnetic vector potential, and f is the scalar electric potential, but now a is a traceless 4 4 matrix with the following Cartesian components: 0
0 B B0 ax B B0 @
0
1
0
0 1
0 1
1
C 1 0C C; 0 0C A 0 0
0
0
B B0 B ay B B B0 @ i
0
0 i
0
i
i
0
0
0
1
0
C 0 C C C; C 0 C A 0
0
B B0 B az B B B1 @ 0
0
1
0
0
0
0
1
0
0
1
C 1 C C C C 0 C A 0
ð2:13:33Þ and b is a scalar traceless 4 4 matrix with components 0
1
B B0 bB B0 @ 0
0
0
1
0
0 1 0
0
0
1
C 0 C C 0 C A 1
ð2:13:34Þ
The Schr€ odinger and Dirac equations will be discussed in detail in Chapter 3. PROBLEM 2.13.1. For Eq. (2.13.17) show that det C ¼ 1. PROBLEM 2.13.2. For Eq. (2.13.17), find C1, the inverse matrix to C. PROBLEM 2.13.3. Verify Eq. (2.13.24). PROBLEM 2.13.4. Let us define the Schwartzschild95 singularity. A photon of total energy E ¼ hn ¼ mc2 becomes unable to escape the gravitational potential of a massive spherical black hole (a term popularized by Wheeler)
95
Karl Schwartzschild (1873–1916).
2.14
75
ELEMENTS OF OPTICS
hν = mc2
R
ρ = 1014 g cm−3
FIGURE 2.11 Photon cannot escape gravitational pull of black hole.
of density r ¼ 1014 g cm3 ¼ 1017 kg m3 (the density of a light atomic nucleus) and radius R (Fig. 2.11). E ¼ mc2 ¼ GmM=R ¼ Gmrð4=3ÞpR3 =R
ð2:13:35Þ
Find the radius of the black hole, and determine its mass M. Compare M it to the mass of our sun, 1.985 1030 kg. Discuss the Schwartzschild singularity and Hawking’s96 explanation.
2.14 ELEMENTS OF OPTICS We first discuss Young’s double-slit experiment (1803). Start with a source of light S that creates a beam of monochromatic light (either a modern monochromatic laser or a multichromatic source followed by a wavelength-rangelimiting diffraction-grating or prism) (Fig. 2.12). This light impinges on a single slit, then goes through a double slit with slit–slit separation h, followed by a photographic plate or fluorescent screen at a distance D from the double slit (pinholes would be acceptable, but the analysis is simpler, assuming slits of finite width and infinite length). The light from source S travels a path r1 from the upper slit to the screen, and it travels a different path r2 from the lower slit to the same point on the screen P. The intensity at P is determined by the phase difference d between the two paths: d ¼ f2 f1 þ kðr2 r1 Þ
96
Stephen William Hawking (1942–).
ð2:14:1Þ
76
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
P
r1 φ1
S
h/2
θ1
h/2
θ2
x
r
r2
φ2
D FIGURE 2.12 Young double-slit experiment.
Single slit
Double slit
Screen
The angles are shown geometrically as sin y1 ¼ ðr r1 Þ=ðh=2Þ
ð2:14:2Þ
sin y2 ¼ ðr2 rÞ=ðh=2Þ
ð2:14:3Þ
Now, if the observation point is at a distance x such that x < D, so that angles y1 and y2 are small, then sin y1 tan y1 and sin y2 tan y2, so sin y1 ¼ ðr r1 Þ=ðh=2Þ tan y1 ¼ x=D
ð2:14:4Þ
sin y2 ¼ ðr2 rÞ=ðh=2Þ tan y2 ¼ ðx þ h=2Þ=D
ð2:14:5Þ
If we further assume that h < x < D, then we may neglect terms in h2 and get r2 r1 ¼ xh=2D
ð2:14:6Þ
When two waves with vector electric fields E1 and E2 interfere with each other, then the intensity I of their vector sum E1 þ E2 can be written as I ¼ E2 ¼ hðE1 þ E2 Þ ðE1 þ E2 Þi ¼ E12 þ E22 þ 2hE1 E2 i
ð2:14:7Þ
This can be rewritten in terms of the individual intensities I1 and I2: I ¼ I1 þ I2 þ 2ðI1 I1 Þ1=2 cos d
ð2:14:8Þ
Therefore, for the Young double-slit experiment we get finally d ¼ Df 2pxh=lD
ð2:14:9Þ
Whenever the argument d of the cosine function changes by 2p, the detected output on the screen goes from light to light, or from dark to dark. Thus, the condition for light maxima is ð2px1 h=lD DfÞ ð2px2 h=lD DfÞ ¼ 2p
ð2:14:10Þ
2.14
77
ELEMENTS OF OPTICS
which simplifies to ðx1 x2 Þ ¼ lD=h
ð2:14:11Þ
The above discussion implicitly obeys Huygens’97 principle, that each point on a spherical wavefront can be regarded as the source of a secondary wavelet (another spherical wave), as well as Fermat’s98 principle of least time. The index of refraction n for a given medium at any frequency is defined by the speed of light in that medium, v, divided by the speed of light in vacuum, c: n c=v
ð2:14:12Þ
Here are some values, measured at the yellow Na D line (wavelength l ¼ 589 nm): For vacuum, n ¼ 1 by definition; for air, n ¼ 1.000294; for CO2, n ¼ 1.000449; for glass, n ¼ 1.33; for water, n ¼ 1.333; for most other materials, 1 < n < 2. If there is absorption within the medium, then the refractive index becomes complex, as discussed earlier, Eq. (2.7.35) and can be rewritten as @ n ik
ð2:14:13Þ
where n and k are real but @ is complex. If the medium is anisotropic, then n becomes a 3 3 tensor n with three principal-axis diagonal values na, nb, and nc localized within the crystal or oriented polymer. When a monochromatic parallel beam of light propagating in a medium of one refractive index arrives at a second medium of different refractive index (real or complex scalar, or tensor), several things can happen: (i) reflection back into the first medium, (ii) refraction, (iii) anisotropic refraction, or (iv) absorption into the second medium. To be specific, let an electromagnetic wave travel in the direction u (Fig. 2.13). Define the interface (dividing) plane between medium 1 (with scalar real index of refraction n1) and medium 2 (with scalar real index of refraction n2) as the xy plane, and define the plane normal to the dividing plane as the z axis. Call the xz plane the plane of incidence (defined as the plane, normal to the dividing xy plane, that contains the vector u). Without loss of generality, let the u vector make an angle yi with the z-axis: u ¼ eX sin yi þ ez cos yi
ð2:14:14Þ
The reflected beam u0 makes the angle of reflection yr with the negative z axis and the angle fr with the x axis: u0 ¼ eX sin yr cos fr þ ey sin yr sin fr ez cos yr
97 98
Christiaan Huygens (1629–1695). Pierre de Fermat (1601?–1665).
ð2:14:15Þ
78
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S Plane of incidence INCIDENT WAVE u
E||i ki
E||r
Epi
FIGURE 2.13 Incident wave u, reflected wave u0 , and refracted wave u00 . The index of refraction, or refractive index, of medium 1 (top) is n1; the index of refraction of medium 2 (bottom) is n2 (shown for case n2 > n1). The incident plane contains the incident and the reflected beam wavevectors. The parallel (jj) and perpendicular (?) polarization components of the electric field are defined relative to the plane of incidence.
θi θr
v1 PLANAR INTERFACE
REFLECTED WAVE u′ Medium 1 (e.g. air) kr n1 = c / v1 =
[ε1μ1/ε0μ0]1/2
Esr
x v2
θt
kt E||t
y Medium 2 for case n2 > n1 (n2a, n2b, n2c for crystals)
Est
z
TRANSMITTED or REFRACTED or ATTENUATED WAVE u″
and the refracted (transmitted) beam u00 makes the angle of refraction yt with the z axis and the angle ft with the x axis: u00 ¼ eX sin yt cos ft þ ey sin yt sin ft þ ez cos yt
ð2:14:16Þ
The electric field vectors are then, respectively: E ¼ A exp½ioi t ðx sin yi þ z cos yi Þn1 =c
ð2:14:17Þ
E0 ¼ A0 exp½ior t ðx sin yr cos fr þ y sin yr sin fr z cos yr Þn1 =c þ idr ð2:14:18Þ E00 ¼ A00 exp½iot t ðx sin yt cos ft þ y sin yt sin ft þ z cos yt Þn2 =c þ idt ð2:14:19Þ where, for generality, changes in the angular frequency or and ot and phase shifts dr and dt have been allowed. The magnetic field vectors are H ¼ ðe1 =m1 Þ1=2 u E
ð2:14:20Þ
H 0 ¼ ðe1 =m1 Þ1=2 u0 E0
ð2:14:21Þ
H 00 ¼ ðe2 =m2 Þ1=2 u00 E00
ð2:14:22Þ
At the interface the boundary conditions must be Ax þ Ax0 ¼ A00x plus Ay þ A0y ¼ A00y and Hx þ Hx0 ¼ Hx00 plus Hy þ Hy0 ¼ Hy00 ð2:14:23Þ
2.14
79
ELEMENTS OF OPTICS
which can be satisfied if and only if the following five conditions are met simultaneously: (i) oi ¼ or ¼ ot: no change in frequency. (ii) dr and dt ¼ either 0 or p radians (dr ¼ p for reflection, (dt ¼ 0 for refraction). (iii) f0 ¼ f00 ¼ 0: both reflected and refracted rays must lie in the plane of incidence. (iv) sin yi ¼ sin yr: angle of incidence ¼ angle of reflection. (v) n1 sin yi ¼ n2 sin yt: this is Snell’s99 law. Thus, an electromagnetic wave reflected at an interface between media of different refractive indices undergoes a phase shift of p radians; when it is refracted, it suffers no phase shift. The fifth condition above is Snell’s law of refraction: n12 n2 =n1 ¼ sin yi =sin yt
ð2:14:24Þ
When these conditions are met, one can evaluate the transmission (t) and reflection (r) coefficients [14]. For perpendicular (N, ?, s, or senkrecht) polarization (E vector perpendicular to plane of incidence but parallel to dividing plane) the transmission coefficient t? and the reflection coefficient r? are given by t? ¼ 2=ð1 þ m1 tan yi =m2 tan yt Þ 2 cos yi sin yt =sin ðyi þ yt Þ
ð2:14:25Þ
r? ¼ ð1 m1 tan yi =m2 tan yt Þ=ð1 þ m1 tan yi =m2 tan yt Þ
sin ðyi yt Þ=sinðyi þ yt Þ
ð2:14:26Þ
For parallel (P, p, jj or waagerecht) polarization (E vector in the plane of incidence) the transmission tjj and reflection coefficients rjj are tk ¼ 2 cos yi sin yt =ðcos yi cos yt þ m1 sin yi cos yt =m2 Þ
2 cos yi sin y00 =sin ðyi þ yt Þ cos ðyi yt Þ rk ¼ ðm1 sin 2yi =m2 sin yt Þ=ð1 þ m1 tan yi =m2 tan yt Þ
tan ðyi yt Þ=tan ðyi þ yt Þ
ð2:14:27Þ
ð2:14:28Þ
If we need these ratios for the energy, we must use Poynting vectors, which involve the squares of the electric field amplitudes, and obtain the reflectivity R and the transmittivity T (where R þ T 1): R ¼ jE00 j =jEj2
ð2:14:29Þ
T ¼ n2 cos yt jE00 j =n1 cos yi jEj2
ð2:14:30Þ
2
and 2
99
Willebrord Snell van Royen (1591–1626).
80
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
Reflection Coefficient r or Reflectivity R
1
FIGURE 2.14 Reflectivity for case ni ¼ n1 ¼ 1.0 and case nt ¼ n2 ¼ 1.5.
Rs || 0.5
rs = tan (i + t ) / tan (i + t)
⊥ Rσ
|| 0 Brewster angle B
⊥ −0.5 rσ = – sin ( i – t) / sin (i + t)
Case ni=n1=1.0, and nt=n2=1.5 −1
10
20 30 40 50 60 70 Angle of incidence i / degrees
80
90
If light goes from a low-index medium to a higher-index medium (n2 > n1), then the reflectivity R becomes large, as Snell’s law “fails” beyond relatively larger incident angles yi; If the light goes from a higher-index medium to a lower-index medium (n1 > n2), then the reflectivity becomes 1 beyond relatively smaller incident angles [14]. This total internal reflection can be understood from Snell’s law (2.14.24), where the angles yi and yt are defined only between 0 and p/2 radians (Figs. 2.14 and 2.15): As yi increases to a critical value yc, yt reaches p/2 (this yc is the critical angle of incidence), Snell’s law fails, there is no refracted beam, and all light is reflected back into the same medium (total internal reflection back into medium 1). A different phenomenon is Brewster’s100 angle for maximum parallel polarization. As the angle of incidence yi is changed systematically, the relative intensities of the reflected and refracted rays change. At a critical angle, called Brewster’s angle, the reflected beam for parallel polarization has zero intensity (T ¼ 0), and all the energy of the parallel-polarized incident wave goes into the refracted wave. If yt ¼ p/2 y, then sin yt ¼ cos yi, and Eq. (2.14.24) reduces to Brewster’s law: n12 ¼ n2 =n1 ¼ tan yi
ð2:14:31Þ
This reflected beam intensity goes to zero only for parallel polarization, as can be seen in Eq. (2.14.28) when tan (y þ y00 ) becomes infinite, or rP goes to zero. When Eq. (2.14.31) holds, then unpolarized incident light u will yield a plane-polarized refracted ray u00 at Brewster’s angle y (the refracted ray u00 , at 90 to the reflected ray, will be partially polarized but very weak). The light intensity refracted into medium 2 has the maximum relative intensity at Brewster’s angle, but this phenomenon can be seen for a few degrees around the Brewster angle.
100
Sir David Brewster (1781–1868).
81
ELEMENTS OF OPTICS
Reflection coefficient r or Reflectance R
2.14
1 rσ = −sin(i−t)/sin(i+t)
0.8
||
0.6
Brewster angle B
TOTAL INTERNAL REFLECTION
0.4 0.2 Rσ
0
Rs
⊥
−0.2
10
rs = tan (i − t)/tan (i + t)
Case ni=n1=1.5,nt=n2=1.0
20 30 40 50 60 70 Angle of incidence i / degrees
FIGURE 2.15 Reflectivity for case ni ¼ n1 ¼ 1.5 and case nt ¼ n2 ¼ 1.0.
80
Fresnel’s101 formulas [13] give the parallel (jj) and perpendicular (?) components of the reflected and refracted light beams: Ak0 ¼ Ak ½tanðyi yt Þ=½tanðyi þ yt Þ
ð2:14:32Þ
A?0 ¼ A? ½sinðyi yt Þ=½sinðyi þ yt Þ
ð2:14:33Þ
Ak00 ¼ Ak ½2 sin yi cos yt =½sinðyi yt Þsinðyi þ yt Þ A?00 ¼ A? ½2 cos yi sin yt =½sinðyi þ yt Þ
ð2:14:34Þ ð2:14:35Þ
PROBLEM 2.14.1. Derive Fresnel’s formulas, Eqs. (2.14.32) to (2.14.35). PROBLEM 2.14.2. Show that the ratio R of the reflected intensity I0 to the intensity I at normal incidence (y ¼ 0) is R ¼ I 0=I ¼ A0 =A2 ¼ ðn12 1Þ2 =ðn12 þ 1Þ2 2
ð2:14:36Þ
PROBLEM 2.14.3. [4] Consider a complex index of refraction for a metal: N nð1 ikÞ
ð2:14:37Þ
(there are many different sign conventions). In metals, for instance, n and k become functions of the dielectric constant e, the magnetic permeability m, the electrical conductivity s, and the light frequency n:
101
Augustin-Jean Fresnel (1788–1827).
n2 n2 k2 ¼ em
ð2:14:38Þ
n2 k ¼ ms=n
ð2:14:39Þ
82
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
Show that Snell’s law must now be modified to n12 ð1 ik12 Þ ¼ sin y=sin y00 Light emanating from some source, sun, or a light bulb, vibrates in all directions at right angles to the direction of propagation and is unpolarized. When emitted from atoms or molecules, light is polarized: it can have (1) plane polarization, (2) circular polarization, (3) elliptical polarization. If a light wave propagating along z has an electric field vector E in the xy plane, then its polarization can be described by a normalized Jones102 vector (1941): qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 Ex ðtÞ2 þ Ey ðtÞ2 B C jci @ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi A 2 2 Ey ðtÞ= Ex ðtÞ þ Ey ðtÞ 0
Ex ðtÞ=
ð2:14:40Þ
Thus, if light is linearly polarized along x, along y, and at 45 from both x and y, the Jones vectors are, respectively, 1 0
! ;
0 1
! ;
and
pffiffiffi ! 1= 2 pffiffiffi 1= 2
ð2:14:41Þ
Isotropic (Cubic or Rhombohedral) Minerals. Halite (NaCl); fluorite (CaF2); garnets (X3Y2(SiO4)3, where X ¼ Mg, Mn, Fe(II), Ca, and Y ¼ Al, Fe(III), Cr; periclase (MgO) (Table 2.8). Anisotropic Minerals (Uniaxial or Biaxial). These differ from isotropic minerals because they exhibit birefringence: (1) The velocity of light varies, depending on the direction through the mineral; (2) they show double refraction; (3) the index of refraction n is not a scalar but instead a 3 3 tensor, diagonalized in a principal-axis system with two or three diagonal values na, nb, and nc. When light enters an anisotropic mineral, it is split into two rays of different velocity, which vibrate at right angles to each other. In anisotropic minerals there are one or two directions through the mineral, along which light behaves as if the mineral were isotropic; this (these) direction(s) is (are) referred to as the optic axis (axes). Birefringence can also arise in certain rare magnetic materials with anisotropic magnetic permeabilities. Many colored anisotropic materials also display a change of color with orientation; this is pleochromism or dichromism. Hexagonal and tetragonal minerals (e.g, calcite CaCO3, quartz SiO2, MgF2, tourmaline, BN) have one optic axis and are optically uniaxial. Orthorhombic, monoclinic, and triclinic minerals (e.g., sulfur, mica, turquoise, selenite) have two optic axes and are optically biaxial. For instance, a light beam traveling through calcite (CaCO3), a uniaxial anisotropic mineral, is split into two rays that vibrate at right angles to each
102
Robert Clark Jones (1916–2004).
2.14
Index of Refraction Components na, nb, and nc (Measured at Na D Line, l ¼ 589.29 nm)
Table 2.8 Name
Formula
Water @20 C Benzene@20 C Ethanol @20 C Silicone oil Diamond Sr titanate Fused silica PyrexÒ glass Halite Water ice Sellaite Quartz Wurtzite Rutile Cinnabar Calcite Tourmaline Sapphire Tridymite Mica (muscovite) Turquoise Topaz Sulfur Borax Lanthanite Stibnite a
83
ELEMENTS OF OPTICS
H2O C6H6 C2H5OH (SiO2)n C SrTiO3 SiO2 BSixOy NaCl H2O MgF2 SiO2 ZnS TiO2 HgS CaCO3 Complexa aAl2O3 SiO2 Complex Complex Al2SiO4(F,OH)2 S8 Na2B4O7.10H2O Complex Sb2S3
Type Isotropic Isotropic Isotropic Isotropic Isotropic Isotropic Isotropic Isotropic Isotropic Uniaxial Uniaxial Uniaxial Uniaxial Uniaxial Uniaxial Uniaxial Uniaxial Uniaxial Biaxial Biaxial Biaxial Biaxial Biaxial Biaxial Biaxial Biaxial
liquid liquid liquid liquid
(disordered) (disordered)
Optic axis — — — — — — — — —
111
Chemical formula is too complex for this table.
other, because they cross lattice atoms or ions with different efficiencies: (1) the ordinary (or slow) ray, labeled o, which does obey Snell’s law, Eq. (2.14.24), with refractive index no ¼ na ¼ 1.658 and a speed of 1.81 108 m s1 (¼ 3.0 108/1.658), and (2) the extraordinary (or fast) ray, labeled e, which does not follow Snells’ law, with ne ¼ nb ¼ nc ¼ 1.486, and a speed of 2.02 108 m s1 (¼ 3.0 108/1.486). For this extraordinary ray the electric displacement D and the electric field E are no longer parallel. The difference D ne no is called the optical retardation. The direction of the principal axes of the index of refraction tensor n can be described by the indicatrix. For isotropic crystals the indicatrix is a sphere. For positive uniaxial crystals it is a prolate spheroid (ne > no); for negative uniaxial crystals it is an oblate spheroid (no > ne). For orientations away from the principal axis orientations, the extraordinary ray will have a refractive index ne’ intermediate between no and ne. Circular polarization of electromagnetic radiation is a polarization such that the tip of E, at a fixed point in space, describes a circle as time progresses. E, at one point in time, describes a helix along the direction of wave propagation k. The magnitude of the electric field vector is constant as it rotates. Circular polarization is a limiting case of elliptical polarization. The other special case is the easier-to-understand linear polarization. Circular (and elliptical) polarization is possible because the propagating E and H fields
na
nb
nc
1.3330 1.501 1.361 1.52045 2.419 2.41 1.45846 1.470 1.516 1.309 1.378 1.54424 2.356 2.616 2.854 1.658 1.669 1.7681 1.469 1.5601 1.61 1.619 1.95 1.447 1.52 3.194
1.3330 1.501 1.361 1.52045 2.419 2.41 1.45846 1.470 1.516 1.313 1.390 1.55335 2.378 2.903 3.201 1.486 1.638 1.7599 1.47 1.5936 1.62 1.62 2.043 1.47 1.587 4.303
1.3330 1.501 1.361 1.52045 2.419 2.41 1.45846 1.470 1.516 1.313 1.390 1.55335 2.378 2.903 3.201 1.486 1.638 1.7599 1.473 1.5977 1.66 1.627 2.240 1.472 1.613 4.46
84
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
have two orthogonal components with independent amplitudes and phases, but the same frequency. A circularly polarized wave may be resolved into two linearly polarized waves of equal amplitude, but 90 or p/2 radians apart, with their planes of polarization normal to each other. Circular polarization may be referred to as right (two conventions: right-threaded screw motion for physics, etc., left-threaded for electrical engineering) or left, depending on the direction in which E rotates. Circular dichroism(CD) is the differential absorption of left- and righthanded circularly polarized light, due to molecules that have optical handedness or optical activity (this happens for most molecules of biological interest—for example, the a helix, b sheet, and random coil regions of proteins and the double helix of nucleic acids have recognizable CD spectral signatures). X-ray diffraction will be discussed in Section 8.3. Mirrors. For reflection from a plane mirror, an observer situated at a distance u from it sees a “virtual image” of himself/herself at a distance v ¼ u
ð2:14:42Þ
“behind” the mirror; the observer sees in this virtual image a mirror reflection (“left-hand” appears as “right-hand,” but “up” remains “up” and “down” is seen as “down”). In the discussion below, we assume that the angle of incidence yi is equal to the angle of reflection yr: y i ¼ yr
ð2:14:43Þ
The relative intensities of the primary and reflected rays are not considered here. For a concave mirror of radius of curvature R (Fig. 2.16) a point object at a distance u from the mirror forms a real image at a distance v from the mirror, where 1=u þ 1=v 2=R
ð2:14:44Þ
(“Real image” means that the image is on the same side of the mirror as the object). This relationship, known as the “thin-lens fomula,” is approximate; it holds exactly only when u ¼ v ¼ R, that is, when a point source and its image are both at the center of curvature C. Also, if the object is moved to I, then the image will form at O; that is, the points O and I are “conjugate,” and FIGURE 2.16 A spherical concave mirror of radius R ¼ CP ¼ CQ reflects an object at O (mirror-object distance OP u) into image at I (mirror-image distance IP v). If object moves to infinity (u ¼ 1), then all rays from that object will converge at the principal focus point F: f ¼ FP ¼ (1/2) CP ¼ R/2; that is, the principal focal length f is half of the radius of curvature.
light
Q θ θ'
α O
C
γ Fβ P I f v
u
spherical concave mirror
2.14
85
ELEMENTS OF OPTICS
u and v are called conjugate distances. Finally, if u becomes infinite, u ¼ 1, then v assumes the special value f, called the principal focal length: All parallel rays coming in from infinity will focus onto the same point F. Conversely, a point source at F will create a parallel bundle of rays going out to infinity. In Eq. (2.14.44), setting u ¼ 1 and v ¼ f yields f ¼ R/2. Therefore we can rephrase Eq. (2.14.44) as 1=u þ 1=v 1=f
ð2:14:45Þ
Similarly, for the plane mirror, R ¼ 1, Eq. (2.14.44) will reduce to Eq. (2.14.42). PROBLEM 2.14.4. Prove Eq. (2.14.44). When the mirror is convex instead of concave, then the image is “virtual”; that is, it forms on the opposite side as the object. The same Eq. (2.14.44) holds, if we assume u > 0 always, but v < 0, R < 0, and f < 0 for the “virtual” case, and v > 0, R > 0, and f > 0 for the “real” case. The reciprocal of the focal length f (expressed in meters) is the diopter, which measures the “strength” of a mirror or lens. The images produced by mirrors or lenses can be constructed (Fig. 2.17). The transverse linear magnification m, that is, the ratio of the length of transverse length h0 ¼ II0 of the image to the transverse length of the object h ¼ OO0 , is obtained from Fig. 2.17a: From the congruent triangles OO0 C and CII0 one sees that |h0 |/|h|¼ II0 /OO0 ¼ CI/CO ¼ (R v)/(u R). However, it is wise to leave h as a positive quantity for a right-side-up object, but as a (a)
R-v Q
O' h I
C O
F P
u-R
I' u
(b) O' h
I' C
I F
O
(c) O' C
I' F I
P
O
FIGURE 2.17 Graphical location of object OO0 and image II0 . (a) Concave mirror, f > 0: real object OO0 and real image II0 : draw parallel line from O’ to mirror surface (point Q). From Q draw an arrow through the focus F. From O0 draw arrow through center C. The two arrows meet at the image point I0 . (b) Convex mirror, f < 0: real object and virtual image. (c) Concave mirror, f > 0: virtual object and real image.
86
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
negative quantity for an inverted (upside down) image. So we define m h0 /h ¼ (R v)/(u R); using Eq. (2.14.44) and R ¼ 2uv/(u þ v), one gets tranverse linear magnification ðspherical mirrorÞ m h=h ¼ v=u ð2:14:46Þ PROBLEM 2.14.5. Complete the proof of Eq. (2.14.46). For the longitudinal magnification, that is, the magnification along the optical axis of a spherical mirror, one calculates the differential of Eq. (2.14.46): From 1/u þ 1/v ¼ 1/f ¼ constant, the differential form is u2du v2dv ¼ 0, whence longitudinal linear magnification ðspherical mirrorÞ dv=du ¼ v2 =u2 ð2:14:47Þ Other conic sections exist: paraboloidal mirrors, ellipsoidal mirrors, and hyperboloidal mirrors. In paraboloidal mirrors, all rays (from infinity or not) converge at the same focus. In ellipsoidal mirrors, all rays emanating at focus 1 converge at focus 2. For refraction at a single spherical interface of radius R that separates two media of refractive indices n1 and n2, the lens equation is n1 =u þ n2 =v ¼ ðn2 n1 Þ=R
ð2:14:48Þ
and the first and second principal focal lengths or foci are defined as f2 (by setting u ¼ 1) and as f1 (by setting v ¼ 1): f2 ¼ Rn2 =ðn2 n1 Þ
ð2:14:49Þ
f1 ¼ Rn1 =ðn2 n1 Þ
ð2:14:50Þ
For two thin lenses with spherical segments (Fig. 2.18), the two effects are additive and yield 1=u þ 1=v ¼ ðn 1Þð1=R1 1=R2 Þ ¼ 1=f
v u
FIGURE 2.18 Real image formation for a biconvex lens (positive, converging) with refractive index n > 1. As long as u > f, a real image forms, upside down, on the opposite side of the lens from the object.
f
f
Object Real Image
ð2:14:51Þ
2.14
87
ELEMENTS OF OPTICS
refractive index 1
θ
θ−ρ ρ
α
θ−ρ
δ
ρ
FIGURE 2.19 Prism of measured refractive angle a and unknown refractive index n bends light through a measured angle d, whose minimum is found by scanning the incident angle y. External angles of triangles yield d ¼ 2(y r), and a ¼ 2 r, and Snells’s law gives n ¼ sin y/sin r ¼ sin [(a þ d)/2]/sin(a/2).
θ
α refractive index n
refractive index 1
while the transverse linear magnification becomes m ¼ v=u For compound lenses, as in a single-lens reflex camera, the focal lengths add as follows: 1=f ¼ 1=f1 þ 1=f2 þ
ð2:14:52Þ
The lensmaker’s equation is an elaboration of Eq. (2.14.51): 1=f ¼ ðn 1Þ½1=R1 1=R2 þ ðn 1Þd=ðnR1 R2 Þ
ð2:14:53Þ
where f is the focal length of the lens, n is the refactive index of the lens material, R1 is the radius of curvature of the lens surface closest to the light source (R1 > 0 if lens surface is convex), R2 is the radius of curvature of the lens surface farthest from the light source (R2 < 0 if the lens surface is concave), and d is the thickness of the lens (distance along the lens axis between the two surface vertices). As above, if f is given in meters, then 1/f is in “diopters.” If d (R1, R2) (“thin-lens approximation”), then 1=f ðn 1Þ½1=R1 1=R2
ð2:14:54Þ
Prisms and Gratings. For a prism of refractive index n and refracting angle a, there is an angle of incidence yi for which the deviation angle d is a minimum: this can be used, for example, to determine the refractive index n of a liquid placed inside the hollowed prism (Fig. 2.19). We next discuss how to wavelength-select visible radiation. There have been two traditional kinds of optical elements: prisms and gratings (Fig. 2.20). In particular, the Bunsen103 prism is a 60 prism, made of fused silica (normal glass absorbs too much light below 350 nm); a single natural quartz crystal is
103
Robert Wilhelm Eberhard Bunsen (1811–1899).
88
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S mirror 60o
r
i
30o 30o
(a) 60o prism
(b) Cornu biprism (c) Littrow prism
normal r to grating
3 2 1
B
i
D C
A
FIGURE 2.20
d
Prisms and gratings.
(d) grating
acentric and will thus introduce polarization, so the Cornu104 biprism consists of two quartz crystals, 30 each, cut from right-handed quartz and left-handed quartz and glued together. Another design is the Littrow105 prism, which uses a 30 cut and a reflective metal mirror on the back. Primary gratings are quartz substrates with a set of a few thousand very finely cut grooves, all exactly parallel, with a distance d between them: they are labor-intensive and expensive to manufacture. Replica gratings are much cheaper: A soft polymer block (e.g., polymethylmethactylate) is pressed into a primary grating, cured and dried, and then covered with a thin layer of Al or Ni. The grating equation resembles Bragg’s106 law (discussed in Section 8.3): A planar wavefront of wavelength l with angle of incidence i is refracted to a smaller angle r, but has the wavelets in phase; this requires that the path length difference AB plus CD must be an integer (n) times the wavelength l: nl ¼ AB þ CD ¼ dðsin i þ sin rÞ
ð2:14:55Þ
Prisms and gratings are usually rotated mechanically inside a sealed unit called a monochromator (Fig. 2.21). The light-gathering power of a monochromator is given, as in photographic lens systems, by its f-number; its ability to resolve small wavelength differences in wavelength l is given by its angular dispersion, dr=dl ¼ n=d cos r
104
Marie Alfred Cornu (1841–1902). Otto von Littrow (1843–1864). 106 Sir William Lawrence Bragg (1890–1971). 105
ð2:14:56Þ
2.15
89
FOUNDATIONS OF ELLIPSOMETRY Concave mirror
Concave mirror
λ1 Grating Entrance slit
λ2
Exit slit
Czerny-Turner Grating Monochromator
Collimating lens
Focussing lens λ2
FIGURE 2.21 Prism Entrance slit
Bunsen Prism Monochromator
λ1
Monochromators: (a) Czerny– Turner grating monochromator, (b) Bunsen prism monochromator.
Exit slit
or by its reciprocal dispersion 1/D; 1=D dl=dr ¼ d cos r=nf
ð2:14:57Þ
where f is the focal length, and d, r, and n are defined in Eq. (2.14.55).
2.15 FOUNDATIONS OF ELLIPSOMETRY In a beam of light (transverse electromagnetic radiation) the three vectors of electric field E, magnetic field H, and light propagation direction k are mutually perpendicular, forming a right-handed set by Lenz’ law (E, H, k). The anisotropy or polarization is usually specified by the direction of the vector E relative to some fixed laboratory axis system. Light emitted by individual atoms or molecules is always polarized, while a macroscopic light beam emitted by randomly oriented atoms or molecules will consist of a huge number (Avogadro’s number’s worth) of such emitters, and the direction of E relative to laboratory coordinates will be random; such light is called unpolarized or natural light. A polarizer or polarization filter (usually an oriented crystal) can suppress most of the light polarized in one direction, relative to the optical axes of the crystal, and transmit most of the component polarized in the perpendicular direction; if the polarization is complete, it will be planepolarized (or linearly polarized) light. In general, the polarized light will consist of two mutually perpendicular plane-polarized components; depending on the amplitude of these two waves and their relative phase,
90
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
the combined electric vector traces out an ellipse; the wave is elliptically polarized. Elliptical and plane polarization can be interconverted by birefringent crystal filters. In a circularly polarized wave, E moves like in a coil. If a lineally polarized (plane-polarized) light of intensity I0 is incident onto a polarizer, then the intensity of the transmitted light I will depend upon the angle y between the direction of the light polarization and the orientation of maximum polarization of the crystal: I ¼ I0 cos2 y. Ellipsometry measures the orientation of polarized light undergoing oblique reflection from a sample surface. Linearly polarized light, when reflected from a surface, will become elliptically polarized, because of presence of the thin layer of the boundary surface between two media. Dependence between optical constants of a layer and parameters of elliptically polarized light can be found on basis of the Fresnel formulas described above. Maxwell’s equations, using several unit systems (see Appendix Table H) and Cartesian coordinates, for J ¼ 0 and r ¼ 0, can be combined into a 6 6 matrix form: 0
0
0
B 0 0 B B B 0 0 B B B 0 @=@z B B 0 @ @=@z @=@y
@=@x
0
@=@z
0
0
@=@z
0
@=@y
@=@x
@=@y
0
0
@=@x
0
0
0
0
0
0
@=@y
10
Ex
1
0
Dx
1
C C B B @=@x C CB E y C B Dy C CB C C B C B B Dz C 0 C CB E z C C B CB C ¼ Qð@=@tÞB C B Hx C B Bx C 0 C CB C C B CB C C B 0 A@ H y A @ By A 0
Hz
Bz ð2:15:1Þ
where Q 1 for SI and cgs units, but Q (1/c) for Gaussian units. For brevity, looking at Eq. (2.15.1) let G be the 6 1 column vector involving the components of electric field E and of magnetic field H on the left-hand side; let C be the 6 1 column vector involving the electric displacement D and the magnetic induction B on the right-hand side; let the 6 6 traceless matrix on the left-hand side be called O; then Eq. (2.15.1) is abbreviated as OG ¼ Qð@=@tÞC
ðSIÞ;
OG ¼ Qð@=@tÞC
ðcgsÞ
ð2:15:2Þ
The constitutive equations are D e0 «E
ðSIÞ;
D «E
ðcgs-esuÞ
ð2:7:12Þ
B m0 mH
ðSIÞ;
B mH
ðcgs-emuÞ
ð2:7:13Þ
D rH
ðSIÞ;
D rH
ðcgsÞ
ð2:15:3Þ
B r0E
ðSIÞ;
B r0E
ðcgsÞ
ð2:15:4Þ
where m, «, r, and r 0 are rank-two tensors with complex matrix elements, which depend on the electronic and magnetic properties of the material: « is the dielectric tensor or relative electric permittivity tensor, m is the
2.15
91
FOUNDATIONS OF ELLIPSOMETRY
magnetic permeability tensor, and r and r 0 are optical rotation tensors, needed to describe optical activity, Faraday effect, and so on. In Eqs. (2.15.3) and (2.15.4) it is assumed that the medium has no nonlinear optical effects (D is taken to be linearly dependent on E) and no nonlinear magnetic effects (B is assumed to be linearly dependent on H). Equations (2.15.1) and (2.15.2) do not directly include the linear response of the material to E or H; that linear response is given by Eqs. (2.7.12), (2.7.13), (2.15.3), and (2.15.4), which can be recast into a new and formally convenient form; the column vectors G and C are next linked by a second 6 6 matrix M, called the optical matrix: MG ¼ C
ð2:15:5Þ
M contains all the information about the anisotropic optical properties of the medium that supports the electromagnetic fields; some of its 36 tensor components are the target of the ellipsometric measurements of anisotropic crystals. This optical matrix M is thus defined by 0
M
« r
!
r0 m
e11
B B e21 B B e31 B ¼B 0 B r 11 B B 0 @ r 21 r0 31
r13
1
e12
e13
r11
r12
e22
e23
r21
r22
e32
e33
r31
r32
r0 12
r0 13
m11
m12
r0 22
r0 23
m21
m22
C r23 C C r33 C C C m13 C C C m23 A
r0 32
r0 33
m31
m32
m33
ð2:15:6Þ
where « [« ij ¼ Mij (i, j ¼ 1, . . ., 3)] is the complex dielectric (or relative electric permittivity) tensor, m [mij ¼ Mi þ 3, j þ 3, i, j ¼ 1, . . ., 3] is the permeability tensor, and r and r 0 are the optical rotation tensors r [rij ¼ Mi,j þ 3, i, j ¼ 1, . . ., 3] and r 0 [rij ¼ Mi þ 3,j, i, j ¼ 1, . . ., 3). The 3 3 complex refractive index tensor N n ik is related to the dielectric tensor « and the magnetic tensor m by the Maxwell relation: e0 m0 « m ¼ ðn ikÞ2
« m ¼ ðn ikÞ2
ðSIÞ;
ðcgsÞ
ð2:15:7Þ
where i ¼ (1)1/2, n is the ratio of the phase velocity of an electromagnetic wave in vacuum to its phase velocity in the anisotropic crystal (Snell’s law), and k is related to the absorption of light energy during the propagation of the light wave within the crystal. Of course, for nonmagnetic media we have m ¼ 1. The experimental aim is to derive from ellipsometric data the tensor «: 0
e1
B « ¼ @0 0
0
0
1
e2
C 0 A
0
e3
ð2:15:8Þ
which has three complex principal-axis values e1, e2, e3 in a Cartesian principal-axis system 1, 2, 3 that is oriented in some fashion relative to the known crystal unit cell axes of the anisotropic crystalline material. From
92
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
these complex values, assuming m ¼ 1, one obtains the principal-axis values of the real and imaginary parts of the refractive index tensor: e1 ¼ ðn1 ik1 Þ2 ð2:15:9Þ
e2 ¼ ðn2 ik2 Þ2 e3 ¼ ðn3 ik3 Þ
2
Thus, in principle, one can obtain the complex index of refraction from the upper left-hand quarter of the optical tensor M ¼ [Mij, (i, j ¼ 1, . . ., 3)]. We now explain how these optical constants can be derived from ellipsometry. By combining Eqs. (2.15.1) and (2.15.7), the full spatial wave equation to be solved is OG ¼ Qð@=@tÞMG
ðSIÞ;
OG ¼ Qð@=@tÞMG
ðcgsÞ
ð2:15:10Þ
The geometry of the ellipsometric measurement is shown in Fig. 2.13. The anisotropic material medium under study has coordinate z > 0, while the isotropic ambient medium (air) has z < 0; the electromagnetic field components of interest are the tangential components Ex, Ey, Hx, and Hy; the plane of incidence is the xz plane. Let the time dependence of the electromagnetic fields be the phase factor exp(iot), where o is the angular frequency of the light. By the symmetry of Fig. 2.13, there is no variation of any field component along the y direction; hence in Eq. (2.15.1) we have ð@=@yÞ ¼ 0
ð2:15:11Þ
can be set. If x denotes the x component of the wavevector of the incident wave of frequency o, then all fields should vary in the x direction as exp (iox/c); thus another condition for Eq. (2.15.1) is ð@=@xÞ ¼ iox=c
ð2:15:12Þ
These two conditions reduce Eq. (2.15.1) to 0
0
B B 0 B B 0 B B B 0 B B @ @=@z 0
0
0
0
@=@z
0
0
@=@z
0
0
0
0
@=@z
0
0
0 iox=c
iox=c 0
0 0
0
10
Ex
CB iox=c CB Ey CB B iox=c 0 C C B Ez CB B 0 0 C CB Hx CB 0 0 A@ Hy 0
0
1
0
Ex
C B C B Ey C B C B Ez C B C ¼ ðio=cÞMB C B Hx C B C B A @ Hy
Hz =x
1 C C C C C C C C C A
Hz =x ð2:15:13Þ
(SI or cgs units). The above matrix is equivalent to two linear homogeneous algebraic equations (the third and sixth equations) and four linear differential equations (the first, second, fourth, and fifth equations); the third equation is xHy ¼ M31 Ex þ M32 Ey þ M33 Ez þ M34 Hx þ M35 Hy þ M36 Hz
ð2:15:14Þ
2.15
93
FOUNDATIONS OF ELLIPSOMETRY
and the sixth equation is xEy ¼ M61 Ex þ M62 Ey þ M63 Ez þ M64 Hx þ M65 Hy þ M66 Hz
ð2:15:15Þ
We need the tangential components of E and H (namely Ex, Ey, Hx, and Hy) as explicit variables, so we want to get rid of Hz and Ez; this is done by solving Eqs. (2.15.14) and (2.15.15) explicitly and simultaneously for Ez and Hz in terms of Ex, Ey, Hx, and Hy. These expressions for Ez and Hz are then substituted into the remaining four differential equations, to produce four linear homogeneous first-order differential equations in the four tangential field variables Ex, Ey, Hx, Hz. For convenience, we define a 4 1 generalized field vector c: 1 0 Ex C B B Hy C C B ð2:15:16Þ cB C @ Ey A Hx The 4 4 matrix involving c can be abbreviated to 0
D11
B B D21 ð@c=@zÞ ¼ iðo=cÞDc ¼ iðo=cÞB BD @ 31 D41
D12
D13
D22
D23
D32
D33
D42
D43
D14
1
C D24 C Cc D34 C A D44
ð2:15:17Þ
where D is a new 4 4 matrix, called the differential propagation matrix, that depends on M and x. The relations between the elements of the 4 4 matrix D and the elements of the 6 6 matrix M are D11 ¼ M51 þ ðM53 þ xÞa31 þ M56 a61 D12 ¼ M55 þ ðM53 þ xÞa35 þ M56 a65 D13 ¼ M52 þ ðM53 þ xÞa32 þ M56 a62 D14 ¼ M54 þ ðM53 þ xÞa34 þ M56 a64 D21 ¼ M11 þ M13 a31 þ M16 a61 D22 ¼ M15 þ M13 a35 þ M16 a65 D23 ¼ M12 þ M13 a32 þ M16 a62 D24 ¼ M14 þ M13 a34 þ M16 a64 D31 ¼ M41 þ M43 a31 þ M46 a61 D32 ¼ M45 þ M43 a35 þ M46 a65 D33 ¼ M42 þ M43 a32 þ M46 a62 D34 ¼ M44 þ M43 a34 þ M46 a64 D41 ¼ M21 þ M23 a31 þ ðM26 xÞa61 D42 ¼ M25 þ M23 a35 þ ðM26 xÞa65 D43 ¼ M22 þ M23 a32 þ ðM26 xÞa62 D44 ¼ M24 þ M23 a34 þ ðM26 xÞa64
ð2:15:18Þ
94
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
with the auxiliary definitions a31 ðM61 M36 M31 M66 Þ=d a32 ½ðM62 xÞM36 M32 M66 =d a34 ðM64 M36 M34 M66 Þ=d a35 ½M65 M36 ðM36 þ xÞM66 =d a61 ðM63 M31 M33 M61 Þ=d
ð2:15:19Þ
a62 ½M63 M32 M33 ðM62 xÞ=d a64 ðM63 M34 M33 M64 Þ=d a65 ½M63 ðM62 þ xÞ M33 M65 =d and further d M33 M66 M36 M63
ð2:15:20Þ
x ðo=cÞnm sin yi
ð2:15:21Þ
where nm is the real refractive index of the isotropic ambient medium (air). Formally, if one has the experimental values of the dielectric tensor «, the magnetic permeability tensor m, and the optical rotation tensors r and r 0 for the substrate, one can construct first the optical matrix M, then the differential propagation matrix D, and x, which, to repeat, is the x component of the wavevector of the incident wave. Once D is known, the law of propagation (wave equation) for the generalized field vector c (the components of E and H parallel to the x and y axes) is specified by Eq. (2.15.18). Experimentally, one travels this path backwards. Consider the relationship between D and the dielectric tensor «. In ellipsometry, there is reflection and transmission by the surface (z ¼ 0) of a semi-infinite anisotropic substrate (biaxial crystal) into an isotropic ambient (air, for z < 0). Suppose that this semi-infinite anisotropic medium (the crystal) is homogeneous and that its optical matrix M is independent of z (if D does depend on z—that is, on how far into the crystal one goes—then the problem becomes much more difficult). If the optical matrix M of the substrate is independent of z, then so is the differential propagation matrix D; if D is independent of z and has a value z, to be found below, the solution of Eq. (2.15.25) is given by cðzÞ ¼ cð0Þexpðiozz=cÞ
ð2:15:22Þ
Here c(0) is the value of the generalized field vector of the incident plane wave at z ¼ 0 (the crystal surface). Substituting Eq. (2.15.22) into the original differential equation, Eq. (2.15.21), one gets iðoz=cÞcðzÞ ¼ iðo=cÞDcðzÞ
ð2:15:23Þ
which has unique solutions if and only if the following determinant vanishes: det j D zI j ¼ 0
ð2:15:24Þ
2.15
95
FOUNDATIONS OF ELLIPSOMETRY
where I is the 4 4 identity matrix. This yields four eigenvalues z1, z2, z3, and z4, and four plane-wave solutions {c(z) ¼ c(0) exp(iozmz/c), m ¼ 1,. . .,4} for Ex, Hy, Ey, and Hx. The four eigenvalues correspond to four values of the z component of the propagation vectors of the electromagnetic waves. Two of these are associated with waves propagating in the þz direction (into the biaxial crystal), and two waves will propagate in the negative z direction (out of the crystal into the air). Consider the two reflected plane waves propagating in the þz direction, generated by the incident waves (which traveled in the z direction); let z1 and z2 be the corresponding two eigenvalues of Eq. (2.15.24) with a positive real part; let c1 c 1(0) and c2 c 2(0) be the associated eigenvectors, which are known up to the constant amplitude factors c1 and c2. These coefficients c1 and c2 will be determined by matching tangential electric and magnetic field components, in the ambient (air) and in the substrate (crystal), at their common interface z ¼ 0. In terms of the generalized field vectors, the boundary conditions assume the form c i ð0 Þ þ c r ð0 Þ ¼ c1 c 1 ð0þ Þ þ c2 c 2 ð0þ Þ
ð2:15:25Þ
where the subscripts i and r indicate the incident and reflected wave components of the total field c in the ambient (air): 0þ and 0 represent the ambient and substrate sides of the z ¼ 0 interface. Equation (2.15.25) shows, on the left hand side, the generalized field of the incident and reflected waves on the air side of the interface, while the right-hand side shows the waves generated inside the biaxial crystal. [Equation (2.15.25) is a conservation of energy condition, but since c r(0) has propagation vector opposite to c i(0), there is a positive sign in front of c r(0)]. We now need a relationship between H and E, involving the dielectric constant n. Consider the plane-wave case, that is, the case in which E depends only on one of the three coordinates, say x; that is, Ex, Ey, and Ez can depend only on x, but not on y or z. For this plane wave, Eqs. (2.7.30) and (2.7.28) can be shown to yield Ex ¼ 0 and Hx ¼ 0. The wave equation, Eq. (2.7.28), now reduces to @ 2 Ey =@x2 ¼ emc2 @ 2 Ey =@t2
ð2:15:26Þ
A similar equation is written for Ez. The general solution to Eq. (2.15.26) is h i Ey ¼ A exp io t e1=2 m1=2 xc1
ð2:15:27Þ
This solution is inserted into Eqs. (2.7.28) and (2.7.26): h i ðm=cÞð@Hz =@tÞ ¼ @Ey =@x ¼ ioe1=2 m1=2 c1 A exp io t e1=2 m1=2 x=c ð2:15:28Þ After integrating with respect to t and using Eq. (2.16.31), one gets Hz ¼ e1=2 m1=2 iy
ðcgsÞ
ð2:15:29Þ
96
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
For nonmagnetic media we have m ¼ 1, so the Maxwell relation, Eq. (2.15.7), yields finally Hz ¼ nEy
ðcgsÞ
ð2:15:30Þ
Using this in Eq. (2.15.25), we want to know the generalized field vectors c i and c r of the incident and reflected waves. We can write down c i and c r in terms of components parallel (k) and perpendicular (s) to the plane of incidence. In a nonmagnetic (m ¼ 1) optically isotropic medium, Eq. (2.15.30) shows that the magnetic field components are simply related to their associated orthogonal electric field components through the index of refraction n: Hk =E? ¼ H? =Ek ¼ n
ð2:15:31Þ
Consider the plane-polarized incident beam, with amplitude Ejji in the plane of incidence (with y and z components; see Fig. 2.13) and E?i in the direction normal to the plane of incidence (the x axis in Fig. 2.13), traversing an isotropic medium (air) of refractive index nm, and making an angle of incidence yi with the surface. Since the beam is polarized in the plane of incidence, Ey ¼ 0; with Eq. (2.15.31) this yields also Hx ¼ 0. From the definition of the generalized field vector, Eq. (2.15.16), and (2.15.31) we can get the s and p components of ci: 0 B B B cki ¼ Eki B B @
cos yi nm 0
1
0
C C C C; C A
B B ¼ E?i B B @
c?i
0 0 1
1 C C C C A
ð2:15:32Þ
nm cos yi
0 and ci cki þ c?i
ð2:15:33Þ
Also, 0 B B B ckr ¼ Ekr B B @
cos yi nm 0
1
0
C C C C; C A
B B c?r ¼ E?r B B @
0 0 1
1 C C C C A
ð2:15:34Þ
nm cos yi
0 and c r c kr þ c ?r
ð2:15:35Þ
where i and r denote the incident and reflected beams, respectively, assuming yi ¼ yr. After substituting into Eq. (2.15.26) the c i and c r given by Eqs. (2.15.32) to (2.15.35), and denoting by ni the index of refraction of the ambient (¼air), one gets
2.15
97
FOUNDATIONS OF ELLIPSOMETRY
Eki Ekr cos yi Eki Ekr nm
¼ c1 c11 þ c2 c12
ðE?i E?r Þ
¼ c1 c31 þ c2 c32
¼ c1 c21 þ c2 c22
ð2:15:36Þ
ðE?i E?r Þnm cos yi ¼ c1 c41 þ c2 c42 where cij is the jth component (i ¼ 1, 2; j ¼ 1, . . ., 4) of the column eigenvectors c 1(0þ) and c 2(0þ). Define the 1 2 column vector c, whose two elements determine the relative amplitudes (c1 and c2) of the two refracted (transmitted) waves propagating into the crystal. Define the total electric field vectors Ei Eik þ Ei?
ð2:15:37Þ
Er Erk þ Er?
ð2:15:38Þ
and define the 2 2 complex-amplitude reflection matrix R by Er REi
ð2:15:39Þ
or, in detail, Ekr
!
E?r
¼
Rkk
Rk?
R?k
R??
!
Eki
!
E?i
¼
Rkk Eki þ Rk? E?i
! ð2:15:40Þ
R?k Eki þ R?? E?i
Then the four equations in Eq. (2.15.40) become c ¼ S1 ðEi Er Þ
ð2:15:41Þ
c ¼ S2 ð E i E r Þ where we define further S1 ¼ cos yi
c11
c12
c41 =nm
c42 =nm
!1 and
S2 ¼
c21 =nm
c22 =nm
c31
c32
!1
ð2:15:42Þ Elimination of c finally leads to an expression of the reflection matrix R in terms of the generalized field eigenvalues and the index of refraction of the medium (air): R ¼ ðS1 þ S2 Þ1 ðS2 S1 Þ
ð2:15:43Þ
So far, however, one still needs an expression for the reflection matrix that shows how to extract from it the tensor elements for the refractive index tensor of the biaxial medium. We seek the reflection matrix R for the semi-infinite anisotropic biaxial medium. Using Eq. (2.15.8) and Eq. (2.15.21), we can relate the 4 4 differential propagation matrix D to the dielectric tensor « from Eqs. (2.15.21) and (2.15.24). Then it can be shown that
98
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
0
0
B B e3 B D¼B B0 @
1 x2 =e3
0
0
0
0
0
0
e 2 x2
0
0
1
C 0C C C 1C A
ð2:15:44Þ
0
The eigenvalues of D are given by
1=2 z1;3 ¼ e1 1 x2 =e3
ð2:15:45Þ
1=2 z2;4 ¼ e2 x2
ð2:15:46Þ
The eigenvectors are then 0 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 2 B 1 x =e3 C C B pffiffiffiffi C B e1 C; c1 ¼ B C B C B 0 A @
0 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 2 B 1 x =e3 C B pffiffiffiffi C B e1 C C c3 ¼ B C B C B 0 A @
0 0
ð2:15:47Þ
0
0
1
0
C B C B 0 C; B c2 ¼ B C 1 A @ pffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 e2 x
0
1
C B C B 0 C B c4 ¼ B C 1 A @ pffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 e2 x
ð2:15:48Þ
where positive eigenvalues are associated with c1 and c2. We need next to obtain the reflection matrix R for this case. It has been shown that the reflection matrix reduces to h 1=2 i h 1=2 i 1=2 2 1 1=2 cos y þ n 1 x e Rkk ¼ R11 ¼ e1 cos yi nm 1 x2 e1 = e i m 3 3 1 Rk? ¼ R12 ¼ 0 h i 1=2 i h = nm cos yi þ ðe2 x2 Þ1=2 R?? ¼ R22 ¼ nm cos yi e2 x2 R?k ¼ R21 ¼ 0 ð2:15:49Þ In the monoclinic case, in which the dielectric tensor is not in the principal axis system: 0
exx
0
B e¼@ 0
eyy
exz
0
exz
1
C 0 A
ezz
ð2:15:50Þ
2.15
99
FOUNDATIONS OF ELLIPSOMETRY
the reflection matrix values are Rkk ¼ R11 ¼
h
1=2 i cos yi nm 1 x2 e1 zz h 1=2 i1 1=2 exx e2xz e1 cos yi þ nm 1 x2 e1 zz zz exx e2xz e1 zz
1=2
Rk? ¼ R12 ¼ 0 h 1=2 i R?? ¼ R22 ¼ nm cos yi eyy x2 h i1 nm cos yi þ ðeyy x2 Þ1=2
ð2:15:51Þ
R?k ¼ R21 ¼ 0 We next calculate the null setting of an ellipsometer from the reflection matrix in an anisotropic sample. The Jones vector for the reflected light is given by Eq. (2.15.44); for an anisotropic sample the off-diagonal elements of reflection matrix R are nonzero. We next seek the fundamental equation of ellipsometry. In the isotropic case (where n00 and n are both scalar) the traceless 2 2 reflection matrix R is given by 0 E k ¼ Rk 0 Ek ð2:15:52Þ ðE0? Þ ¼ ð0 R? ÞðE? Þ The complex reflectance ratio r is defined by r Rk =R? ¼ E0k =Ek =½E0? =E?
ð2:15:53Þ
Using EðtÞ ¼ E exp½iðot þ fÞ ¼ E exp½iotexp½if for all polarizations, we get n h io r Rk =R? ¼ E0k E? =E0? Ek exp i f0k f0? fk f?
ð2:15:54Þ
This can be simplified by defining four new angles, c, c0 , d, and d0 , where tan c Ek =E? ;
tan c0 E0k =E0? ;
d fk f? ;
and
d0 f0k f0? ð2:15:55Þ
then r ¼ ½tan c0 =tan cexp½iðd0 dÞ
ð2:15:56Þ
and with two more trivial definitions; namely, tan c ¼ ½tan c0 =tan c
and
D ½ d0 d
ð2:15:57Þ
we finally get the fundamental equation of ellipsometry: r ¼ tan c expðiDÞ
ð2:15:58Þ
In the ellipsometer, let P, A, and Q be the angular settings of the polarizer, analyzer, and compensator, respectively; these angles are measured
100
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
counterclockwise for an observer looking into the light beam, and they are zero when the transmission axes of the polarizer and analyzer and the fast axis of the compensator are in the plane of incidence. If the compensator is a perfect quarterwave plate and if the electric field intensities are normalized, then the Jones vector representing the polarization of the light before it strikes the surface of the sample is given by
Eki ¼ ðcos Q cosðP QÞ þ i sin Q sinðP QÞ
ðE?i Þ ¼ ðsin Q sinðP QÞ i cos Q sinðP QÞÞ
ð2:15:59Þ
If the reflected beam can be extinguished by the analyzer, then the phases of Ekr and E?r must be equal, and the imaginary part of the ratio Ekr/E?r must be zero. Substituting this condition into Eq.(2.15.59) yields F sinð2P 2QÞ þ G cosð2P 2QÞ þ H ¼ 0
ð2:15:60Þ
where F, G, and H are defined by F ReðR11 R*22 R12 R*21 Þ G ½ImðR11 R*22 R12 R*21 Þsin 2Q þ ½ImðR11 R*21 R12 R*22 Þcos 2Q H ImðR11 R*21 R12 R*22 Þ
ð2:15:61Þ ð2:15:62Þ ð2:15:63Þ
The two solutions of Eq. (2.15.60) are n h 1=2 i P1 ¼ ð1=2Þ sin1 H= F2 þ G2 tan1 ðG=FÞ þ Q
ð2:15:64Þ
h n 1=2 i P2 ¼ ð1=2Þ p sin1 H= F2 þ G2 tan1 ðG=FÞ þ Q
ð2:15:65Þ
where H2 < F2 þ G2 must hold. Equation (2.15.60) gives the relation between the polarizer and compensator that will provide a null setting. Equations (2.15.64) and (2.15.65) give a solution for P for a fixed compensator or a fixed polarizer. The plane-polarized reflected beam is extinguished when the setting of analyzer angle A satisfies the following equation: tan A ¼ R11 Eki þ R12 E?i = R21 Eki þ R22 E?i
ð2:15:66Þ
Two real values of A are obtained, one for each of the two values of P from Eq. (2.15.66). The two sets of null settings for P and A are related to the different zone settings, but sign conventions vary from instrument to instrument. In brief, the steps needed to calculate the instrumental settings P and A from given values of the permittivity «, permeability m, and optical rotation r, r 0 tensors are as follows: 1. 2. 3. 4.
Set up «, m, r, and r 0 and compute the elements of M. Compute the elements of the differential propagation matrix D. Determine the eigenvalues zi and the eigenvectors ci of D. Find the two eigenvectors ci associated with each of the eigenvalues zi.
2.16
10 1
T R A N S FO R M S
5. Compute the refractive index values from the relationship between ci, zi, and D. 6. Compute the reflection matrix R. 7. Compute the polarizer and analyzer angles P and A from the reflection matrix R. The scheme above might be called the “forward” calculation. The practical method to compute “backward”—that is, to calculate the dielectric constant tensor values, or the complex index of refraction—from a set of the observed polarizer and analyzer angles is not presented here. Instead, for a biaxial crystal, the technique indicated below is as follows: 1. Use a single wavelength of light. 2. Collect P and A values at 15 intervals (which will repeat after a 180 rotation) in four zones (thus, 12 4 unique data) and at two angular settings of the ellipsometer (two different angles of incidence); thus there are 96 unique data. 3. Test initial values of the complex index of refraction (nx ikx, ny iky, nz ikx) and of three Eulerian rotation angles for the crystal face. Thus there are nine parameters to be determined, for 96 data points: The system is mathematically overdetermined. 4. Once a reasonable starting set of values for the nine parameters is found, refine these parameters using the Simplex algorithm.
2.16 TRANSFORMS Various mathematical transforms and their reverse, or “back” transforms, are often used in succession to “clean up” and filter out noise effects from experimental data, by first calculating the full transform, then eliminating the higher-order terms which are ascribed to noise, and finally computing the back transform. They are also very useful for solving differential equations. Furthermore, transforms are often used only one way, taking data from coordinate space to momentum space, or from coordinate space to time space, and so on. In common to all these techniques is the transformation integral: x¼b ð
FðkÞ ¼
f ðxÞKðk; xÞdx
ð2:16:1Þ
x¼a
where K(k, x) is the “kernel.” The limits of integration a, b differ from transform to transform. A short list of transforms is 1. Fourier107 transform ½Kðk; xÞ ¼ expðikxÞ 2. Laplace transform ½Kðk; xÞ ¼ expðkxÞ 3. Hadamard108 transform 107 108
Jean-Baptiste Fourier (1768–1830). Jacques Solomon Hadamard (1866–1963).
102
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
4. Wavelet transform h 1=2 i 5. Abel transform Kðk; xÞ ¼ 2x x2 k2 6. Hankel109 or Fourier-Bessel110 transform ½Kðk; xÞ ¼ Jm ðkxÞ; where Jm ðkxÞ; is the mth-order Bessel function 7. Hartley111 transform h i 8. Hilbert transform112 Kðk; zÞ ¼ PV ðk zÞ1 , where PV stands for principal value 9. Linear canonical transform 10. Mellin113 transform Kðk; zÞ ¼ kz1 11. Radon114 transform 12. Stieltjes115 transform 13. Sumudu or “smooth” transform Kðp; zÞ ¼ p1 expðx=pÞ We shall discuss only the first four here. The Fourier transform of a one-dimensional function f(x) is F(k), defined by x¼1 ð
f ðxÞexpðikxÞ dx
FðkÞ ¼
ð2:16:2Þ
x¼1
and the corresponding reverse or back Fourier transform is
f ðxÞ ¼
1 2p
k¼1 ð
FðkÞexpðikxÞ dk
ð2:16:3Þ
k¼1
The factor of 2p is sometimes evenly assigned as a factor (2p)1/2 for both the forward and inverse transforms. Remember exp(ikx) ¼ cos(kx) þ i sin(kx). There are real (cosine) and complex (sine) versions of the transform. If the function is three-dimensional, the transform becomes x¼1 ð
r¼1 ð
r¼1
f ðrÞexpðik rÞ dxdydz x¼1 y¼1 z¼1
109
Hermann Hankel (1839–1873). Friederich Bessel (1784–1846). 111 Ralph Vinton Lyon Hartley (1888–1970). 110
David Hilbert (1862–1943). Hjalmar Mellin (1854–1933). 114 Johann Karl August von Radon (1887–1956). 115 Thomas Joannes Stieltjes (1856–1894). 113
z¼1 ð
f ðrÞexpðik rÞdr ¼
FðkÞ ¼
112
y¼1 ð
2.16
10 3
T R A N S FO R M S
And the inverse transform is 1 f ðrÞ ¼ 2p
k¼1 ð
FðkÞexpðik rÞ dk
ð2:16:4Þ
k¼1
The Fourier transforms of aperiodic functions f(x) are boring. If, however, the functions f(x) or f(r) are periodic with period L or L, then the transforms and inverse transforms become sums instead of integrals. If the three Dirichlet116 conditions are satisfied: (i) periodicity: f(x) ¼ f(x þ L); (ii) f(x) is continuous between 0 x L except at a finite number of points, and (iii) if f(x) has a finite number of maxima or minima in the period L, then the Fourier series for f(x) is given by two infinite series: f ðxÞ ¼ a0 =2 þ
nX ¼1
nX ¼1
an cosð2pnx=LÞ þ i
n¼1
bn sinð2pnx=LÞ
ð2:16:5Þ
n¼1
where the coefficients are:
2 an L
x¼L ð
f ðxÞcosð2pnx=LÞ dx
ð2:16:6Þ
f ðxÞsinð2pnx=LÞ dx
ð2:16:7Þ
x¼0
2 bn L
x¼L ð
x¼0
Note that if the function is even: f(x) ¼ f(x), then only the coefficients an in the cosine series are nonzero; if the function is odd: f(x) ¼ f(x), then only the coefficients bn in the sine series are nonzero. Figure 2.22 shows how including more terms of a Fourier expansion for the odd periodic step function [f(x) ¼ 1 for 0 < x < p radians, f(x) ¼ 1 for p < x < 2p, etc.] approaches a reasonable representation of the original step function. If the sums in Eq. (2.16.5) are truncated at n ¼ N 1, then one has a discrete Fourier transform. The discrete Fourier transform is then X ðk Þ ¼
n¼N1 X
xn expð2pikn=NÞ
fk ¼ 0; 1; 2; . . . ; N 1g
ð2:16:8Þ
n¼0
and the inverse discrete Fourier transform is xn ¼
116
X 1 k¼N1 XðkÞexpð2pikn=NÞ N k¼0
fn ¼ 0; 1; 2; . . . ; N 1g
Johann Peter Gustav Lejeune Dirichlet (1805–1859).
ð2:16:9Þ
104
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
step fcn (4/1 )sin x (4/1 )*(sin x + (1/3)*sin(3x)) (4/1 )*(sinx+(1/3)sin(3x)+(1/5)*sin(5x)) (4/1 )*(sin x + (1/3)*sin(3x)+(1/5)*sin(5x)+(1/7)*sin(7x)) step.fcn.data (4/1)*(sinx +(1/3)*sin(3x)+(1/5)*sin(5x)+(1/7)*sin(7x)+(1/9)*sin(9x)) 1.5
1
f(x)
0.5
FIGURE 2.22
0
–0.5
The odd periodic step function {f(x) 1 for 0 < x < p, f(x) 1 for p < x < 2p} has a Fourier sine series f(x) ¼ (4/p) [sin x þ (1/3) sin 3x þ (1/5) sin 5x þ . . .]. The illustration shows how including more and more successive terms in the Fourier series improves the fit to the step function.
−1
−1.5
−300
−200
−100
0 x (degrees)
100
200
300
There are trigonometric additive and multiplicative relations for both cos(nx) or sin(nx) in terms of (ultimately) sin x and cos x. sinðnxÞ ¼ sin½ðn 1Þx þ x ¼ sin½ðn 1Þxcos x þ cos½ðn 1Þxsin x ð2:16:10Þ cosðnxÞ ¼ cos½ðn 1Þx þ x ¼ cos½ðn 1Þxcos x sin½ðn 1Þxsin x ð2:16:11Þ Once a relatively more compute-intensive calculation of sin x and cos x is finished, the trigonometric expansions with multiplications and sums are relatively fast; this computational advantage was used in the Beevers117–Lipson118 method of computing Fourier maps in early X-ray structure determinations. A digitally efficient, if tolerably imprecise, fast Fourier transform (FFT), due to Cooley119 and Tukey120, has been implemented in digital computer programs for Fourier transform spectroscopy. 117
Cecil Arnold Beevers (1908–2001). Henry Lipson (1910–1991). 119 James William Cooley (1926–). 120 John Wilder Tukey (1915–2000). 118
2.16
10 5
T R A N S FO R M S
The Kronecker121 delta is defined as ( dmn
0
if m 6¼ n
1
if m ¼ n
ð2:16:12Þ
This Kronecker delta can be called a “sum killer.” nX ¼1
an dmn ¼ am
ð2:16:13Þ
n¼0
The Dirac delta function is defined as dð xÞ 0
for x 6¼ 0
ð2:16:14Þ
dð xÞ 1
for x ¼ 0
ð2:16:15Þ
and
provided that x¼1 ð
dðxÞdx 1
ð2:16:16Þ
x¼1
The Dirac delta function, very often used in quantum mechanics, is an infinitely tall but infinitely thin function, which is anathema to some mathematicians (it is called an improper function). In analogy to what the Kronecker delta does, the Dirac delta function could be called an “integral killer” because x¼1 ð
dðx aÞf ðxÞ dx ¼ f ðaÞ
ð2:16:17Þ
x¼1
Another very useful result is k¼1 ð
1 2p
expðikxÞ dk ¼ dðxÞ
ð2:16:18Þ
k¼1
The Fourier transform of a Gaussian is another Gaussian. Convolution Theorem. The convolution (German: Faltung, i.e. folding) of a function f(x) times a function with a different origin g(x) is the very useful
121
Leopold Kronecker (1823–1891).
106
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
function C(y) defined by x¼y ð
f ðxÞgðx þ yÞ dx
CðyÞ f *g
ð2:16:19Þ
x¼0
If C(y) ¼ 0 for y < 0, then the above upper integration limit y can safely go to infinity: x¼1 ð
f ðxÞgðx þ yÞ dx
CðyÞ ¼
ð2:16:20Þ
x¼0
The convolution theorem states (i) that the Fourier transform of C is the product of the Fourier transforms of f and of the complex conjugate of the Fourier transform of g. There is also another convolution: x¼y ð
DðyÞ f **g
f ðxÞgðx yÞ dx
ð2:16:21Þ
x¼0
with similar properties. Thus, if the Fourier transform of f is F, and if the Fourier transform of g is G, then the Fourier transform of fg (Eq. (2.16.19)) is FG (where denotes the complex conjugate); the Fourier transform of fg (Eq. (2.16.21)) is F G. A similar theorem is valid for Laplace transforms. PROBLEM 2.16.1. Show that x¼L ð
expði2pmx=LÞexpði2pnx=LÞ dx ¼ dmn L x¼0
PROBLEM 2.16.2. Prove Eq. (2.16.18). PROBLEM 2.16.3. Show that the Heaviside function, defined by {H(x) 0 for x < 0, H(x) 1 for x > 0, H(0) 1/2}, is the integral of the delta function. PROBLEM
2.16.4. Prove
Parseval’s122
theorem:
If
FðkÞ ¼
x¼1 Ð x¼1
f ðxÞexpðikxÞ dx is the Fourier transform of f(x), and if f(x) is the inverse k¼1 Ð 1 transform of F(k): f ðxÞ ¼ 2p FðkÞexpðikxÞ dk, then the Fourier transform of jf(x)j2 is jF(k)j2.
122
k¼1
Marc-Antoine Parseval (1755–1836).
2.16
10 7
T R A N S FO R M S
PROBLEM 2.16.5. Show that the three-dimensional Fourier transform of the normalized Gaussian function f(r) ¼ 23/4p3/4a3/2exp(r2/a2) is another Gaussian function, namely F(k) ¼ (2pa2)3/4 exp ( k2a2/4). Laplace Transforms. When Eqs. (2.16.2) and (2.16.3) are modified by using a real transform kernel exp(kx) instead of exp(ikx), then we have the Laplace transform: x¼1 ð
FðkÞ ¼
f ðxÞexpðkxÞ dx ¼ L½ f ðxÞ ¼ L½ f ðxÞ; k
ð2:16:22Þ
x¼0
and the corresponding inverse transform is 1 f ðxÞ ¼ 2pi
k¼cþi1 ð
FðkÞexpðkxÞ dx
ð2:16:23Þ
k¼ci1
This transform is abbreviated as L[ f(x)] or as L[ f(x), k]. The Laplace transform is linear: The transform of a sum of functions is equal to the sum of transforms of each function. The Laplace transform of the first derivative of function f 0 ðxÞ df ðxÞ=dx is given by L½ f 0 ðxÞ ¼ kL½ f ðxÞ f ð0Þ
ð2:16:24Þ
Similarly for the second derivative: L½ f 00 ðxÞ ¼ k2 L½f ðxÞ f 0 ð0Þ kf ð0Þ
ð2:16:25Þ
L d2 f ðxÞ=dx2 ¼ kLfdf ðxÞ=dxg df ð0Þ=dx
ð2:16:26Þ
The Laplace transform of the integral of a function is given by 2 x¼x 3 ð L4 f ðxÞ dx5 ¼ k1 L½ f ðxÞ
ð2:16:27Þ
x¼0
The inverse Laplace transform of L(k) is given by L1 ½FðkÞ ¼ f ðzÞ
ð2:16:28Þ
Formally, this inverse transform is given by 1 L ½ Fð kÞ ¼ 2pi 1
k¼cþi1 ð
FðkÞexpðkxÞ dx
ð2:16:29Þ
k¼ci1
Some transforms are listed in Table 2.9. Laplace transforms are useful, inter alia, in complicated chemical kinetics problems.
108
2
Table 2.9
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
A Few Laplace Transforms [18–20]
f(x)
F(k)
a
ak Ð
f ðxÞ
1
x¼0 x
¼ 1 expðkxÞf ðxÞdx
dn f ðxÞ=dxn
kFðkÞ f ð0þ Þ P kn FðkÞ i¼1 i ¼ nki1 dni f ðxÞ=dxn1 x¼0þ
x
k2
xn
n! kn1
df ðxÞ=dx
d(x a) {a > 0}
exp(ka)
exp(ax) {a is complex}
(k þ a)1
x exp(ax) {a is complex}
(k þ a)2
a1[1 exp(ax)]
k1(k þ a)1
(1 ax) exp(ax)
k(k þ a)2
(a b)
1
[a exp(ax) b exp(ax)]
k(k þ a)1 (k þ b)1
(b a)1 [exp(ax) exp( bx)]
(k þ a)1(k þ b)1
(b a)1(c a)1exp(ax) þ (a b)1(c b)1 exp(bx) þ (a c)1(b c)1exp(cx)
(k þ a)1(k þ b)1 (k þ c)1
2
a [1 exp(ax) a x exp(ax)]
k1(k þ a)2
(a b)1 [a exp (ax) b exp(bx)]
k(k þ a)1(k þ b)1
(c b)1[(a b)exp(bx) (a c)exp(cx)]
(k þ a) (k þ b)1 (k þ c)1
sin(ax)
a(k2 þ a2)1
cos(ax)
k(k2 þ a2)1
sinh(ax)
a(k2 a2)1 k(k2 a2)1
cosh(ax) 1 (n 1)
[(n 1)!] x
kn
(px)1/2
k1/2
2(x/p)1/2
k3/2
(y/2) (psx3)1/2 exp(y2/4sx) 1/2
(s/px)
2
exp(y /4sx)
exp[(k/s)1/2y] (s/k)1/2 exp[(k/s)1/2y]
erfc[(y/2)(sx)1/2]
k1 exp[(k/s)1/2y]
exp(a2x) erfc(ax1/2)
k1/2(k1/2 þ a)1
PROBLEM 2.16.6. Solve by Laplace transform methods the classical linear harmonic oscillator differential equation md2y/dt2 ¼ kHy(t), where kH is the Hooke’s law force constant, with the initial condition dy/dt ¼ 0 at t ¼ 0. Note: Use p for the Laplace transform variable, to not confuse it with the Hooke’s law force constant kH! The Hadamard transform is also called the Walsh123–Hadamard, or Hadamard–Rademacher124–Walsh, or Walsh, or Walsh–Fourier transform. 123 124
Joseph Leonard Walsh (1895–1973). Hans Adolph Rademacher (1892–1969).
2.16
10 9
T R A N S FO R M S
The Hadamard transform of index m, Hm, is a 2m 2m matrix, consisting of elements that are either 1 or 1, which can be defined recursively; 1 Hm pffiffiffi 2
!
Hm1
Hm1
Hm1
Hm1
for m > 0
ð2:16:30Þ
and H0 1
ð2:16:31Þ
so that 1 H1 ¼ pffiffiffi 2 0
1
B 1 B1 H2 ¼ pffiffiffi B 8B @1 1
1
1
1
1
1
1
1
1
1
1
1
1
! ð2:16:32Þ
1
1
C 1 C C 1 C A 1
ð2:16:33Þ
This transform is used in quantum computation, in signal processing, and in JPEG-4 compression of visual data. Wavelet Transform. In the continuous wavelet transform, a function f(x) is decomposed into a set of (unspecified) orthonormal and square-integrable basis functions c(s, t, x): Ð Fðs; tÞ ¼ f ðxÞc*ðs; t; xÞdx
ð2:16:34Þ
and the corresponding inverse wavelet transform is f ð xÞ ¼
RR
Fðs; tÞcðs; t; xÞdt ds
ð2:16:35Þ
This transform uses the unspecified single mother wavelet c(s, t, x) to generate other wavelets by scaling and translation: cðs; t; xÞ ¼ s1=2 cððx tÞ=sÞ
ð2:16:36Þ
The advantage of this transform is that its kernel c(s, t, x) is left unspecified. The discrete wavelet transform was invented by Haar125, used by petroleum geologists to extract meaningful data from noisy seismograms, and later utilized in JPEG2000 pixel compression.
125
Alfred Haar (1885–1933).
110
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
2.17 CONTOUR INTEGRATION AND KRAMERS–KRONIG RELATIONS For a complex function f(z) ¼ f1(z) þ if2(z) of the complex variable z, which is analytic in the upper half-plane of z and decays faster than jzj1, the two Kramers–Kronig126,127 relations are 2
t¼1 ð
1 f 1 ð z Þ ¼ P4 p
3 f2 ðtÞ 5 dt tz
ð2:17:1Þ
t¼1
and
2 1 f 2 ðzÞ ¼ P 4 p
t¼1 ð
3 f1 ðtÞ 5 dt tz
ð2:17:2Þ
t¼1
where P denotes the Cauchy128 principal value. Thus the real and imaginary parts of f(z) are interrelated. This result can be obtained from the Cauchy integral formula: If a function f is analytic everywhere within and on a closed contour C, and if z is any point interior to C, then ð 1 f ðtÞ dt f ðz Þ ¼ 2pi t z
ð2:17:3Þ
C
where the integral is taken in the positive sense (counterclockwise) around C. This powerful result has been used to evaluate many a difficult integral. To derive the Kramers–Kronig relation, the contour C, shown in Fig. 2.23, follows the real (“x”) axis, except for a hump over the “pole” at x ¼ t (defined as the point where [z t]1 becomes infinite) and a semicircle in the upper
y = Im (z)
x=t
FIGURE 2.23
x = Re (z)
Contour for Kramers–Kronig relations, including pole at x ¼ t.
126
Hendrik Anthony Kramers (1894–1952). Ralph Kronig (1904–1995). 128 Augustin-Louis, Baron Cauchy (1789–1857). 127
2.17
C O N T O UR I N T E G R A T I O N A N D K R A M E R S – K R O N I G R E L A T I O N S
half-plane at infinity. The integral is then split into three parts. The length of the segment at infinity increases proportionally to jzj, but its integral component vanishes, if and only if f(z) vanishes faster than jzj1. What is left is the segment along the real axis and the half-circle around the pole: ð
2 z¼þ1 3 ð f ðtÞ f2 ðzÞ 5 dt ¼ ipf ðtÞ þ P4 dz ¼ 0 tz zt
ð2:17:4Þ
z¼1
C
Rearranging yields a compact form of the Kramers–Kronig relations: 2 z¼þ1 3 ð 1 4 f ðzÞ 5 dz f ðzÞ ¼ P ip zt
ð2:17:5Þ
z¼1
The spectral response function w(o), a function of the angular frequency o, is the sum of an in-phase real part w1(o), which is an even function of o, and an out-of-phase (dissipative) imaginary part w2(o), which is an odd function of o: wðoÞ ¼ w1 ðoÞ ¼ iw2 ðoÞ
ð2:17:6Þ
Two Kramers–Kronig relations show w1(o) as the integral over the complex part w2(o0 ): 2o0 ¼þ1 3 ð 2 4 o0 w2 ðo0 Þ 05 w1 ðoÞ ¼ P do p o02 o2
ð2:17:7Þ
o0 ¼0
and w2(o) as the integral over the real part w1(o0 ): 2o0 ¼þ1 3 ð 2o 4 w1 ðo0 Þ P w2 ð o Þ ¼ do0 5 p o02 o2
ð2:17:8Þ
o0 ¼0
This seems trivial, but is the very important result that, if one measures w1(o), one can calculate w2(o), and conversely; put differently, the Kramers–Kronig relations show that the absorptive and dispersive properties of a medium are not independent of each other. An experimental difficulty is that one must truncate the integrations at some maximum measured frequency o: this may lead to considerable error. A similar result is obtained for the complex dielectric constant e, which also consists of a real and even function of o, e1(o), and an odd and imaginary part e2(o) (often written as k): eðoÞ ¼ e1 ðoÞ ¼ ie2 ðoÞ ¼ e1 ðoÞ þ ikðoÞ
ð2:17:9Þ
These relationships have been used extensively in nonlinear optics.
11 1
112
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
2.18 TREATMENT OF ERRORS An error associated with a measurement, called the “uncertainty,” is usually the smallest reading that can be read or estimated (by interpolation) from an instrument, or it is the “resolution” of that instrument (the smallest interval of the value measured available on that instrument). Assume that you make N measurements of a quantity x and get results (data) x1, x2, . . ., xN. Of course, you cannot know which datum of these N data is the ‘true’ value. But you can evaluate the mean, or average, trivially: hxi ð1=NÞ
X
i ¼ Nxi ðx1 þ x2 þ x3 þ þ xN Þ=N
ð2:18:1Þ
i¼1
There is no guarantee that the mean, hxi, is the true value, either: The true value is in the hands of God. But you can guess that, barring systematic errors (misreading of instruments, neglect of major factors affecting the measurement), a large number of repetitive measurements may yield a mean or average that may approach the “true” value (as N tends to infinity, hxi may tend to the true value). The deviation of any datum from the mean, or its “residual” is di xi hxi
ð2:18:2Þ
Of course, di can be positive or negative. The average deviation, hdi, is defined as hdi ð1=N Þ
X i¼1
i ¼ N jxi hxij
½jx1 hxij þ ½jx2 hxij þ þ ½jxN hxijÞ=N
ð2:18:3Þ
Here j denotes the absolute sign: jxj ¼ þx if x >0, jxj ¼ x if x < 0. Before we proceed, we must admit that, if hxi is known and used, then of the N þ 1 values {x1, x2, . . ., xN, plus hxi}, only N are truly mutually independent. Therefore the denominator N in Eq. (2.18.3) should be replaced by N 1. Accordingly, the variance s2 is defined by X s2 ½1=ðN 1Þ i ¼ Njxi hxij2 ð2:18:4Þ i¼1
Its square root is the estimated standard deviation, or sample standard deviation, s, which is universally quoted as a good estimate of the probable error e in the measurement: " #1=2 X 1=2 2 s ¼ ð N 1Þ i ¼ Njxi hxij ð2:18:5Þ i¼1
2.19 PROPAGATION OF ERRORS Rule 1. When adding or subtracting measurements with their associated errors, the absolute errors are summed (never subtracted). For instance, in “17.0 0.5 C”, the “0.5 C” is the absolute error.
2.20
11 3
STATISTICS
Rule 2. When multiplying or dividing measurements with associated errors, the percentage errors (or the fractional errors) are summed. Assume that in your experiment you have measured four variables x, y, z, and w and have determined errors Dx, Dy, Dz, and Dw in them. Assume further that a formula r ¼ f(x, y, z, w) must be used to determine a final result r. What is the expected error in r? The approximate expression for the differential dr is given by dr ¼ ð@f =@xÞy;z;w dx þ ð@f =@yÞz;w;x dy þ ð@f =@zÞw;x;y dz þ ð@f =@wÞx;y;z dw ð2:19:1Þ which, in practice, is used as follows: Dr ¼ ð@f =@xÞy;z;w Dx þ ð@f =@yÞz;w;x Dy þ ð@f =@zÞw;x;y Dz þ ð@f =@wÞx;y;z Dw ð2:19:2Þ
2.20 STATISTICS We introduce a few results from combinatorics, the science of ordering and sorting objects into “boxes” or “containers”; these results will be useful for developing the Maxwell–Boltzmann (MB), Fermi–Dirac (FD), and Bose129– Einstein (BE) statistics in Chapter 5. PROBLEM 2.20.1. Show that the number of ways of ordering N distinguishable objects (¼ number of permutations of N distinguishable objects) is given by n1 ¼ N!
ð2:20:1Þ
PROBLEM 2.20.2. Prove that the number of ways of placing N distinguishable objects into R distinguishable boxes, so that there are N1 objects in box 1, N2 objects in box 2, and so on, is given by n2 ¼
N! i¼R Y Ni !
ð2:20:2Þ
i¼1
PROBLEM 2.20.3. Prove that the number of ways of selecting N distinguishable objects from a set of G distinguishable objects (where G > N) is
129
Satyendra Nath Bose (1894–1974).
114
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
given by the binomial coefficient:
n3 ¼
G G! N N!ðG NÞ!
ð2:20:3Þ
PROBLEM 2.20.4. Prove that the number of ways of placing N indistinguishable objects into G distinguishable containers is given by
n4 ¼
ðG þ N 1Þ! ðG 1Þ!N!
ð2:20:4Þ
PROBLEM 2.20.5. Show that the number of ways of placing N distinguishable objects into G distinguishable containers is given by
n5 ¼
ðG þ N 1Þ! ðG 1Þ!
ð2:20:5Þ
PROBLEM 2.20.6. Stirling’s130 approximation for large N is given by N! ð2pN Þ1=2 NN expðN Þ
ð2:20:6Þ
which can be approximated adequately as ln N! N ln N N
ð2:20:7Þ
Calculate the percent error between ln N! and N ln N N for N ¼ 50 and for N ¼ 65. PROBLEM 2.20.7. The gamma function G(x), invented by Leonhard Euler, is given by either of two integrals: t¼1 ð
GðxÞ
t¼1 ð
t t¼0
x1
h
expðtÞ dt ¼
lnð1 =t Þx1 dt
t¼0
valid for positive real x, or for complex x with a positive real part. (a) First show that Gðx þ 1Þ ¼ xGðxÞ
130
James Stirling (1692–1770).
ð2:20:8Þ
2.21
11 5
GAUSSIAN, BINOMIAL, AND POISSON DISTRIBUTIONS
(b) Second, show that for integer positive N the gamma function becomes a factorial: GðN Þ ¼ ðN 1Þ!
ð2:20:9Þ
Gð1=2Þ ¼ p1=2
ð2:20:10Þ
(c) Third, show that
2.21 GAUSSIAN, BINOMIAL, AND POISSON DISTRIBUTIONS This standard deviation s occurs in the Gaussian or normal error probability function PGaussian(x): PGausssian ðxÞ ¼ ð2pÞ1=2 s1 exp x2 =2s2
ð2:21:1Þ
The infinite integral of this function equals 1. That is what “normal” or “normalized” means, but in general parlance, all distributions are defined to be normalized. Thus the name “normal” for the Gaussian distribution is probably a testament to its prevalence. x¼1 ð
x¼1
1 1 PGausssian ðxÞ dx ¼ pffiffiffiffiffiffi 2p s
x¼1 ð
expðx2 =2s2 Þ dx ¼ 1
ð2:21:2Þ
x¼1
Its finite one-sided integral is known as the error function:
2 erfðxÞ pffiffiffi p
pffiffiffi t¼x ð p expðt Þ dt ¼ pffiffiffi expðt2 =2s2 Þ dt s p
t¼x ð
2
t¼0
ð2:21:3Þ
t¼0
The Gaussian distribution function is most valid when the number of data, N, is very large. When there are fewer data, in a smaller sample, a different distribution function is better; it was designed by Gosset131 who, in an excess of modesty, would only let it be known as the “Student t” distribution: n oN=2 ð2:21:4Þ Tstudent ðxÞ GðN=2ÞðN 1Þ1=2 ½GððN 1Þ=2Þ1 1 þ ðN 1Þ1 x2
131
William Sealy Gosset (1876–1937).
116
2
Table 2.10 f 1 2 3 4 5 6 7 8 9 10 15 20 30 1
P P0
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
Critical Values of x (from CRC Handbook) for the Student t Distribution 0.50 0.75
0.80 0.90
0.90 0.95
0.95 0.975
0.98 0.99
0.99 0.995
0.999 0.9995
1.00 0.816 0.765 0.741 0.727 0.718 0.711 0.706 0.703 0.700 0.691 0.687 0.683 0.683
3.08 1.89 1.64 1.53 1.48 1.44 1.41 1.40 1.38 1.37 1.34 1.33 1.31 1.31
6.31 2.92 2.35 2.13 2.02 1.94 1.89 1.86 1.83 1.81 1.75 1.72 1.70 1.70
12.7 4.30 3.18 2.78 2.57 2.45 2.36 2.31 2.26 2.23 2.13 2.09 2.04 2.04
31.8 6.96 4.54 3.75 3.36 3.14 3.00 2.90 2.82 2.76 2.60 2.53 2.46 2.46
63.7 9.92 5.84 4.60 4.03 3.71 3.50 3.36 3.25 3.17 2.85 2.85 2.75 2.75
637.0 31.6 12.9 8.61 6.87 5.96 5.41 5.04 4.78 4.59 4.07 3.85 3.65 3.65
Note: P is the probability that the mean of the population (call it m) does not differ from the sample mean hxi by a factor of more than x, the tabular entry for a given number f of degrees of freedom (f ¼ number of independent data). In contrast, P0 is the probability that m does not exceed hxi, or, alternatively, the probability that hxi does not exceed m by a factor of more than x for a given number of degrees of freedom f.
where G(x) is the gamma function. The Student t distribution function is normalized: x¼þ1 ð
Tstudent ðxÞ dx ¼ 1
ð2:21:5Þ
x¼1
When N is large, Tstudent(x) does approach PGaussian(x). If we wish to find the range of values x over which the integral of the Student t distribution function has a value T0.95, of say, 0.95 (i.e., 95%), this is the integral: x¼þt ð
T0:95 ¼ 0:95 ¼
TStudent ðxÞ dx
ð2:21:6Þ
x¼t
This has been tabulated extensively (see Table 2.10; but watch out for subtly different conventions, that is, limits on the integral in Eq. (2.21.6)). It is often said that “within a confidence limit of 95%, the mean of a large number of measurements is within 2 s of the true value” and that “within a confidence limit of 99%, the mean is within 3 s of the true value.” These two estimates can be found in Table 2.10 under the entries P ¼ 0.95, which yields x ¼ 2.04 for f ¼ 1, and under P ¼ 0.99, x ¼ 2.75 3 for f ¼ 1. Binomial Distribution. “failures” is given by
The probability of m “successes” and (n m)
Pbinomial ðmÞ
n! pm ð1 pÞnm m!ðn mÞ!
ð2:21:7Þ
2.22
11 7
L E A S T S QU A R E S O R L I N E A R R E GR E S S I O N A N A L Y S I S
It can be shown that for large n, the binomial distribution becomes the Gaussian distribution (see Problem 2.21.1). Poisson Distribution. Another limit of the binomial distribution will yield the Poisson distribution: By letting both n ! 1 and p ! 0, but keeping the product np constant (e.g., n m ¼ a, where a is a constant), we will see that [n!/(n m)!] ! nm and (1 p)n m ! exp(a), whence PPoisson ðmÞ ¼ am expðaÞ=m!
ð2:21:8Þ
P Of course, m¼0 m ¼ 1 PPoisson ðmÞ ¼ 1. The Poisson distribution is applicable for the case of a large number of experiments, but each with a small probability of success a. PROBLEM 2.21.1. Prove that for large n, the binomial distribution becomes the Gaussian distribution.
2.22 LEAST SQUARES OR LINEAR REGRESSION ANALYSIS A traditional method of treating N measured data (xi, yi, i ¼ 1, 2, . . ., N), if the theoretical equation governing them is known ytheo ðxÞ, is to fit the data to the equation, obtaining the best slope m and the best intercept b,P so that the sum of the squares of the deviations from the average hxi ¼ i¼1 i ¼ Nxi be minimized; the simplest case is the linear equation (that chemists really love!): ytheo ðxÞ ¼ mx þ b
ð2:22:1Þ
The general procedure is as follows. Define and compute the four sums: P P P P
x y
P P
x2 xy
i¼1 i
¼ Nxi
i¼1 i
¼ Nyi
P
i¼1 i
¼ Nx2
i¼1 i
¼ Nxi yi
P
ð2:22:2Þ
i
P (note that the sum i¼1 i ¼ Ny2 is not computed and that the problem is i asymmetrical; it assumes that the independent variable x is free from error and that the error is all in the measurement of the dependent variable y). Then the “best” slope and “best” intercept are given by h P P i y= N x2 ð xÞ2 P 2 P P P h P 2 P i y x xy = N x ð xÞ2 ¼ x
mbest ¼ ½N bbest
P
xy
P
x
P
ð2:22:3Þ
118
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
P when this is done, and the sum i¼1 i ¼ Ny2i will have been minimized. This is the method of least squares, introduced in 1805 by Legendre,132 later used in a statistical study of the late 1800s that showed that the fraction of tall people in a sample human population tended to return or “regress” to a lower average height; the name “regression analysis” stuck. Another measure of interest is the sample correlation coefficient, or Pearson’s r, or Pearson’s133 product-moment formula, or the linear correlation of xand y:
r ¼ ½N
P
xy
P
x
P i1=2 h P 2 P i1=2 P h P 2 N y ð yÞ 2 y N x ð xÞ 2 ð2:22:4Þ
P 2 P where y i¼1 i ¼ Ny2 . Note that 1 r 1: If r ¼ 1, then all points i lie exactly on one line; if jrj ¼ 0, then no straight line is any better than any other. If the function f(x) is not linear in x, then nonlinear regression is followed: The transcendental equation is recast in a linear form by suitable (if often approximate) transformations (e.g., computing its logarithm), and one proceeds as in the linear case, with a suitable back-transformation at the end. PROBLEM 2.22.1. Prove that Eqs. (2.22.3) do indeed minimize
Pi¼Ny2i i¼1
2.23 GENERAL REFERENCES Mathematical methods are discussed in references 21–27 (some of these are “golden oldies”). Integrals, series, and other tables can be found in [28–33]. Of course, much labor can now be saved by using the computer program package Mathematica.
REFERENCES 1. R. A. Serway, Physics for Scientists and Engineers, 4th edition, Saunders, Philadelphia, 1996. 2. I. Newton, Philosophiae Naturalis Principia Mathematica, Pepys, London, 1687. 3. C. A. de Coulomb, [Trois] Memoires sur l’Electricit e et le Magnetisme, Hist. Acad. R. Sci. 569–577, 578–611, 612–638 (1785). 4. J. C. Maxwell, A dynamical theory of the electromagnetic field, Philos. Trans. R. Soc. London 155:459–512 (1865). 5. en.wikipedia.org/wiki/File:Binding_energy_curve_-_common.isotopes.svg
132 133
Adrien-Marie Legendre (1752–1833). Karl Pearson (1857–1936).
RE FE REN CES
6. (i) G. Friedlander, M. Perlman, J. R. Stehn, and E. F. Clancy,Chart of the Nuclides, Knolls Atomic Power Laboratory, operated by the General Electric Co., 5th edition, April 1956; operated by Bechtel Marine Propulsion, 17th edition, 2010, www. nuclidechart.com). (ii) National Nuclear Data Center, Brookhaven National Laboratory, Upton, NY. (iii) Korea Atomic Energy Research Institute (2000). (iv) Universal Nuclear Chart ( www.nucleonica.net). 7. (a) D. Mendeleev, Experience on the system of the elements [in Russian]. Zh. Russkogo Khim. Obshchestva 1(2–3):35 (1869). (b) D. Mendeleev, Experience on the system of the elements [in German] Z. Prakt. Chem. 106:251 (1869). (c) D. Mendeleev, On the relationship of the properties of the elements to their atomic weights [in Russian] Zh. Russkogo Khim. Obshchestva 1:60–77 (1869). [Abstracted in Z. Chem. 12:405–406 (1869)]. (d) D. Mendeleev, Natural system of the elements and its application to prediction of properties of yet undiscovered elements [in Russian] Zh. Russkogo Khim. Obshchestva 3:25–56 (1871). (e) D. € ber das system der elemente, Ber. Deutsch. Chem. Mendelejeff, Zur frage u Gesell. 4:348–352 (1871). 8. N. H. Abel, Memoire sur les equations algebriques, o u on demontre l’impossibilite de la resolution de l’equation generale du cinquieme degre, Christiania, Norway, 1824. 9. J. A. Wheeler, Geometrodynamics, Academic Press, New York, 1963. 10. K. L. Nielsen and J. H. Vanlonkuyyzen, Plane and Spherical Trigonometry, Barnes and Noble, New York, 1946. 11. R. Hooke, Ut Tensio Sic Vis (1678). 12. G. Joos, Theoretical Physics, 3rd edition, Hafner, New York, 1950. 13. N. Davidson, Statistical Mechanics, McGraw-Hill, New York, 1962. 14. R. D. Guenther, Modern Optics, Wiley, New York, 1990. 15. J. D. Jackson, Classical Electrodynamics, 3rd edition, Wiley, New York, 1999. 16. A. Einstein, Zur Elektrodynamik bewegter K€ orper, Ann. Phys. 17:891 (1905). 17. R. B. Leighton, Principles of Modern Physics, McGraw-Hill, New York, 1959. 18. J. I. Steinfeld, J. S. Francisco, and W. L. Hase, Chemical Kinetics and Dynamics, 2nd edition, Prentice-Hall, Upper Saddle River, NJ, 1998. 19. A. Erdelyi, W. Magnus, F. Oberhettinger, and F. Tricomi, Tables of Integral Transforms, vols. 1 and 2, McGraw-Hill, New York, 1954. 20. A. J. Bard and L. R. Faulkner, Electrochemical Methods—Fundamentals and Applications, 2nd edition, Wiley, New York, 2001. 21. R. Courant, Differential and Integral Calculus, Vol. II, Interscience, New York, 1959 pp. 17–18. 22. C. L. Perrin, Mathematics for Chemists, Wiley-Interscience, New York, 1970. 23. J. Mathews and R. L. Walker, Mathematical Methods of Physics, 2nd edition, W. A. Benjamin, Menlo Park, CA 1970. 24. H. Margenau and G. M. Murphy, The Mathematics of Physics and Chemistry, 2nd edition, Van Nostrand, Princeton, NJ, 1961. 25. H. Margenau and G. M. Murphy, The Mathematics of Physics and Chemistry, Vol. 2, Van Nostrand–Reinhold, New York, 1964. 26. R. V. Churchill, Complex Variables and Applications, McGraw-Hill, New York, 1960. 27. R. V. Churchill, Fourier Series and Boundary-Value Problems, 2nd edition, McGrawHill, New York, 1963. 28. H. B. Dwight, Tables of Integrals and Other Mathematical Data, 4th edition, Macmillan, New York, 1961. 29. I. S. Gradshteyn and I. M Ryzhik, Table of Integrals, Series, and Products, 4th edition, Academic Press, New York, 1965. 30. M. Abramowitz and I. A. Stegun, eds., Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, National Bureau of Standards, Applied
11 9
120
2
PA R T I C L E S , F O R C E S , A N D M A T H E M A T I C A L M E T H O D S
Mathematics Series, Vol. 55, U. S. Government Printing Office, Washington, DC, 1964. 31. S. Zhang and J. Jin, Computation of Special Functions, Wiley-Interscience, New York, 1996. 32. E. Jahnke and F. Emde, Tables of Functions with Formulae and Curves, 4th edition, Dover, New York, 1945. 33. L. B. W. Jolley, Summation of Series, 2nd revised edition, Dover, New York, 1961.
CHAPTER
3
Quantum Mechanics
[In the beginning] Genesis 1, 1
“Finally a contradiction: now we can get some work done.” Niels Bohr (1885–1962)
“All science is either physics or stamp collecting.” Ernest Rutherford, first baron Rutherford of Nelson (1871–1937)
3.1 QUANTUM POSTULATES A few fundamental and axiomatic postulates must be assumed for quantum mechanics; the first of which is the existence of quantized processes—for example, the quantization of atomic and molecular energy levels. Quantum mechanics is the fusion of wave mechanics with matrix mechanics, which were both developed in parallel in Germany in the late 1920s. The tradition of nineteenth-century mathematical physics of solving second-order linear partial differential equations in mechanics, electricity, and magnetism is partially satisfied within the wave-mechanical form of quantum mechanics by the Schr€ odinger1 equation [1] (1926: subrelativistic conditions) and by 2 the Dirac equation [2] (1928: the relativistically correct extension of the Schr€ odinger equation). This tradition, wedded to the assumption that all
1
Erwin Rudolf Josef Alexander Schr€ odinger (1887–1961).
2
Paul Adrien Maurice Dirac (1902–1984).
The Physical Chemist’s Toolbox, Robert M. Metzger. Ó 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc.
121
122
3
QUA NT UM M ECH AN ICS
infinitesimal changes are allowed, must be modified for quantum theory and for subatomic physics: Now only discrete jumps are known, and the comfortable continuum of infinitesimal calculus, or of differential forms, must be modified as we approach the atomic and subatomic world. Where the Schr€ odinger or Dirac equations apply, quantization will appear from sundry mathematical conditions on the existence of physically meaningful solutions to these differential equations. However, in nonrelativistic terms, spin does not come from a differential equation: It comes from the assumptions of spin matrices, or from “necessity” (the Dirac equation does yield spin ¼ 1/2 solutions, but not for higher spin). So we must posit quantum numbers (see Section 2.12) even when there are no differential equations in the back to “comfort us.” This is especially true for the weak and strong forces, where no distance-dependent potential energy functions have been developed. So, what is left? Well, here are the requirements for quantum mechanics: 1. Energy levels, Bohr’s3 orbital angular momenta [3], and spin angular momenta can be quantized: There is a set of integers or half integers (n, m, etc.) for which stationary states of the system exist. Transitions between these energy levels involve the emission or absorption of quantized particles of light (photons) of energy hn: DEnm ¼ En Em ¼ hn
ð3:1:1Þ
where h is Planck’s4 constant of action and h h/2p. For energies that are intermediate between these “allowed” levels, no emission or absorption of light will occur 2. Particles and waves have a dual nature: Particles with mass m and momentum p ¼ mv can become waves, with an equivalent de Broglie5 wavelength l [4] given by l ¼ h=p
ð3:1:2Þ
(at which point they are no longer particles); similarly, waves can behave like particles (but then they are no longer waves); this duality cannot be probed simultaneously in a single experiment. This “equivalent wavelength” inspired Schr€ odinger to invent his wave equation. 3. The observation of a quantum-mechanical system involves the disturbance of the state being observed; the Heisenberg6 uncertainty principle [5] dictates that the uncertainty Dx in position x and the uncertainty Dpx in momentum px in the x direction (or in y or in z, or the uncertainty in any two “canonically conjugate” variables, e.g. energy E and time t, or angular momentum L and phase f, i.e. variables whose
3
Niels Henrik David Bohr (1885–1962). Max Planck (1858–1947). 5 Louis-Victor-Pierre-Raymond, 7th duc de Broglie (1892–1987). 6 Werner Heisenberg (1901–1976). 4
3.1
12 3
QUANTUM POSTULATES
product has dimensions of action ¼ energy time) obeys uncertainty relations:
DxDpx Xð1=2Þh; DyDpy Xð1=2Þh; DzDpz X ð1=2Þh
ð3:1:3Þ
DLDfXð1=2Þh
ð3:1:4Þ
DEDtXð1=2Þh
ð3:1:5Þ
Equation (3.1.4) helps us understand Larmor7 precession, while Eq. (3.1.5) is used to define the lifetime of very short-lived species. 4. All “conservative holonomic” systems must satisfy (at nonrelativistic speeds) the time-dependent Schr€ odinger equation: ^ HCðx; y; z; tÞ ¼ þihð@C=@tÞ
ð3:1:6Þ
^ is the Hamiltonian8 operator (a Hermitian9 operator for where H energy): ^ ¼ T^ þ U ^ H
ð3:1:7Þ
and C(x, y, z; t) is the wavefunction, or state function, and i (1)1/2. The rules for forming a Hamiltonian operator in real space are formulated in Cartesian10 space: coordinates x; y; z ! operators x^; y^; z^
momenta px ! ihð@=@xÞ; py ! ihð@=@yÞ; pz ! ihð@=@zÞ
ð3:1:8Þ
ð3:1:9Þ
(Symmetrically opposite recipes are valid for a Hamiltonian operator in momentum space.) When magnetic fields are present, then the momentum vector receives an additional term, the vector potential A. At relativistic speeds the Dirac equation shall be used. For a one-dimensional one-particle harmonic oscillator the Hamiltonian is ^ ¼ ðh2 =2mÞðd2 =dx2 Þ þ ðkH =2Þx2 H
7
Sir Joseph Larmor (1857–1942). Sir William Rowan Hamilton (1805–1865). 9 Charles Hermite (1822–1901). 10 Rene Descartes (1596–1650). 8
ð3:1:10Þ
124
3
QUA NT UM M ECH AN ICS
where kH is the classical Hooke’s11 law force constant (newtons m1 or joules m2). For the one-electron atom the Hamiltonian is (in SI units): ^ ¼ ðh2 =2mÞr2 Ze2 =4p«0 r H
ð3:1:11Þ
where the nucleus has charge Ze and mass M, while the electron has charge e and mass me, the electron–nucleus distance is r, and the reduced mass for the atom is m ¼ 1/(M1 þ me1). One can also state seven postulates for quantum mechanics [6]: 1. The state of a system is described by a single-valued, continuous and bounded function C (wavefunction, or state function) of coordinates and time. 2. To every physical observable O corresponds a linear Hermitian ^ whose expression is derived from Cartesian coordinates operator O, and momenta. 3. The only possible results from measurements of a physical observable O are a set of eigenvalues oi of the eigenvalue equation: ^ ¼ oi j Oj i i
ð3:1:12Þ
^ is the Hermitian operator corresponding to the observable O. where O The eigenfunctions ji are well-behaved (satisfy the boundary conditions) and the ji form a complete set. 4. If C(q; t) is a normalized wavefunction of a system at time t, then the average value of any physical obervable O at time t is the integral over all space coordinates: ð
^ hOi ¼ C*ðq; tÞOCðq; tÞdq
ð3:1:13Þ
5. The time development of the state of an isolated system is given by the time-dependent Schr€ odinger equation: ^ HCðx; y; z; tÞ ¼ ihð@C=@tÞ
ðð3:1:6ÞÞ
^ is the Hamiltonian (i.e., energy) operator for the system. where H 6. At the limit of large quantum numbers, the results of quantum mechanics must agree with classical mechanics (Bohr correspondence principle). 7. Electrons have spin 1=2 and are fermions.
11
Robert Hooke (1635–1703).
3.1
12 5
QUANTUM POSTULATES
The time-independent Schr€ odinger equation can be found by assuming that C(x, y, z; t) is factorable into a time-dependence separate from the space dependence: Cðx; y; z; tÞ ¼ expðiEt=hÞcðx; y; zÞ
ð3:1:14Þ
The result is: ^ Hcðx; y; zÞ ¼ Ecðx; y; zÞ
ð3:1:15Þ
where the separation constant is the energy E of the system. The complex wavefunction c(x, y, z) is a probability amplitude; its absolute square j c(x, y, z)j2 is a probability density function, while in three dimensions the quantity j c(x, y, z)j2dx dy dz is a probability, that is, a positive definite quantity that has values ranging from 0 to 1. The integral over all space of this is certainty (i.e., probability ¼ 1). This is the Copenhagen statistical interpretation of the wavefunction, first proposed by Born,12 and championed particulary by Bohr and Heisenberg, after Einstein13 reluctantly withdrew his reservations about the Heisenberg uncertainty principle. Quantum-mechanical particles add probability amplitudes by adding c(x, y, z)’s; we can only measure their absolute squares and thus lose the information about the phase of the wavefunction. The other consequence is that a measurement will alter the state of what is being measured. Sideline. Presumably during the controversy, Einstein mused “God does not play dice.” Supposedly Bohr answered: “Einstein, stop telling God what to do.” The Dirac “bra” and “ket” notation is an adaptation of matrix mechanics and is used to abbreviate the wavefunction by emphasizing its eigenvalues. All wavefunctions must be quadratically integrable (at least over a finite domain) and belong to Lesbesgue14 class L2. The world of Hermitian operators, their eigenfunctions and eigenvalues is also called Hilbert15 space (this fancy name was given by von Neumann).16 The postulates and consequences of quantum mechanics have survived unscathed into the twenty-first century. PROBLEM 3.1.1. In one dimension “derive,” or give a plausibility argument, for the Schr€ odinger equation by combining the one-dimensional classical wave equation ð@ 2 u=@x2 Þ v2 ð@ 2 u=@t2 Þ ¼ 0 with the de Broglie relationship, Eq. (3.1.2).
12 13
Max Born (1882–1970).
Albert Einstein (1879–1955). Henri Leon Lesbesgue (1875–1941). 15 David Hilbert (1862–1943). 16 John von Neumann (1903–1957). 14
ð3:1:16Þ
126
3
QUA NT UM M ECH AN ICS
PROBLEM 3.1.2. Bohr’s 1913 derivation of the energy of the hydrogen atom (nuclear charge ¼ e, electron charge ¼ e, reduced mass of the electron– nucleus couple ¼ m, electron–nucleus distance ¼ r, linear momentum ¼ p) is based on the classical energy E ¼ T þ V ¼ p2 =2m e2 =4p«0 r
ð3:1:17Þ
plus the assumption that the orbital angular momentum mvr is quantized by a postulate inspired by Planck: mvr ¼ nh;
where n is a nonzero integer
Complete Bohr’s derivation, to obtain ð3:1:18Þ
En ¼ me4 =ð8«0 2 h2 n2 Þ
[Bohr had the right result for the wrong reason; in 1926, Schr€ odinger found instead that ðmvrÞ2 ¼ lðl þ 1Þh2 :] PROBLEM 3.1.3. The energy of a one-electron atom (nuclear charge Zje j , electron charge j e j , reduced mass of the electron-nucleus couple m) is obtained by solving the Schr€ odinger equation for the one-electron atom: En ¼ me4 Z2 =ð8«0 2 h2 n2 Þ
ð3:1:19Þ
En ¼ e2 Z2 =ð8p«0 a0 n2 Þ
ð3:1:20Þ
where, for the H one-electron atom, to within 1 part in about 1800, m electron rest mass me, n ¼ principal quantum number, and a0 ¼ Bohr radius: a0 ¼ h2 «0 =pme e2 ¼ 4p«0 h2 =me e2 ðSIÞ; ¼ h2 =me e2
ðcgsÞ:
ð3:1:21Þ
Result (3.1.17) or (3.1.18) is the Bohr energy for the hydrogen atom, except that Bohr had written the equation using a quantized orbital angular momentum l (it was discovered later by Schr€ odinger that for the H atom the lowest value for l is 0, while the principal quantum n ¼ 1 is the correct one to use). Verify that a0 ¼ 0:529 A and that En ¼ 2.18 1018Z2n2 J atom1 ¼ (1/2)Z2n2 hartree, where 1 hartree 4.359 1018 J atom1. PROBLEM 3.1.4. The energy of an electron in a hydrogen atom is E ¼ p2 =2me e2 =4p«0 r By Heisenberg’s uncertainty principle, if the electron is at a distance r from the nucleus, its momentum p is at least h/r. Find the minimum values of r and E. PROBLEM 3.1.5. “Proof” of the Heisenberg uncertainty principle, also called Heisenberg’s gamma-ray microscope (Fig. 3.1). Consider a light microscope of Rayleigh resolving power (l/2 sin «), where « is the angular aperture. A photon of wavelength l and frequency n undergoes Compton
3.1
12 7
QUANTUM POSTULATES Human eye here y axis Microscope objective
Lens
e Electron x axis
Light of wavelength l
p= p = hν/c Photon before collision
p'
hν
c '/
Photon after collision
FIGURE 3.1
a
=m
ev
Electron after collision
scattering; the formerly stationary electron assumes a momentum mv, scatters through an angle b, and assumes a new kinetic energy (1/2) mev2. The photon loses some of its energy, scatters through an angle a, enters (we hope) the microscope, and is seen by the eye of the observer. Derive the uncertainty relations by considering (a) the conservation of momentum in the x and y directions, and (b) the condition whereby the Compton scattered photon has entered the microscope and has been seen by the human observer [7]. PROBLEM 3.1.6. While reviewing certain splittings in atomic spectra and to explain the “anomalous Zeeman17 effect,” Goudsmit,18 and Uhlenbeck19 proposed in 1925 that atoms have not only orbital angular momentum L, but also an “intrinsic” spin angular momentum S. They sent their proposal to Pauli,20 who wrote them that (1) Kronig21 had also reached the same conclusion, (2) the magnetic moment due to spin was experimentally twice too small to be consistent with the beam deviations seen in the 1922 Stern22– Gerlach23 experiment. Explain why the electron intrinsic spin angular momentum must be h/2. [This discouraged Goudsmit and Uhlenbeck from
17
Pieter Zeeman (1865–1943). Samuel Abraham Goudsmit (1902–1978). 19 George Eugene Uhlenbeck (1900–1988). 18
20
Wolfgang Ernst Pauli (1900–1958). Ralph de Laer Kronig (1904–1995). 22 Otto Stern (1888–1969) 23 Walther Gerlach (1889–1979). 21
Heisenberg’s Gedankenmikroskop: approximate explanation of the Uncertainy Principle [7].
128
3
QUA NT UM M ECH AN ICS
publishing their ideas and therefore prevented them from getting a Nobel Prize! The missing “factor of two” (g ¼ 2) was found by Thomas24 in 1926 (Thomas precession). In a later letter to Goudsmit, Thomas wrote that “the presumed omniscience of the Almighty does not extend to his self-appointed vicars on earth.”]
3.2 QUANTUM MECHANICS OF THE FREE ELECTRON The time-independent Schr€ odinger equation in one dimension for the free particle reads h2 d2 ^ cðxÞ ¼ EcðxÞ HcðxÞ ¼ 2m dx2
ð3:2:1Þ
Its solutions are free waves: cðxÞ ¼ A expðikxÞ þ BexpðikxÞ
ð3:2:2Þ
with any arbitrary nonnegative eigenenergy (E 0); the momentum operator p ¼ ðh=iÞex ðd=dxÞ
ð3:2:3Þ
p ¼ þð2mEÞ1=2
ð3:2:4Þ
has eigenvalues
for the particle moving to the “right” (B ¼ 0), and p ¼ ð2mEÞ1=2
ð3:2:5Þ
for the particle moving to the left (A ¼ 0). We can define the wavevector k ¼ h1 ð2mEÞ1=2
ð3:2:6Þ
so that the momentum becomes p ¼ hk
ð3:2:7Þ
and the energy becomes Ek ¼ h2 k2 =2m
ð3:2:8Þ
In other words, the energy of a free electron is quadratic in the wavevector k (Fig. 3.2).
24
Llewellyn Hilleth Thomas (1903–1992).
3.3
12 9
T H E P A R T I C L E I N A BO X
E(k)
1
0.8
0.6
0.4
0.2
FIGURE 3.2 0
−1
−0.5
0
0.5
The energy Ek ¼ h2k2/2 m for a free particle has a quadratic dependence on the wavevector k.
1 k
3.3 THE PARTICLE IN A BOX The simplest quantum-mechanical problem is the “particle in a box” (Fig. 3.3). In one dimension, the particle (electron) of mass m is “free” in a region 0 x L (region I), where V ¼ 0, and cannot cross into the region x < 0 (region II), where V ¼ 1, or into the region x > L (region III), where V ¼ 1 (not even quantum particles can surmount an infinite energy barrier). The Schr€ odinger equation is h2 d2 ^ cðxÞ ¼ EcðxÞv^s ðrÞ HcðxÞ ¼ 2m dx2
ð3:3:1Þ
which immediately suggests sines, cosines, or complex exponentials as solutions: cðxÞ ¼ AsinðkxÞ þ BcosðkxÞ ¼ CexpðikxÞ þ DexpðikxÞ Region II
Region I
ð3:3:2Þ
Region III
16
Energy / (h2/8 mL2)
n=4 12
8
V = infinity (no particles here !!!)
n=3
4
V = infinity (no particles here !!!)
n=2
FIGURE 3.3
n=1
0 −50
0 x=0
50 100 150 180 *(x/L) (degrees)
200 x=L
250
Wavefunctions for a particle in a 1-D box, displaced to fit the energy levels.
130
3
QUA NT UM M ECH AN ICS
We will concentrate on sine–cosine solutions without loss of generality. Substitution into Eq. (3.3.1) suggests, as for the really free particle, the following: h2 k2 =2m ¼ E
ð3:3:3Þ
Since the wavefunction must be strictly zero in regions II and III, and the wavefunctions must match at the boundaries, we obtain c(x ¼ 0) ¼ 0, and c(x ¼ L) ¼ 0. The restriction c(x ¼ 0) ¼ 0 brings B ¼ 0, leaving the (odd) sine term of Eq. (3.3.2): cðxÞ ¼ AsinðkxÞ ¼ Asin½ð2mEÞ1=2 x=h
ð3:3:4Þ
The restriction c(L) ¼ 0 brings 0 ¼ cðLÞ ¼ Asin½ð2mEÞ1=2 L=h ¼ AsinðnpÞ
ð3:3:5Þ
whence the quantum condition emerges: En ¼ n2 p2 h2 =2mL2 ;
n ¼ 1; 2; 3; . . .
ð3:3:6Þ
The energies are positive definite and depend quadratically on the newly minted quantum number n. After normalization the eigenfunctions become cðxÞ ¼ ð2=LÞ1=2 sinðnpx=LÞ;
0 x L; n ¼ 1; 2; 3; . . .
ð3:3:7Þ
These functions form an orthonormal set: they are normalized, and sine functions of different argument are orthogonal: x¼L ð rffiffiffi
npx 2 sin L L
rffiffiffi mpx 2 dx ¼ dnm sin L L
ð3:3:8Þ
x¼0
If, instead, the box is chosen symmetrically about the origin, then the eigenfunctions become the (even) cosine functions for n odd: cðxÞ ¼ ð2=LÞ1=2 cosðnpx=LÞ;
L=2 x L=2; n ¼ 1; 3; 5; . . .
ð3:3:9Þ
and become sine functions for n ¼ even. The particle in the box can be adjusted [8] to explain the almost inverse-square dependence of the first optical absorption maximum of conjugated linear polyenes on their conjugation length L: DE ¼ E2 E1 ¼ 3p2 h2 =2mL2
ð3:3:10Þ
but, experimentally, this energy difference does not go to zero for the infinite polymer (L ¼ 1, e.g., for polyacetylene), because of departures from linearity
3.3
13 1
T H E P A R T I C L E I N A BO X
of the longer oligomers. In other words, the “effective conjugation length” derived from the optical spectrum is shorter than the molecular length. For a particle of mass M constrained into a three-dimensional rectangular box of sides A, B, and C, with V ¼ 0 inside the box (0 x A, 0 y B, 0 z C) and V ¼ 1 outside the box: ^ Hcðx; y; zÞ ¼ ðh2 =2MÞ½ð@ 2 =@x2 Þ þ ð@ 2 =@y2 Þ þ ð@ 2 =@z2 Þcðx; y; zÞ ¼ Ecðx; y; zÞ ð3:3:11Þ the problem can be factored by assuming a product eigenfunction c(x, y, z) ¼ X(x)Y(y)Z(z), and the solution is trivially cðx; y; zÞ ¼ ð8=ABCÞ1=2 sinðlpx=AÞsinðmpy=BÞsinðnpz=CÞ 0 x A; 0 y B; 0 z C;
l; m; n ¼ 1; 2; 3; . . . ð3:3:12Þ
and the energy is Elmn ¼ ðp2 h2 =2MÞðl2 A2 þ m2 B2 þ n2 C2 Þ;
l; m; n ¼ 1; 2; 3; . . . ð3:3:13Þ
If A ¼ B ¼ C, then degeneracies appear: the lowest energy level is singly degenerate in energy [(l ¼ 1, m ¼ 1, n ¼ 1); the next two levels are triply degenerate [the triad (l ¼ 2, m ¼ 1, n ¼ 1), (l ¼ 1, m ¼ 2, n ¼ 1), and (l ¼ 1, m ¼ 1, n ¼ 2)], then the triad [(l ¼ 2, m ¼ 2, n ¼ 1), (l ¼ 2, m ¼ 1, n ¼ 2), and (l ¼ 1, m ¼ 2, n ¼ 2)], and so forth. These particle-in-a-box considerations are very useful for (i) electrons in quantum dots (extra electrons trapped on GaAs or CdS or on Au nanocrystals, 5–10 nm in diameter), (ii) electrons solvated and trapped in liquid ammonia (obtained by dissolving Na in liquid NH3: the solution is blue!), or (iii) the excitation spectrum of a linear polyene. When the potential is not infinite, then a quantum-mechanical particle has a finite probability in that region. Consider three regions of different potential energy V (Fig. 3.3). In regions I and III, V ¼ 0. In region II, of width 2 L (L x L) the potential energy is positive but finite: V ¼ V0 > 0. In regions I and III we have a free particle, and Eq. (3.2.2) holds: cI ðxÞ ¼ AexpðikxÞ þ BexpðikxÞ
ð3:3:14Þ
cIII ðxÞ ¼ FexpðikxÞ þ GexpðikxÞ
ð3:3:15Þ
k ¼ ð2mEÞ1=2 h1
ð3:3:16Þ
where
Remember that A $ 0 means that a free particle travels in region I from left to right, while B $ 0 means that a free particle travels in region I from right to left. In region II, if the energy is less than the potential step V0, then the
132
3
QUA NT UM M ECH AN ICS
wavefunction has an imaginary wavevector; that is, it is exponentially attenuated: cII ðxÞ ¼ CexpðaxÞ þ DexpðaxÞ
ð3:3:17Þ
a ¼ ½2mðV0 EÞ1=2 h1 > 0
ð3:3:18Þ
where
As before, the wavefunctions and their first derivatives must match at the walls: cI ðLÞ ¼ cII ðLÞ
ð3:3:19Þ
cIII ðLÞ ¼ cII ðLÞ
ð3:3:20Þ
ðdcI =dxÞðx ¼ LÞ ¼ ðdcII =dxÞðx ¼ LÞ
ð3:3:21Þ
ðdcIII =dxÞðx ¼ LÞ ¼ ðdcII =dxÞðx ¼ LÞ
ð3:3:22Þ
These matching conditions at x ¼ L yield 2A ¼ ð1 þ ia=kÞexpðaL þ ikLÞC þ ð1 ia=kÞexpðaL þ ikLÞD
ð3:3:23Þ
2B ¼ ð1 ia=kÞexpðaL ikLÞC þ ð1 þ ia=kÞexpðaL ikLÞD
ð3:3:24Þ
and at x ¼ L yield 2C ¼ ð1 ik=aÞexpðaL þ ikLÞF þ ð1 þ ik=aÞexpðaL ikLÞG
ð3:3:25Þ
2D ¼ ð1 þ ik=aÞexpðaL þ ikLÞF þ ð1 ik=aÞ expðaL ikLÞG
ð3:3:26Þ
When Eqs. (3.3.25) and (3.3.26) are substituted into Eqs. (3.3.23) and (3.3.24), one obtains two messy equations: A ¼ ½cosh ð2aLÞ þ i 21 ða=k k=aÞ sinhð2aLÞ expð2ikLÞF þi 21 ða=k þ k=aÞ sinhð2aLÞG; B ¼ i 21 ða=k þ k=aÞ sinhð2aLÞF þ ½coshð2aLÞ i 21 ða=k k=aÞ sinhð2aLÞ expð2ikLÞG:
3.3
13 3
T H E P A R T I C L E I N A BO X Potential Energy V Region II: V= V0 Region III: V = 0
Region I: V = 0
x
x = +L
x = −L
2
REGION
REGION I: V = 0
II: V=V
1.5
REGION III: V = 0
0
FIGURE 3.4
ψ(x)
1
Potential energy profile and approximate wavefunctions for the tunneling problem (T ¼ 0.04): cI(x) ¼ 2 cos[3 (x þ 0.563)] for 5 x 1; cII(x) ¼ 0.516 exp[0.15(x þ 1)] for 1 x
þ 1; cIII(x) ¼ 0.400 cos[3(x 0.9)] for 1 x 5. For simplicity, the wavefunction derivatives were not matched at x ¼ 1.
0.5 0 −0.5 −1
−1.5 −2
−4
−2
0 x
2
4
These equations become simpler for the physical situation that the particle is “incident” from the left of Fig. 3.4, and not from the right, so that one can assume G ¼ 0. Then the ratio F/A can be calculated easily: F=A ¼ expð2ikLÞ=½coshð2aLÞ þ i21 ða=k k=aÞsinhð2aLÞ
ð3:3:27Þ
The square of this ratio T jFj2 =jAj2
ð3:3:28Þ
is the transmission coefficient from left to right. T is very simple if a L 1 (and aL is proportional to the area under the tunneling barrier): T 16 expð4aLÞðakÞ2 =ðk2 þ a2 Þ2
ð3:3:29Þ
This means that quantum–mechanical waves can “tunnel” under a potential barrier, but decay exponentially within it. For quantum particles the classically “forbidden” region (what a Teutonic expression!) is somewhat penetrable, but is impenetrable if the barrier is infinitely high. PROBLEM 3.3.1. Prove the orthonormality of the eigenfunctions of the particle in a one-dimensional box, Eq. (3.3.8). PROBLEM 3.3.2. Prove Eq. (3.3.27).
134
3
QUA NT UM M ECH AN ICS
PROBLEM 3.3.3. Estimate T (Eq. (3.3.29) for an electron of mass m ¼ 9.1 1031 kg and energy E ¼ 2 eV, considering tunneling through a barrier of width 2L ¼ 2 nm ¼ 2 109 m and height V0 ¼ 5 eV. PROBLEM 3.3.4. For the particle in a one-dimensional box of size L, evaluate the “transition moment” integral hn|x|mi. Kuhn25 analyzed [8] the lowest-energy optical absorption band of long linear polyenes, or oligomers of conjugated linear polymers, by “free electron molecular orbital theory” (FEMO). Emax, the energy of maximum absorbance, can be related to the number of p electrons np, to the length of the linear “box” L, and to the “effective length” LO of the p electron chain in the oligomer, defined so that L npLO: Emax h2 np =8mL2 ¼ h2 =8mL2O np
ð3:3:30Þ
Thus a plot of Emax versus 1=np is linear for oligomeric substrands of known conducting polymers and should extrapolate to Emax ¼ 0 for the perfectly degenerate, conjugated, infinite, linear, and “metallic” polymer. For instance, Emax ¼ 0 for graphite and for (SN)x. For all other conducting polymers, this zero is not reached, because the polymer has finite strand length or because of conformational distortions or other defects.
3.4 THE HARMONIC OSCILLATOR The harmonic oscillator in one dimension, with Hooke’s law constant kH, obeys the Schr€ odinger equation: h2 d2 1 ^ cðxÞ þ kH x2 cðxÞ ¼ EcðxÞ HcðxÞ ¼ 2 2m dx2
ð3:4:1Þ
or equivalently: ðp^x 2 =2m þ 2p2 n2 mx^2 ÞcðxÞ ¼ EcðxÞ
ð3:4:2Þ
where the fundamental frequency n (Hz) and the angular frequency o (radians per second) are defined, as in the classical oscillator, by using the force constant kH and the mass m: n ðo=2pÞ
1 2p
qffiffiffiffiffiffiffiffiffi KH
m
ð3:4:3Þ
Sideline. The word oscillator derives from the Latin words “os” (mouth) and oscillum (little mouth); in the Italian forests of 1000 BC to 500 BC, little effigies with mouths (happy faces?) were hung from trees to propitiate the
25
Hans Kuhn (1919–).
3.4
13 5
TH E H A R M O N I C O S C I L L A T O R
forest spirits or other deities, hence the word oscillate. The verb osculate (to kiss) also comes from the Latin “os.” Equation (3.4.1) now becomes the differential equation: ðd2 c=dx2 Þ þ 2mh2 ½kH x2 þ Ec ðxÞ ¼ 0
ð3:4:4Þ
A plain power-series solution is not practical; the asymptotic behavior of c(x) at large x requires instead a power series with a prefactor exp[(o/h)x2/2)]: cðxÞ ¼ exp½oh1 x2 =2
1 X
cv xv
ð3:4:5Þ
v¼0
which, after substitution into Eq. (3.4.1), yields a recursion relation: cvþ2 ¼ cv ðwh1 þ2wh1 v 2mEh2 Þ=ðv þ 1Þðv þ 2Þ
ð3:4:6Þ
The potential divergence of the power series is eliminated by requiring that the numerator in Eq. (3.4.6) equal zero at some value of v, thus terminating the series; this v becomes a new quantum number. Energy is quantized as E ¼ Ev, and the fundamental frequency n (Hz) or angular frequency o (radians per second) is multiplied by the integer quantum numbers v displaced by 1/2: Ev ¼ ðv þ 1=2Þhn ¼ ðv þ 1=2Þhw ¼ ðv þ 1=2ÞhkH m1=2 1=2
ðv ¼ 0; 1; 2; 3; . . .Þ ð3:4:7Þ
The term 1/2 is important: For v ¼ 0 a residual “zero-point vibration” exists for all oscillators, even at absolute zero temperature T ¼ 0 K; otherwise the uncertainty principle would be violated. The eigenfunctions cv(x) are a Gaussian26 multiplied by Hermite polynomials: cv ðxÞ ¼ ð2v v!Þ1=2 ðom=phÞ1=4 exp½ðom=hÞðx2 =2Þ Hv ððom=hÞ1=2 xÞ
ð3:4:8Þ
where the Hermite polynomials Hv(z) are given by the Rodrigues27 formula: Hv ðzÞ ¼ ð1Þv expðz2 Þ ðdv =dzv Þexpðz2 Þ
ð3:4:9Þ
The first few Hermite polynomials are given in Table 3.1; their recursion relation is: Hvþ1 ðzÞ ¼ 2zHv ðzÞ 2vHv1 ðzÞ
ð3:4:10Þ
Figure 3.5 shows the first four eigenfunctions: c0(x) and c2(x) are “even” functions of x, while c1(x) and c3(x) are odd with respect to the operation x ! –x.
26 27
Johann Carl Friedrich Gauss (1777–1855) Benjamin Olinde Rodrigues (1795–1851).
136
3
QUA NT UM M ECH AN ICS
Table 3.1 Eigenfunctions c v(x) of the Hamiltonian Operator for the ^ ¼ (h 2/2m)(d2/dx2) þ kHx2 Harmonic Oscillator H v
(2vv!)1/2
Hv(y)
0 1 2 3 4 5 6 7 8
1 21/2 81/2 481/2 3841/2 38401/2 460801/2 6451201/2 103219201/2
1 2y 4y2 2 8y3 12y 16y4 48y2 þ 12 32y5 160y3 þ 120 y 64y6 480y4 þ 720y2 120 128y7 1344y5 þ 3360y3 1680 y 256y8 3584y6 þ 13440y4 13440y2 þ 1680
Note: The eigenfunctions are cv(x) ¼ (2vv!)1/2(om/ph)1/4 exp[(om/h)(x2/2)]Hv((2p(nm/h)1/2x ¼ (2vv!)-1/2 (4nm/h)1/4 exp[y2/2)] Hv(y), where y 2p(nm/h)1/2x. Useful are the Rodrigues formula: Hv(y) ¼ (1)v expR(y2) (dv/dyv) exp(y2), therecursionrelation: Hv þ 1(y) ¼ 2yHv(y) 2vHv-1(y), andthe þ1 orthonormality: 1 dy exp(y2) [p 2m þ nm!n!]-1/2Hm(y) Hn(i) ¼ dmn.
The harmonic oscillator provides equally spaced eigenenergies; in molecular vibrations, additional anharmonic contributions (V ¼ bx3 þ cx4 þ . . .), computed numerically, spread the higher-energy vibrational eigenvalues further apart if b > 0, or closer together if b < 0. Occupation Number Representation of the Harmonic Oscillator. The ^ for the harmonic oscillator, Eq. (3.4.1), can be rewritten in Hamiltonian H terms of ladder operators a^ þ and a^–, which resemble the angular momentum ^ can be rewritten ladder operators [6]. Substituing Eq. (3.4.2) into Eq. (3.4.1), H in terms of the momentum operator p^ (in the x direction) and the position operator x^: ^ ¼ p^2 =2m þ 2p2 n2 mx^2 ¼ ð1=2mÞp^2 þ ð1=2Þo2 mx^2 H
ð3:4:11Þ
5 V=(k/2)x2
FIGURE 3.5 Plot of first four harmonic oscillator eigenfunctions (n ¼ 0, 1, 2, 3) [e.g., c0(x) ¼ exp[(pnmx2/h)], overlaid on evenly spaced energy eigenvalues and on a plot of the potential energy function V ¼ (1/2) kHx2. The v ¼ 0 and v ¼ 2 eigenfunctions are even; the v ¼ 1 and v ¼ 3 eigenfunctions are odd (i.e., change sign as you move from þ x to x).
Energy / (k/m)1/2
4 v=3
ψv=3(x)
3 v=2
ψv=2(x)
2 v=1
ψv=1(x)
1 v=0
ψv=0(x)
0 −2
−1.5
−1
−0.5
0 x
0.5
1
1.5
2
3.4
13 7
TH E H A R M O N I C O S C I L L A T O R
Dimensionless raising and lowering (ladder) operators can be defined by a^ 21=2 m1=2 ðhnÞ1=2 ½p^ 2pinmx^
ð3:4:12Þ
One sees at once that ^ ð1=2Þhn h n a^þ a^ ¼ H
ð3:4:13Þ
^ þ ð1=2Þhn hn^ a a^þ ¼ H
ð3:4:14Þ
and also that
whence the commutators are ^ a^ ¼ H ^ a^ a^ H ^ ¼ hna^ ½H;
ð3:4:15Þ
and Eq. (3.4.12) follows, after it is established that there is a lower bound to the eigenvalues. One may consider [6] these ladder operators a^ þ and a^– as creation and annihilation operators for bosons (quanta of vibration that obey Bose28– Einstein statistics). In detail, assuming a set of eigenvalues (eigenkets) |vi ^ one may baptize a^ as the annihilation operator, also written as a^, of H, which works on ket j vi by producing ket j v 1i (i.e., undoing the boson state v): a^ jvi ¼ a^jvi ¼ v1=2 jv 1i
ð3:4:16Þ
Similarly, a^ þ can be re-baptized as the creation operator a^y, which creates the boson state v þ 1: a^þ jvi ¼ a^y jvi ¼ ðv þ 1Þ1=2 jv þ 1i
ð3:4:17Þ
Note that a^ and a^y do not commute; their commutator is equal to the identity operator: ½^ a; a^y ¼ a^a^y a^y a^ ¼ 1
ð3:4:18Þ
^ ¼ ð1=2Þhnð^ aa^y þ a^y a^Þ H aa^y þ a^y a^Þ ¼ ð1=2Þhoð^
ð3:4:19Þ
After using Eqs. (3.4.12) to (3.4.19), the final result is gratifyingly simple: ^ Hjvi ¼ hvðn þ 1=2Þjvi ¼ hoðv þ 1=2Þjvi
28
Satyendra Nath Bose (1894–1974).
ð3:4:20Þ
138
3
QUA NT UM M ECH AN ICS
PROBLEM 3.4.1. Verify the commutator Eq. (3.4.18). PROBLEM 3.4.2. Obtain from Eq. (3.4.12) explicit expressions for p^ and x^ in terms of the sums and differences of the raising and lowering operators. PROBLEM 3.4.3. [6]. Using for the position operator x^ the representation x^ ip1=2 k1=2 ðh nÞ1=2 ða^ a^y Þ
ð3:4:21Þ
use raising and lowering operators to prove that h0 j x4 j 0i ¼ (3/4) (hn/k)2. PROBLEM 3.4.4. Compute the transition moment integral for the harmonic oscillator [9]: ð þ1 dx cn *ðxÞxcm ðxÞ hnjxjmi ¼ 1
PROBLEM 3.4.5. Compute the expectation value of x2 for the harmonic oscillator: ð þ1 hnjx2 jni ¼ dx cn *ðxÞx2 cn ðxÞ 1
PROBLEM 3.4.6. By using the energy: E ¼ ðDpÞ2 =2m þ 2p2 mn2 ðDxÞ2 and the uncertainty principle, show that Emin ¼ (1/2)hn. PROBLEM 3.4.7. (i) Compute the classical energy for the harmonic oscillator of mass m, Hooke’s law force constant kH, frequency n ¼ (1/2p)(kH/m)1/2, maximum oscillation amplitude a0, and displacement x. (ii) Next, compute the classical probability that the displacement is between x and x þ dx. (iii) Compare this result with the quantum-mechanical probability for the harmonic oscillator of the same frequency n.
3.5 THE HAMILTONIAN FOR THE ONE-ELECTRON ATOM IN A CENTRAL FIELD ^ (relative motion of The hydrogen or one-electron atom Hamiltonian H 29 particle in a central Coulomb field) is, after eliminating the motion of the center of mass [1], 2 2 ^ ¼ h r2 þ Ze H 2m 4p«0 r
29
ðSIÞ;
Charles-Augustin de Coulomb (1736–1806).
^ ¼ h r2 þ Ze H 2m r 2
2
ðcgsÞ
ð3:5:1Þ
3 .5
TH E H AM I LTO NI AN F OR T HE ONE -ELE CTR ON AT OM I N A CEN TR AL FI E LD
where m is the reduced mass of the system: 1=m ¼ 1=ðelectron rest massÞ þ ð1=nuclear rest massÞ
ð3:5:2Þ
Z is the nuclear charge in units of |e| (or the number of protons in the nucleus), j e j is the charge on the electron, «0 is the permittivity of vacuum, r is the distance from the electron to the center of mass, and !2 is the Laplacian30 operator. The first term of Eq. (3.5.1) is the kinetic energy operator; the second is the potential energy operator (Coulomb field). We will show below that the solution to the time-independent Schr€ odinger equation ^ HcðxÞ ¼ EcðxÞ
ð3:5:3Þ
for the bound electronic states is the eigenfunction: cðr; y; jÞ ¼ Rnl ðrÞYlm ðy; jÞ
ð3:5:4Þ
where Rnl ðrÞ is the radial eigenfunction (an associated Laguerre31 polynomial times exp(z r)) and Ylm ðy; jÞ is the angular eigenfunction or surface spherical harmonic (associated Legendre32 polynomial times exp(i j)). The problem is usually solved by starting out in the Cartesian (x, y, z) system: fðh2 =2mÞ½ð@ 2 =@x2 Þ þ ð@ 2 =@y2 Þ þ ð@ 2 =@z2 Þ Ze2 ðx2 þ y2 þ z2 Þ1=2 Þ=4p«0 Þg cðx; y; zÞ ¼ E cðx; y; zÞ but then exploiting the spherical symmetry of the potential, and therefore transforming this central-field problem into the spherical polar coordinate system (r, y, j, where x ¼ r sin y cos j, y ¼ r sin y sin j, z ¼ r cos y, and 0 r 1, 0 y p, 0 j 2p, see Section 2.7), in which the Schr€ odinger equation becomes fðh2 =2mÞ½ð@ 2 =@r2 Þ þ 2r1 ð@=@rÞþr2 ð@ 2 =@y2 Þ þ r2 cotyð@=@yÞ þð1=r2 sin2 yÞð@ 2 =@j2 Þ Ze2 =4p«0 rg cðr; y; jÞ ¼ E cðr; y; jÞ
ð3:5:5Þ
This formidable-looking equation can be solved by separation of variables. Assume that the solution is a product of two independent functions of the three variables: cðr; y; jÞ ¼ RðrÞYðy; jÞ
30
Pierre Simon marquis de Laplace (1749–1827). Edmond Nicolas Laguerre (1834–1888). 32 Adrien-Marie Legendre (1752–1833). 31
ð3:5:6Þ
13 9
140
3
QUA NT UM M ECH AN ICS
By substituting this product into the Schr€ odinger equation and multiplying both sides of the equation by r2/RY, we find fðh2 =2mÞ½r2 R1 ð@ 2 R=@r2 Þ þ 2rR1 ð@R=@rÞ Ze2 r2 =4p«0 Er2 g þfðh2 =2mÞ½Y1 ð@ 2 Y=@y2 Þ þ cot yY1 ð@Y=@yÞ þ ð1=Ysin2 yÞ ð@ 2 Y=@j2 Þg ¼ 0 ð3:5:7Þ Of the two terms in braces on the left-hand side, the first involves only r and the second involves only y and j. Thus the two terms must equal constants, whose sum vanishes; let us call this the “separation” constant A. ðh2 =2mRÞ½r2 ðd2 R=dr2 Þ þ 2rðdR=drÞ Ze2 r2 =4p«0 Er2 ¼ A
ðð3:5:8ÞÞ
ðh2 =2mYÞ½ð@ 2 Y=@y2 Þ þ cot yð@Y=@yÞ þ ð1=sin2 yÞð@ 2 Y=@j2 Þ ¼ A ð3:5:9Þ Rewriting, we obtain ðh2 =2mÞ½ð@ 2 R=@r2 Þ þ 2r1 ð@R=@rÞ þ ½Ze2 =4p«0 r þ A r2 þER ¼ 0
ð3:5:10Þ
ðh2 =2mYÞ½ð@ 2 Y=@y2 Þ þ cotyð@Y=@yÞ þ ð1=sin2 yÞð@ 2 Y=@j2 Þ ¼ A
ð3:5:11Þ
Let us recall the definition of angular momentum components in the coordinate representation: L^z ¼ ih½yð@=@zÞ zð@=@yÞ
ð3:5:12Þ
L^y ¼ ih½zð@=@xÞ xð@=@zÞ
ð3:5:13Þ
L^z ¼ ih½xð@=@yÞ yð@=@xÞ
ð3:5:14Þ
2 The angular momentum operator squared L^ , expressed in spherical polar coordinates, is 2 2 2 2 L^ ¼ L^x þ L^y þ L^z ¼ fih½ðyð@=@zÞ zð@=@yÞg2
þfih½zð@=@xÞ xð@=@zÞg2 þ fih½xð@=@yÞ yð@=@xÞg2 2 L^ ¼ h2 ½ð@ 2 =@y2 Þ þ cotyð@=@yÞ þ ð1=sin2 yÞð@ 2 =@j2 Þ
ð3:5:15Þ
This means that the Schr€ odinger equation for the one-electron atom or ion can be recast as fðh2 =2mÞ½ð@ 2 =@r2 Þþ2r1 ð@=@rÞ þ r2 L^ Ze2 =4p«0 rgcðr; y; jÞ ¼ Ecðr; y; jÞ 2
ð3:5:16Þ
3 .5
TH E H AM I LTO NI AN F OR T HE ONE -ELE CTR ON AT OM I N A CEN TR AL FI E LD
^ commutes with the square of the and, since the one-electron Hamiltonian H 2 angular momentum operator L^ : ^ L^ ¼ 0 ½H; 2
ð3:5:17Þ
one can find common eigenfunctions of commuting variables. In other words, once the rigid rotor (angular momentum) problem is solved, its eigenfunctions can be used for the angular dependence of the one-electron atom problem. Returning to Eq. (3.5.11), we further assume that Y(y,j) can be factored into separate functions of y and j: Yðy; jÞ YðyÞFðjÞ
ð3:5:18Þ
ðh2 =2mÞ½Y1 sin2 yðd2 Y=dy2 Þ þ Y1 sinycos yðdY=dyÞ þF1 ðd2 F=dj2 Þ Asin2 y ¼ 0
ð3:5:19Þ
This means that the ordinary differential equations are, again, equal to constants: ðh2 =2mÞ½Y1 sin2 yðd2 Y=dy2 Þ þ Y1 sin2 ycotyðdY=dyÞ Asin2 y ¼ B ð3:5:20Þ ðh2 =2mÞ½F1 ðd2 F=dj2 Þ ¼ B
ð3:5:21Þ
The latter equation is immediately solved and normalized: FðjÞ ¼ ð2pÞ1=2 expðimjÞ;
m ¼ 0; 1; 2; . . .
ð3:5:22Þ
whence (h2/2m)m2 ¼ B. We now turn our attention to the former equation: ½Y1 sin2 yðd2 Y=dy2 Þ þ Y1 sin2 ycotyðdY=dyÞ Asin2 y ¼ m2
ð3:5:23Þ
Multipling both sides by Y and dividing by sin2y yields ðd2 Y=dy2 Þ þ cotyðdY=dyÞ AY þ m2 Y=sin2 y ¼ 0
ð3:5:24Þ
Now change variables: u cos y, and Y(y) ¼ G(u). The transformed equation is ð1 u2 Þd2 G=du2 2uðdG=duÞ þ G½A m2 =ð1 u2 Þ ¼ 0
ð3:5:25Þ
The trial solution is the modified power series GðuÞ ¼ ð1 u2 Þjmj=2
1 X j¼0
aj uj
ð3:5:26Þ
141
142
3
QUA NT UM M ECH AN ICS
which yields the recursion relation ajþ2 ¼ aj ½ð j þ jmjÞð j þ jmj þ 1Þ A=ð j þ 1Þðj þ 2Þ
ð3:5:27Þ
The series must terminate, or else the power series will diverge. The value k for which the series terminates is ð k þ jmjÞð k þ jmj þ 1Þ ¼ A: This can be shown to lead to l k þ jmj;
l ¼ 0; 1; 2; . . .
ð3:5:28Þ
and to an angular momentum that is quantized by jLj ¼ h½lðl þ 1Þ1=2 ;
l ¼ 0; 1; 2; . . .
ð3:5:29Þ
and to its z-component which is quantized by Lz ¼ mh;
m ¼ l; l þ 1; 1; 0; 1; 2; . . . ; ðl 1Þ; l
ð3:5:30Þ
It is very important and significant that |L| is equal to h [l (l þ 1)]1/2, and not equal to h l; in the old days this was called space quantization. This means that, even in the absence of external fields, the angular momentum vector L makes an angle of cos1(m [l (l þ 1)]1/2) with the z axis (polar axis). 2 The simultaneous eigenfunctions of L^ and L^z are the complex spherical harmonics: Ylm ðy; jÞ ð1Þm ½ð2l þ 1Þðl jmjÞ!=4pðl þ jmjÞ!1=2 Pl jmjðcos yÞ expðimjÞ ð3:5:31Þ jmj
where the Pl ðcos yÞ are the associated Legendre polynomials of the first kind, whose generating function is jmj
Pl ðcos yÞ ¼ ðsinyÞjmj ðdjmj =dðcos yÞjmj ÞPl ðcos yÞ ¼ ð1Þl ½2l l!1 sinjmj yðdlþjmj =dðcos yÞlþjmj Þsin2l y
ð3:5:32Þ
There are some small differences between conventions for spherical harmonics Ylm ðy; jÞ in different texts (following most chemists, we use the so-called “Condon and Shortley” [10] convention). Note: These functions are also the solutions to the Rayleigh problem of the normal modes of waves on a flooded planet (s, p, d, f functions), and they also occur in the study of earthquakes. Let us return to the r-dependence of the one-electron atom or ion, Eq. (3.5.8): ðh2 =2mRÞ½r2 ðd2 R=dr2 Þ þ 2rðdR=drÞ Ze2 r2 =4p«0 Er2 ¼ A
ð3:5:8Þ
3 .5
TH E H AM I LTO NI AN F OR T HE ONE -ELE CTR ON AT OM I N A CEN TR AL FI E LD
We use the value of A ¼ l ðl þ 1Þh2
ð3:5:33Þ
already found for the angular dependence, define the Bohr radius (for the reduced mass) a0 4p«0 h2 =me2
ðSIÞ;
a0 h2 =me2
ðcgsÞ
ð3:5:34Þ
divide both sides of Eq. (3.5.8) by (h2/2 mR)r2, and get ½ðd2 R=dr2 Þ þ 2 r1 ðdR=drÞ þ ½2Z=4p«0 a0 r þ 2E=a0 r2 lðl þ 1Þ=a0 r2 RðrÞ ¼ 0 A modified power series works RðrÞ ¼ rl expðcrÞ
P1
j¼0 bj r
j
ð3:5:35Þ
which yields a recursion relation: bjþ1 ¼ bj ½2c þ 2cl þ 2cj 2Z=a0 =½jð j þ 1Þ þ 2ðl þ 1Þð j þ 1Þ
ð3:5:36Þ
which must terminate (l ¼ k) if the radial eigenfunction is to be well-behaved: 2cðl þ j þ 1Þ ¼ 2Z=a0
ð3:5:37Þ
whence the “principal” quantum number is n ¼ k þ l þ 1;
k ¼ 0; 1; 2; . . . ;
l ¼ 0; 1; 2; . . .
ð3:5:38Þ
and the energy becomes, for a one-electron atom: E ¼ En ¼ Z2 me4 =ð2n2 h2 Þ
ð3:5:39Þ
¼ Z2 n2 ð13:6 eVÞ
ð3:5:40Þ
¼ Z2 n2 ð2:179908 1018 JÞ
ð3:5:41Þ
¼ Z2 n2 hcRH
ð3:5:42Þ
¼ Z2 e2 n2 a1 0 ;
n ¼ 1; 2; . . . ; 1
ð3:5:43Þ
where RH is the Rydberg33 constant for hydrogen (assuming infinite mass for nucleus): RH me e4 =8hc «20 ¼ 1:09677576 105 cm1
33
Johannes Robert Rydberg (1854–1919).
143
144
3
QUA NT UM M ECH AN ICS
and the radial eigenfunctions are given by sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4Z3 ðn l 1Þ! 2Zr l Zr nþ1 2Zr Rnl ðrÞ ¼ exp L2lþ1 na0 na0 na0 n4 a30
ð3:5:44Þ
where Lm n ðxÞ is the associated Laguerre polynomial of degree n m and order n, which can be defined by a Rodrigues recursion relation [9]: m m m m n n n Lm n ðxÞ ðd =dx Þ Ln ðxÞ ¼ ðd =d x Þ ½expðþxÞ ðd =dx Þx expðxÞ
ð3:5:45Þ
while the Laguerre polynomial of order n is defined by its own Rodrigues formula: Ln ðxÞ expðþxÞ ðdn =dxn Þ xn expðxÞ
ð3:5:46Þ
Some texts [11] define the Laguerre polynomial somewhat differently, as Ln(x) (n!)1[exp(þx) (dn/dxn) xnexp(x)]: This definition is not used here; ðmÞ some other texts define a “generalized” Laguerre polynomial Ln ðxÞ 1 m n n nþm expðxÞ: x ðn!Þ ½expðþxÞðd =dx Þx As anticipated in Eq. (3.5.4), the normalized eigenfunctions for the oneelectron atom are the product of Eqs. (3.5.31) and (3.5.44): sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffisffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4Z3 ðn l 1Þ! ð2l þ 1Þðl jmjÞ! cnlm ðr; y; jÞ ¼ 4pðl þ jmjÞ! n4 a30 2Zr l Zr 2lþ1 2Zr jmj exp Lnþl P ðcos yÞexpðimjÞ na0 na0 na0 l where n ¼ 1; 2; . . . ; l ¼ 0; 1; 2; . . . ; ðn 1Þ; m ¼ l; l þ 1; . . . 1; 0; 1; . . . ; ðl 1Þ; l
ð3:5:47Þ
Some features of the Legendre and Laguerre polynomials are discussed next. The Rodrigues formula for associated Legendre polynomials is jmj Pl ðcos yÞ
jmj=2
¼ ðsinyÞ
¼ ðsinyÞjmj=2
djmj
!
Pl ðcos yÞ dðcos yÞjmj ! ! djmj 1 dl ðcos2 y 1Þl dðcos yÞjmj 2l l! dðcos yÞjlj
ð3:5:48Þ
The first few unnormalized associated Legendre polynomials are P00 ðcos yÞ ¼ 1
“s”
ð3:5:49Þ
P01 ðcos yÞ ¼ cos y ¼ z=r
“pz ”
ð3:5:50Þ
3 .5
TH E H AM I LTO NI AN F OR T HE ONE -ELE CTR ON AT OM I N A CEN TR AL FI E LD
P11 ðcos yÞ ¼ siny ¼ ðx2 þ y2 Þ1=2=r
“px and py ”
ð3:5:51Þ
P02 ðcos yÞ ¼ ð3 cos2 y 1Þ ¼ ð3z2 r2 Þ=r2
“dz2 ”
ð3:5:52Þ
P12 ðcos yÞ ¼ sinycos y ¼ ðx2 þ y2 Þ1=2 z=r2
“dxz and dyz ”
ð3:5:53Þ
P22 ðcos yÞ ¼ sin2 y ¼ ðx2 þ y2 Þ=r2
“dxy and dx2 y2 ”
ð3:5:54Þ
The connection between these associated Legendre polynomials and the angular shape of certain real atomic orbitals px, pz, dxz, dyz, dxz, dx2 y2, and dz2 is explicitly made between “quotes;” the connection is exact for m ¼ 0, but inexact for m $ 0, where the j-dependence of exp( imj) gets involved. From the generating function Tjmj ðz; tÞ
lX ¼1
jmj
pl ðzÞtl ¼ ð2jmjÞ!ð1 z2 Þjmj=2 tjmj 2jmj ½jmj!1 ð1 2 z t þ t2 Þjmj1=2
l¼jmj
ð3:5:55Þ after calculating @T/@z, and collecting terms with like powers of t (e.g., tl), one can get the recursion relations [9]: m sinyPnm1 ðcos yÞ ¼ ½1=ð2n þ 1Þ Pm nþ1 ðcos yÞ Pn1 ðcos yÞ
ð3:5:56Þ
m m cos yPm n ðcos yÞ ¼ ½1=ð2n þ 1Þ ½ðn m þ 1ÞPnþ1 ðcos yÞ þ ðn þ mÞPn1 ðcos yÞ
ð3:5:57Þ m m1 Pm n ðcos yÞ ¼ cos y Pn1 ðcos yÞ ðn þ m 1ÞsinyPn1 ðcos yÞ
ð3:5:58Þ
m1 Pm ðcos yÞ ðn þ m 1Þ ðn m þ 2ÞPm2 ðcos yÞ n ðcos yÞ ¼ 2ðm 1Þcot y Pn n
ð3:5:59Þ m m Pm n ðcos yÞ ¼ ½1=ðn mÞ ½ð2n 1Þcos y Pn1 ðcos yÞ ðn þ m 1ÞPn2 ðcos yÞ
ð3:5:60Þ The functions with negative order m are related to those of positive order m by m m Pm n ðcos yÞ ¼ ð1Þ ½ðn mÞ!=ðn þ mÞ!Pn ðcos yÞ
ð3:5:61Þ
145
146
3
QUA NT UM M ECH AN ICS
For negative arguments one gets m Pm n ðcos yÞ ¼ ð1Þn mPn ðcos yÞ
ð3:5:62Þ
The normalization for associated Legendre polynomials is y¼p ð m Pm n ðcos yÞPn0 ðcos yÞsiny dy ¼ ½2=ð2n þ 1Þ ½ðn þ mÞ!=ðn mÞ! dnn0
ð3:5:63Þ
y¼0 y¼p ð
0
m Pm n ðcos yÞPn ðcos yÞsiny dy ¼ ð1Þm ½2=ð2n þ 1Þ dnn0
ð3:5:64Þ
y¼0 y¼p ð
0
m Pm n ðcos yÞPn ðcos yÞdy ¼ ½1=m ½ðn þ mÞ!=ðn mÞ!dmm0
ð3:5:65Þ
y¼0 y¼p ð
0
m m Pm n ðcos yÞPn ðcos yÞ dy ¼ ð1Þ ½1=mdmm0
ð3:5:66Þ
y¼0
The orthonormalization for spherical harmonics reads y¼p ð
j¼2p ð
djYlm ðy; jÞYl0 m0 ðy; jÞ ¼ dll0 dmm0
siny dy y¼0
ð3:5:67Þ
j¼0
Here are some normalized spherical harmonics (complex): Y0;0 ðy;jÞ ¼ ð4pÞ1=2
ð3:5:68Þ
Y1;0 ðy; jÞ ¼ ð3=4pÞ1=2 cos y eij
ð3:5:69Þ
Y1;1 ðy;jÞ ¼ ð3=8pÞ1=2 siny eij
ð3:5:70Þ
Y1;1 ðy;jÞ ¼ ð3=8pÞ1=2 siny eij
ð3:5:71Þ
Y2;0 ðy; jÞ ¼ ð5=16pÞ1=2 ð3cos2 y 1Þ
ð3:5:72Þ
Y2;1 ðy;jÞ ¼ ð15=8pÞ1=2 siny cos y eij
ð3:5:73Þ
3 .5
TH E H AM I LTO NI AN F OR T HE ONE -ELE CTR ON AT OM I N A CEN TR AL FI E LD
Y2;1 ðy; jÞ ¼ ð15=8pÞ1=2 siny cos2 y eij
ð3:5:74Þ
Y2;2 ðy; jÞ ¼ ð15=32pÞ1=2 cos2 y e2ij
ð3:5:75Þ
Y2;2 ðy; jÞ ¼ ð15=32pÞ1=2 cos2 y e2ij
ð3:5:76Þ
The Laguerre polynomials Ln(r) become orthonormal polynomials only after they are multiplied by a weighting function (n!)1 exp(r) [11]: r¼1 ð
1 Lm ðrÞLn ðrÞexpðrÞ dr ¼ dnm m!n!
ð3:5:77Þ
r¼0
A similar weighting factor transforms the associated Laguerre polynomial Lnm(r) into the orthonormal function [(nm)!(n!)3]1/2exp(r/2)rm/2Lnm(r) [7], [12], whence the orthonormalization condition becomes sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r¼1 ð ðn mÞ!ðn0 m0 Þ! ðn!Þ3 ðn0 !Þ3
0
0
0
m ðm þm Þ=2 Lm expðrÞdr ¼ dnn0 dmm0 n ðrÞLn0 ðrÞr
ð3:5:78Þ
r¼0
From the generating function for associated Laguerre polynomials Um ðr;tÞ ¼
nX ¼1
n Lm n ðrÞt ¼
n¼m
rt exp 1t ð1 tÞmþ1 ðtÞm
ð3:5:79Þ
one can calculate @U/@t, collect terms with like powers of t (e.g., tn), and get the following recursion relation for associated Laguerre polynomials: m 3 2 2 3 m ðn mÞLm n ðrÞ ¼ ðn mn nrÞLn1 ðrÞ þ ½n þ n ð2 þ r Þ þ nð2 r 1ÞLn2 ðrÞ
ð3:5:80Þ A useful expression for the associated Laguerre polynomials [10] is 2 L2lþ1 nþl ðrÞ ¼ ½ðn þ lÞ!
p¼nl1 X p¼0
ðrÞp p!ðn l p 1Þ!ð2l þ p þ 1Þ!
ð3:5:81Þ
The first few associated Laguerre polynomials are L01 ðxÞ ¼ x þ 1
ð3:5:82Þ
L11 ðxÞ ¼ 1
ð3:5:83Þ
L02 ðxÞ ¼ x2 4x þ 2
ð3:5:84Þ
L12 ðxÞ ¼ 2x 4
ð3:5:85Þ
147
148
3
QUA NT UM M ECH AN ICS
L22 ðxÞ ¼ 2
ð3:5:86Þ
L03 ðxÞ ¼ x3 þ 9x2 18x þ 6
ð3:5:87Þ
L13 ðxÞ ¼ 3x2 þ 18x 18
ð3:5:88Þ
L23 ðxÞ ¼ 6x þ 18
ð3:5:89Þ
L33 ðxÞ ¼ 6
ð3:5:90Þ
Table 3.2 gathers some hydrogen atom wavefunctions. Figure 3.6 shows, as functions of r/a0, the distance from the center of mass (in units of the Bohr radius a0 ¼ 0.529177 A), three functions 1. The hydrogen “1s” wavefunction c100(r, y, j) ¼ (1/p a03)1/2 exp (–r/a0), which is a probability amplitude. 2. Its square c1002¼(1/p a03) exp ( 2r/a0), which is a probability density. 3. The radial distribution function 4pr2c1002¼(4r2/a03)exp(2r/a0), which, when multiplied by dr, is a probability that the electron is in a spherical shell between r and r þ dr. This last function peaks at the Bohr radius a0.
Table 3.2
Normalized Eigenfunctions c nlm(r, u, w) ¼ Rnl(r) Ylm(u, w) for the One-Electron Atoma
n
l
m
1
0
0
1s
p1=2 ðZ=a0 Þ3=2 expðZr=a0 Þ
1
2
0
0
2s
321 p1=2 ðZ=a0 Þ3=2 ð2 Zr=a0 Þ expðZr=2a0 Þ
1
2
1
0
2pz
321 p1=2 ðZ=a0 Þ3=2 ðZr=a0 Þ expðZr=2a0 Þ cosy
z/r
2
1
1
2px i 2py
81 p1=2 ðZ=a0 Þ3=2 ðZr=a0 Þ expðZr=2a0 Þsiny expð ijÞ
(x iy)/r
3
0
0
3s
19; 6831=2 p1=2 ðZ=a0 Þ3=2 ð27 18 Zr=a0 þ 2 Z2 r2 a0 2 Þ expðZr=3a0 Þ
1
3
1
0
3pz
65811=2 p1=2 ðZ=a0 Þ3=2 ð6Zr=a0 Z2 r2 a0 2 Þ expðZ r=3 a0 Þ cosy
z/r
3
1
1
3px i3py
65811=2 p1=2 ðZ=a0 Þ3=2 ð6Zr=a0 Z2 r2 a2 0 ÞexpðZr=3a0 Þsiny expð ijÞ
ðx iyÞ=r
3
2
0
3dz2
2 39; 3661=2 p1=2 ðZ=a0 Þ3=2 ðZ2 r2 a2 0 Þ expðZr=3a0 Þð3 cos y 1Þ
z2/r2
3
2
1
3dxz i3dyz
811 p1=2 ðZ=a0 Þ3=2 ðZ2 r2 a2 0 ÞexpðZr=3a0 Þsiny cosy expð ijÞ
ðxz iyzÞ=r2
3
2
2
3dx2 y2 i3dxy
2 1621 p1=2 ðZ=a0 Þ3=2 ðZ2 r2 a2 0 ÞexpðZr=3a0 Þsin yexpð 2ijÞ
ðx2 y2 ixyÞ=r2
a
Name
cnlm
Angular dependence
1=2 The radialpartisRnl ðrÞ ¼ ½4Z3 ðn l 1Þ!n4 a3 ð2Zr=na0 Þl expðZr=na0 ÞL2lþ1 0 nþ1 ð2Zr=na0 Þ: TheangularpartisYlm ðy;jÞ¼ ½ð2lþ1ÞðljmjÞ!= jmj
4pðlþjmjÞ!1=2 Pl ðcosyÞexpðimjÞ;sofinally cnlm ðr;y;jÞ¼½Z3 ðnl1Þ!ð2lþ1ÞðljmjÞ!=pa30 n4 ðlþjmjÞ!1=2 ð2Zr=na0 Þl expðZr=na0 ÞL2lþ1 nþ1 ð2Zr= jmj na0 ÞPl ðcosyÞexpðimjÞ:
“Name” is the chemical conventional name.
3 .5
TH E H AM I LTO NI AN F OR T HE ONE -ELE CTR ON AT OM I N A CEN TR AL FI E LD
149
1.2 1 0.8
FIGURE 3.6 0.6 4 s r2 [ψ1s(r)]2 0.4 0.2 ψ1s(r)
[ψ1s(r)]2
0 0
1
2
3
4
5
r (bohrs)
Plots of the hydrogen 1s wavefunction (probability amplitude) c100(r, y, j) ¼ (1/pa03)1/2 exp (r/ a0), of its square (probability density) c1002 ¼ (1/p a03) exp(2r/a0), and of the radial distribution function 4pr2c1002 ¼ (4r2/a03) exp (2r/a0) as functions of r, the distance from the center of mass (in units of the Bohr radius a0 ¼ 0.529177 A).
2 2 2 2 PROBLEM 3.5.1. Prove the expansion L^ ¼ L^x þ L^y þ L^z ¼ ½ihyð@=@zÞþ ihzð@=@yÞ2 þ ½ihzð@=@xÞ þ ihxð@=@zÞ2 þ ½ihxð@=@yÞ þ ihyð@=@zÞ2 :
PROBLEM 3.5.2. Show that the commutator is ½L^x ; L^y ¼ L^x L^y L^y L^x ¼ ihL^z . PROBLEM 3.5.3. Show that the commutators are ½Ly ; Lz ¼ ihLx and ½Lz ; Lx ¼ ihLy : PROBLEM 3.5.4. Verify ½L^ ; L^z ¼ 0: 2
PROBLEM 3.5.5. Show ½L^ ; L^x ¼ 0 and also ½L^ ; L^y ¼ 0. 2
2
PROBLEM 3.5.6. One can define linear ladder operators for angular momentum (orbital or spin): the raising operator L^þ L^x þ iL^y and the lowering operator L^ L^x iL^y . (a) Verify that brute-force expansion yields L^þ L^ 2 2 2 2 L^ L^z þ hL^z and similarly L^ L^þ L^ L^z hL^z ; therefore, L^þ L^ L^ L^þ ¼ 2hL^z . (b) Also verify that L^þ L^z ¼ L^z L^þ hL^þ and further L^ L^z ¼ L^z L^þ þ hL^ . (c) Show that ½L^ ; L^ ¼ 0. (d) Assume that there are simultaneous 2 2 eigenfunctions Y of L^ and L^z such that L^ Y ¼ aY and L^z Y ¼ bY. By using ladder operators show that the eigenvalues are finite in number and are bounded both above and below. 2
PROBLEM 3.5.7. Prove that the eigenfunctions of the hydrogen atom form an orthonormal basis. PROBLEM 3.5.8. Evaluate the first moment integrals for the one-electron atom: hn0 l0 m0 jxjnlmi;
hn0 l0 m0 jyjnlmi;
and
hn0 l0 m0 jzjnlmi:
The case n ¼ n0 , l ¼ l0 , m ¼ m0 is the first moment of the electron distribution; when multiplied by the electronic charge j ej , the integral m ¼ j ej h nl0 m j rj nl0 mi is the static permanent electric dipole moment. The
150
3
QUA NT UM M ECH AN ICS
off-diagonal integrals, again multiplied by j ej , are the transition moments from eigenstate {n, l, m} to eigenstate {n0 , l0 , m0 }. The term “moment” is usually reserved for diagonal terms; nevertheless, by tradition, transition “moments” are given that name, even though they are actually off-diagonal matrix elements. PROBLEM 3.5.9. Evaluate the second moment integrals for the one-electron atom: hn0 l0 m0 jx2 jnlmi;
hn0 l0 m0 jy2 jnlmi;
hn0 l0 m0 jz2 jnlmi
3.6 THE DIRAC EQUATION The Dirac equation [13, 14] is a relativistically correct version of the Schr€ odinger formalism, valid for spin-1/2 particles (or antiparticles, as we shall see). One can start by writing the Hamiltonian for a particle of charge j ej , momentum p and rest-mass m0 in an electromagnetic field with vector potential A and scalar potential j as ^ ¼ c½ðp jejAÞ2 þ m0 2 c2 1=2 þ jejj H
ð3:6:1Þ
which can be formally squared as ^ jejjÞ2 c2 ðp jejAÞ2 ¼ m0 2 c2 ðH
ð3:6:2Þ
Extending the Schr€ odinger formalism to a relativistically correct form requires that one use relativistic four-vectors for both the space–time coordinates of the electron: xm ðx; y; z; ictÞ ¼ ðr; ictÞ
ðm ¼ 1; 2; 3; 4Þ
ð3:6:3Þ
where i ¼ (1)1/2 and c ¼ speed of light in vacuo, and also for its fourmomentum: pm ðhð@=@hxm Þ; hc1 ð@=@tÞÞ ¼ ðp; ic1 EÞ
ðm ¼ 1; 2; 3; 4Þ
ð3:6:4Þ
where E is the total relativistic energy, and for the vector potential we have Am ðA; ijÞ
ðm ¼ 1; 2; 3; 4Þ
ð3:6:5Þ
where A is the magnetic vector potential and j is the electric scalar potential. The wavefunction c must then be a 1 4 column vector with four components: 0
c1 ðxÞ
1
C B B c2 ðyÞ C C c ¼ ðc1 ; c2 ; c3 ;c4 Þ ¼ B B c ðzÞ C A @ 3 c4 ðictÞ
ð3:6:6Þ
3.6
15 1
TH E D I R A C EQ U A T I O N
One might replace, as in the Schr€ odinger formalism, p by (h/i) ! and the ^ by (h/i) (@/@t), this yields the Klein34–Gordon35 Hamiltonian operator H equation: f½ihð@=@tÞ jejj2 c2 ½ihr jejA2 gc ¼ fm20 c2 gc
ð3:6:7Þ
which is relativistically correct, but valid only for particles with no intrinsic angular momentum. The Klein–Gordon equation for the free particle (j ¼ A¼ 0) reduces to r2 c c2 ð@ 2 c=@t2 Þ ¼ m20 c2 h2 c
ð3:6:8Þ
which, by the usual separation-of-variables trick of assuming c(r, ict)¼ A exp[(ih1(p r Wt)], yields a total energy W: W ¼ ðp2 c2 þ m0 2 c4 Þ1=2
ð3:6:9Þ
which can be both positive and negative: but negative energies sounded horrid in 1928! Dirac improved on matters by assuming that Eq. (3.6.1) must be made symmetrical and linear in the momenta: He defined two new quantities, a and b, with (roughly) a being a vector and b a scalar ½ðp jejAÞ2 þ m0 2 c2 1=2 a ðp jejAÞ þ bm0 c
ð3:6:10Þ
For scalar b and vector a, however, serious contradictions arise (Problem 3.6.1), which were “fixed” by defining ax, ay, az, and b as anticommuting operators, representable by the following traceless 4 4 matrices: 0
0 B0 g1 B @0 i 0
0 B0 g3 B @i 0 0
0 B0 ax B @0 1 0
0 B0 az B @1 0
34 35
1 0 0 i 0 i 0 C C; i 0 0 A 0 0 0 1 i 0 0 iC C; 0 0A 0 0
0 0 0 i 0 0 1 0 0 0 0 1
0 1 0 0
1 1 0C C; 0A 0 1 1 0 0 1 C C; 0 0 A 0 0
Oskar Benjamin Klein (1894–1977). Walter Gordon (1893–1939).
0
0 B 0 g2 B @ 0 1 0
1 B0 g4 B @0 0 0
0 B0 ay B @0 i 0
1 B0 bB @0 0
0 0 1 0 0 1 0 0
0 1 0 0
1 1 0 C C; 0 A 0
0 0 1 0
1 0 0 C C 0 A 1
0 0 0 i i 0 0 0 0 1 0 0
0 0 1 0
1 i 0C C; 0A 0 1 0 0 C C 0 A 1
ð3:6:11Þ
152
3
QUA NT UM M ECH AN ICS
which are block-diagonal combinations of the three 2 2 Pauli spin matrices sP and a 2 2 identity matrix I: spx
0 1
1 ; 0
sPy
0 i ; i 0
sPz
1 0
0 ; 1
I
1 0
0 1
ð3:6:12Þ Substituting Eq. (3.6.11) in Eq. (3.6.10), the relativistically correct and useful Dirac equation, valid for spin-1/2 particles but written here in noncovariant notation, is obtained: ^ ¼ ½ca ðih jejAÞ þ bm0 c2 þ jejjc ¼ ihð@c=@tÞ ¼ Ec Hc
ð3:6:13Þ
D
The four equations abbreviated in Eq. (3.6.12) can be written out as: fihð@=@tÞ þ jejj þ m0 c2 gc1 þ cfihð@=@xÞ jejAx gc3 þ cfihð@=@xÞ jejAx i½ihð@=@yÞ jejAy gc4 ¼ 0 fihð@=@tÞ þ jejj þ m0 c2 gc2 þ cfihð@=@xÞ jejAx þ i½ihð@=@yÞ jejAy gc3 cfihð@=@zÞ jejAzgc4 ¼ 0 cfihð@=@zÞ jejAz gc1 þ cfihð@=@xÞ jejAx i½ihð@=@yÞ jejAy gc2 þfihð@=@tÞ þ jejj þ m0 c2 gc3 ¼ 0 cfihð@=@xÞ jejAx þ i½ihð@=@yÞ jejAy gc1 cfihð@=@zÞ jejAz gc2 þfihð@=@tÞ þ jejj þ m0 c2 gc4 ¼ 0
ð3:6:14Þ
For a free particle (A ¼ j ¼ 0), Eq. (3.6.13) reduces to Eq. (3.6.8); we now must, alas, accept the possibility of a set of negative-energy solutions. Do particles with negative energies exist? Yes, they are the so-called antiparticles. In other words, the existence of the positron was predicted by the Dirac equation. PROBLEM 3.6.1. By squaring both sides of Eq. (3.6.13) and equating the coefficients of similar terms, show that using ordinary scalar b and vector a leads to nonsense. PROBLEM 3.6.2. Verify that Eq. (3.6.14) follows from Eq. (3.6.13). PROBLEM 3.6.3. Verify that Eq. (3.6.13) leads to Eq. (3.6.8) for the case A ¼ j ¼ 0. PROBLEM 3.6.4. Verify the anticommutator rule for a and b a4: aiak þ akai ¼ 2 dik (i, k ¼ 1, 2, 3, 4), where dik is the Kronecker delta.
3.6
15 3
TH E D I R A C EQ U A T I O N
PROBLEM 3.6.5. Define the 4 4 Dirac spin operator s from the three 2 2 Pauli spin operators s P as sP 0 s¼ 0 sP Verify that sj ¼ i akal, where j, k, l ¼ cyclic permutation of 1, 2, 3. PROBLEM 3.6.6. Verify that sjai ¼ aisj and also that sj sk ak aj ¼ 2iJz L^z þ h2 sz a1 where j, k, l ¼ cyclic permutation of 1, 2, 3. Antiparticles have been detected experimentally, but annihilate rather rapidly if they encounter their counterpart particle. For instance, in pair annihilation, a positron (e þ ) encountering an electron (e): eþ þ e ! ðe . . . eþ Þ ! 2hn
ð3:6:15Þ
will vanish with it, and two photons of energy hn ¼ 0.511 MeV each are emitted (two, not one, because momentum must be conserved). Therefore the negative-energy solutions for the Dirac equation are not a mathematical fiction: In principle, each fundamental particle does have its corresponding antiparticle (which has the opposite electrical charge, but the same spin and the same nonnegative mass). Equation (3.6.15) also shows the formation of a transient Coulomb-bound electron–positron pair (“positronium”), whose decay into two photons is more rapid if the total spin is S ¼ 0; than if it is S ¼ 1, and is dependent on the medium. The reaction of Eq. (3.6.15) is also possible in the reverse direction, even if relatively infrequent; this is particle–antiparticle pair creation. This possibility is what underlies the idea of vacuum polarization and small effects, like the Lamb shift in atomic spectra. Positrons are not that rare: Many radioactive nuclei decay by positron emission––for instance, sodium-22: 11N a
22
! 12 Mg22 þ eþ
ð3:6:16Þ
Hence Eq. (3.6.15) is not merely an academic exercise: Indeed, positron emission tomography (PET) is a known analytical technique used in medicine (the annihilation rate is subtly spindependent and varies, depending on the type of human body tissue traversed). When matter and antimatter collide in the universe, they annihilate each other in a cosmic version of Eq. (3.6.15). One important question remains: How many antiparticles exist in the universe? In the 1930s it was fashionable to guess an equal number of antiparticles and particles (and to think of antiparticles as “holes” in the filled vacuum), but in 2010 the educated guess by cosmologists is that only 10% of the matter in the universe is antimatter. ^ in Eq. (3.6.13) and the x It can be shown that the Hamiltonian H ^ component of angular momentum Lx ih [x(@/@y)y(@/@x)] do not com^ The same is true for L^y þ h sy mute, but that L^x þ h2 sx does commute with H. 2 h ^ and for Lz þ 2 sz . Then we are led to define a new operator h Jz L^z þ sz 2
ð3:6:17Þ
154
3
QUA NT UM M ECH AN ICS
^ Thus the Dirac equation leads naturally to the which commutes nicely with H. addition of spin angular momentum to orbital angular momentum to create a total angular momentum J ¼LþS
ð3:6:18Þ
The existence of half-integral spin occurs naturally out of the Dirac equation, with half-integer spin (but g ¼ 2, as explained below). To put the Dirac equation into elegant covariant notation, it is useful to define pm pm þ jejc1 Am
ðm ¼ 1; 2; 3; 4Þ
ð3:6:19Þ
and a four-vector gm defined by gm ði b a; bÞ
ðm ¼ 1; 2; 3; 4Þ
ð3:6:20Þ
so that 0
0 B0 g1 B @0 i
1 0 0 i 0 i 0 C C; i 0 0 A 0 0 0
0
1 0 0 i 0 B0 0 0 iC C; g3 B @i 0 0 0A 0 i 0 0
0
0 B 0 g2 B @ 0 1 0
1 B0 g4 B @0 0
0 0 1 0 0 1 0 0
0 1 0 0 0 0 1 0
1 1 0 C C; 0 A 0 1 0 0 C C 0 A 1
ð3:6:21Þ
Using all this, the Dirac equation becomes, very succintly, h X4
i ðp g Þ im c c¼0 m 0 m m¼1
ðm ¼ 1; 2; 3; 4Þ
ð3:6:22Þ
where c is the four-component eigenfunction given in Eq. (3.6.6). It is also convenient to convert the Dirac equation into a second-order partial differential equation, by multiplying both sides of Eq. (3.6.22) by P4 [ m¼1 (pm gm) þ im0c]. After some travail, using quantities that are more familiar, the result is ½fW þ jejj þ ðh2 =2m0 Þr2 g þ ð2m0 c2 Þ1 ðW þ jejjÞ2 þ iðeh=m0 cÞðA rÞ ðe2 =m0 c2 ÞA2 ðeh=2m0 cÞðs HÞ þ iðeh=2m0 cÞða EÞc ¼ 0
ð3:6:23Þ
where W is the eigenenergy, and E and H are the external electric and magnetic fields. The first three terms of Eq. (3.6.23) (in braces) are the ordinary Schr€ odinger equation; the next three terms are relativistic extensions of the Schr€ odinger theory, to wit: the fourth term (2m0c2)-1(W þ j e j j)2 represents the effect of velocity on mass; the fifth and sixth terms show the effects of the external magnetic vector potential; the seventh and eighth terms,
3.6
15 5
TH E D I R A C EQ U A T I O N
involving the matrices a and s defined above, are new: the seventh term accounts for the coupling of the spin magnetic dipole moment (represented by the Pauli spin matrices) with the external magnetic field H; and the eighth represents the interaction of the external electric field E with an electric dipole moment þ i(eh/2m0c)a. One-Electron Atom Solution. The eigenfunctions of the Dirac equations in a central Coulomb field (i.e., for the “Kepler36 problem,” the case A ¼ 0, j ¼ Zj ej r1, or the hydrogen atom or any other one-electron ion) can be found, in analogy to the solutions to the Schr€ odinger equation, as a product of an angular eigenfunction and a radial eigenfunction. Angular Eigenfunctions. There are two cases, j ¼ l þ 1/2 and j ¼ l 1/2. For j ¼ l þ 1/2 the solutions in spherical polar coordinates are c1 ðr; y; f; ictÞ ¼
gðrÞ ½ðl þ m þ 1=2Þ=ð2l þ 1Þ1=2 Yl;m1=2 ðy; fÞ
c2 ðr; y; f; ictÞ ¼ gðrÞ ½ðl m þ 1=2Þ=ð2l þ 1Þ1=2 Yl;mþ1=2 ðy; fÞ c3 ðr; y; f; ictÞ ¼ if ðrÞ ½ðl m þ 3=2Þ=ð2l þ 3Þ1=2 Ylþ1;m1=2 ðy; fÞ
ð3:6:24Þ
c4 ðr; y; f; ictÞ ¼ if ðrÞ ½ðl þ m þ 3=2Þ=ð2l þ 3Þ1=2 Ylþ1;mþ1=2 ðy; fÞ where the Yl,m(y, f) are the usual spherical haramonics, and f(r) and g(r) are radial eigenfunctions to be determined below; for the case j ¼ l 1/2 the solutions are slightly different: c1 ðr; y; f; ictÞ ¼
gðrÞ ½ðl m þ 1=2Þð2l þ 1Þ1 1=2 Yl;m1=2 ðy; fÞ
c2 ðr; y; f; ictÞ ¼ gðrÞ ½ðl þ m þ 1=2Þð2l þ 1Þ1 1=2 Yl;mþ1=2 ðy; fÞ c3 ðr; y; f; ictÞ ¼ if ðrÞ ½ðl þ m 1=2Þð2l 1Þ1 1=2 Yl1;m1=2 ðy; fÞ
ð3:6:25Þ
c4 ðr; y; f; ictÞ ¼ if ðrÞ ½ðl m þ 1=2Þð2l 1Þ1 1=2 Yl1;mþ1=2 ðy; fÞ and the radial eigenfunctions f(r) and g(r) to be found are the solutions to two coupled differential equations h1 c1 ðW þ Ze2 =r þ E0 Þf ½ðdg=drÞ þ ð1 þ kÞg=r ¼ 0 h1 c1 ðW þ Ze2 =r E0 Þg ½ðdf =drÞ þ ð1 kÞf =r ¼ 0
ð3:6:26Þ
where, as usual, E0 m0c2, and k is a positive or negative integer, defined as k ð j þ 1=2Þ ¼ ðl þ 1Þ when j ¼ l þ 1=2 k þð j þ 1=2Þ ¼ l
when j ¼ l 1=2
ð3:6:27Þ
so that for each k there are 2|k| eigenfunctions, with magnetic quantum numbers m ¼ (|kj 1/2), (j kj 3/2), . . ., j kj 3/2, j kj 1/2.
36
Johannes Kepler (1571–1630).
156
3
Eigenenergy.
QUA NT UM M ECH AN ICS
The eigenvalue for the one-electron atom is given by
W ¼ E0 ½1 þ a2 Z2 ½ðn ð j þ 1=2Þ þ ðð j þ 1=2Þ2 a2 Z2 Þ1=2 2 1=2
ð3:6:28Þ
where a e2 h1 c1 ¼ 1=137:035999
a e2 =2«0 ch ¼ 1=137:035999 ðSIÞ;
ðcgsÞ ð3:6:29Þ
is the Sommerfeld37Fine-Structure Constant. For light atoms (low Z) this energy is only slightly less than the rest energy of the electron (no surprise here: the rest energy of an electron is 0.511 MeV, while the binding energy for the H atom in the Schr€ odinger solution is only 0.00136 MeV). As before, n is the principal quantum number. For each n, there are two n2 linearly independent eigenstates, that is, twice as many as for the Schr€ odinger equation, because of two values of the spin orientation. Radial Eigenfunctions. After much grief, one gets the normalized nodeless radial Dirac eigenfunctions f ðrÞ ¼ ½Gð2g þ n0 þ 1Þ1=2 ½Gð2g þ 1Þ1 ½Gðn0 þ 1Þ1=2 ½1 «1=2 ½4NðN kÞ1=2 ð2Z=Na0 Þ3=2 expðZr=Na0 Þ ð2Zr=Na0 Þg1 ½n0 Fðn0 þ1; 2gþ1; 2Zr=Na0 Þ þ ðN kÞFðn0 ; 2g þ 1; 2Zr=Na0 Þ gðrÞ ¼ ½Gð2g þ n0 þ 1Þ1=2 ½Gð2g þ 1Þ1 ½Gðn0 þ 1Þ1=2 ½1 þ «1=2 ½4NðN kÞ1=2 ð2Z=Na0 Þ3=2 expðZr=Na0 Þð2Zr=Na0 Þg1 ½n0 Fðn0 þ1; 2gþ1; 2Zr=Na0 Þ þ ðN kÞ Fðn0 ; 2g þ 1; 2Zr=Na0 Þ
ð3:6:30Þ
where G(x) is the gamma function, F(a, b, c) is a confluent hypergeometric function, N is the “apparent principal quantum number,” and the other constants are either defined above or given below N « n0 g a0
1=2
fn2 2n0 ½k ðk2 a2 Z2 Þ1=2 g W=E0 aZ«½1 «1=2 ðk2 a2 Z2 Þ1=2 Bohr radius h2 =me2 ðcgsÞ
ðð3:4:34ÞÞ
The confluent hypergeometric function (or Kummer’s function of the first kind) F(a, b, x) 1F1(a, b, x) M(a, b, x) satisfies the recurrence relation xFða þ 1; b þ 1; xÞ ¼ b Fða þ 1; b; xÞ bFða; b; xÞ
ð3:6:31Þ
and can be represented by the power series Fða; b; xÞ ¼
nX ¼1 n¼0
37
Arnold Sommerfeld (1868–1951).
½ðaÞn xn =ðbÞn n!
ð3:6:32Þ
3.7
THE HAMILTONIAN AND EIGENFUNCTIONS FOR THE N-ELECTRON ATOM OR MOLECULE
where (a)n a(a þ 1)(a þ 2). . .(a þ n 1) and (a)0 1 and (a)1 a. The Kummer function becomes a Laguerre polynomial for the following special case: Fðn; 1; xÞ ¼ Ln ðxÞ
ð3:6:33Þ
F(a, b, x) is infinite if b ¼ negative integer; F(a, b, x) ¼ 1 for a ¼ 0 or x ¼ 0; F(a, b, x) ¼ (1 x/b) for a ¼ 1; F(a, b, x) ¼ exp(x) for a ¼ b, F(a, b, x) ¼ [(1 þ x/b) exp(x)] for a ¼ b þ 1; F(a, b, x) ¼ [exp (x) 1/x] for a ¼ 1 and b ¼ 2. If a is a negative integer, Eq. (3.6.31) is used.
3.7 THE HAMILTONIAN AND EIGENFUNCTIONS FOR THE N-ELECTRON ATOM OR MOLECULE We now move to the many-electron atom or molecule. Within the Born-Oppenheimer38 approximation (i.e., neglect of nuclear motion) the ^ becomes Hamiltonian H N N X M N X i1 M X A1 2X e2 X ZA e2 X 1 e2 X ZA ZB ^ ¼h r2i þ þ H 2m i¼1 4p«0 i¼1 A¼1 riA 4p«0 i¼2 j¼1 rij 4p«0 A¼2 B¼1 rAB
ðSIÞ ð3:7:1Þ
where the sums over i and j (with i > j) are the sums over the N electrons, while the sums over A and B are the sums over the M nuclei of charge ZA|e| in the atom (M ¼ 1) or in the molecule (M > 1). The first term on the righthand side is the easily computed kinetic energy of the electron motion (the kinetic energy of the nuclear motion is neglected here; it will return, in a reverse Born–Oppenheimer approximation, in the theory of the vibrational spectra of molecules). The second term is the nucleus–electron attraction (fairly trivial to compute). These first two terms are just the addition of N one-electron atoms, except for the attraction to several fixed nuclear centers. The fourth term, the nucleus–nucleus repulsion is also trivially computed, since the nuclei are assumed to be motionless. The third term, however, is the “headache”: the electron–electron repulsion. While the three-body problem is also insoluble in classical mechanics, at least the stars and planets have many different masses or distances, so often many of the celestial bodies can be ignored, in first approximation, and the problem is solved by successive approximations. In contrast, all electrons in a molecule have the same mass; this is what makes applied quantum chemistry challenging. Since electrons are fermions39 (spin-1/2 particles), one must add one important consequence of the Pauli exclusion principle: for noninteger spin particles the solution C(1, 2, . . ., N) ¼ C(r1, y1, j1; r2, y2, j2; . . .; ri, yi, ji; . . .; rN, odinger equation, Eq. (3.1.15), using the yN, jN) to the time-independent Schr€ Hamiltonian of Eq. (3.7.1), must be antisymmetric to the exchange of labels
38 39
J. Robert Oppenheimer (1904–1967). Enrico Fermi (1901–1954).
15 7
158
3
QUA NT UM M ECH AN ICS
of any two particles. This leads naturally to the N N Slater40determinant of spin orbitals (here written out explicitly for N ¼ 4):
c1 ð1Það1Þ
1
c1 ð2Það2Þ Cð1; 2; . . . NÞ ¼ pffiffiffiffiffiffi
N! c1 ð3Það3Þ
c ð4Það4Þ 1
c1 ð1Þbð1Þ c2 ð1Það1Þ c2 ð1Þbð1Þ
c1 ð2Þbð2Þ c2 ð2Það2Þ c2 ð2Þbð2Þ
ð3:7:2Þ
c1 ð3Þbð3Þ c2 ð3Það3Þ c2 ð3Þbð3Þ
c ð4Þbð4Þ c ð4Það4Þ c ð4Þbð4Þ
1
2
2
where the subscript label 1 is for the first one-electron spatial wavefunction (c), the label (1) in parentheses is for electron 1, and the symbols a and b are the spin functions (a ¼ spin-up, b ¼ spin-down); the second row of the Slater determinant has electron labels (2), the third row has (3) and so on. The properties of determinants ensure that the overall wavefunction is antisymmetric, as required. In practice, in most calculations the Slater determinant does not occur unchanged all over the computer codes, since orthogonality will “kill” most of the terms of the determinants multiplied by determinants. If one has a muonic atom (one m replaces one e), then the Pauli exclusion principle does not apply to that lepton, which is now quite distinguishable from the other electrons by its mass. For one- or two-electron wavefunctions the space and spin parts can be factored. Assume that one of the electrons is in electronic state m, the other in electronic state n. Then one can write antisymmetrized wavefunctions of the type cS¼0 ; MS¼0 ð1; 2Þ ¼ ð1=2Þ
½cm ðr1 Þcn ðr2 Þ þ cn ðr1 Þcm ðr2 Þfað1Þbð2Þ bð1Það2Þg ð3:7:3Þ
cS¼1 ; MS¼1 ð1; 2Þ ¼ ð1=2Þ1=2 ½cm ðr1 Þcn ðr2 Þ cn ðr1 Þcm ðr2 Þfað1Það2Þg cS¼1 ; MS¼0 ð1; 2Þ ¼ ð1=2Þ
ð3:7:4Þ
½cm ðr1 Þcn ðr2 Þ cn ðr1 Þcm ðr2 Þfað1Þbð2Þ þ bð1Það2Þg ð3:7:5Þ
cS¼1 ; MS¼1 ð1; 2Þ ¼ ð1=2Þ1=2 ½cm ðr1 Þcn ðr2 Þ cn ðr1 Þcm ðr2 Þfbð1Þbð2Þg ð3:7:6Þ where the space part (in square brackets) is symmetric to 1 $ 2 particle interchange for the singlet (S ¼ 0) eigenfunction, Eq. (3.7.3); the space part is antisymmetric for the triplet (S ¼ 1) eigenfunctions, Eqs. (3.7.4) to (3.7.6). The reverse is true for the spin functions, enclosed in braces. When r1 ¼ r2 the triplet product functions, Eqs. (3.7.4)–(3.7.6), vanish: The two electrons cannot be localized in the same spot: this is called a “Fermi hole.” When r1 ¼ r2, then the singlet spatial wavefunction, Eq. (3.7.3), is finite and nonzero. This consequence of Fermi–Dirac statistics is called “spin pairing”: Two electrons with opposite spins attract each other (despite the classical Coulomb repulsion), while two electrons with the same spin repel each other.
40
John Clarke Slater (1900–1976).
3.9
TH E R OOT HA AN –HAL L M ATR IX FOR MULAT ION OF THE HA RT REE–FOC K P ROB LEM
3.8 THE HARTREE–FOCK METHOD There is no analytical solution to the N-particle problem in quantum mechanics (neither is there one in, say, celestial mechanics or classical electrodynamics). The Hartree41–Fock42 approximation to the problem is to treat each electron individually, one at a time, in the average electrical field of all the other electrons and fixed nuclei. It yields the effective Hamiltonians (or Fock Hamiltonians): M N h i h2 2 e2 X ZA X r1 þ 2J^j ð1Þ K^j ð1Þ F^eff ð1Þ ¼ 2m 4p«0 A¼1 r1A j¼1
ðSIÞ
ð3:8:1Þ
ði ¼ 1; 2; . . . ; NÞ
ð3:8:2Þ
where the [direct] Coulomb operator Jjop(1) is given by ð * fj ð2Þfj ð2Þ e2 fi ð1Þ dVð2Þ J^j ð1Þfi ð1Þ ¼ 4p«0 r12
and the exchange [Coulomb] operator Kj(1) is given by ð * fi ð2Þfj ð2Þ e2 ^ Kj ð1Þfi ð1Þ ¼ f ð1Þ dVð2Þ r12 4p«0 j
ði ¼ 1; 2; . . . ; NÞ
ð3:8:3Þ
where the integrations are over the coordinates of electron 2 and the volume element dV(2) of electron 2. Note that to the left of integral sign of Eqs. (3.8.2) and (3.8.3), different suffixes exist for the one-electron effective wavefunction; also note that inside Eq. (3.8.3), electron 2 is exchanged between two different eigenstates i and j. We must solve the one-electron Schr€ odinger-like equation: F^eff ð1Þfi ð1Þ ¼ «i fi ð1Þ
ði ¼ 1; 2; . . . ; NÞ
ð3:8:4Þ
iteratively and cyclically, to improve successively the wavefunctions, until the molecular energy E is minimized (the variational theorem ensures that the minimization is monotonic toward lower energies). The orbital energy «i is the energy of one electron in the ith energy level, embedded in the averaged electric field of all the other electrons (and nuclei).
3.9 THE ROOTHAAN–HALL MATRIX FORMULATION OF THE HARTREE–FOCK PROBLEM In the early 1950s, Roothaan43 [15] and Hall44 [16] independently suggested that one expand the N molecular orbitals (MO) wavefunctions fi in the chosen basis set as linear combinations of a complete set of B atomic basis functions 41
Douglas Rayner Hartree (1897–1958). Vladimir Aleksandrovich Fock (1898–1974). 43 Clemens C. J. Roothaan (1918– ). 44 George Garfield Hall (1925– ). 42
15 9
160
3
QUA NT UM M ECH AN ICS
(fn, n ¼ 1, 2, . . ., B, B N), in particular, as linear combinations of B atomic orbitals (LCAO):
fi ¼
n¼B X
cin fn
ði ¼ 1; 2; . . . ; NÞ
ð3:9:1Þ
n¼1
(We shall use the convention that Greek subscripts deal with atomic orbitals, while Roman subscripts deal with molecular orbitals.) It will be assumed, by typical “Aufbau” arguments, that, of the molecular orbitals, the ones with the lowest energies (with proper concern for spinpairing if necessary) will be occupied for the ground-state configuration of the atom or molecule. If the system (atom or molecule) is in a spin singlet state, then two electrons (one with spin eigenfunction a, the other with spin eigenfunction b) will share each of the N/2 lowest spatial molecular orbitals. The atomic orbitals used in Eq. (3.9.1) can be: (i) Slater-type orbitals [STO: nodeless analogs of one-electron atom wavefunctions, but with an adjustable orbital exponent z in the factor exp(z) set by Slater’s rules evolved for the wavefunctions of manyelectron atoms, e.g., fn ¼ C(xy/r2) exp(z r) for a “nodeless” 3dxy orbital; the integrals involving these STOs over several atomic centers must be solved numerically] or (ii) the Gaussian-type orbitals proposed by Boys45 [GTO: the exponential is exp(zr2); this makes all integrals evaluable analytically, but reproduces rather badly the cusp in the s wavefunction near the origin] or piecewise polynomials, or other crazy functions]. At present the popular ab initio program packages (GAUSSIAN, ALCHEMY, HONDO) use GTOs; the older package POLYATOM used STOs. The LCAO assumption and the Fock equations lead to the Roothaan– Hall matrix equation, or system of B homogeneous equations in B unknowns: n¼B X
cin ðFmn «i Smn Þ ¼ 0
ðm ¼ 1; 2; . . . ; BÞ
ð3:9:2Þ
n¼1
which are subject to the N N orthonormalization conditions: m¼B X n¼B X
cim cjn Smn ¼ dij
ði; j ¼ 1; 2; . . . ; NÞ
ð3:9:3Þ
m¼1 n¼1
Here dij is the Kronecker delta. Since the B equations (3.9.2) are nonlinear, some iterative solution algorithm must be found. The «i are the desired scalar and real eigenenergies, while the coefficients cin are the desired multipliers in Eq. (3.9.2) that yield the eigenfunctions. A nontrivial solution of Eq. (3.9.2) is a set of nonzero scalar (possibly complex) multipliers cin in Eq. (3.9.2) that yield
45
Samuel F. Boys (1911–1972).
3.9
TH E R OOT HA AN –HAL L M ATR IX FOR MULAT ION OF THE HA RT REE–FOC K P ROB LEM
the eigenfunctions; it exists if and only if the determinant of the coefficients vanishes, that is, if: detjFmv «i Smv j ¼ 0
ð3:9:4Þ
This Eq. (3.9.4) is the secular equation. In Eqs. (3.9.2)–(3.9.4) the matrix elements Smn are the trivially computed overlap integrals: ð Smv ¼ fm *ð1Þfv ð1ÞdVð1Þ ðm; n ¼ 1; 2; . . . ; BÞ ð3:9:5Þ The Fock matrix elements Fmn are defined as follows: Fmn ¼ Hmn þ
Pocc Pocc l¼1
s¼1
Pls ½ðmnjlsÞ ð1=2ÞðmljnsÞ
ð3:9:6Þ
where Pls is the one-electron density matrix obtained from the LCAO coefficients at every step of the iteration: Xocc Pls ¼ c *c ðl; s ¼ 1; 2; . . . ; BÞ ð3:9:7Þ i¼1 il is where occ ¼ N/2 for a spin singlet ground state (N electrons occupying the N/2 lowest molecular orbitals). In Eq. (3.9.6) the core Hamiltonian matrix elements Hmn are given by the one-electron integrals: " # ð 2 2 A¼M X ZA h e r2 fv ð1ÞdVð1Þ ðm; n ¼ 1; 2; . . . ; BÞ Hmn ¼ fm* ð1Þ 2m 1 4p«0 A¼1 riA ð3:9:8Þ The Hmn are relatively few and are computed only once. The (mn|ls) are the very numerous and onerous two–electron 1, 2, 3, and 4-center repulsion integrals: ðmnjlsÞ ¼
e2 4p«0
ZZ
fm* ð1Þfv ð1Þfl* ð2Þfs ð2Þ dVð1ÞdVð2Þ r12
ðm; n; l; s ¼ 1; 2; . . . ; BÞ ð3:9:9Þ
which are computed only once, but must be stored, to be used over and over again in each iteration cycle to construct a new Fock matrix for the next iteration. However, if the computer disk seek time is too long for these stored integrals, it may be more economical in time to recompute them in each Of the core Hamiltonian matrix elements Hmn, the terms (h2/2m) Rcycle. fm* (1)!12fn(1) dV(1) are R the electron kinetic energy integrals, while P called the terms (e2/4p«0) AMZA fm* (1) r1A1fn(1) dV(1) are called the nucleus– electron attraction integrals. The Roothaan–Hall equations Eq. (3.9.2) can be rewritten in matrix form: FC ¼ ESC
ð3:9:10Þ
where F, C, and S are B B matrices, and E is a B B matrix which is diagonal in the “right coordinate” system. Equation (3.9.10) does not look like a
16 1
162
3
QUA NT UM M ECH AN ICS
“canonical” (i.e., normal and ordinary) eigenvalue problem, but it can be reduced to canonical form by a matrix similarity transformation that uses the L€ owdin46 orthogonalization matrix S1/2 and its inverse, S1/2 F can ¼ S1=2 FS1=2
ð3:9:11Þ
Ccan ¼ S1=2 C
ð3:9:12Þ
and by
from which one obtains the standard eigenvalue form: F can Ccan ¼ ECcan
ð3:9:13Þ
which can be diagonalized by several well-known diagonalization algorithms—for example, by the Givens47–Householder48 algorithm. The electronic energy Eee of the molecule is given by either a sum over atoms (centers): n¼B 1XX Pmn ðFmn þ Hmn Þ 2 m¼1 n¼1 m¼B
Eee ¼
ð3:9:14Þ
or by a sum over orbitals (eigenstates): Eee ¼ 2
iX ¼occ
«i
i¼1
iX ¼occ j¼occ X i¼1
ð2Jij Kij Þ
ð3:9:15Þ
j¼1
where it becomes obvious that the sum of the orbital energies is not the overall energy, but that it must be adjusted for the direct Coulomb energies (or Coulomb energies) Jij: e2 Jij ¼ 4p«0
ZZ
f*i ð1Þfi ð1Þf*j ð2Þfj ð2Þ r12
dVð1ÞdVð2Þ
ði; j ¼ 1; 2; . . . ; occÞ ð3:9:16Þ
and the exchange Coulomb energies (or exchange energies) Kij: Kij ¼
e2 4p«0
ZZ
f*i ð1Þfj ð1Þf*j ð2Þfi ð2Þ r12
dVð1ÞdVð2Þ
ði; j ¼ 1; 2; . . . ; occÞ ð3:9:17Þ
To get the Hartree–Fock energy of the molecule EHF, one must add to Eee the trivially obtained (classical Coulomb) nuclear repulsion energy ENN: EHF ¼ Eee þ ENN
46
Per-Olov L€ owdin (1916–2000). James Wallace Givens (1910–1993). 48 Alston Scott Householder (1904–1993). 47
ð3:9:18Þ
3.10
P RA C T I C A L I M P L E M E N T A T I O N S O F TH E H A R T RE E – FO C K ME T H O D
where ENN ¼
M X M X
ZA ZA0 =4p«0 rAA0
ð3:9:19Þ
A¼1 A0 ¼1
Koopmans’49 1933 theorem states [17] that the highest occupied molecular orbital energy, «occ in Eq. (3.9.15), is approximately equal to the first ionization energy of the system, while the negative of lowest unoccupied orbital energy «occþ1 is the electron affinity of the system. Koopmans’ theorem is only approximately correct; a better estimate of ionization energy and electron affinity is obtained as the difference between two independently calculated Hartree–Fock energies EHF, that of parent system, and that of the product system, where the system has either gained or lost one electron; the reason is that the other orbital energies must readjust when one electron is added or removed. [Note: After his single enduring contribution to quantum chemistry, Koopmans started a long and illustrious career in economics, crowned by a Swedish Rijksbank Nobel Memorial Prize in 1975.] The dipole moment and also the other components of what is called the Mulliken50 population analysis (atom-in molecule charges, bond orders, etc.) are obtained from the final one-electron density matrix Pmn, Eq. (3.9.7). Atom-in-molecule partial charges are not quantum-mechanical “observables” that can be measured directly, but they have great importance in chemistry, because they influence (a) the “chemical shifts” measured in nuclear magnetic resonance, X-ray photoelectron spectroscopy, M€ ossbauer51 and nuclear quadrupole resonance spectroscopy and (b) chemical intuition about reactivities, and so on. The electrostatic potential at some distance from a molecule is a quantum-mechanical observable, and can be evaluated at several points in space. Then a classical set of partial charges, localized on atoms, on lone pairs, or along chemical bonds, can be optimized to reproduce these electrostatic potentials, and yield “potential-derived charges”; these differ considerably from the charges obtained by a Mulliken population analysis charges.
3.10 PRACTICAL IMPLEMENTATIONS OF THE HARTREE–FOCK METHOD In practice, the present GTO ab initio programs work as follows: 1. User chooses a basis set of B Gaussians (STO-3G means that a prechosen linear combination of 3 Gaussians is used to represent each core or valence STO function; 6-21 G means that 6 Gaussians are for the “core” (i.e., inner-shell) electrons, 2 Gaussians are for the “valence” electrons, and 1 Gaussian is for a more diffuse valence electron wavefunction, etc.). 2. User inputs an initial molecular geometry. 49
Tjalling Charles Koopmans (1910–1985). Robert Sanderson Mulliken (1896–1986). 51 Rudolf Ludwig M€ ossbauer (1929–2011). 50
16 3
164
3
QUA NT UM M ECH AN ICS
3. All integrals [Eqs. (3.9.5), (3.9.8), and (3.9.9)] are computed and stored. 4. An initial guess is made for some LCAO coefficients cin, and an initial Fock matrix is constructed from that guess; the transformations Eq. (3.9.11) and (3.9.12) are carried out, and the matrix Fcan is diagonalized. 5. The new eigenvectors and eigenenergies are used to construct a new and better Fock matrix, and the program returns to step 3 in successive cycles, unless both the electronic energy Eee and the density matrix Pmn have converged to satisfaction (usually 105 eV or so for the energy). That the energy becomes more negative in each cycle of iteration is guaranteed by the variational theorem. The energy is said to have fully converged when the monotonic decrease in energy with every cycle will be followed by a “fibrillation” of energy shifts (both positive and negative) acceptably smaller than a preset quantity. 6. The final energy is computed, and a dipole moment and Mulliken population analysis is performed using the final one-electron density matrix. For odd-electron atoms or molecules (“open-shell systems”), one has two choices. The first choice is to treat the last electron only in a half-filled MO; this “half-electron method” yields “worse” results. The second choice is to let the “spin-up” or a electrons be treated separately from the “spin-down” or b electrons: this is the unrestriced Hartree–Fock, or UHF, method. In UHF, there are two Fock matrixes, two eigenvalue matrices, and so on, and the energy levels are all singly occupied or empty; the a and b electron energies wil be different even for the core and lower orbitals. The UHF eigenfunctions will 2 be simultaneous eigenfunctions of S^z but not of S^ ; for example, for spin1/2 (S ¼ 1/2) systems, the eigenvalue of S^z will be h/2, but the eigenfunction will be also be contaminated by contributions from the Sz ¼1/2 substates of S ¼ 3/2, of S ¼ 5/2, and so on. Geometry optimization is customary at the end of most HF calculations (including refinemements described below): One wants to get the best possible “theoretical geometry” for a molecule, which hopefully corresponds closely to the “experimental geometry,” if known. This requires a series of many SCF calculations, followed by small artificially imposed incremental geometry changes, the computation of the energy gradient along the change axis, and the decision about which changes to abandon, curtail, or accept.
3.11 MOLECULAR MECHANICS At the very beginning of a study, it is very convenient to perform a purely classical molecular mechanics (MM) energy and geometry minimization procedure: MM replaces the Hamiltonians by purely classical potential energies for (i) formal electrostatics for charged atoms in molecules, (ii) parameters for Hooke’s law classical vibrations of chemical bonds, (iii) parameters for bond angle changes, and (iv) parameters for twist (dihedral) angle changes. These MM programs (e.g., MM3) compute reasonable geometries very quickly, which are then valid input to more serious quantum chemistry calculations (ab initio HF, semiempirical, or density functional). However, the computed MM3 energies and dipole moments are worthless numbers, not to be taken at all seriously.
3.12
16 5
C O N F I GU R A T I O N I N T E R A C T I O N
3.12 CONFIGURATION INTERACTION [18] Since the problem of correlated motion of the N electrons cannot be dealt with adequately by the Hartree–Fock method, even in the limit of an infinite basis set (i.e., B ! 1), it is expected that the ground-state electronic energy and estimation of the first few excited-state energy levels of even small molecules can be several electron volts away from thermodynamic or spectroscopic reality; the difference EexpEHF is often called the correlation energy Ecorr: Ecorr ¼ Eexp EHF
ð3:12:1Þ
To reduce this error, or difference, to a tolerably small number, one starts from a Hartree–Fock calculation of the ground state, which yields the B groundstate eigenfunctions (|ci, i ¼ 1, 2, . . ., B) of Eq. (3.7.1) as c1(1)a(1), c2(2)a(2), and so on; these solutions can be written as the ground-state Slater determinant Co [Eq. (3.7.2)] Co ð1; 2; . . . ; NÞ ¼ ð1=N!Þ1=2 jj1 j2 j3 . . . j
ð3:12:2Þ
(electron labels and spin functions have been omitted here for simplicity and are absorbed into the N occupied spinorbitals ji for the system of N electrons). Of course, the basis set size B must be at least as large as N (B N). The Hartree–Fock solution creates not only the occupied molecular orbitals (ji, i ¼ 1, 2,. . ., N), but also a large number of virtual or unoccupied or excited–state molecular orbitals (jj; j ¼ N þ 1, N þ 2,. . ., B) which have been handled “unsymmetrically” in the Hartree–Fock problem, since they do not contribute to the density matrix P, except through their orthogonality to the occupied orbitals. The configuration interaction (CI) method consists of considering new Slater determinants, in which the electron in the ith occupied MO ji is promoted into the ath virtual MO ja (the orbital ji is replaced by the orbital ja in the density matrix), the jth MO jj is replaced by the bth virtual MO jb, and so on, to construct excited-state Slater determinants written as Cijkabc (the restriction i < j < k,. . ., and a < b < c < . . . ensures that each unique excitedstate configuration is counted only once). One can then write the “ultimate” wavefunction as Cð1; 2; . . . ; NÞ ¼ ao Co þ
PN
s¼1 as Cs
ð3:12:3Þ
where C1 stands for all the N2/4 singly substituted excited-state Slater determinants Cia, and C2 stands for all the doubly substituted excited-state Slater determinants Cijab, and so on, all the way up to N-substituted Slater determinants Cijk. . .Nabc. . .N. Then the goal of the CI calculation is to determine how large or small are the coefficients as which mix the excited-state MOs with the occupied MOs. Once again, one determines these coefficients by the linear variation method, which leads to the matrix equation X
N a ðHst s¼1 st
Et dst Þ ¼ 0
ðt ¼ 0; 1; 2; . . . ; NÞ
ð3:12:4Þ
166
3
QUA NT UM M ECH AN ICS
or HA ¼ EA
ð3:12:5Þ
where Hst is the configurational matrix element: ðð Hst ¼
ð ^ t dVð1Þ dVð2Þ . . . dVðNÞ . . . cs * HC
ð3:12:6Þ
^ is the full Hamiltonian, Eq. (3.7.1). The Kronecker52 delta dst is used, and H since the Cs are mutually orthogonal. The lowest root of Eq. (3.12.4) is the corrected ground-state energy. By Brillouin’s53 theorem H0s ¼ 0 by symmetry. The total number of determinants of the type Cs is, alas, (2B!)/[N! (2B N)!], that is a very large number, so that configuration interaction singles (CIS), doubles (CID), or singles and doubles (CISD)
CCISD ¼ a0 Co þ
occ X virt X i¼1 a¼1
aai Cai þ
occ X virt X occ X virt X
ab aab ij Cij
ð3:12:7Þ
i¼1 a¼1 j¼1 b¼1
is all that is practical within anybody’s computer budget.
3.13 MØLLER–PLESSET (MP) TIME-INDEPENDENT PERTURBATION THEORY Another approach to the problem of computing the electron correlation energy is the Møller54–Plesset55 (MP) perturbation theory (which is philosophically akin to the many-body perturbation theory of solid-state physics). The mechanics are the conventional Rayleigh–Schr€ odinger perturbation ^ l, where theory: One introduces a generalized electronic Hamiltonian H ^ ð0Þ þ lV ^ ^l ¼ H H
ð3:13:1Þ
or, in more general terms: ^l ¼ H ^ ð0Þ þ lH ^ ð1Þ þ l2 H ^ ð2Þ þ l3 H ^ ð3Þ þ H
ð3:13:2Þ
^ (0) is taken to be the simple sum of the one-electron Fock operators, where H Eqs. (3.8.1)–(3.8.3), while the perturbation V is the difference between the ^ ¼H ^ (0), while if l ¼ 1, ^ and H ^ (0); thus if l ¼ 0, then H “correct” Hamiltonian H (0) ^ ^ (1) þ H ^ (2) þ ^ then the interaction is fully “turned on”: Hl ¼ H þ H (0) (3) ^ ^ ^ H þ . . . ¼ H þ V.
52
Leopold Kronecker (1823–1891). Leon Brillouin (1889–1969). 54 Christian Møller (1904–1980). 55 Milton Spinoza Plesset (1908–1991). 53
3.13
M ØL L E R – P L E S S E T ( M P ) TI M E - I N D E P E N D E N T P E R T UR B A T I O N T H E O R Y
^ (0) are the Slater determinants As in the CI problem, the eigenfunctions of H Cs, with eigenvalues Es, which are the simple sums of those one-electron Hartree–Fock energies «i, which are occupied in the state described by the wavefunction Cs. Of course, s ¼ 0 represents the ground state, while s > 0 represents all singly excited states, just as in the singly excited configuration interaction case. These “zeroth-order functions” Cs and energies Es are relabeled, for simplicity in the perturbation expansion, asCs(0) and E(0), respectively: Eð0Þ ¼
Xocc i¼1
«i
ð3:13:3Þ
Then the full perturbation solution is written as Cl ¼ Cð0Þ þ lCð1Þ þ l2 Cð2Þ þ l3 Cð3Þ þ . . .
ð3:13:4Þ
El ¼ Eð0Þ þlEð1Þ þ l2 Eð2Þ þ l3 Eð3Þ þ . . .
ð3:13:5Þ
and like powers of l are collected together; the result must obviously hold for arbitrary values of l, whence one obtains first-order, second-order, and so on, equations. In practice, computations are truncated to second order (MP2) or fourth order (MP4). One can show to first-order in energy that Eð0Þ þ Eð1Þ ¼
ðð
ð ^ ð0Þ dVð1ÞdVð2Þ . . . dVðNÞ . . . Cð0Þ *HC
ð3:13:6Þ
where C(0) is the simple one-determinant Hartree–Fock wavefunction. Therefore the MP1 or “first-order” correction to the energy [Eq. (3.13.6) truncated for l exponents higher than one] is just the Hartree–Fock energy for the unpertubed state plus the energy correction:
Eð1Þ ¼
ðð
ð ^ ð0Þ dVð1ÞdVð2Þ . . . dVðNÞ . . . Cð0Þ*VC
ð3:13:7Þ
The first-order contribution to the wavefunction is Cð1Þ ¼
PBN s>0
V0s Cs =ðE0 Es Þ
ð3:13:8Þ
where the matrix elements V0s mix the Hartree–Fock ground-state functions C(0) with the singly excited determinants Cs: ðð ð ^ s dVð1ÞdVð2Þ . . . dVðNÞ V0s ¼ . . . Cð0ÞVC ð3:13:9Þ The second-order (MP2) correction to the energy is Eð2Þ ¼
PD
s¼0 jV0s j
2
=ðE0 Es Þ
ð3:13:10Þ
MP2 and MP4 computations are more economical than the corresponding CI calculation.
16 7
168
3
QUA NT UM M ECH AN ICS
When the unperturbed energy levels are n-fold degenerate (e.g., n valence-state unperturbed atomic C wavefunctions for the n C atoms in an alkane molecule CnH2n þ 2), then caution must be used to avoid divisions by zero in equations such as Eq. (3.13.8). The “fix” is to choose the degenerate wavefunctions more closely, adding a suffix i for the ith member of an n-fold degenerate set: ^ 0 C0i ¼ Eð0Þ C0i H
ði ¼ 1; 2; . . . ; nÞ
ð3:13:11Þ
where the perturbation might break the degeneracy. We rewrite the zeroorder wavefunction as a linear combination of these “correct” functions, but with unknown scalar coefficients cik which will be determined later ð0Þ
Ci
¼
Xn k¼1
cik C0k
ð3:13:12Þ
which merge into the perturbed wavefunctions: ð0Þ
ð1Þ
ð2Þ
Ci ¼ Ci þ lCi þ l2 Ci þ . . . ^ ½H
ð0Þ
ð1Þ
Eð0Þ Ci
ð1Þ ^ ð1Þ Cð0Þ ¼ ½Ei H i
ð3:13:13Þ ð3:13:14Þ
We assume that the first-order wavefunction is equal to some sum over the degenerate zeroth-order wavefunctions plus a sum over the new set of wavefunctions: ð1Þ
Ci
ð0Þ
¼ Ci þ
X
a C n n n
ð3:13:15Þ (0)
This expansion, substituted into Eq. (3.13.14), then premultiplied by C l and (0) integrated, produces, thanks to the orthogonality of C l and Cn, the secular equations: Pn
ð1Þ k¼1 cik ½Hkl
ð1Þ
Ei Skl ¼ 0
ðl ¼ 1; 2; :::; nÞ
ð3:13:16Þ
where the overlap integral is ð
ð0Þ*
ð0Þ
Sk1 ¼ Ck C1 dVð1Þ
ðk; l ¼ 1; 2; . . . ; nÞ
ð3:13:17Þ
and the energy integral is ð1Þ Hk1
ð
ð0Þ* ^ ð1Þ ð0Þ ¼ Ck H C1 dVð1Þ
ðk; l ¼ 1; 2; . . . ; nÞ
As usual, the condition for a nontrivial solution to these secular equations is the vanishing of the secular determinant: ð1Þ
ð1Þ
detjHkl Ei Skl j ¼ 0
ð3:13:18Þ
3.14
16 9
THE COUPLED CLUSTER METHOD
When computing a molecular energy, or other energy requiring high levels of complexity, we need some sweet assurance that the calculations will converge monotonically to the “correct result.” This assurance is provided by the variational theorem, which says that the Rayleigh56 ratio: ð Etrial ¼
^ C*trial H
ð Ctrial dV= C*trial Ctrial dV
ð3:13:19Þ
is such that Etrial XEtrue ¼ the ‘‘true’’ result
ð3:13:20Þ
that is, the correct energy is always approached (asymptotically) from above. The Rayleigh–Ritz57 variational method [19,20] uses an expansion: Ctrial ¼
X
a C n n n
ð3:13:21Þ
and seeks the best coefficients an consistent with the variational theorem. This assumes dEtrial ¼ 0, which will be true if and only if (@Etrial/@an) ¼ 0 for all n ¼ 1, 2, . . ., N; in turn, this is true if and only if the secular equation holds: XN
a ½Hkl k¼1 k
ESkl ¼ 0
ðl ¼ 1; 2; :::; NÞ
ð3:13:22Þ
which, finally, has a nontrivial solution if the secular determinant vanishes: det jHkl ESkl j ¼ 0
ð3:13:23Þ
The N “roots” Ei, i ¼ 1, 2,. . ., N, for the secular determinant (3.13.23) are substituted, one by one, into the system of secular equations (3.13.22), to find the coefficients ak in Eq. (3.13.21). This yields the MP result. One further level of perturbation theory yields MP2; two more yield MP4. PROBLEM 3.13.1. Prove Eq. (3.13.20) by expanding the trial wavefunction Ctrial inP terms of a complete orthonormal basis set of the true eigenfunctions ^ C n ¼ En C n . Ctrial ¼ n anCn, where H PROBLEM 3.13.2. Prove Eq. (3.13.22) by substituting Eq. (3.13.21) into Eq. (3.13.19) [11].
3.14 THE COUPLED CLUSTER METHOD Coupled cluster theory uses a fundamental equation: ^ 0 c ¼ expðTÞF
56 57
John William Strutt, third baron Rayleigh (1842–1919). Walther Ritz (1878–1909).
ð3:14:1Þ
170
3
QUA NT UM M ECH AN ICS
where c is the desired exact nonrelativistic ground-state molecular electronic wavefunction, F0 is the ground-state Hartree–Fock wavefunction and the ^ is used as a Taylor expansion exponential operator exp(T) ^ ¼ 1 þ T^ þ ð1=2!ÞT^ þ ð1=3!ÞT^ ¼ . . . ¼ expðTÞ 2
3
Xk¼1 k¼0
ð1=k!ÞT^
k
ð3:14:2Þ
where, in turn, T^ is the cluster operator (not the kinetic energy), defined in detail as follows: T^ ¼ T^1 þ T^2 þ þ T^n
ð3:14:3Þ
where T^1 is the one-particle excitation operator: T^1 F0 ¼
Xa¼1
Xi¼n
a¼nþ1
i¼1
tai Fai
ð3:14:4Þ
involving all modified Slater determinants Fia where the occupied orbital i is replaced by unoccupied orbital a, and the coefficients (or amplitudes) tia must be now sought; similarly, T^2 is the two-particle excitation operator: T^2 F0 ¼
Xb¼1 Xa¼b a¼1
a¼nþ1
Xj¼n j¼iþ1
Xi¼n1 i¼1
tij ab Fij ab
ð3:14:4Þ
involving all modified Slater determinants Fijab where the occupied orbitals i and j are replaced by unoccupied orbitals a and b, and the coefficients tijab must be now sought. To find all these amplitudes, one must form Dirac brackets and proceed.
€ € 3.15 THE HUCKEL PROBLEM, OR SIMPLE HUCKEL MOLECULAR ORBITAL THEORY (SHMO) In 1931 H€ uckel58 proposed [21] an ad hoc MO theory, valid for the N “p” electrons of planar conjugated aromatic hydrocarbons, which is disarmingly effective, considering its drastically simple assumptions: 1. Consider only one 2pz AO per aromatic atom. (Ignore all hydrogens, core electrons, “sigma bond” electrons, lone pairs, etc.) 2. Assume that p electrons on the same site (¼ atom) repel each other with energy a (the one-site Coulomb repulsion integral, usually positive). 3. Assume that p electrons on different sites interact if and only if the sites are adjacent—that is, if they are separated by a single covalent or “s” bond. In that case they interact by a “bond integral” or “Mulliken resonance integral” b (usually negative). 4. Assume that all p electron wavefunctions orthonormal—that is, that the overlap integrals involved in the secular equation Eq. (3.8.4) are replaced by Kronecker deltas.
58
Erich Armand Arthur Joseph H€ uckel (1896–1980).
3.15
€ E L P R O B L E M , O R S I M P L E H UCK € E L M O L E C U L A R O R B I T A L TH E O R Y ( S H M O ) T H E H UCK
These four approximations lead to the simplified matrix equation: PN
n¼1 cin ½ðamm0 þbdmm0 Þ
«i dmm0 ¼ 0
ðm ¼ 1; 2; . . . ; N; m0 ¼ atom bonded to mÞ
ð3:15:1Þ
which has a nontrivial solution if the corresponding secular equation vanishes. For instance, for butadiene the secular equation is
x
1
0
0
0
0
¼0 1
x
1 0 x 1 1 x 0 1
ð3:15:2Þ
where x is defined as x ða «i Þ=b
ð3:15:3Þ
This secular determinant is a continuant Yj¼4 j¼1
ðx 2cosð jp=5ÞÞ
ð j ¼ 1; 2; 3; 4Þ
ð3:15:4Þ
whose solutions are x ¼ 1:618 0:618; 0:618; 1:618
ð3:15:5Þ
This type of solution is valid for all linear polyenes (see below) and is very similar to the solution to the tight-binding Hamiltonian for a linear chain (see Section 8.6). The LCAO coefficients cin for butadiene are defined by xci1 þ ci2 ¼ 0;
ci1 þ xcj2 þ cj3 ¼ 0;
cj2 þ xcj3 þ cj4 ¼ 0;
cj3 þ xcj4 ¼ 0 ð3:15:6Þ
The solutions x to Eq. (3.15.4) are then substituted, one at the time, to obtain the four wavefunctions for butadiene in terms of the four 2pz atomic orbitals centered on carbon atoms 1 through 4: f1 ¼ 0:372f1 þ 0:602f2 þ 0:602f3 þ 0:372f4
for e1 ¼ a þ 1:618b
f2 ¼ 0:602f1 þ 0:372f2 0:372f3 0:602f4
for e2 ¼ a þ 0:618bðHOMOÞ
f3 ¼ 0:602f1 0:372f2 0:372f3 þ 0:602f4
for e3 ¼ a 0:618bðLUMOÞ
f4 ¼ 0:372f1 0:602f2 þ 0:602f3 0:372f4
for e4 ¼ a 1:618b ð3:15:7Þ
For the general linear polyene with n carbon atoms, the eigenenergies are ej ¼ a þ 2bcosðjp=ðn þ 1ÞÞ
ðj ¼ 1;2; 3;... ; nÞ
ð3:15:8Þ
17 1
172
3
QUA NT UM M ECH AN ICS
and the HMO coefficients are cjr ¼ ð2=n þ 1Þ1=2 sinðirp=ðn þ 1ÞÞ
ð3:15:9Þ
A single value for b for all hydrocarbons is too much to hope from a simple theory. Experimental estimates of b vary from 2.72 eV (to fit energies for the benzenoid hydrocarbons) to 3.48 eV (derived from the experimental absorption spectrum of naphthalene). The onsite Coulomb repulsion energy a is positive, but its value is not obtained directly from experiment, since it does not enter into the spacings of H€ uckel eigenenergies. For compounds containing X ¼ N, O, or S in place of C, some trivial extensions of H€ uckel theory define the on-site and off-site integrals aX and bXY in terms of the corresponding integrals aC and bCC for carbon (called a and b further above), through numerical “fudge” (i.e., adjustable) parameters hX and kXY (both typically between 0.5 and 2) aX aC þ hX bCC
ð3:15:10Þ
bXY kXY bCC
ð3:15:11Þ
These stratagems represent trivial additions to any simple computer program for SHMO. For a circular polyene (a planar monocyclic compounds) such as benzene or cyclobutadiene the secular equation is a circulant:
x
1
0
0
0
1
1 0
0
0
x 1
0
0
1 x
1
0
0 1
x
1
0 0
1
x
0 0
0
1
1
0
0
¼0 0
1
x
ð3:15:12Þ
for which the solution is xk ¼ 2cosð2pk=6Þ
ðk ¼ 1; 2; . . . ; 6Þ
ð3:15:13Þ
For the circular aromatic or anti-aromatic polyene with n carbon atoms (CnHn) the eigenenergies are given by xk ¼ 2cosð2pk=nÞ
ðk ¼ 1; 2; . . . ; nÞ
ð3:15:14Þ
These eigenenergies can also be obtained graphically from the Rumer59 diagrams, by artfully enclosing the polygon in a circle! (Fig. 3.7)
59
Yurii Borisovich “Georg” Rumer (1901–1985).
3.16
€ KEL TH EOR Y E X T E N D E D H UC
C3H3
17 3
x= -2.000, 1.000, 1.000
C4H4
NBO x = -2.000, 0.000, 0.000, 2.000
C5H5
C6H6
C7H7
x = -2.000,-0.618,-0.618, 1.618, 1.618
x = -2.000,-1.000,-1.000, 1.000, 1.000, 2.000
x = -2.000,-1.247,-1.247, 0.445, 0.445, 1.802, 1.802
€ 3.16 EXTENDED HUCKEL THEORY The Wolfsberg60–Helmholz61–Hoffmann62 extended H€ uckel theory consid^ val is the ers only the valence electrons, for which the effective Hamiltonian H sum of one-electron Hamiltonians which are not specified explicitly.
60
Max Wolfsberg (1928– ). Helmholz, Lindsay. J. (ca. 1930– ). 62 Roald [Safra] Hoffmann (1937– ). 61
FIGURE 3.7 Rumer diagram for circulants C3H3 through C7H7. Only C6H6 is stabi€ ckel lized as an aromatic system (Hu 4n þ 2 rule).
174
3
^ val ¼ H
X i
QUA NT UM M ECH AN ICS
^ eff ðiÞ H
ð3:16:1Þ
As in Eq. (3.8.1), the MOs fi are linear combinations of the atomic valence uckel problem, the secular equation is functions fn. As in the H€ detjHjkeff «Sjk j ¼ 0
ð3:16:2Þ
All the overlap integrals Sjk are computed explicitly. For j ¼ H 1s, C 2s, and C 2p, the diagonal matrix elements Hjjeff are defined semiempirically: Hjjeff ¼ 13.6 eV, 20.8, and 11.3 eV, respectively. The off-diagonal elements are given by a weighted scalar average eff ÞSjk Hjkeff ¼ ðK=2ÞðHjjeff þ Hkk
ð3:16:3Þ
with K ¼ 1 to 3 being used. The method is indicative, but not quantitative.
3.17 PARISER–PARR–POPLE THEORY In the PPP method, due to Pariser63, Parr64 and Pople65, the assumption of zero differential overlap (ZDO) consists of setting the AO product to zero unless they involve the same orbital on the same center (atom): fm fn ¼ dmn fm2
ð3:17:1Þ
where dmn is the Kronecker delta. This drastic assumption simplifies several integrals immediately: the overlap integral is diagonal: Srs ¼ drs
ð3:17:2Þ
For bonded atoms r and s the bond integral is an empirical parameter: brs ¼ h fr jHcore jfs i
ð3:17:3Þ
ar ¼ h fr jHcore jfr i
ð3:17:4Þ
The core integrals:
are allowed to be different for different atoms and are evaluated approximately. The direct Coulomb and exchange Coulomb electron–electron repulsion integrals are simplified by the ZDO approximation ðð ðrsjtuÞ ¼
fr* ð1Þfs ð1Þðe2 =4p«0 r12 Þft* ð2Þfu ð2ÞdVð1ÞdVð2Þ ¼ drs dtu grt ð3:17:5Þ
which wipes out all three- and four-center integrals and leaves only onecenter integrals grr and two-center integrals grs, which are treated as semiempirical parameters. 63
Rudolph Pariser (1923– ). Robert Ghormley Parr (1921– ). 65 Sir John Anthony Pople (1925–2004). 64
3.18
17 5
NEGLE CT OF DIFFE RENTIAL OV ERL AP (NDO) ME THO DS
3.18 NEGLECT OF DIFFERENTIAL OVERLAP (NDO) METHODS To simplify the Hartree–Fock problem, Pople introduced CNDO/1 (1965), then CNDO/2 (1967), and then INDO (1967) to yield computer programs that mimic ab initio programs with a minimum of fuss. Jaffe66 improved CNDO to fit spectroscopic absorptions (with a minimum of Cl); this was CNDO/S (1968). Later, Dewar67 introduced MINDO/3 (1975), then MNDO (1977), AM1 (1985), and PM3 (1989). For transition metals, Zerner68 introduced ZINDO (1984); these were progressive improvements on INDO, but parameterized to fit thermochemical data, dipole moments, absorption spectra, and so on, to the fitful extent that they are available from experiment. The differential overlap for atomic orbitals fm(i) and fn(i) of electron i is defined to be the integrand fm(i) fn(i) dV(i), which gives the extent to which the electron shares in two different AO states within the same volume; as in Eq. (3.17.1), zero differential overlap (ZDO) consists of setting the AO products to zero unless they involve the same orbital on the same center (atom): fm ðiÞfn ðiÞdVðiÞ ¼ dmn fn ðiÞ2 dVðiÞ
ð3:18:1Þ
where dmn is the Kronecker delta [22–25]. This drastic assumption simplifies several integrals immediately; the overlap integral becomes diagonal: Smn ¼ dmn
ð3:18:2Þ
The nucleus–electron attraction integrals simplify as well: ð fm* ð1ÞðZA e2 =4p«0 r1A Þfn ð1ÞdVð1Þ ¼ 0
unless m ¼ n
ð3:18:3Þ
and there are also major simplifications to the direct Coulomb and exchange Coulomb electron–electron repulsion integrals: ZZ ðlmjnsÞ ¼
fl* ð1Þf mð1Þðe2 =4p«0 r12 Þfn* ð2Þfs ð2ÞdVð1ÞdVð2Þ ¼ dlm dns gmn ð3:18:4Þ
which “wipe out” (set to zero) all three- and four-center integrals and leave only one-center integrals gmm and two-center integrals gmn. The different ways of implementing the neglect of differential overlap (NDO) are: 1. complete, version 2 (version 1 was coordinate-system dependent): CNDO/2 (Pople); 2. complete, for accounting for the first excited spectroscopic state: CNDO/S (Jaffe); 3. intermediate: (for first- and second-row elements only): INDO (Pople);
66
Hans J. Jaffe (1919–1989). Michael James Steuart Dewar (1918–1997). 68 Michael Zerner (1940–2000). 67
176
3
QUA NT UM M ECH AN ICS
4. modified intermediate, version 3: (using heats of formation) MINDO/ 3 (Dewar); 5. modified neglect: MNDO (using heats of formation) (Dewar); 6. Austin Mechanics version 1: (using heats of formation) AM1 (Dewar); 7. afternoon version 3: PM3 (using heats of formation) (Dewar and Stewart69); 8. partial retention of diatomic: PRDDO (Lipscomb70); 9. ZINDO, or INDO for transition metals (Zerner) These differ in the extent to which the drastic consequences [Eqs. (3.18.1) and (3.18.4)] of ZDO are implemented, and they depend on how much semiempirical correction and refinement is added to the theory, so that it can reproduce experiment adequately. In general, CNDO/2, INDO, and PRDDO mimic ab initio results by artful and compensating approximations and semiempirical parameters, and they yield reasonable dipole moments and charge distributions. INDO and ZINDO parameterizations are available for relatively many elements in the periodic table, but their predictions can deviate considerably from experiment. The Dewar group of programs (MINDO/3, MNDO, AM1, PM3) use thermochemical data wherever available for parameterization, so as to yield reasonably correct conformations, gas-phase heats of formation, and spectra. They work well and provide geometry-minimized ground states for compounds of C, H, N, and O; parameters were added for B, P, Cl, Br, Se, and so on, but not all combinations of all these elements in a new compound are available (because of lack of thermochemistry data). In the discussion below, adapted from [22], one can see, for some of the NDO methods, how the core Hamiltonian matrix elements Hmm (diagonal) and Hmn (off-diagonal)] and electron–electron repulsion integrals gmn are given by several possible combinations of: (a) the experiment-derived atomic valence-state ionization potentials IPm or atomic electron affinities EAm, (b) effective nuclear charges Z*A , (c) ad hoc “resonance integrals” bA for atom A (the Mulliken approximation): Hmn Smn bmn ðbA þ bB Þ
ð3:18:5Þ
(d) exactly calculated electron repulsion integrals which, whether they are given as one-center hsAsA|sAsAi or as two-center hsAsA|sBsBi, are treated in the “s-orbital approximation” as one-center repulsion integrals involving “s-electrons, even when they usually are used for p electrons; (e) certain repulsion integrals are given in CNDO/S as integrals between fully charged spheres;
69 70
James J. P. Stewart (ca. 1950– ) William Nunn Lipscomb, Jr. (1919–
).
3.18
17 7
NEGLE CT OF DIFFE RENTIAL OV ERL AP (NDO) ME THO DS
(f) adjustable empirical parameters K, kA, kB, and so on, show up in certain methods. (g) The STO orbital exponents are taken from Slater’s empirical rules for atoms, except that for hydrogen the orbital exponent is taken as 1.2, not 1.0. In particular, the CNDO Fock matrix F becomes, for the diagonal elements: A Fmm ¼ ½Umm
P
B B$A Vmm
þ
P
n Pmn lmn
ð1=2ÞPmm gmm
ð3:18:6Þ
and for the off-diagonal ones: Fmn ¼ Hmn ð1=2ÞPmn gmn
ð3:18:7Þ
The (M)INDO Fock matrix is more complicated: A Fumm ¼ ½Umm þ ðPmm Pumm Þgmm þ
PA
ex m$n ðPmv gmn Pmm gmm Þ
þ
P
B$A ðPBB gAB
VAB Þ
ð3:18:8Þ Fumn ¼ Hmn ð1=2ÞPumn GAB
ð3:18:9Þ
In Eq. (3.18.6), the term in square brackets [ ] is the core Hamiltonian. The term A is for the atomic orbital m on atom A: Umm Umm A ¼ ðIPm þ EAm Þ=2 ðZ*A 1=2Þgmn
ðCNDO=2; CNDO=SÞ ð3:18:10Þ
Umm A ¼ ðIPm þ EAm Þ=2 þ ER
ðINDOÞ
ð3:18:11Þ
Umm A ¼ IPm ðZ*A 1Þ þ ER
ðMINDO=3Þ
ð3:18:12Þ
where ER is the total electron repulsion term (discussed below). In Eq. (3.18.6) B the terms Vmm refer to all atoms B other than the atom A on which the orbital m is centered: Vmm B ¼ Z*B gmm
ðCNDO=2; CNDO=S; INDO; MINDO=3Þ
ð3:18:13Þ
In Eq. (3.18.7), the off-diagonal core Hamiltonian terms Hmn are given by the Mulliken approximation Hmn ¼ Smn ðb0A þ b0B Þ=2
ðCNDO=2; INDOÞ
ð3:18:14Þ
or by subtle variations thereof: Hmn ¼ KSmn ðb0A þ b0B Þ=2 Hmn ¼ BS0 mn ðIPm þ EAn ÞÞ=2
ðCNDO=SÞ
ð3:18:15Þ
ðMINDO=3Þ
ð3:18:16Þ
178
3
QUA NT UM M ECH AN ICS
In Eq. (3.18.16), S0mn is the overlap integral for modified STO atomic orbitals with parametric atomic orbital exponents z, rather than those given by Slater’s rules. The 1- and 2-center integrals g in Eqs. (3.18.8), (3.18.10), and (3.18.13), for the case that atomic orbitals m and n are both on the same atom A, are given by gmm ¼ gmv ¼ hsA sA jsA sA i
ðCNDO=2Þ
ð3:18:17Þ
gmm ¼ gmn ¼ IPA EAA
ðCNDO=SÞ
ð3:18:18Þ
while if atomic orbitals m and n are on different atoms A and B, then the twocenter integrals become gmn ¼ hsA sA jsB sB i
ðCNDO=2Þ
ð3:18:19Þ
gmn ¼ uniformly charged sphere
ðCNDO=SÞ
ð3:18:20Þ
For INDO and MINDO/3, the two-electron integrals are more complicated: gmn ¼ hssjssi; hssjppi ¼ hsA sA jsB sB i gmn ¼ hpx px jpy py i ¼ Gmm ð2=25ÞF2
ðINDO; MINDO=3Þ
ð3:18:21Þ
ðINDO; MINDO=3 : OleariÞ ð3:18:22Þ
gmn ¼ hpx px jpx px i ¼ Gmm þ ð4=25ÞF2
ðINDO; MINDO=3 : OleariÞ ð3:18:23Þ
1 lex mn ¼ hspx jspy i ¼ ð1=3ÞG
2 gex mn ¼ hpx py jpx py i ¼ ð3=25ÞF
ðINDO; MINDO=3 : OleariÞ ðINDO; MINDO=3 : OleariÞ
ð3:18:24Þ ð3:18:25Þ
where F2 and G1 are the atomic STO Slater–Condon parameters [22]. The key to such moderately successful and useful semiempirical techniques was the choice of “reasonable” parameters derived either from experiment or from ab initio theories. There have been further parameterizations: MINDO/3, then MNDO, then AM1 and finally PM3 were incremental further developments by Dewar’s group, to cover as many combinations of elements as the limited availability of experimental thermochemical data allows. The Dewar programs allow for geometry optimization (as do most ab initio program packages): one starts with a chemically reasonable initial molecular geometry, does one set of fixed-geometry SCF refinement cycles for it, then carries out small variations in atom positions, calculates a new SCF results, and decides from the derivatives of the energies with respect to the displacement the next guess for a geometry, etc., until the geometry and the SCF energy settle on a final result. Also, these programs have a predilection for flat structures and away from zwitterions. One critic has said that the success and usefulness of the semi-empirical methods (good geometry,
3.19
17 9
DENSITY FUNCTIONAL THEORY (DFT)
dipole moment, heat of formation, first excited state, and even polarizability) are really a triumph of parametrization, rather than a validation of the NDO assumption.
3.19 DENSITY FUNCTIONAL THEORY (DFT) Density functional theory (DFT) is dramatically different: it starts from the presumption that some ideal “good” calculation for a molecule (e.g. HF in the Born-Oppenheimer approximation, but also excluding nuclei-nuclei repul^ of the form of Eq. (3.5.1) and gave a groundsion) used the Hamiltonian H state wavefunction: C0 ð1; 2; . . . ; NÞ ¼ ð1=N!Þ1=2 jc1 ð1Það1Þc1 ð1Þbð1Þc2 ð1Það1Þc2 ð1Þbð1Þ . . . j ð3:19:1Þ and thus the ground-state electron density function: ð ð r0 ðx; y; zÞ ¼ r0 ðrÞ ¼ NS . . . jC0 ðr; r 2 ; . . . r N j2 dvðr 2 Þ . . . dvðr N Þ
ð3:19:2Þ
(where S denotes the sum over all occupied spin states). The Hohenberg71– Kohn72 theorem of 1964 then assumes that the ground-state energy E0 is some unknown functional E0[r0] of that density: E0 ¼ E0 ½r0 ¼ hTi½r0 þ hVeN i½r0 þ hVee i½r0
ð3:19:3Þ
where hTi hC0 ð1; 2; ... ;NÞjðh2 =2mÞ
PN
i¼1 ri
hVeN i hC0 ð1; 2; ... ;NÞjðe2 =4p«0 Þ hVee i hC0 ð1;2;.. .;NÞjðe2 =4p«0 Þ
2
jC0 ð1; 2;... ;NÞi
PN PM
A¼1 ðZA =riA ÞjC0 ð1;2; ...; NÞi
i¼1
PN PN i¼1
j¼1ði$jÞ ð1=rij ÞjCð1;2;. ..; NÞi
ð3:19:4Þ are the kinetic energy, electron–nucleus attraction, and electron–electron repulsion energies, respectively. The term “functional” means here “an unknown and unspecifiable function of.” Of these three terms, the second can be shown to be ð hVeN i ¼ r0 ðrÞnNA ðrÞdnðrÞ
71 72
Pierre C. Hohenberg (1934– Walter Kohn (1923– ).
).
ð3:19:5Þ
180
3
QUA NT UM M ECH AN ICS
where nNA ðrÞ
XM A¼1
ðZA =riA Þ
ð3:19:6Þ
is baptized the external potential (the nuclei are seen as being “external” to the “electron gas”). The theorem also showed that any trial electron density function rtr, different from r0, would yield an energy more positive than E0, thus legitimizing the eventual energy minimization. The DFT method was inspired by the Thomas73–Fermi energy for an atom (in a.u.): ð ETF ½r ¼ 2:871 rðrÞ ZZ þð1=2Þ
ð
5=3
dnðrÞ Z r1 rðrÞdnðrÞ
rðr 1 Þrðr 2 Þjr 1 r 2 j1 dnðr 1 Þdnðr 2 Þ
ð3:19:7Þ
R where r(r) is subject to the constraint N ¼ r(r) dn(r). How does one construct such a trial electron density function? Kohn and Sham74 introduced a fictitious system of N noninteracting electrons, which produce a corresponding fictitious charge distribution rs with a fictitious external potential ^ns ðrÞ so constructed that, if all goes well, rs becomes equal to the real r0. Then the total energy can be rewritten by first defining Exc[r0], the exchange-correlation functional: Exc ½r0 hTi½r0 hTs i½r0 þ hVee i½r0 ZZ ð1=2Þ r0 ðr1 Þr0 ðr2 Þdnðr1 Þdnðr1 Þ
ð3:19:8Þ
which yields the ground-state energy: ð E0 ¼ E0 ½r0 ¼ fhTs i½r0 þ rðrÞ^ns ðrÞdnðrÞ ZZ þð1=2Þ r0 ðr1 Þr0 ðr2 Þdnðr1 Þdnðr1 Þg þ Exc ½r0 ð3:19:9Þ where the first three terms, enclosed in braces, are relatively large and easy to compute, while the fourth term, Exc[r0], is small, crucial, and more difficult to compute. The insight was to lump all the delicate inter-electron interactions into Exc[r0] and thus focus on what must be computed well in order to obtain the properties of an N-electron atom or molecule. All this suggests the Kohn–Sham Hamiltonian: ^s ¼ H
73 74
XN i¼1
fðh2 =2mÞri 2 þ ^ns ðri Þg ¼
Llewellen Hilleth Thomas (1903–1992). Lu Jeu Sham (1938– ).
XN i¼1
^ KS H i
ð3:19:10Þ
3.19
18 1
DENSITY FUNCTIONAL THEORY (DFT)
and a search for one-electron eigenfunctions yiKS found from the eigenvalue equation: ^ KS yKS ¼ «KS yKS H i i i i
ð3:19:11Þ
with a basis set of atomic orbitals: yKS i ¼
XB
c x r¼1 ri r
ð3:19:12Þ
by solving a set of B simultaneous equations: PB
KS s¼1 csi ðHrs
«KS i Srs Þ ¼ 0
ðr ¼ 1; 2; . . . ; BÞ
ð3:19:13Þ
for which solutions are found, in an iterative procedure (similar to HF SCF procedures). The big question is, How are the Kohn–Sham orbitals yiKS picked? In the Local Density Approximation (LDA), if the charge density r0 varies only slowly with position, then a formal expression for Exc is ð Exc ½r0 ¼ r0 ðrÞ«xc ðr0 ÞdnðrÞ
ð3:19:14Þ
where «xc(r0) is the exchange Coulomb plus correlation energy per electron in “jellium,” a hypothetical electron gas with electron density r0. The LDA approximation and its successive refinements (local-spin-density approximation, generalized-gradient functionals, etc.) can yield a surprisingly good representation of molecular energies at relatively low computing cost. It does not produce a wavefunction and does not work for excited states. One simple but not very successful expression for Exc was Slater’s “Xa” approximation: ð Exc ½r0 ð9=8Þð3=pÞ1=3 a ½r0 ðrÞ4=3 dnðrÞ
ð3:19:15Þ
where a was an ad hoc parameter with values between 0.66 and 1.00. A very popular functional, optimized to reproduce experimental atomization energies, is the “hybrid” exchange-correlation B3LYP functional ( Becke75, three-parameter, Lee76, Yang,77 and Parr): LDA EB3LYP ¼ ELDA þ 0:20ðEHF Þ þ 0:72ðEGGA ELDA Þ þ 0:81ðEGGA ELDA Þ xc xc x Ex x x x c
ð3:19:16Þ where the superscripts GGA, HF, and LDA are for generalized gradient, Hartree–Fock, and local density approximations, respectively, and the
75
Axel D. Becke (1953– ). Chengteh Lee (ca. 1958– ). 77 Weitao Yang (1961– ). 76
182
3
QUA NT UM M ECH AN ICS
subscripts c, x, and xc refer to Coulomb, exchange and exchange correlation, respectively. In detail: ð ¼ rðrÞ«xc ðrÞdnðrÞ ELDA xc
EHF x ð1=4Þ
Xn Xn i¼1
j¼1
ð3:19:17Þ
KS KS 1 KS < yKS i ð1Þyj ð2Þjrij jyj ð1Þyj ð2Þ >
ð 1=3 ¼ ð3=4Þð2=pÞ ELDA ½rðrÞ4=3 dnðrÞ x
ð3:19:18Þ ð3:19:19Þ
ð ELDA c
¼ rðrÞ«c ðrÞdnðrÞ
ð3:19:20Þ
ð ¼ f ðrðrÞ; rrðrÞÞdnðrÞ EGGA xc
ð3:19:21Þ
ð 1=3 ¼ ð3=4Þð6=pÞ EGGA ½rðrÞ4=3 dnðrÞ x
ð3:19:22Þ
3.20 ENERGIES IN MAGNETIC FIELDS, AND SPIN-ORBIT COUPLING When an electron of mass m and charge |e| moves with tangential velocity v and angular velocity v, in a circular “Bohr” orbit of radius r around a nucleus of charge Z|e|, where
v ¼ r nr2
ðð2:4:78ÞÞ
then this electron has an orbital angular momentum L: L ¼ mr n ¼ mr2 v ¼ Iv
ðð2:4:77ÞÞ
where I is the scalar moment of inertia: I ¼ mr2
ðð2:4:76ÞÞ
By its circular motion, this electron creates an electrical current j at the center of mass: j ¼ jejn=2pr
ðSIÞ;
j ¼ jejn=2prc
ðcgsÞ
ð3:20:1Þ
3.20
(a)
(c)
(b)
H
FIGURE 3.8
wL
South pole of magnet
T H μL θ v
r +q -|e|
δ
+q (N) μL
μL
- q (S)
-q
L
North pole of magnet
θ
L
direction of torque T (perp. to paper)
dL
L sin θ
By Ampere’s78 law, j can be related to an orbital magnetic dipole moment mL: mL ¼ jA ¼ jpr2
ðSIÞ;
mL ¼ jA ¼ jpr2
ðcgsÞ
ð3:20:2Þ
where A ¼ pr2 is the circular area traced out by the current (Fig. 3.8). The SI units of magnetic moment are amperes m2 (the cgs-esu units are statamperes cm2; cgs-emu units are abamperes cm2). Combining Eqs. (3.20.1) and (3.20.2), the magnetic moment of the electron in the circular orbit becomes mL ¼ jejvr=2
ðSIÞ;
mL ¼ jejvr=2c ðcgsÞ
ð3:20:3Þ
Even though magnetic monopoles do not exist, this magnetic dipole moment vector mL can also be considered conceptually (Fig. 3.8a) as consisting of two magnetic monopoles, the “North pole” q, and the “South pole” q linked by the vector d: mL ¼ qd ðSIÞ;
mL ¼ qd
ðcgsÞ
ð3:20:4Þ
Since the electron carries a negative charge, the orbital magnetic moment vector mL points in the direction opposite to that of the angular momentum vector L. The ratio of the magnetic moment mL to the orbital angular momentum L is: mL =L ¼ jej=ð2mÞ ðSIÞ;
mL =L ¼ jej=ð2mcÞ
ðcgsÞ
ð3:20:5Þ
which, after multiplying and dividing by h, can be rewritten as mL ¼ ðgL be =hÞL
78
18 3
E N E R GIES IN M AGNE T IC F IELDS , A N D SPI N - O RBI T C O U PLI N G
ðSIÞ;
Andre-Marie Ampere (1775–1836).
mL ¼ ðgL be =hÞL
ðcgsÞ
ð3:20:6Þ
(a) Orbital angular momentum vector L for a negative charge |e| rotating with tangential velocity v in a circular orbit of radius r, and the equivalent magnetic dipole moment (mL) due to equivalent magnetic “charges” þ q (“North pole” of dipole) and q (“South pole” of dipole). (b) If a magnetic dipole mL is placed in an external magnetic field B, it experiences a torque T perpendicular to mL and perpendicular to B. (c) Larmor precession of magnetic dipole mL in a magnetic field B at angular frequency vLarmor.
184
3
QUA NT UM M ECH AN ICS
or else as m L ¼ ge L
ðSIÞ;
mL ¼ ge L
ðcgsÞ
ð3:20:7Þ
where be is the electronic Bohr magneton: be jejh=2me ¼ 9:2740154 1024 J T 1
ðSIÞ;
be jejh=2me c ¼ 9:2740154 1021 erg G1
ðcgsÞ
ð3:20:8Þ
and the “orbital g-factor” gL is gL 1
ðSIÞ;
gL 1
ðcgsÞ
ð3:20:9Þ
while the magnetogyric or gyromagnetic ratio ge is given by ge jej=2me ¼ 8:794 1010 C kg1
ðSIÞ;
ge jej=2me c ¼ 8:794 106 G1 s1 ðcgsÞ
ð3:20:10Þ
where C kg1 ¼ radians second1 tesla1. From the point of view of the electron rotating with tangential velocity v in a circular Bohr orbit of radius r around a nucleus (or the center of mass), the nucleus appears to rotate around the electron with a tangential velocity v and to generate a current jint: jint ¼ Zjejv=2pr
ðSIÞ;
jint ¼ Zjejv=2prc
ðcgsÞ
ð3:20:11Þ
By Ampere’s law this current jint due to the “rotating” nucleus produces, at the electron, a magnetic field Bint: Bint ¼ m0 jint r=4pr3 ¼ m0 Zjejn r=8p2 r2 Bint ¼ jint r r3 ¼ Zjejn r=2pcr2
ðSIÞ;
ðcgsÞ
ð3:20:12Þ
The nucleus is also the source for an electric field Eint, measured at the electron: Eint ¼ Zjejr=4p«0 r3
ðSIÞ;
Eint ¼ Zjejrr3
ðcgsÞ
ð3:20:13Þ
The magnetic field at the electron, due to the nucleus, can be rewritten as Bint ¼ n Eint c2
ðSIÞ;
Bint ¼ n Eint c1
ðcgsÞ
ð3:20:14Þ
Similarly, a nucleus of nuclear spin angular momentum quantum number I and vector I has spin angular momentum L: LN ¼ hI
ðSIÞ;
LN ¼ hI
ðcgsÞ
ð3:20:15Þ
3.20
E N E R GIES IN M AGNE T IC F IELDS , A N D SPI N - O RBI T C O U PLI N G
Further, a many-electron atom or molecule of total angular momentum quantum number J and vector J has angular momentum L: Le ¼ hJ
ðSIÞ;
Le ¼ hJ
ðcgsÞ
ð3:20:16Þ
Since the nucleus (always) and the atom or molecule (often) carry a charge, the angular momentum has associated with it a magnetic moment, which for the positively charged nucleus is mN: mN ¼ gN hI ¼ gN bN I
ðSIÞ;
mN ¼ gN hI ¼ gN bN I
ðcgsÞ
ð3:20:17Þ
and for the negatively charged electron it is me: me ¼ ge hJ ¼ ge be J
ðSIÞ;
me ¼ ge hJ ¼ ge be J
ðcgsÞ
ð3:20:18Þ
Here gN is the nuclear magnetogyric ratio, gN is the nuclear gyromagnetic ratio, and bN is the nuclear magneton: bN jejh=2M ðSIÞ;
bN jejh=2Mc
ðcgsÞ
ð3:20:19Þ
where e is the nuclear charge, and M is the mass of the proton (kg). The data for a proton (1H) are gN ¼ 5.585, gN ¼ 2p 4.25788 103 gauss1 1 s , and bN ¼ 5.05038 1024 erg gauss1. For other nuclei, see Table 3.3. The data for a paramagnetic many-electron atom with quantum numbers S, L, and J are ge ¼ 1:00116f1 þ ½JðJ þ 1Þ þ SðS þ 1Þ LðL þ 1Þ=2JðJ þ 1Þg
ð3:20:20Þ
and be ¼ 9.2731 1021 erg gauss1 as in Eq. (3.20.8) (the factor 1.00116 is not exactly 1, as explained below, because of well-understood vacuum polarization effects). Larmor Precession. When a nucleus with magnetic moment mN is placed in an external magnetic field B0, and the angle between the two vectors is y, the orientational potential energy is E ¼ mN B0 ¼ mN B0 cos y
ðSIÞ;
E ¼ mN B0 ¼ mN B0 cos y
ðcgsÞ ð3:20:21Þ
Similarly, when a paramagnetic atom or molecule with magnetic moment me is placed in an external magnetic field B0, the orientational potential energy is: E ¼ me B0 ¼ me B0 cos y
ðSIÞ;
E ¼ me B0 ¼ me B0 cos y
ðcgsÞ ð3:20:22Þ
It is very important to realize that space quantization is a fundamental quantum effect that limits the allowed orientations of mN or me with respect to
18 5
186
3
QUA NT UM M ECH AN ICS
Table 3.3 Nuclear Spin I, Nuclear Gyromagnetic Ratio gN, Isotopic Abundance, Nuclear Electric Quadrupole Moment Q, and NMR Resonance Frequency n at 1 Tesla (10,000 Gauss) for Nuclei of Interest to NMR and NQRa Nucleus 0n
1
1H 1H
1 2
7 3Li 12 6C 13 6C 14 N 7 15 7N 16 8O 17 8O 19 F 9
23 11Na 28 14Si 29 14Si 31 P 15 32 16S 33 16S 35 Cl 17 37 17Cl 39 19K 40 20Ca
Spin I 1/2 1/2 1 3/2 0 1/2 1 1/2 0 5/2 1/2 3/2 0 1/2 1/2 0 3/2 3/2 3/2 3/2 0
gN 3.826 5.585 0.857 2.171 0 1.405 0.403 0.567 0 0.757 5.257 1.478 0 1.111 2.263 0 0.429 0.548 0.456 0.261 0
Isotopic Abundance
Q (1024 cm2)
n (MHz) at H ¼ 1 Tesla
0 0.99985 0.00015 0.925 0 0.011 0.9963 0.0037 0.996 0.004 1 1 0.9223 0.0467 1 0.9502 0.0075 0.7577 0.2423 0.932581 0.96941
0 0 0.00274 0.02 0 0 0.02 0 0 0.0265 0 1.00 or 0.836 0 0 0 0 0.064 0.079 0.062 0.113 0
42.577 6.537 16.549 0 10.705 3.078 4.316 0 5.774 40.055 11.270 0 8.466 17.235 0 3.269 4.176 3.476 1.989 0
If I < 1, then Q ¼ 0. Nuclei with spin I 1 also have an electric quadrupole moment Q: Q > 0 if the nucleus is a prolate spheroid, while Q < 0 when the nucleus is an oblate spheroid. Five isotopes with I ¼ 0 are also listed for pedagical emphasis [26]. a
an external field B0. In particular, there can be no more than 2I þ 1 orientations of mN with respect to B0, and no more than 2J þ 1 (or 2S þ 1 if the paramagnetism is quenched) equally spaced orientations of me with respect to B0. For instance, if I ¼ 1/2, then the nuclear spin can only have two orientations, MI ¼ 1/2 (upper energy) and MI ¼ þ 1/2 (lower energy) relative to the external field. For a paramagnetic ion or molecule with L ¼ 0, S ¼ 1/2, the electron spin can only have two orientations, MS ¼ 1/2 (upper energy) and MS ¼ 1/2 (lower energy). The absorption of a photon (spin I ¼ 1) allows the transition of a 1H proton from the lower-energy state (mNlow) with MI ¼ 1/2 to the upper-energy state (mNhigh) with MI ¼ 1/2 (Fig. 3.9), or the transition of an electron from the lower-energy state (mlow e ) with MS ¼ 1/2 to the upperenergy state (mehigh) with MS ¼ þ 1/2. A nucleus with a well-defined quantized total angular momentum I cannot minimize Eq. (3.20.21) by setting y ¼ 90 —that is, by realigning and becoming normal to the external field direction B0. Similarly, an orbiting atom with a well-defined quantized total angular momentum J cannot minimize Eq. (3.20.22) by setting y ¼ 90 —that is, by realigning and becoming normal to the field direction B0, but maintains the magnitude of J (“space quantization”), and the magnitude of me; instead of moving into alignment
3.20
18 7
E N E R GIES IN M AGNE T IC F IELDS , A N D SPI N - O RBI T C O U PLI N G
CASE (A): positively charged particle (e.g., proton with I=1/2)
South pole of magnet torque out of paper, for mI= +1/2 (clockwise precession)
B
Energy E = - μΝ ⋅B = = - μΝ B cos θ
μNlow= gN βN I
mI = +1/2 45˚ 135˚
μNhigh= gN βN I
mI = -1/2
torque into paper for mI = -1/2 (conterclockwise precession) Torque T = μ × B (into paper for mI = -1/2, out of paper for mI = +1/2) North pole of magnet
CASE (B): negatively charged particle (e.g., electron with S = 1/2)
South pole of magnet torque out of paper, for mS= +1/2 (clockwise precession)
B
μehigh= - ge |βe| S
mS = +1/2 45˚ 135˚ mS = -1/2
FIGURE 3.9
Energy E = - με⋅B= = +|με| B cos θ
Direction of Larmor precession in lower and upper magnetic states Case A: for a nucleus of spin I ¼ 1/2 in an external field B Case B: for an electron of spin S ¼ 1/2 in an external field B. The transition implies an interchange of Larmor precession direction from counterclockwise to clockwise.
μelow= -ge |βe | S torque into paper for mS= -1/2 (conterclockwise precession)
Torque T = μ × B (into paper for mS= -1/2, out of paper for mS= +1/2)
North pole of magnet
with B0, the electron experiences a torque T which will tend to turn the moment in a direction normal to B0: T ¼ me B0
ðSIÞ;
T ¼ me B0
ðcgsÞ
ð3:20:23Þ
This can be shown to cause a Larmor precession with angular frequency vL (Problem 3.20.2): vL ¼ ge mB h1 B0 ¼ ge ðe=2me ÞB0
ðSIÞ;
vL ¼ ge mB h1 B0 ¼ ge ðe=2me cÞB0
ðcgsÞ
ð3:20:24Þ
This Larmor precession can be understood, if one considers the classical torque T experienced by a body with angular momentum L affected by a force F perpendicular to L: T ¼ r F ¼ dL=dt
ðSIÞ;
T ¼ r F ¼ dL=dt ðcgsÞ
ð2:4:80Þ
188
3
QUA NT UM M ECH AN ICS
This is the angular analog to Newton’s second law. Since from the above we have mN ¼ gN hI
ðSIÞ;
mN ¼ gN hI
ðcgsÞ
ð3:20:25Þ
me ¼ ge hJ
ðSIÞ;
me ¼ ge hJ
ðcgsÞ
ð3:20:26Þ
we obtain dme =dt ¼ ge me B0
ðSIÞ;
dme =dt ¼ ge me B0
ðcgsÞ
dmN =dt ¼ gN mN B0
ðSIÞ;
dmN =dt ¼ gN mN B0
ð3:20:27Þ
ðcgsÞ ð3:20:28Þ
A solution to this equation is the precession of the vector L (and the associated magnetic moments me and mN) with a Larmor frequency vL given by vL ¼ ve for an electron and vL ¼ vN for a nucleus vN ¼ gN B0 v e ¼ ge B 0
ðSIÞ; ðSIÞ;
vN ¼ gN B0
ðcgsÞ
ð3:20:29Þ
ve ¼ ge B0
ðcgsÞ
ð3:20:30Þ
For free electrons (ge ¼ 2.00232) the Larmor precession frequency ve is 2.80 MHz gauss1 (28.0 GHz T1), which for a field of 3400 gauss (0.34 tesla) corresponds to a frequency ne ¼ oe/2p ¼ 9.52 GHz (angular frequency ve ¼ 5.98 1010 radians s1), and to a wavelength of 3.15 cm (microwave region). For a proton, the Larmor precession frequency vN corresponds to 429 kHz gauss1 (42.9 MHz T1), which for a field of 14,000 gauss (1.4 T) corresponds to a frequency of 60 MHz. In analogy with Eq. (3.20.14), this external field B0 can also be associated with an external electric field E0: B0 ¼ v E0 c2
B0 ¼ v E0 c1
ðSIÞ;
ðcgsÞ
ð3:20:31Þ
so that 1 vL ¼ ðjejm1 e c Þv E0
ðSIÞ;
2 vL ¼ ðjejm1 e c Þv E0
ðcgsÞ ð3:20:32Þ
or vL ¼ ðgL be =hÞB0
ðSIÞ;
vL ¼ ðgL be =hÞB0
ðcgsÞ
ð3:20:33Þ
The g ¼ 2 Puzzle. When an atom vapor passes through a magnetic field, the “normal” Zeeman effect splits an optical absorption or emission line into an odd number of lines: for example, in a P state the normal Zeeman effect splits the optical spectrum of an atom into three lines, which argues for ML ¼ 1, 0, þ 1 states, whence L ¼ 1, and gL ¼ 1. In contrast, electron spin caused an early conundrum, the “anomalous” Zeeman effect: Atoms with an odd number of electrons, placed in a magnetic field, showed a complicated number of lines. If L ¼ 0, two lines were seen,
3.20
E N E R GIES IN M AGNE T IC F IELDS , A N D SPI N - O RBI T C O U PLI N G
which argued for MS ¼ 1/2 and þ 1/2 and for S ¼ 1/2. However, the precession for electron spin was twice that expected for S ¼ 1/2. The “fix” was that S ¼ 1/2 but gS ¼ 2, as will be explained below. In the Stern–Gerlach experiment, a single beam of hot silver atoms, placed in an inhomogenous magnetic field, was split into two beamlets. Spin–Orbit Interaction. If an electron of mass m and orbital angular momentum L moves with a velocity v in an electric field (internal to the atom or molecule) Eint, or in an electric potential fint(r) (due to the nucleus), then it experiences, in addition to Eint, a magnetic field Bint. If the potential is spherically symmetric, then the electric field is simply Eint ¼ jej1 rfint ¼ ð1=jejcrÞrðdfint =drÞðSIÞ Eint ¼ jej1 rfint ¼ jej1 r1 rðdfint =drÞ
ðcgsÞ
ð3:20:34Þ
Then Bint is given by Bint ¼ ð1=jejc2 rÞðdfint =drÞr v ðSIÞ;
Bint ¼ ð1=jejcrÞðdfint =drÞr v ðcgsÞ ð3:20:35Þ
Using the definition of angular momentum, Eq. (2.4.77), this field becomes Bint ¼ ð1=jejme c2 rÞðdfint =drÞ L ðSIÞ;
Bint ¼ ð1=jejme crÞðdfint =drÞL ðcgsÞ ð3:20:36Þ
In the simple case of the electron interacting with a proton in a oneelectron atom, the electric field at the electron due to the proton is Eint ¼ jejr=4p«0 r3
ðSIÞ;
Eint ¼ jejr=r3
ðcgsÞ
ð3:20:37Þ
and the magnetic field is Bint ¼ c2 Eint v
ðSIÞ;
Bint ¼ c1 Eint v
ðcgsÞ
ð3:20:38Þ
Let us assume, in analogy to Eq. (3.20.6), that the conversion factor between the electron spin angular momentum S and the concomitant spin magnetic moment mS is mS ¼ ðgS be =hÞS
ðSIÞ;
mS ¼ ðgS be =hÞS
ðcgsÞ
ð3:20:39Þ
where the gyromagnetic ratio for electron spin gS will be proven below to be (except for quantum electrodynamics corrections): gS ¼ 2
ðSIÞ;
gS ¼ 2 ðcgsÞ
ð3:20:40Þ
We next derive the spin-orbit coupling energy. The spin S will interact with the magnetic field Bint due to the orbital magnetic moment L, and
18 9
190
3
QUA NT UM M ECH AN ICS
generate an interaction energy (called the spin-orbit interaction ESO); by combining Eq. (3.20.36) and (3.20.39) we get DE ¼ mS Bint ¼ ðgS be =jejme c2 rhÞðdfint =drÞL S
ðSIÞ
DE ¼ mS Bint ¼ ðgS be =jejme crhÞðdfint =drÞL S
ðcgsÞ
ð3:20:41Þ
Using Eq. (3.20.8), the quantum-mechanical ^L S^ Hamiltonian operator for spin–orbit coupling is ^ SO ¼ ðgS jej=2m2 c2 rÞðdfint =drÞ^L S^ ðSIÞ H e ^ SO ¼ ðgS jej=2m2 crÞðdf =drÞ^L S^ ðcgsÞ H int e
ð3:20:42Þ
For a one-electron atom with a Coulomb potential fint ¼ Zjej=4p«0 r ðSIÞ;
fint ¼ Zjej=r ðcgsÞ
ð3:20:43Þ
and eigenfunctions cnlm ¼ Rn‘(r)Ylm(y, f), the spin–orbit energy ESO is (Problem 3.20.3) ESO ¼ hca2 En Z2 =nlð2l þ 1Þ
hL Si
ðSIÞ;
ESO ¼ hca RH Z =n lðl þ 1=2Þðl þ 1Þ
hL Si
ðSIÞ
2
4
3
ð3:20:44Þ
where RH is the Rydberg constant for hydrogen, Eq. (3.5.42), En is the energy for the one–electron atom, Eq. (3.5.43), and a is the fine-structure constant, Eq. (3.6.29). For hydrogen, ESO is about (1/137)2 times smaller than the Rydberg energy (Problem 3.20.5), but, with its dependence on Z4, ESO becomes quite important for heavy nuclei, where Z is large. Thomas Precession. The magnetic field BN due to the nucleus will cause the spin magnetic moment mS of the electron to precess around it with a Larmor frequency vN: vN ¼ ðgS be =hÞBN
ðSIÞ
ð3:20:46Þ
Given this magnetic field, the orientation of the spin magnetic moment mS in it will produce an orientational potential energy: DE ¼ mS BN ¼ ðgS be =hÞBN S
ð3:20:47Þ
However, a special relativistic, or kinematical, correction, is necessary: it is the Thomas precession. The electron orbiting around the nucleus with speed v (where v is a reasonably large fraction of the speed of light c) causes the period of one full rotation around the nucleus to be T in the fast-moving electron rest frame, but a longer time T0 (time dilatation) in the stationary rest frame of nucleus [see Eq. (2.13.11)]: T 0 ¼ Tg ¼ Tð1 v2 =c2 Þ1=2 T=ðv2 =2c2 Þ
ð3:20:48Þ
The Thomas precession frequency oT is defined as the difference between 2p/T and 2p/T0 , where we keep only the first two terms of a Maclaurin series: oT 2p=T 2p=T 0 ¼ ð2p=T 0 Þð1=g 1Þ ð2p=T0 Þðv2 =2c2 Þ
ð3:20:49Þ
3.20
E N E R GIES IN M AGNE T IC F IELDS , A N D SPI N - O RBI T C O U PLI N G
Use (2p/T0 ) ¼ onet, L ¼ onetmer2 and the centripetal acceleration a ¼ v2/r to get oT ¼ onet v2 =2c2 ¼ onet v2 =2c2 ¼ rvr2 v2 =2c2 ¼ v3 =2rc2
ð3:20:50Þ
¼ vðr1 v2 Þ=2c2 ¼ av=2c2 or, in vector form, using a ¼ |e| EN/me: vT ¼ ðv a=2c2 Þ ¼ þðjej=2me c2 Þv EN
ð3:20:51Þ
Since, in analogy to the orbital angular momentum case, Eqs. (3.20.32) and (3.20.33) teach us that for the spin angular momentum ve ¼ ðgS be =hÞBN ¼ ðgS be =hc Þv EN
ðSIÞ;
ve ¼ ðgS be =hÞBN ¼ ðgS be =hcÞv EN
ðcgsÞ
ve ¼ ðgS jej=2me cÞv EN
ðSIÞ;
ð3:20:52Þ
ve ¼ ðgS jej=2me c2 Þv EN
ðcgsÞ ð3:20:53Þ
the net angular frequency of precession vnet, combining the classical Larmor precession ve with the special relativistic Thomas precession vT, is given by vnet ¼ ve þ vT ¼ ðgS jej=2me c2 Þv EN þ ðjej=2me c2 Þv EN vnet ¼ ðgS þ 1Þðjej=2me c2 Þv EN
ð3:20:54Þ ð3:20:55Þ
or, equivalently, using Eq. (3.20.52), vnet ¼ ðgS þ 1Þðjej=2me c2 ÞBN
ðSIÞ;
vnet ¼ ðgS þ 1Þðjej=2me cÞBN
ðcgsÞ
ð3:20:56Þ From Eq. (3.20.52) and the definition of the Bohr magneton, Eq. (3.20.8), we obtain ve ¼ ðgS jej=2me c2 ÞBN
ðSIÞ;
ve ¼ ðgS jej=2me cÞBN
ðcgsÞ
ð3:20:57Þ
If, by omitting the factor gS in Eq. (3.20.46), we assume that ve becomes vnet, then vnet ¼ ðjej=2me c2 ÞBN
ðSIÞ;
vnet ¼ ðjej=2me cÞBN
ðcgsÞ
ð3:20:58Þ
thus, equating Eq. (3.20.56) and (3.20.58) we finally confirm gS ¼ 2
ð3:20:40Þ
There is a further relativistic correction to the energy of the one-electron atom, which competes in magnitude and importance with the spin–orbit coupling. It can be analyzed directly using the special relativistically correct
19 1
192
3
QUA NT UM M ECH AN ICS
Dirac equation, but an approximate perturbation treatment of the Schr€ odinger equation, due to Sommerfeld, gives the same result: In the energy E ¼ T þ V, the kinetic energy Tnonrel ¼ p2/2me should be replaced by a power series approximation: Trel ¼ ðc2 p2 þ me 2c4 Þ1=2 me c2 2 1=2 ¼ mc2 ½ð1 þ p2 m2 1 e c Þ 1 4 4 4þ 2 2 . . . ¼ mc2 ½21 m2 e c p 8 me c p
ð3:20:59Þ
2 p2 =2me p4 81 m3 e c
and then the corrected energy becomes, after some labor, Erel ¼ ½Z2 me4 =ð2n2 h2 Þf1 þ ðZ2 a2 =nÞ½2=ð2l þ 1Þ 3=4g
ð3:20:60Þ
which returns the Bohr–Schr€ odinger answer as the first term; the second term is the relativistic correction, which is of the same order of magnitude as the spin–orbit energy. PROBLEM 3.20.1. Prove Eq. (3.20.23) from the definition of torque, Eq. (2.4.80). PROBLEM 3.20.2. Prove Eq. (3.20.24) [27]. PROBLEM 3.20.3. Prove Eq. (3.20.44) [28]. PROBLEM 3.20.4. From the definition of magnetic field, Eq. (2.5.4) derive Eq. (3.20.14): B0 ¼ c2 v E0 PROBLEM 3.20.5. Show that for hydrogen in the 2p state the spin–orbit coupling energy ESO [Eq. (3.20.44)], is of the order of 2.9 1023 J. PROBLEM 3.20.6. From the spin–orbit coupling energy, Eq. (3.20.45) and from Problem 3.20.5, show that the local magnetic field for the 2p state of hydrogen is of the order of 3 tesla.
3.21 TERMS OF THE HAMILTONIAN OPERATOR FOR A MANY-ELECTRON ATOM OR MOLECULE [14] For the N-electron atom, we have seen (Section 3.7) several terms in the Hamiltonian operator. We collect here some more terms, to come to a “final list,” within the Born–Oppenheimer approximation of a fixed nucleus: 1. Kinetic energy of the electrons: i¼N h2 X T^ ¼ r2 2me i¼1 i
ð3:21:1Þ
3.21
TER MS OF TH E HA MILTO NIA N OPER AT OR FO R A MA NY -ELEC TR ON AT OM OR M OLEC ULE
2. Electrostatic electron–nucleus attractive Coulomb interaction energy: i¼N 2 A¼M XX 1 ^ en ¼ Ze V 4p«0 A¼1 i¼1 riA
ð3:21:2Þ
3. Electrostatic electron–electron repulsive Coulomb interaction energy (direct þ exchange): j¼N i¼j1 2 X X 1 ^ ee ¼ e V 4p«0 j¼2 i¼1 rij
ð3:21:3Þ
4. Electron spin–orbit energy: ^ SO ¼ V
i¼N 1 X li si dV 2 2 m c «0 i¼1 ri dri
ð3:21:4Þ
(for Russell79-Saunders80 coupling) L S
Xi¼N i¼1
Fðri Þ
ð3:21:5Þ
Russell–Saunders (RS) coupling [29] occurs, for light atoms, when the individual electron orbital angular momenta li and electron spin angular i add, to form “good” (i.e., valid) vectors P momenta sP i¼N l , and S ¼ L ¼ i¼N i i¼1 i¼1 s i . These two vectors then couple to form the total electronic angular momentum vector J ¼ L þ S. For heavy elements, j–j coupling predominates: L and S no longer exist as “good” quantum numbers, but an individual electron angular momentum vector ji ¼ li þ s i forms; then these ji add up to form P j (it seems the same in the end, but it is not). J ¼ i¼N i i¼1 5. (Electron spin magnetic moment)–(electron spin magnetic moment) dipolar interaction energy: " # i¼N j¼i1 X s i sj si rij sj rij m0 X ^ V ss ¼ ðSIÞ 3 4p i¼2 j¼1 r3ij r5ij
ð3:21:6Þ
This interaction leads to “fine-structure” splittings in the spectra of atoms and molecules. For atoms and molecules in the S ¼ 1 triplet state, the electron spin–electron spin dipolar interaction leads to the “D and E” fine-structure Hamiltonian. ^ fs ¼ DðS^2 S^2 Þ þ EðS^2 S^2 Þ H z x y where D and E are energy parameters that can be obtained from experiment: D describes the “spherical size” of the magnetic interaction, while E represents the departure from spherical symmetry of this
79 80
Henry Norris Russell (1877–1957). Frederick Albert Saunders (1875–1963).
19 3
194
3
QUA NT UM M ECH AN ICS
interaction; this description is valid in some local xyz coordinate system, frozen with respect to the atomic or molecular orientation, where these systems are “diagonal” (see Section 2.4). 6. (Electron orbital angular magnetic moment)–(electron orbital magnetic moment) interaction energy: ^ oo ¼ V
i¼N X i1 X
Cij li lj
ð3:21:7Þ
i¼2 j¼1
7. (Electron spin magnetic moment)–(nuclear spin magnetic moment) dipolar interaction energy: i¼N XX si mnucl;A si r iA mnucl;A r iA m0 A¼M ^ V esns ¼ 3 4p A¼1 i¼1 r3iA r5iA
ð3:21:8Þ
This interaction leads to hyperfine splittings in atomic and molecular spectra. One particular term for this interaction is the Fermi contact term, which dominates chemical shifts in nuclear magnetic spectra and splittings in electron paramagnetic spectra: ^ ¼ a^I S ^ H 8. Electron orbital moment–nuclear spin interaction energy: A¼M i¼N XX li mnucl;A ^ eons ¼ «0 m0 V 2 8p m A¼1 i¼1 r3iA
ð3:21:9Þ
9. Nuclear electric quadrupole moment interaction with electric field gradient 10. Nucleus size effects 11. Electron spin–other electron orbit interactions 12. Relativistic effects: ^¼ R
i¼N h2 X r4 16m4 c2 i¼1 i
ð3:21:10Þ
The exchange Coulomb electron–electron repulsion is very large: this causes an “exchange correlation”, whereby the spins tend to align antiparallel to each other. The Lande81g-factor in its full glory is [3.10]: g ¼ gL
jð j þ 1Þ sðs þ 1Þ þ lðl þ 1Þ jð j þ 1Þ þ sðs þ 1Þ lðl þ 1Þ þ gS 2jð j þ 1Þ 2jð j þ 1Þ ð3:21:11Þ
81
Alfred Lande (1888–1976).
3.22
19 5
“VA N DE R W A A L S ” I N T E R A C T I O N S
where gL ¼ 1.0, gS (exp.) ¼ 2*(1.001165 0.000011) gS (theory) ¼ 2 * [1 þ a/ 2p 0.328 (a/2 p)2 þ ] ¼ 2*(1.0011596), and a is the fine-structure constant: the additional terms are radiative corrections (quantum electrodynamic “polarization of vacuum” or “spin–spin interactions under zero-point motion).
3.22 “VAN DER WAALS” INTERACTIONS The Dutch scientist van der Waals,82 in his improvement of the equation of state of ideal gases, modified for the interactions between gas molecules, talked about weak intermolecular energies that decay with the sixth power of the intermolecular distance. The term “van der Waals potential” is a portmanteau term for several weak potentials with different physical origins: 1. Dipolar, or permanent-dipole-permanent dipole energy Edd for molecules of permanent scalar electric dipole moments mi and mj firmly oriented in space and separated by a distance rij: " # N X i1 m m X mi r ij mj r ij i j Edd ¼ ð3:22:1Þ 3 r 3ij r 5ij i¼2 j¼1 which was first described for molecules by Keesom83 in 1921. This potential is strictly dependent on the inverse third power of the intermolecular distance rij. However, when thermal motion averages all dipole orientations and involves a Maxwell84–Boltzmann85 distribution, an arrangement with a weak potential energy E has a probability proportional to exp (E/kBT), then PN Pi 2 2 2 6 hEdd i ¼ i¼2 j¼1 mi mj rij 3kB T
ð3:22:2Þ
and an inverse-sixth power dependence on intermolecular distance is found. 2. Induction or permanent dipole-induced dipole energy or Debye86 energy: Eid ¼
XN Xi1 i¼2
j¼1
m2i aj jr ij j6
ð3:22:3Þ
This potential involves the permenent dipole mi on molecule i inducing a dipole moment in the second molecule miaj due to the polarizability of the second molecule j.
82 83
Johannes Diderick van der Waals (1837–1923).
Willem Hendrik Keesom (1876–1956). James Clerk Maxwell (1831–1879). 85 Ludwig Boltzmann (1844–1906). 86 Peter Joseph William Debye (1884–1966). 84
196
3
QUA NT UM M ECH AN ICS
3. Dispersion or London87energy Eii is due the fact that all bonds have a zero–point motion, which allows an instantaneous fluctuating induced dipole on one molecule to induce a second instantaneous induced dipole on a second molecule: this is an additive potential, whose form is XN Xj 3 Eii ¼ ðhv0 e4 k2 jr j6 H Þ i¼2 j¼1 ij 4
ð3:22:4Þ
where hn0 is the zero-point vibrational energy, kH is the Hooke’s law force constant, and |e| is the electronic charge. London’s second-order perturbation expression for this energy is N X i X 1 X m hri0m i2 hrj0n i2 2 X Eii ¼ e4 3 i¼2 j¼1 m¼2 n¼1 hrij i6 jEi0 Eim þ Ej0 Ejm j
ð3:22:5Þ
where the first transition moment of the molecular wavefunction for the ground state ci0 and the excited state cim (with energies Ei0 and Eim) is given by ð hr i0m i ¼ ci0 *r i cim dV
ð3:22:6Þ
Using the polarizability of atom i in a static field: ai ¼ ð2e2 =3Þ
X1 m¼1
hr i0m i2 ½Ei0 Eim 1
ð3:22:7Þ
and Ii and Ij as the ionization energies of atoms i and j, the London dispersion energy becomes: Eii ¼
3 XN Xi a a hr i6 Ii Ij ðIi þ Ij Þ1 i¼2 j¼1 i i ij 2
ð3:22:8Þ
Dispersion forces are present in all systems, polar or nonpolar, electrically charged or neutral, dominate the biochemical processes of forming alpha-helices, and bind the A, T, G, and C parts of the two strands of deoxyribonucleic acid (DNA). 4. “Hydrogen bonds” are not due to a separate potential; they involve the attraction between an H atom that is covalently bonded to molecule 1 and electronegative atoms (O, N, etc.) in molecule 2 that are between 0.15 nm and 0.25 nm from the H atom. This hydrogen bond interaction is a combination of Keesom, Debye, and London interactions. Van der Waals and similar interactions are discussed again in Section 8.10.
87
Fritz Wolfgang London (1900–1954).
3.23
19 7
MA NY-ELE CTRON AT OMS
3.23 MANY-ELECTRON ATOMS For many-electron light atoms, the Russell–Saunders coupling rules prevail: One combines the orbital angular momenta li of each electron, treated as a vector, to form P the total orbital angular momentum quantum number (and vector) L ¼ i¼N i¼1 li ; one similarly couples the spin angular momentum quantum numbers si into a total spin angular momentum quantum number P S ¼ i¼N i¼1 si ; then one adds L and S to get the total angular momentum vector J ¼ L þ S. For heavy P elements (Pb to U), the jj coupling predominates: ji ¼ li þ si, and J ¼ i¼N i¼1 ji , but neither L nor S are good quantum numbers. Figure 3.10 shows two energy diagrams that illustrate the consequences of these two extreme forms of coupling. For atoms in the “middle” of the periodic table, intermediate cases between LS and jj coupling become possible (Fig. 3.11).
1P 1D
2S+1L
(L=2-1)
1P
(L=2+0) 1D
S=0 1F
singlets n1=4, n2=4 l1=1, l2=2 s1=1/2, s2=1/2
J
1
(L=2+1)
1F
"terms"
2
3
"levels"
4p 4d 3P
"configuration" S=1
3F
triplets unperturbed
+
RUSSELL-SAUNDERS or LS COUPLING
2 1 0 3 2 1 4 3 2
3D
el-el spin-spin el-el exchange Coulomb + direct Coulomb repulsion repulsion
0,1,2
3D 3F
1,2,3
2,3,4
+ spin-orbit energy SPLITS J
SPLITS L
SPLITS S
3P
FIGURE 3.10 j1, j2
(J=3/2+5/2)
3/2,5/2 (J=3/2+3/2)
n1=4, n2=4 l1=1, l2=2 s1=1/2, s2=1/2
3/2,3/2
4p 4d
(J=3/2-3/2) (J=5/2+1/2)
1/2, 5/2
configuration
(J=5/2-1/2) 1/2, 3/2 (J=3/2-1/2) unperturbed jj COUPLING
+
spin-orbit energy DEFINES j
+
(j1,j2)J 4 3 2 (3/2,5/2)1,2,3,4 1 3 2 1 (3/2,3/2)0,1,2,3 0 3 (1/2,5/2)2,3 2 2 (1/2,3/2)1,2 1
el-el direct and exchange Coulomb interactions DEFINES J
Schematic energy level diagrams for LS and jj coupling in atomic spectra. The energy hierarchy from coarsest to finest is “configuration” (e.g., 4s 4p) ! “term” (e.g., 3 P) ! level (e.g., 3 P1 ) ! “state.” The symbol specifies the level: 3 P0 means (3 ¼ spin multiplicity) {P: orbital angular momentum L ¼ 1}[1 ¼ J: level]. Sometimes the MJ quantum number must also be specified and is given as an additional superscript, and then the state is 3 P1 1 ; this state has then quantum numbers S ¼ 1, L ¼ 1, J ¼ 1, MJ ¼ 1.
198
3
Pure LS
1S
QUA NT UM M ECH AN ICS Pure jj
0
3/2,3/2
1D
3P
2
3/2,1/2
0,1,2
FIGURE 3.11
1/2,1/2
Qualitative sketched transition from LS to jj coupling. From Atkins [30].
H
Si
Ge
Sn
Pb
The terms with L ¼ 0, 1, 2, 3 are called S, P, D, F; these letters were adapted from the pre-1900 labeling of spectroscopic transitions as giving “sharp,” “principal,” “diffuse”, and “fundamental” lines. A problem that is worthy of some attention in LS coupling is the configuration p2 ¼ (np)(np), where two electrons with l ¼ 1 share the same principal quantum number n. In that case the Pauli exclusion principle will require that the terms with all quantum numbers the same be excluded. If we consider l1 ¼ 1, l2 ¼ 1, s1 ¼ 1/2, s2 ¼ 1/2, we are tempted to think that L ¼ jl1 þ l2 j; . . . ; etc:; . . . ; jl1 l2 j ¼ 2, 1, 0 be possible, along with S ¼ |s1 þ s2|, . . ., etc., . . ., |s1 s2| ¼ 1, 0, giving rise to the candidate terms 3 D; 3 P; 3 S; 1 D; 1 P; 1 S. But this is incorrect, as we can see from Table 3.4: Only the term symbols 1 D; 3 P, and 1 S remain. Nothing so far tells us how these terms are ordered in energy. For this, three empirical Hund88rules apply: Rule 1: The term of maximum (spin S) multiplicity is lowest in energy. Rule 2: For a given multiplicity, the term with largest L lies lowest. Rule 3: For atoms with less than a half-filled shell, the level with lowest J is lowest. So, for p2, the ordering in energies is (Rule 1): 3 P < f1 D; 1 Sg, (Rule 2) 1 D < 1 S, and (Rule 3) within the term 3 P, the state 3 P1 is lowest. For equivalent electrons (i.e., electrons with the same principal quantum numbers) the term symbols given in Table 3.5 are possible. For spectroscopic transitions in many-electron atoms, the selection rules under LS-coupling are similar to those for the one-electron atom:
88
DL ¼ 0; 1;
DS ¼ 0;
DMJ ¼ 1; 0
ðbut MJ ¼ 0
Friedrich Hermann Hund (1896–1997).
DJ ¼ 1; 0
ðbut J ¼ 0==
== ! MJ ¼ 0 if DJ ¼ 0Þ
== ! J ¼ 0Þ
3.23
19 9
MA NY-ELE CTRON AT OMS
Table 3.4 Quantum Numbers ml and ms for Two Electrons in Configuration p2, Microstate Number, and Term Symbol Assignmenta L¼
mstate# þ
þ
(1 , 1 ) (1 þ , 1) (1 þ , 0 þ ) (1 þ , 0) (1, 0 þ ) (1, 0) (1 þ , 1 þ ) (1 þ , 1) (1, 1 þ ) (1 þ , 1) (0 þ , 0 þ ) (0 þ , 0) (0, 0) (0 þ , 1 þ ) (0 þ , 1) (0, 1 þ ) (0, 1) (1, 1 þ ) (1 þ , 1) (1, 1)
— 1 2 3 4 5 6 7 8 9 –– 10 –– 11 12 13 14 –– 15 ––
2 2 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 2 2 2
S¼ 1 0 1 0 0 1 1 0 0 0 1 0 1 1 0 0 1 1 0 1
Term Symbol None D 3 P (1 D) (3 P) (3 P) (3 P) (1 D) (3 P) (3 P) None 1 S None (3 P) (1 D) (3 P) (3 P) None (1 D) None 1
Notes 3
D not allowed (1, 1 þ ) is not a new state (0 þ , 1 þ ) is not a new state (0, 1 þ ) is not a new state (0 þ , 1) is not a new state (0, 1) is not a new state (1 þ , 1 þ ) is not a new state (1, 1 þ ) is not a new state (1 þ , 1) is not a new state (1, 1) is not a new state (0, 0 þ ) is not a new state (1 þ , 0 þ ) is not a new state (1, 0 þ ) is not a new state (1 þ , 0) is not a new state (1, 0) is not a new state (1, 1) is not a new state
a By Pauli’s exclusion principle, the two kets must have different quantum numbers, since they share n and l. Shorthand: (1 þ , 0) means the ket | n 1 þ 1 þ 1/2> for electron 1 and the ket | n 1 0 1/2> for electron 2. The term symbol assignments are also made. Note that 1 D needs five microstates (L ¼ 2, 1, 0, 1, 2), and the term 3 P needs nine microstates (L ¼ 1,0, 1; S ¼ 1, 0, 1), and so on. However, a given microstate L ¼ 0, S ¼ 0 may be assigned to several terms; the symbol shown is just for enumeratiom. The term symbols in parentheses mean that these are part of the manifold of a term assigned previously. Given Lmax ¼ 2, Smax ¼ 1, one can consider the following possibilities: 3 D; 1 D; 3 P; 1 P, and finally 1 S. 3 D is disallowed by Pauli’s principle; starting with 1 D and its five microstates, then with 3 P and its nine microstates, one is left with only one microstate, which must be 1 S. Therefore 1 P is not present. The allowed terms are 1 D; 1 S, and 3 P. This analysis will not tell us which is lowest in energy.
Table 3.5
Term Symbols for Equivalent Electronsa
Number of States Configuration 1 6 15 20 10 45 120 210 252 14 91 364 1001 2002 3003 3432 a
2
s p1, p5 p2, p6 p3 d1, d9 d2, d8 d3, d7 d4, d6 d5 f 1, f 13 f 2, f 12 f 3, f 11 f 4, f 10 f 5, f 9 f 6, f 8 f7
Term Symbols (LS Coupling) 1
S P 1 ðSDÞ; 3 P 2 ðPDÞ; 4 S 2 D 1 ðSDGÞ; 3 ðPFÞ 2 D; 2 ðPDFGHÞ; 4 ðPFÞ 1 ðSDGÞ; 3 ðPFÞ; 1 ðSDGIÞ; 3 ðPDFGHÞ; 5 D 2 D; 2 ðPDFGHÞ; 4 ðPFÞ; 2 ðSDFGIÞ; 4 ðDGÞ; 6 S 2 F 1 ðSDGIÞ; 3 ðPFHÞ 2 ðPD2 F2 G2 IKLÞ; 4 ðSDFGIÞ 1 2 4 ðS D FG4 H2 I3 KL2 NÞ; 3 ðP3 D2 F4 G3 H2 I2 K2 LMÞ; 5 ðSDFGIÞ 2 4 5 7 6 7 5 5 3 2 ðP D F G H I K L M NOÞ; 4 ðSP2 D3 F4 G4 H3 I3 K2 LMÞ; 6 ðPFHÞ 1 4 ðS PD6 F4 G8 H4 I7 K3 L4 M2 N2 QÞ; 3 ðP6 D5 F9 G7 H9 I6 K6 L3 M3 NOÞ; 5 ðSPD3 F2 G3 H2 I2 KLÞ; 7 F 2 2 5 7 10 10 9 9 7 5 4 2 ðS P D F G H I K L M N OQÞ; 4 ðS2 P2 D6 F5 G7 H5 I5 K3 L3 MNÞ; 6 ðPDFGHIÞ; 8 S 2
For example, for p2 Table 3.4 listed the 15 allowed independent microstates and the 1 S; 1 D, and 3 P term symbols.
200
3
QUA NT UM M ECH AN ICS
SINGLETS
TRIPLETS
(11.217 V) 1
F3
0
1F
3
1D 2
o 1D
2
1P o 1
1P
1
1S
3S
0
1
3P
012
3P
012
o 3 D
3 o3 3 o 123 D123 F234 F234
0 10,000
5d 4d 3d
20,000
5d 4d
5d 4d 4p
3d 2p3
2p3
4p
3d 3p
3p
5d 4d 5s 3d 4s
4p
3s
30,000
2p3
3p 3s
5d 4d
5d 4d
3d
3d
3p 2p3
40,000 50,000 60,000
FIGURE 3.12 “Grotrian” energy diagram for the 70,000 neutral carbon atom (called “C I” by the atomic spectroscopists); the 80,000 configuration is 1s22s22p2, with the (0 V) triplet state as the lowest energy by Hund’s rules. The singlet (S ¼ 0) energy levels are at left, the triplet 100,000 (S ¼ 1) levels are at right; the relatively weaker “intersystem cross-1 ing” transitions are shown with ν / cm dashed lines; the stronger electric-dipole-allowed transitions are given with solid lines.
2s2 2p2 2s2 2p2 2s2 2p2
"C I" Neutral C atom Ground state 1s22s22p2 3P0 Excited states 1s22s12p3 1s22s22p1 ms, mp, md
However, only one electron can jump at a time; the l-value of the jumping electron must change Dl ¼ 1 (the parity of the wavefunction must change in an allowed electric dipole transition). When the atom obeys jj-coupling, then Dj ¼ 0, 1 for the jumping electron, Dj ¼ 0 for all other electrons, and for the whole atom we have DJ ¼ 0, 1; (but i ¼ 0 // ! J ¼ 0); DMJ ¼ 1, 0 (but MJ ¼ 0 // ! MJ ¼ 0 if DJ ¼ 0). Figure 3.12 shows the energy levels (the Grotrian89 diagram) for neutral carbon. A new wrinkle in term symbols is the superscript o, which indicates “odd parity” in the electron configuration. As seen above, the electron configuration 1s22s22p2 splits into two singlet terms 1 S0 and 1 Do2 and into a triplet term 3 Po0;1;2 . The triplet term 3 Po0;1;2 is lowest. The so-called “intersystem crossings” from triplets to singlets are “forbidden” by electric dipole selection rules: they are possible by other mechanisms, but are considerably weaker in intensity than the “allowed” transitions (note the old-fashioned Mitteleuropa police-state language!).
89
Walter Grotrian (1890–1954).
3.24
20 1
ABSORPTION AND EMISSION OF LIGHT
3.24 ABSORPTION AND EMISSION OF LIGHT The scattering and absorption or emission of light by an atom or molecule can be of several types: (i) (ii) (iii) (iv) (v) (vi)
Elastic: Thompson90 Elastic: Rayleigh & Mie91 Inelastic: Raman92 Inelastic: Brillouin Inelastic: Compton93 Inelastic or elastic: X-ray scattering
Scattering involves a change in direction by the incoming particle when it collides with a stationary particle. The same goes for light particles, or electromagnetic radiation. When a light beam of wavelength l hits a particle of characteristic dimension r, the photons are either absorbed or scattered by some scattering angle y. The important ratio that dictates the type of scattering is x 2pr=l
ð3:24:1Þ
If x 1, that is, if the particles are much smaller than the wavelength of the photon, then elastic Rayleigh scattering occurs; the wavelength of the light does not change, but its direction does, by an angle y (see below). In general, an incoming beam of intensity I may lose intensity to scattering processes as it traverses a length x of the target; if the process is uniform, one can presume that a first-order differential equation is operative: dI=dx ¼ QI
ð3:24:2Þ
where Q is an “interaction coefficient’ and dx is the distance traveled in the target. This can be integrated to yield various equivalent forms: I ¼ I0 expðQDxÞ ¼ I0 expðDx =lÞ ¼ I0 expðsZDxÞ ¼ I0 expðrDx=tÞ ð3:24:3Þ where I0 is the initial flux, the path length is Dx x x0. The second equality defines an interaction mean free path l. The third uses the number of targets per unit volume h to define an area cross section s, and the last uses the target mass density r to define a density mean free path t. In electromagnetic absorption spectroscopy the interaction coefficient (e.g., Q in cm1) is called opacity, absorption coefficient, or attenuation coefficient. In nuclear physics, area cross sections (e.g., s in barns ¼ 1024 cm2), density mean free path (e.g., t in g cm2), and its reciprocal, the mass attenuation coefficient (e.g., in cm2 g1) or area per nucleon, are all popular, while in electron microscopy the inelastic mean free path (e.g., l in nm) is used. Sideline Enrico Fermi coined the term
90
Sir Joseph John Thomson (1856–1940). Gustav Adolf Feodor Wilhelm Ludwig Mie (1869–1957). 92 Sir Chandrasekhara Venkata Raman (1888–1970). 93 Arthur Holly Compton (1892–1962). 91
202
3
QUA NT UM M ECH AN ICS
1 barn 1. 0 1024 cm2 1.0 1028 m2 to describe, humorously, the likelihood of hitting the side of a Midwestern barn with a baseball (or a highenergy proton); for a baseball batter the cross section would hopefully be at least a barn, while nuclei, being so small, are easily missed.
3.25 THOMSON SCATTERING (ELASTIC) Thomson scattering is the elastic scattering of electromagnetic radiation by a charged particle (e.g., an electron). The electric and magnetic components of the incident wave accelerate the particle. As it accelerates, it in turn emits radiation with no shift in energy, and thus the wave is scattered. If the particle velocity v is nonrelativistic (i.e., v c), the main cause of the acceleration of the particle will be due to the electric field component of the incident wave. The particle will move in the direction of the oscillating electric field, resulting in electromagnetic dipole radiation: First the incident wave of frequency n will force the electron to vibrate at n; next, the electron will very quickly emit a second wave with frequency n. If the wavelength of the electromagnetic radiation is very short (as in cosmic rays), then the Compton effect becomes appreciable: The electron receives an appreciable momentum, and the scattered photon is red-shifted appreciably. For visible light, the Compton effect is minimal, and the only thing the electron does is to vibrate in resonance with the visible light and either re-emit the photon in the “forward direction” with no polarization or re-emit the photon with (i) an unchanged wavelength, (ii) a polarization that is maximized at p/2 radians (90 degrees) from the direction of the input photon beam, and (iii) a very small phase shift. The moving particle radiates most strongly in a direction perpendicular to its motion, and that radiation will be polarized along the direction of its motion. Therefore, depending on where an observer is located, the light scattered from a small volume element may appear to be more or less polarized. The electric fields of the incoming and observed beam can be divided up into those components lying in the plane of observation (formed by the incoming and scattered beams) and those components perpendicular to that plane. Those components lying in this plane are referred to as “radial”; those perpendicular to it are “tangential”. The amplitude of the scattered radial wave will be proportional to the cosine of y, the angle between the incident and observed beam (the “scattering angle”); its intensity, which is the square of the amplitude, will then be diminished by a factor of cos2y. The tangential components will not be affected in this way. The derivation of Thomson scattering from first principles is a bit involved. Given an electron of charge |e| with velocity v ¼ cb(t) at the source point x(t), where c is the speed of light in vacuum, the scalar and vector potentials f(r, t) and A(r, t), respectively, at the field point r(t), are given by the Lienard94 –Wiechert95 expressions: fðr; tÞ ¼ jej½ð1 b ðx rÞjx rj1 Þ1 jx rj1 t ðcgsÞ
94 95
Alfred-Marie Lienard (1869–1958). Emil Johann Wiechert (1861–1928).
ð3:25:1Þ
3.25
20 3
THO MSO N SCATTERING (EL AST IC)
Aðr; tÞ ¼ jej½bð1 b ðx rÞjx rj1 Þ1 jx rj 1t0
ðcgsÞ
ð3:25:2Þ
where the expressions are evaluated at the retarded time t0 ¼ t |x r(t0 )| c1. After considerable mathematical labor, the electric field E(r, t) becomes the sum of two terms: (a) the velocity field Evel that does not depend on the acceleration of the electron and (b) an acceleration field Eacc that does: Eðr; tÞ ¼ Evel ðr; tÞ þ Eacc ðr; tÞ
ðcgsÞ
ð3:25:3Þ
h i Evel ðr; tÞ ¼ jej ðR1 R bÞð1 b bÞR2 ð1 R1 R bÞ3 0 t
ðcgsÞ ð3:25:4Þ
Eacc ðr; tÞ ¼ jejc1 ½R1 R fðR1 R bÞ ðdb=dtÞR1 ð1 R1 R bÞ3 t0
ðcgsÞ ð3:25:5Þ
The magnetic induction for that electron is given simply by B ¼ RR1 E
ðcgsÞ
ð3:25:6Þ
The acceleration field is the only term that involves emitting radiation. If the reference frame in which the electron is observed is moving slowly, relatively to the speed of light, R c, then Eacc(r,t) simplifies to Eacc ðr;tÞ ¼ jejc1 ½R2 R R ðdb=dtÞR1 t
ðcgsÞ
ð3:25:7Þ
and the instanteous flux is given by the Poynting pseudovector S: S ¼ ðc=4pÞEacc ðr;tÞ B ¼ ðc=4pÞjEacc ðr;tÞ Eacc ðr;tÞjR1 R
ðcgsÞ
ð3:25:8Þ
whence the power radiated per unit solid angle (dP/dO) becomes ðdP=dOÞ ¼ ðc=4pÞjR Eacc ðr;tÞj2 ¼ ðe2 =4pcÞj½R2 R R ðdb=dtÞtj2
ð3:25:9Þ
and if Y is the angle between the acceleration (dv/dt) and R, then ðdP=dOÞ ¼ ðeh=4pc3 Þðdv=dtÞ2 sin2 Y
ðcgsÞ
ð3:25:10Þ
and the total instantaneous power emission per electron is then given by the Larmor result: P ¼ ð2e2 =3c3 Þðdv=dtÞ2
ðcgsÞ
ð3:25:11Þ
Given an incoming electromagnetic wave and a polarization vector c: Eðx; tÞ ¼ cE0 cosðkx 2pintÞ
ð3:25:12Þ
204
3
QUA NT UM M ECH AN ICS
using the Lorentz force equation from Section 2.7, the acceleration becomes dv=dt ¼ cðjej=mÞE0 cosðkx 2pintÞ
ð3:25:13Þ
If the charge moves a negligible part of a wavelength during one oscillation, then (dv/dt)2 is roughly (1/2) |(dv/dt) (dv*/dt)|, so hdP=dOi ¼ ðc=8pÞE20 ðe2 =mc2 Þ2 sin2 Y
ðcgsÞ
ð3:25:14Þ
and the scattering cross section, defined by (ds/dO) (energy radiated per unit time and unit solid angle)/(incident energy per unit area and unit time), becomes ds=dO ¼ ðe2 =mc2 Þ2 sin2 Y
ðcgsÞ
ð3:25:15Þ
If in spherical polar coordinates the polarization vector c makes an angle c with the x axis, and the field vector R makes a spherical co-latitude angle y with the z axis and an azimuthal angle f with the x axis, then ds=dO ¼ ðe2 =mc2 Þ2 ð1 sin2 y cos2 ðf cÞÞ
ðcgsÞ
ð3:25:16Þ
For unpolarized radiation the cross section is given by averaging over the angle c: ds=dO ¼ ðe2 =mc2 Þ2 ð1=2Þð1 þ cos2 yÞ
ðcgsÞ
ð3:25:17Þ
This is the Thomson formula for scattering from a free charge. The differential Thomson cross section s is given by the angle-independent part of Eq. (3.25.17): s ¼ ðe2 =4p«0 mc2 Þ2
s ðe2 =mc2 Þ2
ðSIÞ;
ðcgsÞ
ð3:25:18Þ
which for a single electron is s ¼ 7.9407875 1026 cm2/steradian. It is independent of wavelength. After integrating out for all angles, the total Thomson cross section is sT ¼ ð8p=3Þðe2 =4p«0 mc2 Þ2
ðSIÞ;
sT ð8p=3Þðe2 =mc2 Þ2
ðcgsÞ ð3:25:19Þ
which for a single electron is sT ¼ 6.6524586 1025 cm2 ¼ 0.6652486 barns. The classical electron radius r0 is given by r0 e2 =4p«0 mc2
ðSIÞ;
r0 e2 =mc2
where r0 ¼ 2.89179 1015 m 2.89179 fm.
ðcgsÞ
ð3:25:20Þ
3.26
20 5
COM PTO N SCA TTERING (IN ELAS TIC)
This same result was derived quite simply in Problem 2.11.1. This result is also valid in quantum theory, provided relativistic effects can be neglected. Alternately, the Thomson scattering can be described by an emission coefficient Z, where Z dt dV dO dl is the energy scattered by a volume element dV in time dt into solid angle dO between wavelengths l and l þ dl. From the point of view of an observer, there are two emission coefficients, Zrad for radially polarized light and Ztan for tangentially polarized light. If the incident light is unpolarized, these are Zrad ¼ psT I0 nð1=2Þ
ð3:25:21Þ
Ztan ¼ pT I0 nð1=2Þ cos2 y
ð3:25:22Þ
where n is the density of charged particles at the scattering point and I0 is incident flux (i.e., energy/time/area/wavelength). The rest goes as above. PROBLEM 3.25.1. Derive Eq. (3.25.19) from Eq. (3.25.17).
3.26 COMPTON SCATTERING (INELASTIC) Compton scattering, or the Compton effect, is the decrease in energy of an X-ray or gamma-ray photon when it interacts with matter. The amount by which the wavelength changes is called the Compton shift. Although nuclear Compton scattering exists, Compton scattering usually refers to the interaction involving only the electrons of an atom. Inverse Compton scattering also exists, where the photon gains energy (decreasing in wavelength) upon interaction with matter. The Compton effect (Fig. 3.13) demonstrates that light cannot be explained purely as a wave phenomenon; in this experiment, light behaves as a stream of particles called photons, whose energy is proportional to the frequency. If the photon is of lower energy, but still has sufficient energy (in general a few eV, right around the energy of visible light), it can eject an electron from its host atom entirely (photoelectric effect), instead of undergoing Compton scattering. Higher-energy photons ( MeV) may be able to bombard the nucleus and cause an electron and a positron to be formed (pair production). Compton used three results, namely, (i) light as a particle,
scattered electron Ee , p e φ incident photon E1 = hν p1 = hν/c
θ
scattered photon E2 = hν' p2 = hν'/c
FIGURE 3.13 The Compton effect.
206
3
QUA NT UM M ECH AN ICS
(ii) special theory of relativity, and (iii) law of cosines, to yield the Compton scattering equation l0 l ¼ ðh=me cÞð1 cos yÞ
ð3:26:1Þ
where h is Planck’s constant, me is the electron mass, c is the speed of light, l is the wavelength of the incident photon, l0 is the (smaller) wavelength of the Compton-scattered photon, and y is the scattering angle. Here (h/mec) ¼ 2.426 1012 m ¼ 0.02426 A ¼ 0.002426 nm is the Compton wavelength, which is twice the maximum change of wavelength for the photon (which occurs when the photon turns around and scatters to the left in Fig. 3.13: y ¼ p). For visible light, (e.g., l ¼ 500 nm), the Compton effect is too small to be measurable. The Klein–Nishina96 formulas give the total cross section for Compton scattering as sKN ¼ ðe2 =mc2 Þ2 fð8p=3Þð1 2hn=mc2 þ Þg
ðcgsÞ
ð3:26:2Þ
for the case h n mc2 and as sKN ¼ ðe2 =m c2 Þ2 fðpmc2 =hnÞ½lnð2hn=mc2 Þ þ ð1=2Þg
ðcgsÞ
ð3:26:3Þ
for the case hn mc2. PROBLEM 3.26.1. Derive Eq. (3.26.1) (see Fig. 3.13), by using conservation of energy and momentum and the relativistic result Ee p2e c2 ¼ m2e c4 .
3.27 RAYLEIGH AND MIE SCATTERING (ELASTIC) Rayleigh scattering occurs in transparent solids and liquids, but is most prominently seen in gases. Rayleigh scattering of sunlight in a cloudless day at high noon is why the sky is so blue: Rayleigh and cloud-mediated scattering contribute to diffuse light (direct light being sunrays). At sunset, however, you look at a grazing angle through a lot of the atmosphere, and the sky is red. When the scattering is due to particles of sizes similar to or larger than a wavelength (x l), then Mie scattering occurs. The intensity I of light scattered from an incident unpolarized beam of wavelength l and incident intensity I0 by a single small particle of diameter d and refractive index n by the elastic, coherent Rayleigh process is given by I ¼ I0 8p4 l4 ðn2 1Þ2 ðn2 þ 2Þ2 d6 ð1 þ cos2 yÞR2
ð3:27:1Þ
where R is the distance from the particle to the point where I is measured, and y is the scattering angle. The angular distribution of Rayleigh scattering, governed by the (1 þ cos2y) term, is symmetric in the plane normal to the
96
Yoshio Nishina (1890–1951).
3.27
20 7
RAYLEIGH AND MIE SCATTERING (ELASTIC)
incident direction of the light; the forward scattering intensity (0 y p) will equal the backwards scattering intensity (p y 2). Integrating over the whole sphere surrounding the particle gives the total Rayleigh scattering cross section sR: sR ¼ ð2=3Þp5 d6 l4 ðn2 1Þ2 ðn2 þ 2Þ2
ð3:27:2Þ
PROBLEM 3.27.1. Prove Eq. (3.27.2). The Rayleigh scattering cross section for a dilute assembly of N scattering particles per unit volume is N times the cross section per particle. Note that there seems little in common between the formulas for Thomson and Rayleigh scattering. A 5-mW green laser pointer is visible at night, due to Rayleigh scattering and airborne dust. An individual molecule does not have a well-defined refractive index n or diameter d, but does have a measurable polarizability a, which describes how much the electrical charges on the molecule will move in an electric field. Then, by replacing (n2 1)2 (n2 þ 2)-2d6 by a2, the Rayleigh scattering intensity for a single molecule becomes I ¼ I0 8p4 a2 l4 R2 ð1 þ cos2 yÞ
ð3:27:3Þ
where I0 is the incoming light intensity, I is the intensity scattered at a scattering angle y at a distance R from the scattering center. The amount of Rayleigh scattering from a single molecule can also be expressed as a total cross section sR: sR ¼ ð8=3Þp5 a2 l4
ð3:27:4Þ
For example, the major constituent of the atmosphere, the nitrogen molecule, N2, has a Rayleigh cross section sR ¼ 5.1 1031 m2 at a wavelength l ¼ 532 nm (green light). This means that at atmospheric pressure, about a fraction 105 of light will be scattered for every meter of travel. Since the scattering has a l4 dependence, blue light is scattered much more than red light. In the atmosphere, this results in blue wavelengths being scattered to a greater extent than longer (red) wavelengths, and so one sees blue light coming from all regions of the sky. Mie theory, or Lorenz97–Mie theory or Lorenz–Mie–Debye theory, is a complete analytical solution of Maxwell’s equations for the scattering of electromagnetic radiation by spherical particles with local dipoles (also called Mie scattering). As an improvement to Rayleigh’s treatment, the Lorenz–Mie–Debye solution to the scattering problem is valid for all possible x [Eq. (3.19.1)], although the technique results in numerical summation of infinite sums. The incident plane wave and the internal and scattering fields are expanded into radiating spherical vector wavefunctions.
97
Ludvig Valentin Lorenz (1829–1891).
208
3
QUA NT UM M ECH AN ICS
3.28 RAMAN SCATTERING (INELASTIC) The Raman scattering of photons, discovered in 1928 by Raman and Krishnan98 in liquids, and by Landsberg99 and Mandelstam100 in crystals, is an inelastic process that depends on the static polarizability of molecules. The Germans call it the Smekal101 –Raman effect, honoring the previous work of Smekal. It consists of the absorption of an incident photon of wavelength l0 and frequency n0 ¼ c/n0, and the almost immediate emission of either (i) a photon of longer wavelength (and smaller energy) lS ¼ c/nS (Stokes102 line), because the rest of the energy was retained by the molecule in a molecular vibration nvib, or (ii) a photon of shorter wavelength lAS ¼ c/nAS, because a molecular vibration quantum nvib was added to the photon energy (anti-Stokes line). Thus: nS ¼ n0 nvib
ð3:28:1Þ
nAS ¼ n0 þ nvib
ð3:28:2Þ
The shifts by plus or minus nvib are collectively known as Stokes shifts; the anti-Stokes intensities are weaker than the Stokes intensities (because the latter must first be excited, and this requires a Boltzmann factor). The Raman effect depends on the rate of change of polarizability of the molecule (or molecules) with bond length change. Its selection rules are often complementary to the electric-dipole selection rules for the absorption of infrared light. The Raman light source is usually in the visible range and must be intense, because the Raman process is relatively weak; since the 1960s, laser light sources are most often used. The Raman effect differs from fluorescence: In fluorescence, the incident light is absorbed under resonance conditions, then the molecule is excited, and finally the molecule is de-excited by various mechanisms, including fuorescent emission after a certain resonance lifetime. In contrast, the Raman effect is immediate, nonresonant, and inelastic. The quantitative treatment requires quantum mechanics.
3.29 BRILLOUIN SCATTERING (INELASTIC) Brillouin scattering occurs when light traversing a medium (such as water or a crystal) interacts with time-dependent changes in density, which changes the frequency and path of the scattered photon. The density variations may be due to acoustic modes (e.g., phonons), or magnetic modes (e.g., magnons), or temperature gradients. As described in classical physics, when the medium is compressed, its index of refraction changes and the light’s path necessarily bends. Stokes shifts exist here too, but are called Brillouin shifts. Brillouin scattering is inelastic and nonresonant and is conceptually similar to Raman
98
Karlamanikkam Srinavasa Krishnan (1898–1961).
99
Grigory Landsberg (1890–1957). Leonid Isaakovich Mandelstam (1979–1944). 101 Adolf Smekal (1895–1959). 102 George Gabriel Stokes (1819–1903). 100
3.30
20 9
X - R AY SCAT T E RING (E L AST IC AND I N ELA STI C)
scattering, except that the emphasis is on macroscopic changes in the medium, rather than nanoscopic changes in individual molecules. Experimentally, the frequency ranges for Brillouin scattering (GHz) are lower than those for Raman scattering (THz), because they engage phonons or magnons of relatively smaller energy.
3.30 X-RAY SCATTERING (ELASTIC AND INELASTIC) When the photon source has the wavelength of X rays (0.05 nm to 5 nm), several processes can occur: diffraction, absorption (by atoms), and scattering (Bragg’s103 law or other). The general formula for scattered intensities (away from X-ray absorption edges) in the Thomson formula is n o
IðR; fÞ ¼ I0 e4 ð4p«0 Þ2 m2 c4 R2 ðN=2Þ 1 þ cos2 f
LðfÞexp 2Bl2 sin2 f Ihkl ðfÞ ðSIÞ
ð3:30:1Þ
where m is the electron mass, c is the speed of light, and the factor {e4(4p«0)2m2c4} ¼ 7.9407825 1030 m2 is the differential Thomson scattering cross section for one electron, Eq. (3.18.21), derived earlier. Further more, N is the number of electrons in the sample (equal to Avogadro’s104 number for a crystal containing one mole of hydrogen atoms), f is the scattering angle (twice the Bragg angle), l is the X-ray wavelength, (1/2) (1 þ cos2f) is the polarization factor, L(f) is the Lorentz factor (a simple trigonometric expression that corrects for the varying time during which a reciprocal lattice point is measured, and depends on the geometry of the data collection), B ¼ 8p2 is the Debye–Waller105 factor (thermal broadening of each peak due to vibrations of the contributing atoms around their mean positions), and Ihkl(f) is defined below. The scattering angle f ranges from 0 to p (backscattering). If the scattering angle f is twice a Bragg scattering angle yhkl: nl ¼ 2dhkl sin yhkl
ð3:30:2Þ
usually (but not always) n is set to n ¼ 1, and the distance between (imaginary) Bragg planes containing diffracting material is given by dhkl ¼ jha* þ kb* þ lc*j1
ð3:30:3Þ
where a*, b*, and c* are the reciprocal lattice vectors, and the Miller106 indices h, k, and l are integers (positive, negative, or zero).
103
Sir William Lawrence Bragg (1890–1971). Lorenzo Romano Amedeo Carlo Bernadette Avogadro, conte di Quaregna e Cerreto (1776–1856). 105 Ivar Waller (1898–1991). 106 William Hallowes Miller (1801–1880). 104
210
3
QUA NT UM M ECH AN ICS
Finally, Ihkl is the absolute square of the all-important structure factor Fhkl(y): Ihkl ðyÞ ¼ jFhkl ðyÞj2 Fhkl ðyÞ ¼
Pn j¼1
fj exp½2piðhxj þ kyj þ lzj Þ
ð3:30:4Þ ð3:30:5Þ
which consists of the atomic scattering factor fj (amplitude of scattering) of atom j in the primitive unit cell, and atom j is at position {(xj, yj, zj), j ¼ 1, 2, . . ., n} in the primitive unit cell (with axes a, b, c). The atomic scattering factor fj is obtained (by Rayleigh–Schr€ odinger perturbation methods or other methods) from the atomic wavefunction; at y ¼ 0 it is equal to the number of electrons for that atom, and it decays in an almost Gaussian fashion as the scattering angle increases (it is usually computed from atomic wavefunctions). In practice, a data collection of several thousand observed intensities Iobs(R,f) (on an arbitrary scale) are first corrected for Lorentz and polarization effects h P andi then put on an “absolute scale”: in a Wilson plot of ln ðIobs ðR; fÞ= nj¼1 fj2 versus [sin y/l] of ratios of observed (random-phased) intensities versus expected scattering factors of the atoms known to be in the unit cell. Its intercept at zero angle provides the necessary scale factor, and thus the Iobs(R,f) are scaled to become Ihkl(y). The absolute square in Eq. (3.30.4) implies that the diffraction intensity Ihkl(y) does not have an explicit phase and therefore masks the atom positions {(xj, yj, zj), j ¼ 1, 2, . . ., n}, the main goal of X-ray structure determination. This “phase problem” frustrated crystallographers for decennia. However, when one compares the experimental data (thousands of different diffraction intensities Ihkl), with the goal (a few hundred atomic position and their thermal ellipsoid parameters B), one sees that this is a mathematically overdetermined problem. Therefore, first guessing the relative phases of some most intense low-order reflections, one can systematically exploit mutual relationships between intensities that share certain Miller indices, to build a list of many more, statistically likely mutual phases. Finally, a likely and chemically reasonable trial structure is obtained, whose correctness is proven by least-squares refinement. This has made large-angle X-ray structure determination easy for maybe 90% of the data sets collected.
3.31 BEER–BOUGUER–LAMBERT LAW, OR BEER’S LAW The interaction between electromagnetic radiation and atoms or molecules is now discussed by empirical methods, then by semiclassical arguments, and finally by quantum theory. Quantitative data about the intensity of absorption of energy from a radiation field were discussed by Bouguer107 in 1729, Lambert108 in 1760, and Beer109 in 1852.
107
Pierre Bouguer (1698–1758). Johann Heinrich Lambert (1728–1777). 109 August Beer (1825–1863). 108
3.31
21 1
B E E R –B O U G UE R– L A M B E R T L A W , O R B E E R ’S L A W x=0
x=B
B I(x) Area A
I0(ν)
IB(ν)
FIGURE 3.14 Light beam of intensity I(n) and the Bouguer–Lambert–Beer law, or Beer’s law.
dx
In Fig. 3.14, we assume that a beam of light intensity I(n) (cgs: erg cm2 s1; SI: J m2 s1) passes through a surface A of unit area [(cgs) 1 cm2 (SI: 1 m2)] in 1 s. Since the speed of light is c, the light that passes through the area A in 1 s will traverse a distance B, therefore the energy I(v) dv will occupy a volume AB, and the density of radiation r(n) (cgs: erg cm3; SI: J m3) will be rðnÞ ¼ IðnÞ=B
ð3:31:1Þ
The intensity decrease dIx through a narrow sliver dx of solution is given by the energy: dIx ðnÞ dn ¼ Ix ðnÞ dn N1 aðnÞ dx
ð3:31:2Þ
where Nl is the number of molecules in the lower energy state, and a(n) is the absorption coefficient per atom, or absorption cross section (units: cm1 molecule1), for absorption of light of frequency n. In Eq. (3.31.2), Ix(n) is the intensity of light, that is, the light energy passing (Fig. 3.14) through a surface A of unit area per second, in the frequency range between v and n þ dn. After dividing by the common factor dn, Eq. (3.31.2) can be integrated to yield x¼B ð
Ix ¼IB ð
dIx =Ix ¼ x¼0
Ix ¼I0
x¼B ð IB dIx =Ix ¼ lne N1 adx ¼ N1 aB ¼ I0
ð3:31:3Þ
x¼0
lne ðIB =I0 Þ ¼ N1 aB
ð3:31:4Þ
IB ¼ I0 exp½N1 aB
ð3:31:5Þ
If one introduces a uniform concentration c (moles per liter, or mol dm3) of the absorbing solute within the cell of length B (cm), then one defines the frequency-dependent molar extinction coefficient «l ! u (n) (dm3 mole1 cm1) for a transition from an initial lower state l to a final upper state u, where N1 a ¼ N1 a1 ! u ðnÞ ¼ c«1 ! u ðnÞ ¼ c«1 ! u ðnÞ
ð3:31:6Þ
and obtains the Bouguer–Lambert–Beer law: IB ¼ I0 expð«1 ! u ðnÞcBÞ
ð3:31:7Þ
212
3
QUA NT UM M ECH AN ICS
This law can also be given in decadic form, using the decadic molar extinction coefficient «0 : IB ¼ I0 10 «0 cB
ð3:31:8Þ
We next re-derive the law, giving attention to the absorption of the photon by the molecules. Let Emn ¼ Em En ¼ hnmn ¼ homn
ð3:31:9Þ
be the energy difference between the ground state of a molecule or atom (with quantum number n) and the excited state (with quantum number m), for a transition m n (as traditional spectroscopists like to write it, “backwards”) or n ! m (as the rest of us like to write it). Let a parallel beam of light, with intensity I(n), impinge normally on a surface A of unit area. The amount of light (light intensity) that passes through the surface of area A in 1 s in the frequency range between n and n þ dn is I(n) dn. If c is the speed of the light, then a volume Act is traversed by the beam within a time t. Next, define anm as the absorption coefficient per atom or molecule (units: cm2 molecule1), or as the cross section per molecule (other convenient units: m2 molecule1), as the atom in the ground state n absorbs the photon and is promoted to the excited state m (n ! m). Let Ng atoms (or molecules) in the ground state lie in the region between x and x þ dx. The amount of light absorbed per second within a thickness dx is then dIðnÞdn ¼ IðnÞdn Ng aee dx
ð3:31:10Þ
If we omit the frequency interval factor dn, we get dIðnÞ ¼ IðnÞNg anm dx
ð3:31:11Þ
which, when integrated between the thickness limits 0 (where the intensity is I0) and B (units: cm), yields again the Bouguer–Lambert–Beer Law, or Beer’s law, for the intensity IB(n): IB ðnÞ ¼ I0 expðNg anm ðnÞBÞ
ð3:31:12Þ
We can rewrite this in powers of 10, using c ¼ the concentration in mol L1, B in cm, and the molar absorptivity, or decadic molar extinction coefficient, «0 (n), in L mol1 cm1: 0
I0 ¼ IL ðnÞ 10c« ðnÞB
ð3:31:13Þ
so that the relation between «(n) and a(n) becomes: «ðnÞ ¼ NA anm ðnÞ=2:303 103 ¼ 2:614 1020 anm ðnÞ
ð3:31:14Þ
3.32
ABSO RPT ION OF L IGH T BY A MOL ECULE: JAB LONS KI DIA GRA M
where NA is Avogadro’s number. The absorptivity a(n) (units: L g1 cm1) is defined by: aðnÞ ¼ ð1=cBÞlog10 ½I0 =IB ðnÞ
ð3:31:15Þ
The (dimensionless) absorbance A is defined by A ¼ log10 ½I0 =IB ðnÞ ¼ log10 ½IB ðnÞ=I0 ¼ c«ðnÞ
ð3:31:16Þ
The percent transmittance is defined by %T ¼ 100½I0 =IB ðnÞ
ð3:31:17Þ
Equation (3.31.16) shows that A is linear with c. However, at high solute concentrations c, deviations from Beer’s law can occur, due to (i) chemical reactions which modify the effective concentration of the solute, (ii) clusters which modify the capacity of each molecule to absorb independently of the other molecules, (iii) monochromator band-pass, particularly when the absorbance changes rapidly and nonlinearly with frequency n, or (iv) at high concentrations, the absorptivity a(n) is no longer independent of concentration, because aðnÞ ¼ atrue ZðZ2 þ 2Þ2
ð3:31:18Þ
where Z is the refractive index of the solvent. PROBLEM 3.31.1. Using Beer’s law, Eq. (3.31.12), estimate the percent absorption of light (¼ 100IB/I0) for « ¼ 20,000 L mol1 cm1 (a typical value for an allowed electronic transition), c ¼ 4 106 mol L1, and B ¼ 3 cm. The spectra (absorption or emission) of atoms are much sharper than those of molecules, because every electronic energy level in a molecule has a rich complement of vibronic levels and rotational sublevels (Fig. 3.15). In the late nineteenth century these smaller features could not be resolved in visible– ultraviolet spectroscopy, so, in ignorance of all the quantum effects explained decades later, the sharper spectra of atoms were called “line spectra,” while the broadened spectra of molecules were called “band spectra.” Cooling the molecules to 77 K or 4.2 K does resolve some of the vibronic substructure, even in visible–ultraviolet absorption spectroscopy.
3.32 ABSORPTION OF LIGHT BY A MOLECULE: JABLONSKI DIAGRAM Light absorbed by an atom or molecule excites it from the initial ground (or excited) state to a higher-energy excited state; for low-intensity light, this occurs, provided that the various applicable quantum rules for the transition are satisfied (electric-dipole “allowed” transitions). If quantum rules “forbid” a transition, then the transition is either absent (“strongly forbidden transition”) or very weak (“weakly allowed transition”). The “Jablonski”110 diagram (Fig. 3.16) depicts various forms of absorption and emission from
110
Alexander Jablonski (1898–1980).
21 3
214
3
QUA NT UM M ECH AN ICS
MOLECULE
ATOM v=2
v=3 v=1 v=2
S3
S3
v=2 v=1 v=1
T3 S2
T3
S2 v=1 v=2
T2
T2 v=1
S1
FIGURE 3.15
S1 T1
Energy levels in atoms (left) and molecules (right): The latter have vibronic levels and rotational sublevels. The S (and T) levels are spinsinglet S ¼ 0 (and spin-triplet S ¼ 1) states.
v=2
v=1 v=3
T1
v=2
v=1
S0
S0
GROUND STATE
GROUND STATE
S3 Third excited singlet state Internal conversion S2 Second excited singlet state Internal conversion
FIGURE 3.16
S1First excited singlet state
”Jablonski” diagram, showing, for a molecule in the ground (spin)singlet state S0, the (induced) absorptions, a double-quantum transition, (spontaneous) fluorescence, (spontaneous) phosphorescence, internal conversion, and intersystem crossing between the singlet manifold of states S0, S1, S2, and S3, and the lowest excited triplet state T1.
Intersystem crossing
(induced) Absorption(s)
(spontaneous) Fluorescence
Excited triplet state T1
Double quantum transition
(spontaneous) Phosphorescence
S0 Ground singlet state
an atom or molecule. Figure 3.16 shows three allowed absorptions, one fluorescent emission from the lowest excited singlet state (the fact that upper states do not fluoresce, but fluorescence occurs preferably from the lowest excited state, is known as Kasha’s111 rule), and two (very rapid) radiationless internal conversions within the singlet manifold, one radiationless intersystem crossing from the singlet manifold to the lowest excited triplet state, and one (slow) phosphorescent emission back to the ground state. The absorption process occurs within 1015 s. The excited state S1 usually lasts between 107 s and 109 s, while the phosphorescent excited state T1 lasts
111
Michael Kasha (1920–
).
3.32
ABSO RPT ION OF L IGH T BY A MOL ECULE: JAB LONS KI DIA GRA M
much longer (104 s to hours). The electronic upper singlet states S3 and S2 can also decay to S1 by emitting a vibrational photon (infrared, usually); this occurs faster than 1012 s. For molecules, the spectroscopic nomenclature for molecular energy levels and their vibronic and rotational sublevels is messy and very specialized. Already for homonuclear or heteronuclear diatomic molecules a new quantum number shows up, which quantifies the angular momentum along the internuclear axis, but the reader need not be burdened with the associated nomenclature. More important is the identification, in chromophores, of spectroscopic electronic transitions as p ! p* (pi to pi star) or as n ! p* (n to pi star); the n designates a nonbonding orbital, such as a lone pair, while p designates the ground state and p* an excited state of a pi electron system. Note that the n ! p* transitions are “forbidden,” while the p ! p* transitions are allowed. A more general, and rational, labelling of molecular electronic states uses the point group of the molecule, and irreducible representations of the point group, which will yield the symmetries of ground- or excited-state wavefunctions. For instance, as we shall see in Section 7.1, benzene (C6H6) belongs to the point group D6h and has a weak “benzenoid” band at 260 nm, a strong band at 200 nm, and another band at 185 nm. The ground state has symmetry 1A1g; the accessible excited states are 1 E1u and 1 A2u . The transition 1 A1g ! 1 E1u is allowed and is the band centered at 200 nm; the forbidden “benzenoid” transition at 260 nm is 1 A1g ! 1 B2u ; the band at 200 nm is maybe 1 A1g ! 1 B1u . For very intense light (e.g., in a laser beam), sometimes double-quantum transitions, proportional to the square of the light intensity, use a “virtual state” halfway between the initial state and the final state and two photons (thus the square-law dependence) [31]. The detection of an absorption is much less sensitive than detecting fluorescent or phosphorescent emission, because in absorption (at the same wavelength and in the same direction as the source) the detector must measure both the primary beam and the small change in its intensity due to absorption, without “saturating” the detector, while fluorescence or phosphorescence is usually detected at a different wavelength than the excitation and/or also at an angle away from the primary excitation beam (typically at 90 from it), so single-quantum detectors can be used, with relatively little fear of “saturating” the detector with the intense excitation beam. Indeed, fluorescence can be detected by “single-photon” counting: The detector is sensitive enough to respond to a single photon and, by a photomultiplier electron cascade, can emit an electrical signal in response. All such single-event detectors must then “reset” for the next counting event, by dispersing the large electronically amplified charge; thus there is a “dead-time,” during which any other incident photon cannot be counted. This dead-time limitation is shared also with Geiger112–M€ uller113 radiation detectors, for instance. Usually, the spectral shape of an absorption band (particularly the vibrational sub-bands) is repeated in emission, so that a fluorescence spectrum is often a mirror image of the absorption, but shifted toward lower frequencies (“Stokes shift”).
112 113
Johannes Wilhelm [Gengar] Geiger (1882–1945). Walther M€ uller (1905–1979).
21 5
216
3
QUA NT UM M ECH AN ICS
Potential energy U / eV
10
FIGURE 3.17 Schematic diagram for a typical electronic potential energy U as a function of some significant interatomic bond distance R. Also shown are the first four vibrational sublevels of the electronic energy, with vibrational quantum numbers v ¼ 0, 1, 2, 3.
5
0
v=3 v=2 v=1
−5
v=0
−10 0
1
2
3 4 R (Ångstroms)
5
6
Figure 3.17 shows an idealized diagram of the electronic (potential) energy U for a diatomic molecule, as a function of the distance R between the bonded atoms: the vibronic substructure is shown as horizonal levels (with vibrational quantum numbers v ¼ 0, 1, 2, etc.). Figure 3.17 could also be valid for a larger molecule, where the attention is focused on one significant chemical bond within the molecule. The width of the horizontal lines, which increases with v, attempts to depict the range of interatomic distances (bond lengths) accessible for that chemical bond within the harmonic (Hooke’s law) approximation. Figure 3.18 depicts a fundamental aspect of spectroscopy: Light absorption (arrow upwards) or emission (arrow downwards) to a new state is very fast, but its probability requires that some vibronic level be available at the same bond length as in the initial state; this is described by the Franck114–Condon115 factor, FC: if FC ¼ 0, then there is no overlap, and absorption or emission cannot occur. If FC ¼ 1, then the vibronic structures of initial and final state are ideally aligned, and the transition will occur. Small nonzero FC values require that some time elapse until at some instant the final state reaches the same “geometry” as the initial state. Note also that the potential energy minimum of the excited state is drawn at a larger value of R than the ground-state minimum: this tends to imply that the emission processes tend to occur at lower energies (“red-shifted”) than the absorption processes.
3.33 EINSTEIN A AND B COEFFICIENTS [32] Einstein obtained coefficients for induced absorption Bl ! u, induced emission Bu ! l, and spontaneous emission Au ! l of light by the following thermodynamic arguments, based on Arrhenius’116 law. Take a two-level system, whose upper level u with energy Eu is higher than the lower level l with energy El (see Fig. 3.19). Assume Nl molecules
114
James Franck (1882–1964). Edward Uhler Condon (1902–1974). 116 Svante August Arrhenius (1859–1927). 115
3.33
21 7
E I N S T E I N A AN D B CO E F F I C I E N T S
20
15
Energy / eV
v' = 3 v' = 2
10
ELECTRONIC EXCITED STATE
v' = 1 v' = 0
5 ABSORPTION
0
EMISSION ELECTRONIC GROUND STATE
v=3 v=2 v=1
−5 −10
v=0
0
1
2
3
4
5
R(Ångstroms)
FIGURE 3.18 Potential energy U for two energy levels of a hypothetical molecule: the ground electronic state and an excited electronic state, as a function of some “effective interatomic distance” R, with absorption and fluorescence processes shown as arrows. The energy minima for each state occur at different R; the potential energy well is narrower for the excited state than for the ground state. The transitions are “vertical,” for both absorption and fluorescence and require overlap between the vibrational states: this is a consequence of the Franck–Condon principle, which requires that an electronic transition can only take place when the molecule in the ground state has an instantaneous geometry that equals that of the target excited state. The resultant Franck–Condon factor (FC) equals 1.0 if this geometry difference is vanishingly small (i.e., if the two energy curves are vertically above each other), is zero if there is no overlap (or the potential energy diagrams do not overlap), and is small if the overlap is small. Because the excited state has an energy minimum at a longer distance R, the vectors depicting absorption are longer than those depicting emission; emission is said to be “red-shifted” relative to absorption.
(SI: molecules m3; cgs: molecules cm3) in the lower state, and Nu molecules (MKS: molecules m3; cgs: molecules cm3) in the upper state, and assume that their relative populations at thermodynamic equilibrium at the temperature T are controlled by the Arrhenius law: Nu =N1 ¼ expðEu =kB TÞ=expðE1 =kB TÞ ¼ exp ½ðEu E1 Þ=kB T ¼ expðhn1 ! u =kB TÞ
ð3:33:1Þ
where kB is the Boltzmann factor. occupancy Nu u
hn = (h/2π) ω
(induced) absorption coefficient Bl->u
Energy Eu
hn = (h/2π) ω
induced emission coefficient Bu->l
spontaneous emission coefficient Au->l
ΔEul
FIGURE 3.19 l
occupancy Nl
Energy El
Einstein coefficients.
218
3
QUA NT UM M ECH AN ICS
The intensity of absorption Ia (energy per unit volume per unit time; MKS: J m3 s1; cgs: erg cm3 s1) of light or other energy of frequency nl ! u Hz by the sample is given by Ia ¼ ðhn1 ! u Þrðn1 ! u ÞN1 B1 ! u
ð3:33:2Þ
where r(nl ! u) is the radiation density, or energy per unit volume of the energy source (MKS: J m3; cgs: erg cm3), and Bl ! u is the Einstein coefficient of induced absorption, in rather peculiar units (SI: m s kg1 molecule1; cgs: cm s g1 molecule1). This coefficient is characteristic of the absorbing species (atom or molecule). Once the atoms or molecules are in the excited state, they can again interact with the light or the radiation field and can relax to the lower state by returning to it a photon of energy (hnu ! l); the intensity of induced emission is Iie ¼ ðhn1 ! u Þrðn1 ! u ÞNu Bu ! 1
ð3:33:3Þ
By the principle of microscopic reversibility, we obtain B1 ! u ¼ Bu ! 1
ð3:33:4Þ
This induced emission has a classical analog: in classical mechanics, an oscillator of frequency n can either absorb energy from, or add energy to, a radiation field, depending on the phase of the vibration with respect to the phase of the oscillating radiation field. A second process of energy relaxation is spontaneous emission, which occurs “whenever the excited state feels like it” (rapidly or slowly), and its corresponding intensity Ise is given by Ise ¼ ðhn1 ! u ÞNu Au ! 1
ð3:33:5Þ
The units of Au ! l are s1. This spontaneous emission, which is independent of the radiation density, is called fluorescence or phosphorescence, according to some arbitrary time division (e.g., phosphorescence if relaxation time is ð 1 is called the “radiative lifetime of >1 ms). The term Au ! l ðnl ! u Þdnl ! u excited state u.” At thermal equilibrium, the amount of energy absorbed by state l and the energy emitted (by spontaneous or induced processes) by state u must be equal: Ia ¼ Iie þ Ise ð3:33:6Þ rðn1 ! u ÞN1 B1 ! u ¼ rðn1 ! u ÞNu Bu ! 1 þ Nu Au ! 1
ð3:33:7Þ
which, when solved for the ratio Nu/Nl, yields Nu =N1 ¼ B1 ! u rðn1 ! u Þ=½Au ! 1 þ Bu ! 1 rðn1 ! u Þ
ð3:33:8Þ
3.34
TIME -DEPENDENT PE RTURBATION THE ORY: THE RABI FORMULA
Equating Eqs. (3.33.1) and (3.33.8), we get exp½ðhn1 ! u =kB T ¼ B1 ! u rðn1 ! u Þ=½Au ! 1 þ Bu ! 1 rðn1 ! u Þ
ð3:33:9Þ
Using the Planck blackbody radiation formula (Section 5.6) for r(nl ! u), we obtain rðn1 ! u Þ ¼ 8phc3 ðn1 ! u Þ3 ½expðhn1 ! u =kB TÞ 11
ð3:33:10Þ
we finally get Einstein’s result: Au ! 1 ¼ 8phc3 ðn1 ! u Þ3 Bu ! 1
ð3:33:11Þ
Another way of stating these results is to write [32] ðInduced emission=Spontaneous emissionÞ ¼ ½expðhn1 ! u =kB TÞ 11 ð3:33:12Þ If the effective “temperature” T of the radiation field is so low that hnl ! u kBT, then induced emission is less important than spontaneous emission; if, instead, hnl ! u kBT, then induced emission dominates over spontaneous emission. For a transition in the mid-visible region (l ¼ 500 nm, or n ¼ 6 1014 Hz) the condition hn/kBT ¼ 1 requires T ¼ 30,000 K (not available, except inside a star), so spontaneous emission will rule. For infrared spectra (1/l ¼ 2000 cm1, n ¼ 6 1014 Hz) both processes can be important. In microwave spectroscopy (n ¼ 10 GHz) or for NMR experiments (n ¼ 10 MHz), spontaneous emission at reasonable temperatures is unimportant [32]. Laser (Light emission by stimulated emission of radiation), action, with its tremendous phase coherence (Rayleigh scattering), was discovered much later, in 1960 (Section 10.10), but provision for it already existed in Einstein’s analysis! After quantizing the electromagnetic field, the Einstein result will be rederived below without recourse to thermodynamics.
3.34 TIME-DEPENDENT PERTURBATION THEORY: THE RABI FORMULA When the Hamiltonian itself is time-dependent, then one first needs to discover how to evaluate the changes. The Hellman117—Feynman118 ^ and its eigenfunction c both depend theorem says that if the Hamiltonian H
117 118
Hans Gustav Adolf Hellmann (1903–1938). Richard Phllips Feynman (1918–1988).
21 9
220
3
QUA NT UM M ECH AN ICS
on some parameter P (time, distance, field, etc.), then the derivative of the system energy with respect to that P is just the expectation value of @H/@P: ð
^ dE=dP ¼ c*ðPÞð@ H=@PÞcðPÞdV
ð3:34:1Þ
Consider again a system with two energy levels El and Eu (l ¼ lower, u ¼ upper) with a Hamiltonian: ^ ð1Þ ðtÞ ^¼H ^ ð0Þ þ H H
ð3:34:2Þ
^ (1)(t) ¼ A sin(ot). The time-dependent where the time dependence might be H Schr€ odinger equation is ^ ¼ ihð@C=@tÞ HC
ð3:34:3Þ
The zeroth-order time-independent, orthonormalized eigenfunctions for the two levels are: ^ ð0Þ c ¼ E1 c H 1 1
ð3:34:4Þ
^ ð0Þ c ¼ Eu c H u u
ð3:34:5Þ
where Eu is the higher energy; the matrix elements of the perturbation are ð ð1Þ ^ ð1Þ c dV hHij ðtÞi ¼ ci *H ði; j ¼ l; uÞ ð3:34:6Þ j Often the two off-diagonal terms of Eq. (3.34.6) are nonzero and are equal to each other: ð1Þ
ð1Þ
hHlu ðtÞi ¼ hHul ðtÞi
ð3:34:7Þ
while the diagonal terms vanish: ð1Þ
ð1Þ
hH11 ðtÞi ¼ hHuu ðtÞi ¼ 0
ð3:34:8Þ
If the time-dependent state of the system is given by CðtÞ ¼ al ðtÞCl ðtÞ þ au ðtÞCu ðtÞ ¼ al ðtÞcl expðiEl t=hÞ þ au ðtÞcu expðiEu t=hÞ
ð3:34:9Þ ð3:34:10Þ
then routine algebra (see Problem 3.34.2) shows that the time evolution is given by two coupled differential equations: 1 dal =dt ¼ ½ih1 au ðtÞhHð1Þ lu ðtÞiexp½ih ðEu El Þt ð1Þ
dau =dt ¼ ½ih1 al ðtÞhHlu ðtÞiexpðih1 ðEu El Þt
ð3:34:11Þ ð3:34:12Þ
3.34
TIME -DEPENDENT PE RTURBATION THE ORY: THE RABI FORMULA
These are two coupled ordinary differential equations. (With three energy levels instead of two, we would have three coupled equations.) The system can be solved by differentiating Eq. (3.34.11) again with respect to time, using Eq. (3.34.12), and rearranging ð1Þ
ð1Þ
ðd2 au =dt2 Þ iðEu El Þðdau =dtÞ þ h2 hHlu ðtÞihHul ðtÞiau ¼ 0
ð3:34:13Þ
To help matters along, three auxiliary energy variables are defined: houl Eu El
ð3:34:14Þ
ð1Þ
ð1Þ
4O2 ðEu El Þ2 þ 4h2 hHlu ðtÞihHul ðtÞi ð1Þ
ð1Þ
V 2 4h2 hHlu ðtÞihHul ðtÞi
ð3:34:15Þ
ð3:34:16Þ
For the initial conditions al(t) ¼ 1, au(t) ¼ 0, that, is, when the system starts in the lower energy state, we get, after some algebra, al ðtÞ ¼ cosðOtÞ iðoul =2OÞsinðOtÞexpðioul t=2Þ
au ðtÞ ¼ iðV=OÞsinðOtÞexpðioul t=2Þ
ð3:34:17Þ
ð3:34:18Þ
and the Rabi119 formula: Pu ¼ jau ðtÞj2 ¼ ½4V 2 =ðo2u ! l þ 4V 2 Þsin2 ½ððo2u ! l þ 4V 2 Þ1=2 t=2
ð3:34:19Þ
which in the case ou ! l ¼ 0 reduces to Pu sin2 ðVtÞ
ð3:34:20Þ
or in the case ou ! l 2 V becomes Pu ½2V 2 =ou ! l 2 sin2 ½ou ! l t=2 In the former case (ou ! l ¼ 0), it is easy to establish a 50–50 mixture of states l and u (Pu ¼ 1/2), since they have the same energy. This is routinely done in multiple-pulse methods to even out, for example, nuclear magnetic resonance spin states. In the latter case (ou ! l 2 V), the perturbation V is much smaller than the energy level separation ou ! l, and the maximum probability that state u can be reached is 2V2 ou ! l2 1, that is, the level u is never reached. Figure 3.20 reminds us what the function y ¼ (sin x/x)2 looks like.
119
Isidor Isaac Rabi (1898–1988).
22 1
222
3
QUA NT UM M ECH AN ICS
1 0.8 sin(x)*sin(x)/x*x
0.6 0.4
FIGURE 3.20 0.2
Plot of y ¼ x2 sin2x. Its peak at x ¼ 0, is y ¼ 1.0; the width at half height (y ¼ 0.5) is 2.783 radians. Its integral is p (Problem 3.34.4).
0
−2š
−š
0
2š
x
PROBLEMð 3.34.1. Prove the Hellmann–Feynman theorem, Eq. (3.34.1), ð ^ dV. assuming c*c dV ¼ 1 and E ¼ c*Hc PROBLEM 3.34.2. Prove the coupled differential equations, Eqs. (3.34.11) and (3.34.12), by substituting Eq. (3.34.9) into Eqs. (3.34.3) and (3.34.2) and ð1Þ ð1Þ then using Eq. (3.34.6) with the assumption Hii ¼ Hff ¼ 0, Eq. (3.34.8). PROBLEM 3.34.3. Prove Eq. (3.34.20). ð x¼1 PROBLEM 3.34.4. Prove
dxðsin x=xÞ2 ¼ p (Fig. 3.20).
x¼1
3.35 FERMI’S (SECOND) GOLDEN RULE An alternate treatment, due to Dirac, uses the variation of the coefficients to discuss a system of many levels, not just two levels. In analogy to the above, use the perturbation expansion, Eq. (3.34.2), and the zeroth-order eigenfunctions Cn(r, t) as follows:
Cðr; tÞ ¼
P
^ ð1Þ ðtÞ ^¼H ^ ð0Þ þ H H
ð3:34:2Þ
^ ð1Þ Cn ðtÞ ¼ ihð@Cn ðtÞ=@tÞ H
ð3:35:1Þ
Cn ðr; tÞ ¼ cn ðrÞexp½iEn t=h
ð3:35:2Þ
n an ðtÞcn ðrÞexp½iEn t=h
¼
P
n an ðtÞCn ðr; tÞ
^ HCðr; tÞ ¼ ihð@Cðr; tÞ@=tÞ
ð3:35:3Þ ð3:35:4Þ
Substituting Eqs. (3.35.1) and (3.35.2) into Eq. (3.35.3) yields P
^ ð1Þ ðtÞc ðrÞexpðiEn t=hÞ ¼ ihP ðdan =dtÞc ðrÞexpðiEn t=hÞ n n n
n an ðtÞH
ð3:33:5Þ
3.35
22 3
F E R M I’S ( S E C O N D ) GO L D E N R U L E
By premultiplying by cf* and integrating, because the {Cn(t), n ¼ 1, . . .} form an orthonormal basis, we get (Problem 3.35.1) P
n an ðtÞH
ð1Þ
nf ðtÞexpðiEn t=hÞ
¼ ihðdaf =dtÞexpðiEf t=hÞ
ð3:35:6Þ
Now define houn Eu En
ð3:35:7Þ
and obtain ðdau =dtÞ ¼
P
ð1Þ n an ðtÞHun ðtÞexpðioun tÞ
ð3:35:8Þ
Unfortunately, this maneuver links au(t) to all the other coefficients an(t). For simplicity, then, assume a two-level problem, with only states i (initial: lower) and f (final: upper) to worry about: ai ðt ¼ 0Þ ¼ 1;
all other an ðt ¼ 0Þ ¼ 0;
and only af ðt$0Þ$0 ð3:35:9Þ
Then af ðtÞ ðihÞ
1
ð t¼t t¼0
ð1Þ
dtHi ! f ðtÞexpðiou1 tÞ
ð3:35:10Þ
^ ð1Þ ðtÞ of the form If now we adopt an oscillating perturbation H ^ ð1Þ cosðotÞ ¼ H ^ ð1Þ ½expðiotÞ þ expðiotÞ ^ ð1Þ ðtÞ ¼ 2H H
ð3:35:11Þ
then Eq. (3.35.10) becomes ð1Þ
af ðtÞ ðHi ! f =hÞf½ou1 þ o1 ½expðiofi t þ iotÞ 1 þ½ofi o1 ½expðiofi t iotÞ 1g
ð3:35:12Þ
The first term within the braces is generally small, while the second term within the braces, with its denominator [ofi o], becomes very large, particularly close to resonance (o ¼ ofi). Therefore we will neglect the first term, keep the second, and calculate the probability Pf(t) ¼ |af|2 of finding the system in state f at a time t > 0, when irradiated at a frequency o, after it started in state i for t ¼ 0: ð1Þ
ð1Þ
Pf ðtÞ ¼ jaf j2 ¼ f4Hf ! i Hi ! f =h2 ½of ! I o2 gsin2 ½ðof ! I oÞt=2
ð3:35:13Þ
which resembles Eq. (3.34.19), except for the spectral shift by o. Since Limx ! 0 ðsin x=xÞ2 ¼ 1
ð3:35:14Þ
therefore at resonance we have ð1Þ
ð1Þ
Limðofio Þ ! 0 ¼ h2 Hf ! i Hi ! f t2
ð3:35:15Þ
224
3
QUA NT UM M ECH AN ICS
which is a probability quadratic in time. This is a silly result; the transition rate (or transition probability) Wi ! f is given by ð1Þ
ð1Þ
Wi ! f ¼ dPu =dt ¼ 2h2 Hf ! i Hi ! f t
ð3:35:16Þ
At resonance (ofi ¼ o) Wi ! f, is linear with time: One more silly result! What will avoid this silliness is to assume a single initial lower state with energy Ei, but a manifold of closely spaced final excited states Ef, with a density of states r(E), such that the number of energy states between Ef and Ef þ dEf is r(Ef)dEf, with a density at the center of the band r(Ef). This replaces Eq. (3.33.10) by a sum and, eventually, by an integral over the manifold of excited states: PðtÞ ¼
P
ð
f Pf ðtÞ
ð1Þ
ð1Þ
¼ f4Hf ! i Hi ! f h2 ½ofi o2 gsin2 ½ðofi oÞt=2rðEf Þ dEf
Changing variables to Ef ¼ hofi, assuming that r(Ef) can be replaced by its ð1Þ ð1Þ largest value, r(Ef), and pulling it, along with Hf ! i Hi ! f out of the integral one obtains PðtÞ ¼
P
f Pf ðtÞ
ð1Þ
ð1Þ
¼ 2ph1 Hf ! i Hi ! f rðEf Þt
ð3:35:17Þ
Therefore the transition probability Wi ! f becomes independent of time (hallelujah!): this is Fermi’s golden rule: ð1Þ
Wi ! f ¼ 2ph1 jHf ! i j2 rðEf Þ
ð3:35:18Þ
This result is also called the “second golden rule.” The ancient Greek admonition to “do all things in moderation” is the world’s “first golden rule.” PROBLEM 3.35.1. Prove Eq. (3.35.5) by substituting Eqs. (3.35.2) and (3.35.3) into Eq. (3.35.4). PROBLEM 3.35.2. Prove Eq. (3.35.15) PROBLEM 3.35.3. Equation (3.35.11) can be obtained more simply [5] by assuming immediately that an electromagnetic wave, producing an electric field with vector E0x along the x axis, and propagating along z with wavelength l and frequency n, interacts with charges qi localized at x coordinates xi: the resulting perturbation Hamiltonian is ^ ¼ E0 P qi xi sinð2pnt 2pzi =lÞ : HðtÞ x i Show that the coefficient in Eq. (3.33.10) can be af ðtÞ ¼ dfi þ ðE0x =2hiÞ < ci j
P
i qi xi jcf
> f½ofi þ o1 ½expðiofi t þ iotÞ 1
¼þf½ofi o1 ½expðiofi t iotÞ 1g
3.36
L I G HT W AVE - M OL E CULE INT E R AC TI O N — TH E H A M I LTO N I A N
3.36 LIGHT WAVE-MOLECULE INTERACTION—THE HAMILTONIAN From the Lorentz120 force for a charge e: F ¼ eE þ ev B
ðSIÞ;
F ¼ e E þ ðe=cÞv B
ðcgsÞ
ðð2:7:24ÞÞ
and from the definition of scalar and vector potentials, f and A, respectively: EðrÞ ¼ rfðrÞ @A=@t
ðSIÞ;
EðrÞ ¼ rfðrÞ ð1=cÞ@A=@t ðcgsÞ ðð2:7:54ÞÞ
Newton’s121 second law yields Fx ¼ m ðd2 x=dt2 Þ ¼ e½ð@f=@xÞ ð@Ax =@tÞ þ Bz ðdy=dtÞ By ðdz=dtÞ Fy ¼ m ðd2 y=dt2 Þ ¼ e½ð@f=@yÞ ð@Ay =@tÞ þ Bx ðdz=dtÞ Bz ðdx=dtÞ Fz ¼ m ðd2 z=dt2 Þ ¼ e½ð@f=@zÞ ð@Az =@tÞ þ By ðdx=dtÞ Bx ðdy=dtÞ
ðSIÞ
Fx ¼ m ðd2 x=dt2 Þ ¼ eð@f=@xÞ ðe=cÞð@Ax =@tÞ þ ðe=cÞ½Bz ðdy=dtÞ By ðdz=dtÞ Fy ¼ m ðd2 y=dt2 Þ ¼ eð@f=@yÞ ðe=cÞð@Ay =@tÞ þ ðe=cÞ½Bx ðdz=dtÞ Bz ðdx=dtÞ Fz ¼ m ðd2 z=dt2 Þ ¼ eð@f=@zÞ ðe=cÞð@Az =@tÞ þ ðe=cÞ½By ðdx=dtÞ Bx ðdy=dtÞ ðcgsÞ ð3:36:1Þ Such equations can also be derived conveniently by using the Lagrangian122 function L: Lðrij ; drij =dtÞ ¼ Tðdrij =dtÞ Uðrij Þ
ði ¼ 1; 2; :::; N; j ¼ 1; 2; 3Þ
ðð2:6:2ÞÞ
which in this case is L ðm=2Þ½ðdx=dtÞ2 þ ðdy=dtÞ2 þ ðdz=dtÞ2 þ e½Ax ðdx=dtÞ þ Ay ðdy=dtÞ þ Az ðdz=dtÞ ef X X L ¼ ðm=2Þ i ðdxi =dtÞ2 þ ðe=cÞ A ðdxi =dtÞ ef i i
ðSIÞ
ðcgsÞ
ð3:36:2Þ
ð3:36:3Þ
or, written out more explicitly; L ðm=2Þ½ðdx=dtÞ2 þ ðdy=dtÞ2 þ ðdz=dtÞ2 þ ðe=cÞ½Ax ðdx=dtÞ þAy ðdy=dtÞ þ Az ðdz=dtÞ ef
120
Hendrick Antoon Lorentz (1853–1928). Sir Isaac Newton (1642–1727). 122 Joseph Louis Lagrange (1736–1813). 121
ðcgsÞ
ð3:36:4Þ
22 5
226
3
QUA NT UM M ECH AN ICS
From L one can get the generalized momentum p conjugate to the generalized coordinate q: pi ¼ ðdL=dðdqi =dtÞÞ
ði ¼ 1; 2; 3Þ
ð3:36:5Þ
(here q ¼ r); thus the conjugate momenta are p ¼ mðdr=dtÞ þ eA
ðSIÞ;
p ¼ mðdr=dtÞ þ ðe=cÞA
ðcgsÞ
ð3:36:6Þ
Using these results, the classical Hamilton’s function can be defined by H pðdr=dtÞ L ¼ ðm=2Þ½ðdx=dtÞ2 þ ðdy=dtÞ2 þ ðdz=dtÞ2 þ ef
ðSIÞ ð3:36:7Þ
or in terms of coordinates and momenta: H ¼ ð1=2mÞ½ðpx Ax Þ2 þ ðpy Ay Þ2 þ ðpz Az Þ2 þ ef
ðSIÞ
ð3:36:8Þ
H ¼ ð1=2mÞ½ðpx ec1 Ax Þ2 þ ðpy ec1 Ay Þ2 þ ðpz ec1 Az Þ2 þ ef
ðcgsÞ ð3:36:9Þ
After using the classical-to-quantum correspondence px ¼ i h1(@/@x), and so on, and fully expanding the quadratic forms, the quantum-mechanical ^ becomes Hamiltonian operator H ^ ¼ ð1=2mÞ h2 r2 þ ihe r A þ 2ihe A r þ e2 A A þ ef ðSIÞ H ^ ¼ ð1=2mÞ h2 r2 þ ihec1 r A þ 2ihðe=cÞ A r þ e2 c2 A A þ ef ðcgsÞ H
ð3:36:10Þ The customary Coulomb gauge (! A ¼ 0) eliminates the second term inside the square brackets; furthermore, in an electromagnetic wave there is no source of charges, so the electrostatic potential also vanishes, f ¼ 0. Finally, the term e2A A (or e2c2A A) is important only in "double-quantum transitions”: in strong electric fields (where the first light quantum propels the electron into a very short-lived “virtual” state, and the second quantum takes this excited electron to the upper level; these double-quantum transitions are weak, and can be identified by their quadratic dependence on the light intensity; they were discovered by G€ oppert-Mayer123. Here we ignore these double-quantum transitions and set the relevant term to zero. ^ is a sum over all M electrons in the atom or Therefore, what is left in H molecule: m¼M 2 m¼M X X ^ þ ih ^¼ h r2m þ V A m rm H m m¼1 2m m¼1
123
Maria G€ oppert-Mayer (1906–1972).
ðSIÞ
ð3:36:11Þ
3.36
L I G HT W AVE - M OL E CULE INT E R AC TI O N — TH E H A M I LTO N I A N
m¼M 2 m¼M X X ^ þ ieh ^¼ h r2m þ V A m *r m H mc m¼1 2m m¼1
ðcgsÞ
ð3:36:12Þ
^ is the potential energy for the atom or molecule in the absence of the where V external electromagnetic field (but including, for example, the internal Coulomb field due to the electrostatic attraction between electrons and nuclei). Thus, the time-dependent perturbation Hamiltonian operator is rather simply: ^ ð1Þ ðtÞ ¼ ieh H m
m¼M X
A m *r m
ðSIÞ;
m¼1
^ ð1Þ ðtÞ ¼ ieh H mc
m¼M X
A m *r m
ðcgsÞ
m¼1
ð3:36:13Þ To obtain the matrix elements hH i ! f(t)i, we take, not the timeindependent wavefunction ci(r) of Eq. (3.34.6), but its time-dependent form Ci(r; t): (1)
Ci ðr; tÞ ¼ expðiEi th1 Þci ðrÞ
ð3:36:14Þ
so that the coupling matrix element becomes ð ^ ð1Þ ðtÞc ðrÞdVðrÞ hHð1Þ i ! f ðtÞi ¼ Ci *ðrÞH f
ð3:36:15Þ
which, using Eq. (3.36.13), reads hHð1Þ i! f ðtÞi ¼ iðeh=mcÞ
m¼M Xð
1 expðih1 Ei tÞci *ðrÞAm *rm expðih Ef tÞcf ðrÞdVðrÞ
m¼1
¼ iðeh=mÞexp½ih1 ðEi Ef Þt
m¼M Xð
ci *ðrÞAm *rm cf ðrÞdVðrÞ ðSIÞ
m¼1 1
¼ iðeh=mcÞexp½ih ðEi Ef Þt
m¼M Xð
ci *ðrÞAm *rm cf ðrÞdVðrÞ ðcgsÞ
m¼1
If Am does not vary much over the volume of the molecule (because photons of wavelength 500 nm are much “larger” than a molecule of diameter 1 nm), then we can take Am out of the integral, and the above expression becomes ð m¼M X 1 ^ ð1Þ ci *ðrÞrm cf ðrÞ dVðrÞ ðcgsÞ * ðtÞi ¼ iðeh=mcÞ exp½ih ðE E Þt A hH m i i !u f m¼1
For a plane-polarized light wave (defined by Ax $ 0, Ay ¼ Az ¼ 0) the matrix element becomes ð m¼M X ð1Þ 1 ^ Ax ci *ðrÞð@=@xÞcf ðrÞdVðrÞ hH i ! f ðtÞi ¼ iðeh=mcÞexp½ih ðEi Ef Þt m¼1
22 7
228
3
QUA NT UM M ECH AN ICS
If the wavefunction depends only on x, rather than on y or z, we get, after some manipulation, ð1Þ
1 ^ hH i ! f ðtÞi ¼ i½Ax =hcðEf Ei Þexp½ih ðEi Ef Þthmx;i ! f i
ð3:36:16Þ
where the integral m¼M Xð
hmx;i ! f i e
ci *ðrÞ x cf ðrÞ dVðrÞ
ð3:36:17Þ
m¼1
is the transition moment in the x direction between the lower level l and the upper level u. Similar transition moment components can be defined in the y and z directions: m¼M Xð hmy;i ! f i e ci *ðrÞy cf ðrÞ dVðrÞ m¼1
hmz;i ! f i e
m¼M Xð
ci *ðrÞz cf ðrÞ dVðrÞ
m¼1
so, finally, the static electric dipole transition moment vector mi ! f is defined by hmi ! f i e
m¼M Xð
ci *ðrÞr cf ðrÞ dVðrÞ
ð3:36:18Þ
m¼1
Put differently, the probability that the system, having started in state i at t ¼ 0, will be in state f at time t ¼ t is given by [7]
2
2
Pf ðtÞ ¼ af *af ¼ p2 ni ! f 2 c2 h2 A0 ðni ! f Þ hmi ! f i t
ð3:36:19Þ
which agrees with the spirit of Eq. (3.35.16). We can also relate the transition moment to the simple perturbation term [33]: ð ^ ð1Þ c dV ¼ Ex ðn; tÞhmi ! f i ci *H ð3:36:20Þ f PROBLEM 3.36.1. Show that a plane-polarized light wave is defined by Ax $ 0, Ay ¼ Az ¼ 0). PROBLEM 3.36.2. Prove Eq. (3.36.18).
3.37 TRANSITION MOMENT AND EINSTEIN COEFFICIENTS We next connect the time-varying vector potential A with the electric field E. The vector potential A depends on time as follows: AðnÞ ¼ ð1=2ÞA0 ½expð2pintÞ þ expð2pintÞ ¼ A0 cosð2pntÞ
ð3:37:1Þ
3.37
22 9
TRANS ITION MOME NT AND EINSTEIN C OEFFIC IENT S
and the electric field can be written as ð3:37:2Þ
EðnÞ ¼ ð1=cÞ ð@AðnÞ=@tÞ ¼ ð2pv=cÞA0 sinð2pntÞ Since hsin2(2pnt)i ¼ 1/2, the mean-square field is hEðni ! f Þ2 i ¼ ð2p2 ni ! f 2 c2 ÞjA0 ðni ! f Þj2
ð3:37:3Þ
For the radiation density r(nl ! u), instead of the Planck formula, we use the energy density of the electromagnetic field: rðni ! f Þ ¼ ð1=8pÞ ðhE2 i þ hH2 iÞ ðcgsÞ
ð3:37:4Þ
Since the magnetic field intensity H and the electric field intensity E have the same magnitude and frequency (provided that E is measured in cgs-esu and H in cgs-emu), the above reduces to rðni ! f Þ ¼ ð1=4pÞhE2 i ðcgs-esuÞ
ð3:37:5Þ
In terms of the vector potential this becomes rðni ! f Þ ¼ ð3pni ! f 2 =2 c2 ÞjA0 ðni ! f Þj2
ðcgs-esuÞ
ð3:37:6Þ
We can now rewrite Eq. (3.34.13) as follows: Pf ðtÞ ¼ af *af ¼ ðp2 h2 c2 Þni ! f 2 rðni ! f Þ ð2c2 =3pÞ ni ! f 2 jmi ! f j2 or: Pf ðtÞ ¼ af *af ¼ ð2p=3h2 Þjhmi ! f ij2 rðvi ! i Þt
ð3:37:7Þ
We now finally define the Einstein coefficient for induced absorption as Bi ! f ½2p=3h2 jhmi ! f ij2 ¼ 1:883 1054 jhmi ! f ij2
ðcgsÞ
ð3:37:8Þ
Therefore the Einstein coefficient for spontaneous emission becomes, using Eq. (3.33.11), Af ! i ¼ ½32p3 =3hc3 ðni ! f Þ3 jhmi ! f ij2 ¼ 1:161 102 ðni ! f Þ3 jhmi ! f ij2
ðcgsÞ
ð3:37:9Þ
Dipole strength is another practical quantity, defined as Di ! f Ge2 jhmi ! f ij2
ð3:37:10Þ
230
3
QUA NT UM M ECH AN ICS
where G is the ratio of the quantum-mechanical degeneracy of the final state u, divided by the degeneracy of the initial state l, and Di ! f is in units of cm2 (cgs): then the Einstein coefficient for induced absorption Bi ! f becomes Bi ! f ¼ ½2pe2 =3ch2 Di ! f ¼ 1:450 1025 Di ! f
ðcgsÞ
ð3:37:11Þ
Oscillator Strength. Another useful quantity is the oscillator strength fi ! f, or “f-number.” It is defined as the “effective number of electrons that can oscillate.” In classical electromagnetic theory the intensity of absorption is given by Ia ¼ fi ! f Ni ðpe2 =mcÞrðni ! f Þ
ð3:37:12Þ
where Ni is the number of electrons in the state i, e is the electronic charge, c is the speed of light, and m is the electron mass. An electron is taken as an oscillator with its own characteristic frequency, which can be excited by light of the same frequency, at resonance. Then fi ! f ¼ 1 for a three-dimensional harmonic oscillator, 1/3 for a one-dimensional oscillator, and 2/3 for a twodimensional oscillator. Finally, fi ! f is the effective number of electrons that contribute to a given absorption band. The sum of all f values for a system should equal the number of electrons (Kuhn–Thomas sum rule). In terms of the quantities defined above: fi ! f ¼ ð4pm=3he2 ÞGvi ! f j < mi ! f > j2
ð3:37:13Þ
The oscillator strength can be related to the Einstein coefficient of induced absorption: fi ! f ¼ ð2hmc=e2 Þni ! f Bi ! f ¼ 7:483 1015 ni ! f Bi ! f
ð3:37:14Þ
3.38 QUANTUM ELECTRODYNAMICS [14] The emission or absorption of electromagnetic radiation by matter could not be left in a classical framework when matter was being treated by quantum mechanics. This imbalance led to quantum electrodynamics (QED), where the radiation field itself was also quantized. This procedure had some mathematical difficulties, since certain definite integrals diverged when the limits reached infinity; the so-called “box normalization” or “renormalization,” restricting the integration to a finite range, solved the divergence. A key concept in QED is that matter can interact at a distance by using an interchange of “virtual photons” as messengers for the interaction, traveling at the speed of light. Photons are bosons (spin ¼ 1 quantities) with zero rest mass and relativistic mass. Real photons carry energy; virtual photons do not. Virtual photons (or other virtual particles) exist within the framework of the uncertainty principle, for lifetimes Dt below the uncertainty principle limit DEDt h/2; their brief existence does not create a net flux and does not violate the principle of conservation of mass energy.
3.39
23 1
N O R M A L M O D E S O F A C O N T I N U O US E L A S T I C S Y S T E M
3.39 NORMAL MODES OF A CONTINUOUS ELASTIC SYSTEM [14] We want to learn how to quantize the radiation field. As a first step, consider a continuous elastic system. Any classical continuous elastic system in one dimension can be treated by a normal-mode analysis. Consider an elastic string of length a [m], tied at both ends to some fixed objects, with density per unit length r [kg m1], and tension, or Hooke’s law force constant kH [N m1]. The transverse displacements of the string along the x axis can be described by a transverse stretch y(x, t) at any point x along the string and at a time t. One can describe the y(x, t) as a Fourier sine series in x: yðx; tÞ ¼
Xs¼1 s¼1
fs ðtÞsinðspx=aÞ
ð3:39:1Þ
where the fs(t) can be shown to be the normal-mode coordinates of the system. Indeed, the kinetic energy can be written as T ¼ ð1=2Þ
ð x¼a
ð@y=@tÞ2 dx
ð3:39:2Þ
x¼0
¼ ðra=4Þ
ð s¼1
ðdfs ðtÞ=dtÞ2 ds
ð3:39:3Þ
s¼1
while the potential energy can be written as V ¼ ð1=2Þ
ð x¼a
ð@y=@xÞ2 dx
ð3:39:4Þ
x¼0
¼ ðt=4aÞ
Xs¼1 s¼1
s2 p2 ½fs ðtÞ2
ð3:39:5Þ
This allows us to obtain the Lagrangian function: L¼TV ¼
Xs¼1 s¼1
fðra=4Þðdfs ðtÞ=dtÞ2 ðkH s2 p2 =4aÞ½fs ðtÞ2 g
ð3:39:6Þ
which allows us to find the equations of motion for the normal coordinates: d2 fs ðtÞ=dt2 þ ðs2 p2 kH =4a2 rÞfs ðtÞ ¼ 0
ð3:39:7Þ
with the usual solutions: fs ðtÞ ¼ A exp½iðps=aÞðkH =rÞ1=2 t þ B exp½iðps=aÞðkH =rÞ1=2 t
ð3:39:8Þ
If we assume o2 ¼ s2 p2 k/4 a2 r, then Eq. (3.39.7) is equivalent to d2 fs ðtÞ=dt2 þ o2 fs ðtÞ ¼ 0
ð3:39:9Þ
Then from the Lagrangian L we can find the momenta Ps conjugate to the fs(t): Y s
¼ @L=@fs ¼ ðra=2Þðdfs =dtÞ
ð3:39:10Þ
232
3
QUA NT UM M ECH AN ICS
and finally construct the classical Hamilton’s function for the string: H¼
Xs¼1 s¼1
fð1=arÞðPs Þ2 þ ðkH s2 p2 =4aÞ½fs ðtÞ2 g
ð3:39:11Þ
and then, by replacing the classical conjugate momentum Ps by its quantum equivalent, (h/i) (@/@ fs), construct the quantum-mechanical Hamiltonian operator: ^¼ H
Xs¼1 s¼1
fðh2 =arÞð@=@fs Þ2 þ ðkH s2 p2 =4aÞ½fs 2 g
ð3:39:12Þ
^ are not linked. Therefore one can find The two terms in the Hamiltonian H product eigenfunctions for the string, with each factor belonging to a different classical normal mode: Xs¼1 cns ðfs; tÞ ð3:39:13Þ Cðfs ; tÞ ¼ s¼1 where ns is some quantum number for the sth state (which we will proceed to find). The relevant time-dependent Schr€ odinger equation is now fh2 =arÞð@=@fs Þ2 þ ðkH s2 p2 =4aÞ½fs 2 gcns ðfs ; tÞ ¼ ihð@cns ðfs ; tÞ=@tÞ
ð3:39:14Þ
By assuming factorability of the space and time dependences: cns ðfs ; tÞ ¼ uns ðfs Þexpðih1 Ens tÞ
ð3:39:15Þ
one gets a time-independent Schr€ odinger equation: fðh2 =arÞðduns =dfs Þ2 þ ðkH s2 p2 =4aÞuns ¼ Ens uns
ð3:39:16Þ
which is the harmonic oscillator problem! The eigenfunctions are well known: uns ðfs Þ ¼ Nns Hns ðas fs Þexpða2s i2s =2Þ
ð3:39:17Þ
where Nn is a normalization factor, Hn(x) is the nth-order Hermite polynomial, and a2s ¼ mos =h ¼ ðsp=2hÞðkH =rÞ1=2
ð3:39:18Þ
The “equivalent mass is m ¼ ra/2, and the eigenenergies are Ens ¼ ðns þ 1=2Þhos ¼ ðns þ 1=2Þðsph=aÞðkH =rÞ1=2 ðns ¼ 0; 1; 2; . . .Þ ð3:39:19Þ This means that a continuous string can be represented as existing in any of an infinite number of evenly spaced eigenstates. PROBLEM 3.39.1. Prove Eq. (3.39.3).
3.40
23 3
Q U A N T IZAT IO N OF THE E L ECT R O M A G N ETI C FI ELD
3.40 QUANTIZATION OF THE ELECTROMAGNETIC FIELD [14] Now we will introduce quantum electrodynamics. Just as we quantized the atoms and molecules, we must also quantize the electromagnetic radiation field, to deal with field–molecule interactions properly [14,34]. The electromagnetic field can be represented (in the Coulomb gauge, with scalar potential set to zero: f ¼ 0) as due solely to a vector potential A; a periodic three-dimensional wavefront A(r, t) propagating along a propagation wavevector K can be expanded in a Fourier series within a cube of side a: Aðr; tÞ ¼
X K
fðe1 qK1 þ e2 qK2 ÞexpðiK rÞ þ ðe1 qK1 * þ e2 qK2 *Þexpði K rÞg ð3:40:1Þ
where e1 and e2 are mutually orthogonal unit vectors, which define the polarization components of A, while qK1 and qK2 are independent normal coordinates (complex quantities in general) which include within them the time dependence. Since A must be periodic along all three mutually orthogonal coordinate axes x, y, and z, therefore K is restricted to values K ¼ ð2p=aÞðhex þ key þ lex Þ
ð3:40:2Þ
where h, k, and l are integers, and ex, ey, and ex are unit vectors. For any given K, the wavelength is l ¼ 2p=jKj
ð3:40:3Þ
Classically, each of the normal coordinates qK for the vector potential satifies an equation similar to Eq. (3.39.9): d2 qK =dt2 þ o2K qK ¼ 0
ð3:40:4Þ
and there is also the traditional connection (dispersion relation) between the angular frequency oK and the wavevector K: o2K ¼ c2 K2
ð3:40:5Þ
where c is the speed of light. So, A consists of superpositions of plane waves of all polarizations and phases, traveling with speed c parallel to each K vector. Each plane wave can be written as AK ¼ eK AK expðiK r oK tÞ
ð3:40:6Þ
This must now be walked into a quantum-mechanical formalism. What we have learned above permits us to write a Schr€ odinger equation similar to Eq. (3.39.14), whose solutions will be of the harmonic oscillator type: cn ðqK ; tÞ ¼ un ðqK Þexpðih1 En tÞ
ð3:40:7Þ
234
3
QUA NT UM M ECH AN ICS
¼ Nn Hn ðaK qK Þexp½ða2K q2K =2Þ iðn þ 1=2ÞoK t
ð3:40:8Þ
where Hn is again the nth-order Hermite polynomial, the quantum number n depends on the wavevector K (but this is not shown here for typographic simplicity), Nn is a normalization constant, and the convenient lumped constant aK is given by a2K ¼ «0 a3 h1 oK t
ð3:40:9Þ
where «0 is the permittivity of vacuum. The following picture emerges: The radiation field, represented classically by a vector potential—that is, by a superposition of plane waves, as before, with transverse electrical and magnetic fields—is now, in quantum electrodynamics, a system with quantized energies. The general eigenfunction is then a product eigenfunction of the type Y cn ðqK ; tÞ ¼ u ðq Þexp½iðn þ 1=2ÞoK t ð3:40:10Þ K n K cn ðqK ; tÞ ¼ UnK1...nK2... exp½it
P
K ðnK
þ 1=2ÞoK
ð3:40:11Þ
Elementary quantum mechanics showed that a plane wave exp (iK r) has the same dependence on space and time as the wavefunction of a particle with momentum hK. This, plus the quantization of the normal modes of vibration of the electromagnetic radiation field (just demonstrated), form, together, the quantum-mechanical basis for the wave-particle duality: A wave can become a particle, and vice versa, but you can never make a simultaneous experiment to test both the wave and the particle nature of the same system. PROBLEM 3.40.1. Show that the electric field EK corresponding to A ¼ e1q1 exp (i K r) is EK ¼ e1 ð@q1 =@tÞexpðiK rÞ
½14
PROBLEM 3.40.2. Show that the magnetic induction BK corresponding to A ¼ e1q1exp (i K r) is BK ¼ iK e1 q1 expðiK rÞ ½14 PROBLEM 3.40.3. For a transverse electromagnetic wave, the vectors EK, BK, and K must be mutually perpendicular. Prove that, for this to be true, EK K ¼ 0 and BK K ¼ 0 are required (same as Problem 2.7.5). PROBLEM 3.40.4. Show that A, E, and B are real [14].
3.41 TRANSITIONS IN THE RADIATION FIELD We return to the definition of the Lagrangian function, Eq. (3.36.2) for a particle with mass m and electrical charge e subjected to a magnetic vector potential A and to a scalar potential f:
3.41
23 5
T R A N S I T I O N S I N T H E RA D I A T I O N F I E L D
L ¼ ðm=2Þ
X
ðdxi =dtÞ2 þ e i
X i
Ai ðdxi =dtÞ ef
ðcgsÞ
ðð3:36:2ÞÞ
from which the momenta pi canonically conjugate to the coordinates xi are pi ¼ ðdL=dðdxi =dtÞÞ ¼ mðdxi =dtÞ þ eAi
ð3:41:1Þ
and the classical Hamilton’s function becomes ^ ¼ ð1=2mÞ H
X i
ðpi eAi Þ2 þ ef
ð3:41:2Þ
which can be expanded into a form that is essentially the same as Eq. (3.36.8), namely, ^ ¼ ½ð1=2mÞ H
X
p2 þ ef ðe=mÞ i i
X
n X o 2 p A þ ðe =2mÞ A2 i i i i i
ð3:41:3Þ
Usually the third term, in braces, is small; it is involved in two-photon processes (so-called double-quantum transitions). The first term, in square brackets, is the usual Hamilton’s function for atoms and molecules in the electrostatic field of the other electrons and nuclei. The second term, P (e/m) ipiAi, is therefore the main term, representing the interaction of the atom or molecule with the electromagnetic field: X ð3:41:4Þ V ¼ ðe=mÞ i pi Ai ^ ð1Þ ðtÞ introduced in Eq. This is the explicit form of the interaction potential H (3.34.2). We must now evaluate the matrix element of Eq. (3.41.4), using the eigenfunctions un(qi) (n ¼ 1, 2, . . ., N) of the system (atom or molecule) and also the eigenfunctions Um(qK) (m ¼ 1, 2,. . . ., 1) of the radiation field, Eq. (3.40.11). The integral given below looks formidable, but we will see below that it becomes quite simple. We must integrate over the momenta qi of the N particles of the system, as well as over the infinite number of normal modes qK of the radiation field: "ð # ð ð Xð 1 ub *pi ... Uh *Ai Un d qK ua dN qi ð3:41:5Þ hbhjVjani ¼ ðe=mÞ i ... q1
qN
q1
q1
The part of Eq. (3.41.5) enclosed in square brackets, is of interest: ð ð hhjAi jni ¼ eK expðiK rÞ . .. Uh *Ai Un d1 qK q1
ð3:41:6Þ
q1
By writing the magnetic vector potential as in Eq. (3.40.6), we find that Eq. (3.41.6) is a sum of a number of terms, each for a plane wave of different propagation direction K, frequency, and polarization direction; the vector potential component in the integrand is replaced by a qK, and the Uh are replaced by Eq. (3.40.7). Since the set of eigenfunctions forms an orthonormal set, the integrals involving the extra qK are unity, except for the single integral: ð Uh *qK Un dqK ¼ ðNh Nn =aK Þ
ð x¼þ1 x¼1
dx Hh ðxÞxHn ðxÞexpðx2 Þ
ð3:41:7Þ
236
3
QUA NT UM M ECH AN ICS
which, by using the recurrence formula for Hermite polynomials, Eq. (3.4.10), reduces to ð Uh *qK Un dqK ¼ ðNh Nn =aK Þ
ð x¼þ1
dxHh ðxÞ½Hnþ1 ðxÞ þ 2nHn1 ðxÞexpðx2 Þ
x¼1
¼ ðNh Nn =2Nh2 aK Þðdh;nþ1 þ 2ndh;n1 Þ
ð3:41:8Þ
with the following two possible values: ¼ Nn =2Nnþ1 aK ¼ ðn þ 1Þ1=2 21=2 a1 K
for h ¼ n þ 1 ðsystem emits photonÞ ð3:41:9Þ
¼ nNn =Nn1 aK ¼ n1=2 21=2 a1 K
for h ¼ n 1 ðsystem absorbs photonÞ ð3:41:10Þ
Thus the nonzero matrix element is only either hn þ 1jAi jni
ð3:41:11Þ
when the radiation field gains one quantum of vibration from the system, or hn 1jAi jni
ð3:41:12Þ
when the radiation field loses one quantum of vibration to the system. These last two results, Eq. (3.41.11) and (3.41.12), are of primary importance. The first-order transitions, due to the magnetic vector potential, will only occur either 1. if the radiation field gains one photon from the molecule or 2. if the radiation field loses one photon from the molecule and gives it to the system. The final expression for the probability of absorption, as in Eq. (3.37.8), will contain the square of terms like Eqs. (3.41.11) or (3.41.12). The selection rule for the radiation field is Dn ¼ DnK ¼ 1. The probability of absorption by the system is proportional to n, while the probability of emission from the system is proportional to (n þ 1). Here we have a combination of emission (a 1), which will occur even in the dark, and induced emission (a n), which occurs when light emission stimulates even more light emission. Einstein had predicted this in a thermodynamic analysis of blackbody radiation. PROBLEM 3.41.1. Derive Eq. (4.41.3) from Eqs. (3.41.1) and (3.41.2).
3.42
23 7
GENERAL RADIATIVE TRANSITIONS
3.42 GENERAL RADIATIVE TRANSITIONS Now that we know what the field eigenfunctions can do, we turn to the rest of the matrix element Eq. (3.41.5): Xð hbhjVjani ¼ ðe=mÞ i
"ð
ð
#
ð
1
Uh *Ai Un d qK ua dN qi ð3:41:5Þ
ub *pi q1 qN
q1 q1
In particular, using Eq. (3.41.6): hbn þ 1jVjani¼ðe=mÞðn þ 1Þ1=2 21=2 a1 K P hbn 1jVjani¼ðe=mÞn1=2 21=2 a1 K i
ð
P
ð
i
ð
ð ub *pi eKexpðiK rÞua dN qi q1 qN
ub *pi eKexpðiK rÞua dN qi q1 qN
Now we can expand exp(iK r) in a Taylor series, reminding ourselves that for atoms, in any event, the wavelength of a photon of visible light is of the scale of very many atomic diameters, so that K r is of the order of the Sommerfeld fine structure constant a (Problem 3.42.1 below); thus we are justified in keeping the leading term only: expðiK rÞ ¼ 1 þ iK r ð1=2ÞðK rÞ2 ið1=6ÞðK rÞ3 1
ð3:42:1Þ
then we are left with pieK; for it we use the commutator: pi ¼ i½m=hðHri ri HÞ
ð3:42:2Þ
so the integral becomes ððð Ix
ub *px eK expðiK rÞua dxdydz ððð i½m=h ub *eK ÞðHx xHÞua dxdydz
ð3:42:3Þ
i½m=hðEb Ea Þub *xua dxdydz so finally we obtain hbn þ 1jVjani ¼ ieðn þ 1Þ1=2 21=2 a1 K ðEb Ea Þ hbn 1jVjani ¼ ien1=2 21=2 a1 K ðEb Ea Þ
X ððð i
X ððð i
ub *ri ua dxdydz ð3:42:4Þ
ub *ri ua dxdydz
ð3:42:5Þ
This links the transition matrix element to the transition moment integrals hb|ri|ai (first moments of the electron distribution) along the direction of electric field of the emitted or absorbed photon: ððð ehbjri jai ¼
ub *ri ua dxdydz
ð3:42:6Þ
238
3
QUA NT UM M ECH AN ICS
for the atom or molecule. One can abbreviate the result using the Kronecker delta: hbn 1jVjani ¼ iðn þ dnþ1;n 1 Þ1=2 21=2 h1 a1 K ðEb Ea Þehbjri jai
ð3:42:7Þ
The above result is for a single normal mode of the radiation field. A real molecular system is coupled to all modes of the radiation field, so more mathematical labor is required. If at time t ¼ 0 we start in states Un for the radiation field and ua for the system, then the probability that at time t the radiation field will be in state Un 1 and the system will be in state ub is, in analogy to the Rabi formula of Eq. (3.34.19), Wa;b;n;n 1 ¼ 4fjhbn 1jVjanij=½Eb Ea hoK Þt=hg2 sin2 ½Eb Ea hoK Þt=h ð3:42:8Þ This will be largest when the denominator DE ¼ Eb Ea hoK
ð3:42:9Þ
approaches zero––that is, for the case where energy is conserved (no surprise!). Of course, some virtual transitions will appear not to conserve energy, but, in the long run (for large times t), energy indeed must be conserved. We consider all possible frequencies by integrating over all of them, but must multiply the integrand by a density of states r(o) per unit frequency interval: Wa;b ¼ 4
ð o¼þ1
do rðoÞjhbn 1jVjanij2 sin2 ½DEt=hDE2
ð3:42:10Þ
o¼1
If only the states near to resonance matter, that is, near to ho0 Eb Ea, then certain things fall out of the integrand, to yield Wa;b ¼ 4phjhbn 1jVjanij2 rðo0 Þt
ð3:42:11Þ
Now for the density of states at resonance, r(o0). From classical electromagnetic theory, the density of normal frequencies of a cubical box of side a is dN ¼ rðoÞdo ¼ a3 o2 p2 c3 do
ð3:42:12Þ
rðo0 Þ ¼ a3 o20 p2 c3
ð3:42:13Þ
Then
Inserting Eqs. (3.42.8) and (3.42.14) for the case oK ¼ o0 into Eq. (3.42.12), one gets 2 1 3 Wa;b ¼ 2ðn þ dnþ1;n 1 Þ1=2 e2 o30 p1 «1 0 h c hbjri jai t
ð3:42:14Þ
Therefore, the transition probability Wa,b between states a and b is proportional to the time t, to the square of the electric transition moment
3.43
23 9
STATIC VERSUS RESONANT DETECTION
Table 3.6 [30]
Relative Oscillator Strengths f and Extinction Coefficients « «/(cm1 L mol1)
f Electric-dipole-allowed Magnetic-dipole-allowed Electric-quadrupole-allowed Singlet–triplet spin-forbidden Parity-forbidden
104–105 102–10 104–101 102–10 103
1 105 105 105 101
hb|ri|ai between states a and b, and to either (n þ 1) for absorption or to (n 1) for emission of radiation from the system (atom or molecule). Next, let us compute the quantum-mechanical average power PQM radiated by the system by spontaneous emission (n ¼ 0)––that is, the photon energy times the transition rate between states a and b: 2 4 3 2 PQM ¼ ho0 ½dWa;b =dt ¼ 2p1 «1 0 o0 c e hbjri jai
ð3:42:15Þ
If we average the transition moment over all three Cartesian directions, we get 4 3 2 PQM ¼ ho0 ½dWa;b =dt ¼ ð2=3Þp1 «1 0 o0 c e
Pi¼3
i¼1 hbjri jai
2
ð3:42:16Þ
Consider the classical power PCL emitted by an oscillating electric dipole m ¼ m0 sin (o0t): 4 3 2 PCL ¼ ho0 ½dWa;b =dt ¼ 31 p1 «1 0 o0 c m0
ð3:42:17Þ
Hence a quantum-mechanical system radiates energy spontaneously at the same rate as a classical oscillating static electric-dipole transition moment of strength: m20 ¼ 2e2
Pi¼3
i¼1 hbjri jai
2
ð3:42:18Þ
Now we call m0 the oscillator strength of the transition a $ b. Table 3.6 gives some approximate values of relative oscillator strengths. PROBLEM 3.42.1. Show that K r is typically of the order of magnitude of the dimensionless fine structure constant a ¼ e2/4p«0ch ¼ 1/137.0377.
3.43 STATIC VERSUS RESONANT DETECTION [36] Spectroscopy is traditionally performed in a static (nonresonant) fashion: A transition (absorption or emission) is observed as a function of the frequency or wavelength of light. However, if one uses an electromagnetic wave,
240
3
QUA NT UM M ECH AN ICS
particularly an alternating-current (AC) radio-frequency (RF) source, and electrical resonance is measured in a tuned detection circuit, then the output signal is much stronger, and we have resonant detection.
3.44 STATIC ELECTRIC-DIPOLE SELECTION RULES FOR THE ONE-ELECTRON ATOM Now let us compute, for the one-electron atom, the three relevant “first moment” integrals hn0 l0 m0 |x|nlmi, hn0 l0 m0 |y|nlmi, and hn0 l0 m0 |z|nlmi. 0 0
0
hn l m jxjnlmi ¼
ð r¼1
ð y¼p r dr
r¼0
ð j¼2p 2
3
djcosj
sin ydy y¼0 m0
ð3:44:1Þ
j¼0 0
Nnlm Nn0 l0 m0 Rn0 l0 ðrÞRnl ðrÞPl0 ðcosyÞPl ðcos yÞexpðim jÞexpðimjÞ m
where the Nnlm are normalization constants, the Pm l (cos y), are associated Legendre polynomials of the first kind, and Rnl(r) are the radial factors, exp (ar) times associated Laguerre polynomials. See Problem 3.5.8 for further details. We shall not compute this integral for the general case but will determine when it is zero. The j integral is ð j¼2p
dj cos j expðim0 jÞexpðimjÞ ¼ pdm0 ;m 1
ð3:44:2Þ
j¼0
The y integral is, after using Eq. (3.5.55): ð y¼p y¼0
sin2 y dyPm 1 ðcos yÞPm l l ðcos yÞ ¼ ½1=ð2l þ 1Þ½dl0 l 1 þ dl0 l1
ð3:44:3Þ
Thus the integral reduces to hn0 l 1m 1jxjnlmi ¼ ½p=ð2l þ 1ÞNnlm Nn0 l 1;m 1
ð r¼1
r3 drRn0 l 1 ðrÞRnl ðrÞ
r¼0
ð3:44:4Þ The other two integrals reduce to hn0 lm0 jyjnlmi ¼ 0 unless m0 ¼ m 1 and l0 ¼ l 1
ð3:44:5Þ
hn0 l0 m0 jzjnlmi ¼ 0 unless m0 ¼ m 1 and l0 ¼ l 1
ð3:44:6Þ
Therefore the selection rules Dm ¼ 1, Dl ¼ 1 emerge. The integral over r gives no more selection rules. If one evaluates the general integrals with care, the relative intensities of spectral lines can be calculated. These are details important only for frenetic spectroscopists and astronomers.
3.45
STATIC ELECTRIC-DIPOLE SE LECTION RULE S F OR THE HARMONIC OS CILLATOR
Note that we have ignored spin. If we include spin, then the following rules become a complete set (at least for Russell–Saunders coupling for the one-electron atom): Dm ¼ 1;
Ds ¼ 0;
Dmj ¼ 1; 0
Dj ¼ 1; 0 ðbut j ¼ 0
ðbut mj ¼ 0
=!0
=! 0Þ;
if Dj ¼ 0Þ
PROBLEM 3.44.1. Prove Eq. (3.44.2). PROBLEM 3.44.2. Prove Eq. (3.44.3) using one of two possible recursion relations: sin y Pm1 ðcos yÞ ¼ ½1=ð2l þ 1Þ½P1þ1 m ðcos yÞ P11 m ðcos yÞ 1 m cos y P1 ðcos yÞ ¼ ½1=ð2l þ 1Þ½ðl m þ 1ÞP1þ1 m ðcos yÞ þ ðl þ mÞP11 m ðcos yÞ PROBLEM 3.44.3. Prove Eqs. (3.44.5) and (3.44.6) using the recursion relation: cos y P1 m ¼ ½1=ð2l þ 1Þ½ðl m þ 1ÞP1þ1 m þ ðl þ mÞP11 m
3.45 STATIC ELECTRIC-DIPOLE SELECTION RULES FOR THE HARMONIC OSCILLATOR Next, let us compute, for the one-dimensional harmonic oscillator, the “first moment” integral hn0 |x|ni. First, remember that the eigenfunctions of the harmonic oscillator are orthonormal: ð x¼1 dxð2m m!Þ1=2 ða=pÞ1=4 expðax2 ÞHm ða1=2 xÞð2n n!Þ1=2 ða=pÞ1=4 x¼1
expðax2 ÞHn ða1=2 xÞ ¼ dmn Next, set up the first moment integral: hn0 jxjni ¼
ð x¼1
xdxð2n n0 !Þ1=2 ða=pÞ1=4 expðax2 ÞHn ða1=2 xÞð2n n!Þ1=2 ða=pÞ1=4 0
x¼1
exp ðax ÞHn ða1=2 xÞ 2
¼ ð2n n0 !Þ1=2 ð2n n!Þ1=2 ða=pÞ1=2 0
ð x¼1
x dx expð2ax2 ÞHn ða1=2 xÞHn ða1=2 xÞ:
x¼1
and use the recursion formula Eq. (3.4.10) to get rid of the integrand factor x: 0
hn0 jxjni ¼ n0 ð2n n0 !Þ1=2 ð2n n!Þ1=2 p1=2
ð x¼1
dx expð2ax2 ÞHn0 1 ða1=2 xÞ ð x¼1 1=2 1=2 n0 0 1=2 n 1=2 ð2 n!Þ p dx expð2ax2 ÞHn0 þ1 ða1=2 xÞ Hn ða xÞ þ ð1=2Þð2 n !Þ x¼1
x¼1
Hn ða1=2 xÞ ¼ n0 a1=2 dn0 1;n þ ð1=2Þdn0 þ1;n Therefore the only allowed transitions are Dn ¼ 1.
241
242
3
QUA NT UM M ECH AN ICS
3.46 LIFETIMES FROM RESONANCE LINESHAPES [30] In the nonrelativistic Schr€ odinger equation, time is a parameter, not a coordinate. Therefore the typical uncertainty relation relating the lifetime Dt and the half-width DE of the transition: DEDt h=2
ðð3:1:5ÞÞ
must be interpreted a bit differently [30] than in the case of other canonically conjugate variables, such as energy and momentum, or angular momentum and phase. In particular, the time-dependent Schr€ odinger equation must be considered: Cðx; tÞ ¼ cðxÞexpðiEt=hÞexpðt=tÞ ¼ cðxÞ
ð E0 ¼E E0 ¼0
GðE0 ÞexpðiE0 t=hÞdE0 ð3:46:1Þ
where G(E0 ) is the Fourier transform: 0
2
0 2
2
GðE Þ ¼ ðh=2tÞ =½ðE E Þ þ ðh=2tÞ
ð t¼ðh=2pÞ=2t
exp½iðE E0 Þt=hdt ð3:46:2Þ
t¼0
so that the width at half-height DE is indeed given by Eq. (3.1.5).
REFERENCES 1. E. Schr€ odinger, Quantisierung als Eigenwertproblem. I, Ann. Physik 79:361–376 (1926); Quantisierung als Eigenwertproblem. II, Ann. Physik 79:489–527 (1926); € Uber das Verh€ altnis der Heisenberg-Born-Jordanschen Quantenmechanik zu der meinen, Ann. Physik 734–756 (1926); Quantisierung als Eigenwertproblem. III, Ann. Physik 80:437–490 (1926); E. Schr€ odinger, Quantisierung als Eigenwertproblem. IV, Ann. Physik 81:109–139 (1926); Naturwissen. 14:664 (1926); An undulatory theory of the mechanics of atoms and molecules, Phys. Rev. 28:1049–1070 (1926). 2. P. A. M. Dirac, A Quantum theory of the electron, Proc. Ry. Soc. London A117:610 (1928); P. A. M. Dirac, A theory of electrons and protons, Proc. R. Soc. London A126:360 (1928). 3. N. Bohr, On the constitution of atoms and molecules. Part I, Philos. Mag. 26:1–24 (1913). 4. L. de Broglie, Ondes et quanta, Compt. Rend. Acad. Sci. Paris 177:507–510 (1923). € 5. W. Heisenberg, Uber den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik, Z. Physik 43:172–198 (1927). 6. I. N. Levine, Quantum Chemistry, 6th edition, Pearson Prentice-Hall, Upper Saddle River, NJ, 2009. 7. H. Eyring, J. Walter, and G. E. Kimball, Quantum Chemistry, Wiley, New York, 1944. 8. H. Kuhn, A quantum mechanical theory of light absorption of organic dyes and similar compounds, J. Chem. Phys. 17:1198–1212 (1949). 9. L. Pauling and E. B. Wilson, Jr., Introduction to Quantum Mechanics, McGraw-Hill, New York, 1935. 10. E. U. Condon and G. H. Shortley, The Theory of Atomic Spectra, Cambridge University Press, Cambridge, UK 1963.
RE FE REN CES
11. S. Zhang and J. Jin, Computation of Special Functions, Wiley, New York, 1996. 12. A. Messiah, Quantum Mechanics, Volume 1, North-Holland, Amsterdam 1961. 13. H. A. Bethe and E. E. Salpeter, Quantum Mechanics of One- and Two-Electron Atoms, Springer, Berlin 1957. 14. R. B. Leighton, Principles of Modern Physics, McGraw-Hill, New York, 1959. 15. C. C. J. Roothaan, New Developments in molecular orbital theory, Rev. Mod. Phys. 43:69–89 (1951). 16. G. G. Hall, The molecular orbital theory of chemical valency. VIII. A method of calculating ionization potentials, Proc. R. Soc. London A205:541–552 (1951). € 17. T. C. Koopmans, Uber die Zuordnung von Wellenfunktionen und Eigenwerten zu den Einzelnen Elektronen Eines Atoms, Physica 1:104–113 (1933). 18. W. J. Hehre, L. Radom, P. v. R. Schleyer, and J. A. Pople, Ab Initio Molecular Orbital Theory, Wiley, New York, 1986. 19. J. W. Rayleigh, In finding the correction for the open end of an organ-pipe, Philos. Trans. 161:77 (1870). € 20. W. Ritz, Uber eine neue Methode zur L€ osung gewisser Variationsprobleme der mathematischen Physik, J. Reine Angew. Math. 135:1–61 (1909). 21. E. H€ uckel, Quantentheoretische Beitr€ age zum Benzolproblem. I. Die Elektronenkon-figuration des Benzols und verwandter Verbindungen, Z. f Phys. 70:204–286 (1931). 22. G. Klopman and R. C. Evans, in Gerald A. Segal, ed., Semiempirical Methods of Electronic Structure Calculation. Part A: Techniques, Plenum, New York, 1977. 23. J. A. Pople and D. L. Beveridge, Approximate Molecular Orbital Theory, Wiley, New York, 1969. 24. S. P. McGlynn, L. Van Quickenborne, and M. Kinoshita, Introduction to Applied Quantum Chemistry, Holt, Rinehart, and Winston New York, 1971. 25. M. J. S. Dewar, The Molecular Orbital Theory of Organic Chemistry, McGraw-Hill, New York, 1969. 26. G. E. Pake, Paramagnetic Resonance, W. A. Benjamin, New York, 1962. 27. G. Kortum and M. T. Seiler, Angew. Chem. 32:687 (1939). 28. R. Eisberg and R. Resnick, Quantum Physics of Atoms, Molecules, Solids, Nuclei, and Particles, 2nd edition, Wiley, New York, 1985. 29. H. N. Russell and F. A. Saunders, New regularities in the spectrum of the alkaline earths, Astrophys. J. 61:38–69 (1925). 30. P. W. Atkins, Molecular Quantum Mechanics, 2nd edition, Oxford University Press, Oxford, UK 1983. € 31. M. G€ oppert-Mayer, Uber Elementarakte mit zwei Quantenspr€ ungen, Ann. Phys. 9:273–294 (1931). 32. N. Davidson, Statistical Mechanics, McGraw-Hill, New York, 1962. 33. C. Sandorfy, Electronic Spectra and Quantum Chemistry, Prentice-Hall, New York, 1964. 34. W. Heitler, The Quantum Theory of Radiation, 3rd edition, Oxford University Press, Oxford UK, 1954. 35. M. Abramowitz and I. Stegun, Handbook of Mathematical Functions, National Bureau of Standards, Washington, DC 1964. 36. A. Abragam, The Principles of Nuclear Magnetism, Clarendon Press, Oxford, UK, 1961.
243
CHAPTER
4
Thermodynamics
“There is no free lunch.” Milton Friedman (1912–2006)
4.1 REVIEW OF THERMODYNAMICS When we deal with macroscopic ensembles of particles, the laws of thermodynamics must be discussed; their definitions and uses are reviewed below.
4.2 ZEROTH LAW OF THERMODYNAMICS (TRANSITIVITY) Let macroscopic bodies A, B, and C be at temperatures TA, TB, and TC, respectively. If body A is in intimate contact and thermal equilibrium with body B so that TA ¼ TB, and if body B is in intimate thermal contact with body C so that TB ¼ TC, then TA ¼ TC. This law introduces implicitly intuitive notions of temperature, thermal equilibrium, and heat flow and emphasizes the transitivity of temperature equalization.
4.3 FIRST LAW OF THERMODYNAMICS (CONSERVATION OF ENERGY – “YOU CAN’T WIN”) In a cyclic process, a macroscopic system cannot convert all its internal energy U into useful work W: Some of this energy is dissipated as heat q, as shown in the equation DU ¼ U1 U2 ¼ qDW
The Physical Chemist’s Toolbox, Robert M. Metzger. Ó 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc.
244
ð4:3:1Þ
4.4
SEC OND LAW OF T HER MO DY NA MIC S (“Y OU C ANN OT EV EN BREA K EVEN ”)
and for the particular case that the system is a perfect gas, which obeys the state function PV ¼ n R T, where P is the pressure, n is the number of moles, and R is the gas constant, the differential becomes dU dirr qP dV
ð4:3:2Þ
In this case, the increments in U and W can be written as the perfect differentials dU and dW (which can be made independent of path, and integrated), but dirrq cannot be integrated uniquely, since it is very much path-dependent: this causes major difficulties. The principle of conservation of energy can be attributed to Meyer1 in 1842, although the first inklings of this principle were expressed by Rumford2 in 1798. Note that modern usage and Eq. (4.3.1) uses DW as the work gained by the system. Some older textbooks had put þ DW as the work gained by the surroundings; such “double-think” was confusing.
4.4 SECOND LAW OF THERMODYNAMICS (“YOU CANNOT EVEN BREAK EVEN”) In a cyclic reversible (infinitely slow) process, connecting an infinite number of intermediate steps that are assumed to be in mutual equilibrium, the amount of heat q is minimized, and the quantity dS ¼
drev q T
ð4:4:1Þ
where T is the absolute temperature, becomes a perfect differential, whose integral is path-independent (but the path must be reversible); the factor 1/T is called an “integrating factor.” The function S is the entropy. The concept of a reversible path seems to be fiction, but it is very real for phase transitions (e.g., solid-to-liquid, liquid-to-gas) involving a large number of particles—for example, Avogadro’s3 number (NA) of particles; this large system of particles achieves reversibility in the large number of interactions ½ðNA ðNA 1Þ=2, which keeps the two phases in mutual coexistence. Now, we can write dU as a perfect differential in terms of natural variables (and state functions) S and V: dUðS; VÞ ¼ T dSP dV
ð4:4:2Þ
The first statement of the second law of thermodynamics was by Clausius,4 although early ideas came from Carnot5 in 1824.
1
Julius Robert von Meyer (1814–1878). Benjamin Thompson, Count Rumford (1753–1814). 3 Lorenzo Romano Amedeo Carlo Bernadette Avogadro, Conte di Quaregna e Cerreto (1776–1856). 4 Rudolf Julius Emanuel Gottlieb Clausius (1822–1888). 5 Nicolas Leonard Sadi Carnot (1796–1832). 2
245
246
4
THE RM ODYN AM ICS
4.5 THIRD LAW OF THERMODYNAMICS Boltzmann6 proposed that at the temperature T ¼ 0, all thermal motion stops (except for zero-point vibration), and the entropy function S can be evaluated by a statistical function W, called the thermodynamic probability W (or, as we will learn in Section 5.2, the partition function O for a microcanonical ensemble): S kB lne W
ð4:5:1Þ
where kB is Boltzmann’s constant. Remember that kB R=NA , where NA is Avogadro’s number. The Nernst7–Simon8 formulation of this law states that for any isothermal process involving pure substances at equilibrium, the entropy change DS goes to zero as the absolute temperature T goes to zero: LimT ! 0 DS ¼ 0
ð4:5:2Þ
Caratheodory’s9 principle derives the three laws of thermodynamics using differential geometry, from certain limits on the possible paths between adjacent differential surfaces.
4.6 USEFUL AUXILIARY FUNCTIONS: ENTHALPY, HELMHOLTZ FREE ENERGY, AND GIBBS FREE ENERGY Additional state functions, that are very useful in practice, are: enthalpy H, Helmholtz10 free energy A, and Gibbs11 free energy G. The term “free” is used to denote what is available as net “usable” energy, with due allowances for entropy. These auxiliary functions are defined as follows: H U þ PV
ð4:6:1Þ
A UTS
ð4:6:2Þ
G HTS
ð4:6:3Þ
so that the perfect differentials (for reversible processes) are
6 7 8
dUðS; VÞ ¼ T dSP dV
ðð4:4:2ÞÞ
dHðS; PÞ ¼ T dS þ V dP
ð4:6:4Þ
Ludwig Eduard Boltzmann (1844–1906). Walther Hermann Nernst (1864–1941).
Franz Simon (1893–1956). Constantin Caratheodory (1873–1950). 10 Hermann von Helmholtz (1821–1894). 11 Josiah Willard Gibbs, Jr. (1839–1903). 9
4.7
247
PERFECT DIFFERENTIALS (TWO-FORMS)
PROBLEM 4.6.1. to (4.6.3).
dAðT; VÞ ¼ S dTP dV
ð4:6:5Þ
dGðT; PÞ ¼ S dT þ V dP
ð4:6:6Þ
Prove Eqs. (4.6.4) to (4.6.6) by using Eqs. (4.6.1) and (4.6.1)
4.7 PERFECT DIFFERENTIALS (TWO-FORMS) A differential dz (or “two-form”) of a continuous function z ¼ zðx; yÞ of two independent variables x and y is defined as:
@z @x
dz ¼
@z dx þ @y y
dy;
ð4:7:1Þ
x
dz is a perfect differential if it satisfies the Euler12 relations, that is, the equalities of the second “cross” derivatives:
@2z @y@x
¼
! 2 @ @z @ @z @ z ¼ ¼ @y @x y @x @y x y @x@y
ð4:7:2Þ
x
Two other useful relationships are the “inverter”:
@z @y
1 @y @z x
ð4:7:3Þ
@y ¼ 1 z @z x
ð4:7:4Þ
¼ x
and the cyclic expression:
@z @x
y
@x @y
Therefore Eq. (4.7.1), applied to Eqs. (4.4.2) and (4.6.4)–(4.6.6) will yield
12
Leonhard Euler (1707–1783).
@U ¼T @S V
ð4:7:5Þ
@U ¼ P @V S
ð4:7:6Þ
248
4
@H @S
THE RM ODYN AM ICS
¼T
ð4:7:7Þ
@H ¼V @P S
ð4:7:8Þ
@A ¼ S @T V
ð4:7:9Þ
@A @V
P
¼ P
ð4:7:10Þ
@G ¼ S @T P
ð4:7:11Þ
@G ¼V @P T
ð4:7:12Þ
T
Finally the Euler relations, Eq. (4.7.2), applied to Eqs. (4.4.2) and (4.6.4)–(4.6.6) will yield automatically
@T @V
S
@P ¼ @S V
@T @V ¼ @P S @S P
@S @V
@S @P
¼
T
T
@P @T
ð4:7:13Þ
ð4:7:14Þ
ð4:7:15Þ V
@V ¼ @T P
ð4:7:16Þ
4.8 USEFUL MEASURABLES: THERMAL EXPANSIVITY, HEAT CAPACITY, JOULE–THOMSON COEFFICIENT We should remember that certain partial derivatives are quantities that can be measured conveniently: the isobaric thermal expansivity (also known as the coefficient of thermal expansion): a
1 @V V @T P
ð4:8:1Þ
4.8
249
USEFUL MEAS URABL ES
the isothermal compressibility: 1 @V kT V @P T
ð4:8:2Þ
the heat capacity at constant volume: @U CV @T V
ð4:8:3Þ
the heat capacity at constant pressure: @H CP @T P
ð4:8:4Þ
and the Joule13–Thomson14 coefficient (important for refrigerators and air conditioners): m
@T @P H
ð4:8:5Þ
For systems of many components, where the number of particles in any of the components can change, for example, by a chemical reaction, one must generalize the differential relations Eqs. (4.4.2) and (4.6.4)–(4.6.6) by defining the chemical potential mi as follows: mi
@U @Ni
@H @A @G @Ni S;P;allNj $Ni @Ni T;V;allNj $Ni @Ni T;P;allNj $Ni S;V;allNj $Ni ð4:8:6Þ
dU ¼ TdSPdV þ
i¼c X
mi dNi
ð4:8:7Þ
i¼1
dH ¼ TdS þ VdP þ
i¼c X i¼1
13 14
James Prescott Joule (1818–1889) William Thomson, first baron Kelvin (1824–1907).
mi dNi
ð4:8:8Þ
250
4
THE RM ODYN AM ICS
V A
U
T
S
FIGURE 4.1 G
H
The Maxwell “box,” a mnemonic diagram for thermodynamic variables.
P
dA ¼ SdTPdV þ
i¼c X
mi dNi
ð4:8:9Þ
i¼1
dG ¼ SdT þ VdP þ
i¼c X
mi dNi
ð4:8:10Þ
i¼1
(where c ¼ number of components), from which the Gibbs–Duhem15 relations follow: i¼c X Ni dmi ¼ 0
ð4:8:11Þ
i¼1
PROBLEM 4.8.1. Show that you can derive the most important relationships by learning the way to interpret the diagram in Fig. 4.1 (called the Maxwell16 box)
4.9 GIBBS PHASE RULE Chemical equilibria between p phases will be established when the chemical potentials of all the phases are equal: m1 ¼ m2 ¼ ¼ mp
ð4:9:1Þ
From this the Gibbs phase rule can be constructed: f ¼ cp þ 2
ð4:9:2Þ
where c is the number of components and f is the number of degrees of freedom, or number of independent variables. With the advent of nanotechnology, an interesting issue is: When do you really have a homogeneous
15 16
Pierre Maurice Marie Duhem (1861–1916). James Clerk Maxwell (1831–1879).
4.10
25 1
CRYSTALLINE SOLID
phase (versus a mixture of two interpenetrating phases that look macroscopically homogeneous)? Here are a few known stable phases: (a) solid: (a1) crystalline, (a2) amorphous; (a3) glass; (a4) plastic crystal; (a5) superconductor; (a6) ferromagnet; (a7) antiferromagnet; (a8) electret (b) liquid: (b1) normal; (b2) liquid crystal: (b2.1) smectic, (b2.2) nematic, (b2.3) cholesteric; (b3) ionic liquid; (b4) superfluid (2He4 below the lambda point) (c) gas or vapor In addition, there are phases that are short-term stable but long-term metastable: supercritical CO2, supercooled water, and so on. The Ehrenfest17 classification of phase transitions (first-order, secondorder, and lambda point) assumes that at a first-order phase transition temperature there are finite changes DV$0, DH$0, DS$0, and DCP $0, but mi;lower T ¼ mi;higher T and changes in slope of the chemical potential mi with respect to temperature (in other words ð@mi =@TÞlowerT $ð@mi =@TÞhigherT ). At a second-order phase transition DV ¼ 0, DH ¼ 0, DS ¼ 0, and DCP ¼ 0, but there are discontinuous slopes in (@V/@T), (@H/@T), (@S/@T), a saddle point in mi(T), and a discontinuity in CP. A lambda point exhibits a delta-function discontinuity in CP. A very important expression originates from dealing with partial pressures in mixtures of perfect gases, but is used “everywhere”:
mi mN i þ RT lne ci
ð4:9:3Þ
where ci is the molar concentration (mol/L) of species i, and myi is the value of mi in a reference “standard” state, defined by some convention. Thermodynamicists love Eq. (4.9.3) so much that even when the equation should no longer work (e.g., when ci is large, or when molecules aggregate and no longer act independently), they invent activity coefficients gi to “force it to work”: mi ¼ mN i þ RT lne ðgi ci Þ
ð4:9:4Þ
The price for this simplicity is extensive tables of concentration-dependent activity coefficients.
4.10 CRYSTALLINE SOLID A crystal, or crystalline solid, is defined by a fixed volume and shape and by long-range translational order: Atoms or molecules in the “primitive unit
17
Paul Ehrenfest (1880–1933).
252
4
THE RM ODYN AM ICS
cell,” a parallelepiped with unit cell axes a, b, c are repeated with almost perfect translational symmetry by displacement vectors of the type rijk ¼ ia þ jb þ kc
ð4:10:1Þ
(where i, j, k are integers) that repeat the contents of the primitive unit cell in three-dimensional “direct” space. This will be discussed again in Section 7.1. Note, however, that the perfection is limited by the existence of “domains”: A perfect crystallite may have dimensions of 1–3 mm in any of the three directions, and then the next crystallite will start with a small misorientation, typically of the order of maybe one-tenth to one-fifth of a degree of arc. Thus a crystal will have the short-range order of the contents of the primitive unit cell repeating into its next nearest neighbors. The unit cell axes are simply a recognition of the ordering and symmetry of the crystal and that the origin and choice of axis directions are defined by convention. These solids are bound by covalent forces (e.g., diamond, silicon, graphite within their graphene sheets), ionic forces (salts like sodium chloride), or van der Waals forces (crystals of solid benzene, anthracene, neon; inter-sheet forces in graphite, molybdenum disulfide, etc.), or combinations of these forces. X-ray, neutron, and electron diffraction peaks exhibit the periodicities of the crystalline lattice. Most crystals expand by 1% of their volume from 0 C to 1000 C, and they compress by about 1% per kbar of applied pressure.
4.11 AMORPHOUS SOLID An amorphous solid has some amount of short-range order but no long-range order. It has an approximately definite volume but no defined shape, as well as low diffusional mobility within the structure. Solid polymers are typically amorphous solids, with short-range order along the covalently bound chain, but no order between polymer strands. Thus, amorphous “carbon black,” polyethylene, DNA, and so on, form amorphous solids. The very few diffraction maxima will exhibit only the short-range order of these structures. Parenthetically, DNA is a fibrous sodium salt of double-stranded ribosepoly-phosphate covalently bonded to precise sequences of the four nucleotides adenine (A), thymine (T), glycine (G), and cytosine (C), with internucleotide hydrogen bonds connecting the two strands, and which, in trios, form one bit of the genetic code.
4.12 GLASS Glasses are also amorphous solids, which again have definite volume but not set shape. Glasses are sometimes described as supercooled liquids, although there is a gradual and smooth transition in density, called “glass transition,” from the glassy state to the supercooled liquid state. Glasses have great mechanical strength under compression, but negligible strength under expansion, and hence a high tendency to fracture or shatter. Examples are fused quartz glass (pure SiO2), sodium or soda-lime or soft glass (a mixture of SiO2,
4.14
MAGNETIC SOLIDS
Na2O, CaO), borosilicate or Schott18 glass (a mixture of B2O3 and SiO2; trade names DuranÒ , PyrexÒ , and KimaxÒ ), and electrically “conducting glass” (ITO: glass covered by a very thin, rough, yellowish-to-gray surface layer of typically 90% In2O3 and 10% SnO2). The diffraction characteristics of glasses are the same as those for any amorphous solid.
4.13 PLASTIC CRYSTAL Plastic crystals are almost crystalline solids, except that the molecular constituents in the primitive unit cell rotate freely in place; this confers to them a certain degree of plasticity. Examples are certain cage-shaped boranes (e.g., B10H14), carboranes (e.g., B10C2H12), and buckminsterfullerene (C60) at room temperature. These plastic crystals usually order into normal crystalline solids at low temperatures.
4.14 MAGNETIC SOLIDS Solids can also be subdivided by their magnetic properties. The preponderant fraction of solids (crystalline or amorphous) are diamagnetic. If the individual components (atoms, ions, or molecules) have a net magnetic dipole moment, these magnetic solids can be classified according to how these moments add, or cancel, or are enhanced by nearest-neighbor interactions. Paramagnetic solids have a net magnetization that is the sum of individual moments, except for their thermal orientational disorder. Superparamagnetic solids have a net magnetization that enhances the sum of individual moments, all of which can easily reoriented by a small external magnetic field. Ferromagnetic solids have a net magnetization that is larger than the sum of the individual moments, and they have considerable resistance to reorientation of these moments. Examples are a-Fe, Fe2O3, CrO2, Co, Ni, alloys of these with Pt, and rare earth borides, for example, Nd2Fe14B. Above a substance-specific critical temperature, called the Curie19 temperature (TC), the ferromagnetic domains lose their coherence, and the whole crystal turns into a paramagnet. The essential difference between superparamagnetic and ferromagnetic solids is that in the former the domains are too small for them to retain a net magnetization, while in the latter the domains easily retain the magnetization and resist (up to a limit) the “demagnetizing” fields. Antiferromagnetic solids have a net magnetization that tends to cancel, or orient in opposite direct ions, equal individual moments, and have considerable resistance to reorientation of these moments. An example is MnF2. Above a substance-specific critical temperature, called the Neel20
18
Otto Schott (1851–1935). Pierre Curie (1859–1906). 20 Louis Eugene Felix Neel (1904–2000). 19
25 3
254
4
THE RM ODYN AM ICS
temperature (TN), the anti-ferromagnetic domains lose their coherence, and the whole crystal turns into a paramagnet. Ferrimagnetic solids have a net magnetization that tends to cancel individual moments, and they have considerable resistance to reorientation of these moments; however, the individual moments are usually of two kinds, one large, one small, so that the antiparallel ordering of moments yields some net overall magnetization.
4.15 ELECTRET Solids can also be subdivided by their electrical polarization properties. The preponderant fraction of solids (crystalline or amorphous) are dielectric: They have no net electrical polarization. If the individual components (molecules or clusters of ions) do have a net electric dipole moment, and these add nonlinearly, then one has electrets. There are also nanoferroelectrics.
4.16 LIQUID Liquids have a definite volume, but no definite shape. They are somewhat compressible, and typically expand with increasing temperature. They have at most short-range order, of the order of one to three atomic, ionic, or molecular volumes. Most liquids have a freezing temperature (loosely called “freezing point”) when a liquid becomes a solid, and they have a boiling temperature (loosely called “boiling point”) where the vapor pressure above the liquid matches the barometric pressure. Between the freezing temperature and the boiling temperature, most liquids have finite vapor pressures above the liquid that rise exponentially with temperature, with the following exception: “Ionic liquids” have a negligible vapor pressure, since long-range Coulomb forces greatly retard the vaporization of individual ions. A few compounds dispense with the liquid state altogether, and the solid phase turns directly into the vapor phase (sublimation temperature). In liquids, heat transfer is by convection and diffusion. In liquids or gases the molecules have high translational motion; this is regulated by Fick’s21 first and second laws of diffusion (1855): J ¼ Dð@C=@xÞ
ð4:16:1Þ
ð@C=@tÞ ¼ Dð@ 2 C=@x2 Þ
ð4:16:2Þ
where J is the flow of molecules across an imaginary plane in the container, C is the concentration, D is the diffusion coefficient, or diffusivity, and x is the coordinate normal to the imaginary plane. D is proportional to the square of the molecular velocity, to the viscosity Z, and to the particle size.
21
Adolf Eugen Fick (1829–1901).
4.16
25 5
LIQUID
Liquid volumes are limited by their surface tension P, a force per unit area, or energy per unit volume that arises because the surface atoms or molecules do not have that complete set of nearest neighbors in all three dimensions that the atoms or molecules in the bulk have. Typical surface tensions are: C6H6: 28.88 mN/m; Hg, 472 mN/m; CH3OH: 22.6 mN/m; H2O: 72.75 mN/m. The resultant minimization of surface area makes rain droplets (almost) spherical (not teardrop-shaped!). The Laplace22 equation explains this: ð4:16:3Þ Pint Pext ¼ 2P=r where Pint is the pressure (N m2) in the concave interior of a curved surface of radius r, and Pext is the pressure on the outer convex surface. PROBLEM 4.16.1. Prove Eq. (4.16.3). Capillary action occurs when a liquid of density r and surface tension P in a thin tube of inner radius r “wets” the inner surface of the tube, and it rises at the inner periphery of the tube, by a height h in its center, against the force of gravity g. The Laplace equation yields rgh ¼ 2 P/r, whence one can measure P ¼ rghr=2
ð4:16:4Þ
The normal modes of waves on a flooded planet were determined by Rayleigh23; they are the spherical harmonics Ym l ðy; jÞ discussed in Section 3.5. Tides in the earth’s oceans (with their period of about 12 h 25.2 min between high and low tide) are caused by a combination of the earth’s rotation and the gravitational pull of the moon, whose intensity and timing are affected somewhat by details of the depth of the ocean floor close to the coastline. Local waves in the ocean are stationary, but close to the beach become travelling waves. In addition, giant solitary waves, or harbor waves (“tsunami” in Japanese), are giant waves, caused by subsurface earthquakes, which move at high longitudinal speed and sometimes devastatingly high amplitude (10 m or higher) across the ocean; they resemble solitons. Solitons are self-reinforcing solitary waves, for which dispersion and nonlinearity effects cancel each other; they were first described by Russell24 in 1834. These solitons can be represented by nonlinear equations such as the Korteweg25– de Vries26 equation ½ð@f=@tÞ þ ð@ 3 f=@t3 Þ6fð@f=@xÞ ¼ 0, a nonlinear Schr€ odinger27 equation ½ð1=2Þð@ 2 f=@x2 Þ þ kfjfj2 ið@f=@tÞ ¼ 0, or the sine-Gordon equation ½ð@ 2 f=@t2 Þð@ 2 f=@x2 Þ þ sin f ¼ 0. Osmosis. Imagine a solution A separated by a semipermeable membrane from a second solution B (or pure solvent B); the membrane is assumed to be permeable to the solvent but not to the solute: this is easily understood if the solute is a large macromolecule and the solvent is pure water, a small 22 23 24
Pierre Simon, Marquis de Laplace (1749–1827). John William Strutt, third Baron Rayleigh (1842–1919).
John Scott Russell (1808–1882). Diederick Korteweg (1848–1941). 26 Gustav de Vries (1866–1934). 27 Erwin Rudolf Josef Alexander Schr€ odinger (1887–1961). 25
256
4
THE RM ODYN AM ICS
molecule. The osmotic pressure P is the pressure that must be applied on the more concentrated solution side to prevent the solvent from moving into it across the membrane. For dilute solutions P with nB moles of solute, volume V and absolute temperature T is given by the van’t Hoff28 equation: P ¼ ðnB =VÞRT
ð4:16:5Þ
Osmotic pressure is vital in biology, where the cell contents has a different concentration of solutes than the surrounding medium: If too much medium moves into a cell, it bursts and dies (“lysis”); conversely, if too much medium moves out of a cell, it shrinks and dies; these movements are called passive transport. Active transport involves proteins on the cell wall, which promote movements of nutrients and waste products despite the osmotic pressure. Superfluid. Liquid helium (more precisely the 2He4 isotope) has a “lambda point” transition temperature of 2.17 K, below which it becomes a superfluid (“Helium-II”). This superfluid, or “quantum liquid,” stays liquid down to 0 K, has zero viscosity, and has transport properties that are dominated by quantized vortices; thus 2He4 never freezes at 1 bar. Above 25.2 bar the superfluid state ceases, and 2He4 can then freeze at 1 K. The other natural helium isotope, 2He3, boils at 3.19 K and becomes a superfluid only below 0.002491 K. Another unusual state is the supercritical fluid, attained by clusters of molecules (e.g., CO2) which become polar—that is, probably order so as to have small net overall dipole moments, even though the individual molecules have zero dipole moments. Therefore supercritical CO2 (above 31.1 C and above 72.9 atm) can be used in chemical separations when more normal polar solvents fail.
4.17 LIQUID CRYSTALS Certain rod-like organic molecules can form a state, in which partial ordering of these molecules, particularly in electric fields, can be achieved; these are liquid crystals (LC), with phases like nematic, smectic, cholesteric, and a few more (see Fig. 4.2) that are intermediate between crystals and isotropic liquids. These liquid crystals are either rod-like (calamitic) or round (discotic). Their optical properties are anisotropic. The calamitic liquid crystal molecules have small dipole moments, so their orientation, and therefore their optical polarization, can be changed by an electric field applied to patterned semitransparent ITO electrodes; this change affects the transmittance of a plane-polarized light between crossed polarizers on a given “optical pixel.” Color is achieved by grafting onto the pixels an organized set of submillimeter-sized colored lenses. LCs are used in flat-panel LC displays, which by 2007 have largely displaced cathode-ray tube (CRT) displays for television screens and computer monitors.
28
Jacobus Henricus van’t Hoff (1852–1911).
4.18
25 7
ARRHENIUS ASSUMPTION
CHOLESTERIC
NEMATIC
FIGURE 4.2 Calamitic liquid crystal phases.
SMECTIC
4.18 ARRHENIUS ASSUMPTION As was discussed in Section 3.33, Arrhenius29 assumed that, at a macroscopic temperature T, if a system has two states, namely (1) a ground state G with energy UG and a macroscopic particle occupation number NG and (2) an upper or excited state U, with energy UE and occupation number NE, then the ratio of particles in the two states is given by NE =NG ¼ exp½ðUE UG Þ=kB T
ð4:18:1Þ
This relationship works, for UE H UG, as long as the excited state is less populated than the lower state (NE G NG). If UE H UG and NE ¼ NG then formally, if Eq. (4.18.1) holds, then T ¼ 1. Going further, if population inversion occurs (NE H NG), as occurs with lasers before stimulated light emission, or with nuclear spins upon saturation of the excited state, then T must be “negative.” This absurd notion, which flies in the face of conventional thermodynamics, arises when Eq. (4.41) is forced to hold even under conditions for which it was never designed.
29
Svante August Arrhenius (1859–1927).
258
4
THE RM ODYN AM ICS
4.19 PERFECT GAS LAW, VAN DER WAALS AND VIRIAL EQUATIONS As mentioned earlier, the perfect gas (or ideal gas) obeys the equation PV ¼ nRT ¼ nNA kB T
ð4:19:1Þ
at all pressures P, volumes V, and absolute temperature T. R is the gas constant: R ¼ 0:082057 L atm mol1 K1
ð4:19:2Þ
R ¼ 8:31431 J mol1 K1
ð4:19:3Þ
The gas constant R, divided by Avogadro’s number NA, is the Boltzmann constant kB: kB ¼ R=NA ¼ 1:3806578 1023 J K1
ð4:19:4Þ
The perfect gas law is the 1834 merger, performed by Clapeyron,30 of Boyle’s31 1662 law (PV ¼ constant) the 1787 law of Charles32 and Gay-Lussac33 (V / T), and Avogadro’s 1811 principle. One definition of a perfect gas is that its “internal pressure” vanishes: ð@U=@TÞV ¼ 0
ð4:19:5Þ
All gases will liquefy at low enough temperatures, at their normal boiling temperature Tb (at 1 atm pressure); and most liquids will solidify at their normal melting temperature Tm (at 1 atm pressure) (see Table 4.1). The van der Waals34 equation of state accounts fairly well for the possibility that a gas must liquefy: ðP þ an2 V 2 ÞðVnbÞ ¼ nRT
ð4:19:6Þ
where the constant b deals with the excluded volume (i.e., the volume occupied by a molecule when compressed to touch the next molecule), and the constant a deals with the corrections to the pressure due to intermolecular interactions. The van der Waals equation also explains the existence of the critical point (with critical temperature Tc, critical pressure Pc, and critical volume Vc), a point above which there is no distinction between liquid and vapor, but a single “fluid” phase with no surface tension.
30 31
Paul Emile Clapeyron (1799–1864).
Robert Boyle (1627–1691). Jacques Charles (1746–1823). 33 Joseph Louis Gay-Lussac (1776–1850). 34 Johannes Diderick van der Waals (1837–1923). 32
25 9
Solid
Liquid
Solid Liquid
PERFEC T GAS L AW, VAN DE R WAALS A ND VIRIAL E QUAT IONS
Critical point G
Liq Va uidpo r Tri ple lin e So Vo lu
lid
–V ap
me
Va p
or ure
e mp
PVT surface for a substance that contracts on freezing.
Te
Critical point
as
Tri p
le
Vo lu
lid
–V ap
G
Liq Va uidpo r
Solid
So
FIGURE 4.3
rat
or
Liquid Pressure
as
Pressure
4.19
lin
e Va po r
FIGURE 4.4
or
me
re atu
PVT surface for a substance (e.g., water) that expands on freezing.
r
pe
m Te
This critical point can be defined in a PV diagram as the inflection point where ð@P=@TÞV ¼ 0 ð@ 2 P=@T2 ÞV ¼ 0
at Pc ; Tc ; Vc at Pc ; Tc ; Vc
ð4:19:7Þ ð4:19:8Þ
The critical point is a point where a collective approach of molecules into a phase with close intermolecular proximity occurs all over the sample, and
260
4
THE RM ODYN AM ICS
critical opalescence is observed, due to large density fluctuations, as first explained by Smoluchowski. It has been proposed that there is a universal behavior in the approach of the relevant physical parameters to any critical point in ferromagnetism, ferroelectricity, fluid-superfluid transitions (lambda point) in liquid He, gas–liquid phase transitions, and so on. PROBLEM 4.19.1. Show [1] by applying Eqs. (4.19.7) and (4.19.8) to the van der Waals equation, Eq. (4.19.6), that Tc ¼ 8a=27bR
ð4:19:9Þ
Vc ¼ 3nb
ð4:19:10Þ
Pc ¼ a=27b2
ð4:19:11Þ
It is not known experimentally whether there is a similar critical point for solid–liquid phase transitions; the experimentally available temperatures and pressures are insufficient to resolve this issue. The triple point (which really should be called a triple line) is a triple temperature (Tt) and a triple pressure (Pt) at which the three phases gas, liquid, and solid coexist, but with different volumes; this triple line for several compounds is used to define reliable and reproducible standard temperatures for the International Temperature Scale. Other empirical gas laws exist (Berthelot,35 Dieterici,36 Beattie37– Bridgman,38 etc.), but the search for a simple, yet generally valid, gas law for all gases at all conditions of temperature, pressure, and volume has failed. Engineers must thus rely on tabular data (e.g., steam tables) rather than on a master equation. One intuitively useful gas equation is Kamerlingh Onnes’39 virial equation (a fancy term for a power series): PV=nRT ¼ 1 þ ðn=VÞBðTÞ þ ðn=VÞ2 CðTÞ þ ðn=VÞ3 DðTÞ þ
ð4:19:23Þ
where the second, third, and fourth virial coefficients B(T), C(T), and D(T), respectively, do depend on temperature); B(T) is also listed in Table 4.1. There is a reduced equation of state that is followed by many gases; the critical point data are used to define a reduced pressure, volume, and temperature: TR T=Tc ;
35 36
VR V=Vc ;
Marcellin Pierre-Eugene Berthelot (1827–1907).
Conrad Dieterici (1858–1929). James A. Beattie (1895–after 1965). 38 Percy Williams Bridgman (1882–1961). 39 Heike Kamerlingh Onnes (1863–1926). 37
PR P=Pc
ð4:19:24Þ
4.19
26 1
PERFEC T GAS L AW, VAN DE R WAALS A ND VIRIAL E QUAT IONS
Table 4.1 Normal Melting Temperature Tm, Normal Boiling Temperature (at 1 atm) Tb, Critical Temperature Tc, Critical Pressure Pc, Critical Volume Vc, Triple Point Temperature (Tt), Triple-Point Pressure (Pt), Van Der Waals Coefficients a and b, Second Virial Coefficients B(T) at 273 K, and Joule–Thomson Inversion Temperature Ti (at 1 atm) for Several Elements and Compounds [1–3]
Compound He Ne Ar Kr H2 N2 O2 F2 Cl2 Br2 I2 HF HCl HBr HI H2O CO CO2 CH4 CF4 CCl4 CBr4 CI4 C2H6 C2H4 C2H2 C6H6 NH3 N2H2 Li Na K Mg Ga Al In Fe Co Ni Cu Ag Au Hg Mo Pt Pb W
Tm
Tb
Tc
Pc
(K)
(K)
(K)
(atm)
— 24.48 84.0 116.6 13.81 63.29 54.75 53.53 172.17 266.0 386.6 190.1 158.35 184.7 222.3 273.15 74 194.7 91 123 258 363.2 444 89.9 104 192.3 278.7 195.5 275.2 453.69 370.96 336.4 922.0 302.93 933.52 429.76 1808 1768 1728 1356.6 1235.1 1337.58 234.28 2610 2045 600.65 3695
4.22 27.3 87.45 120.85 20.35 77.35 90.19 85.01 238.55 331.93 457.50 292.69 188.2 206.2 — 373.15 81.7 194.7 109 144 349.7 463 413 184.6 169.5 189.2 ?? 353.3 239.80 386.65 1615 1156.1 1033.15 1380 2676 2740 2373 3023 3143 2643 2840 2485 3081 629.73 5560 4100 2013 5933
5.19 44.40 150.8 209.4 33.3 126.2 154.3 144.3 417.2 588 818 461 324.7 363.0 423.2 647.4 134.0 305.2 190.6 227.5 556.3
0.227 27.2 48.00 54.3 12.8 33.5 49.7 51.47 76.1 99.7 116 64 81.5 84.0 80.8 218.3 34.6 72.8 45.6 36.96 45
283.0 282.9 308.7 562.7 1405.6
48.2 50.9 61.6 48.6 112.2
3223 2508.7 2220
680 253.0 161.8
1735.0
1036.0
Vc (cm3 mol1)
Tt (K)
2.27 75.35 69.7 90.0 74.1
13.92 54.34
124
81.0
55.3 90.1 94.0 98.7
273.16
Pt
a b B(T) (104 cm6 (cm3 @237 K (atm) atm mol1) mol1) (cm3 mol1) 3.41 21.07 134.5 231.8 24.4 139 136
23.7 17.09 32.19 39.78 26.6 39.1 31.8
649.3
56.22
366.7 445.1
40.81 44.31
546.4 148.5 359.2 228.3
30.49 39.85 42.67 42.78
2039
138.3
260 72.5
548.9 447 439 1824 417
63.80 57.1 51.36 115.4 37.1
40.1
809
17.0
127.5
12.0 10.4 21.7 62.9 13.7 10.3 22.0
142 53.6
Ti (K) 51.0 231 723 1090 202 621 764
1500 968
262
4
THE RM ODYN AM ICS
These reduced coordinates remove from the picture the individual differences between the mutual interactions of different molecules. For instance, the van der Waals equation, Eq. (4.51), becomes, in reduced variables: ðPR þ 3VR2 ÞðVR 1=3Þ ¼ ð8=3ÞTR
ð4:19:25Þ
The departure from ideality (perfection) is chronicled, inter alia, by how much the compressibility factor Z, defined by Z ¼ PV=nRT
ð4:19:26Þ
departs from unity. At the critical point the van der Waals equation yields Z ¼ 8/3. Many gases (nitrogen, methane, ethane, ethylene, propane, n-butane. iso-pentane, n-heptane, carbon dioxide, water) obey Eq. (4.19.25) quite well in the reduced pressure range PR ¼ 0 to 8 [7]. The adiabatic index g: g CP =CV
ð4:19:27Þ
is involved in the speed of sound v in perfect gases: v ¼ ðgp=rÞ1=2
ð4:19:28Þ
where p is the pressure and r is the density (v ¼ 343 m s1 in dry air at 293 K); for perfect gases the formula reduces to v ¼ ðgkB T=mÞ1=2
ð4:19:29Þ
where m is the mass of a molecule expressed in kilograms. Thus for perfect gases v depends only on the absolute temperature and the adiabaticity index. Sound is the transmission of a small disturbance in a gas at constant entropy. In real gases, liquids, and solids, the speed of sound v depends on the ratio of the coefficient of stiffness K (or bulk modulus, i.e., resistance of the medium to deformation) over the density of the medium r: v ðK=rÞ1=2
ð4:19:30Þ
The Mach40 number Ma is defined as the ratio of the speed vobj of an object in a medium to the speed of sound v in the same medium: Ma vobj =v
ð4:19:31Þ
As the object surpasses the Ma ¼ 1.0 boundary, a “sonic boom” is generated. For comparison, the speed of a satellite in low Earth orbit is Ma 25.2, and the speed of light in vacuum is Ma 880,000.
40
Ernst Mach (1838–1916).
4.20
MAXWE LL–BOLTZMANN DISTRIBUTION, COLLISION FRE QUENCY, ME AN FREE PATH
The viscosity in gases is given by Z ¼ mhvi81=2 p1 d1
ð4:19:32Þ
where hvi is the mean speed of the molecules of mass m and diameter d. PROBLEM 4.19.2. [1] Show that T ¼ ð8a=27bRÞTR
ð4:19:33Þ
V ¼ 3nbVR
ð4:19:34Þ
P ¼ ða=27b2 ÞPR
ð4:19:35Þ
PROBLEM 4.19.3. Start from Eq. (4.19.6) and, using Eqs. (4.19.33) to (4.19.35), prove Eq. (4.19.25). PROBLEM 4.19.4. Show that at the critical point, the van der Waals equation, expressed in reduced coordinates, yields a compressibility Z ¼ 3/8 ¼ 0.375. PROBLEM 4.19.5. Compare the second virial coefficient B(T) to the Lennard-Jones 6–12 potential.
4.20 MAXWELL–BOLTZMANN DISTRIBUTION, COLLISION FREQUENCY, MEAN FREE PATH, AND GASEOUS EFFUSION In Section 5.2, we will derive the three-dimensional Maxwell–Boltzmann distribution n(v)dv of molecular speeds between v and v þ dv in the gas phase: nðvÞdv ¼ 4pv2 ðm=2pkB TÞ3=2 expðmv2 =2kB TÞdv
ð4:20:1Þ
from statistical–mechanical considerations. Here we can derive it from equivalent simple assumptions. Assume first that the energy of any molecule of mass m in the gas phase is purely its kinetic energy: E ¼ ð1=2Þmv2
ð4:20:2Þ
Assume next that the fraction of molecules dn with speed between vx and vx þ dvx is given in one dimension by an Arrhenius factor resembling Eq. (4.18.1): dn ¼ A exp½ð1=2Þmv3x =kB Tdvx
ð4:20:3Þ
26 3
264
4
THE RM ODYN AM ICS
which is a Gaussian41 function symmetric about vx ¼ 0 (with symmetrically equal fractions for positive and negative vx). Finally, assume that the motions in the x, y, and z directions are not correlated. Then in one dimension the constant A can be obtained from ð ð v ¼ þ1 1 ¼ dn ¼ A expðmv2x =2kB TÞdvx ¼ Að2pkB T=mÞ1=2
ð4:20:4Þ
v ¼ 1
Finally, dn ¼ ðm=2pkB TÞ1=2 expðmv2x =2kB TÞdvx
ð4:20:5Þ
The same situation will be repeated in the y and z directions. Since v2x þ v2y þ v2z ¼ v2 and, after integration over all angles, the spherical polar volume element in three dimensions is dV ¼ 4pv2dv, the Maxwell–Boltzmann distribution function n(v) becomes Eq. (4.20.1), QED. Note that n(v) ¼ 0 at v ¼ 0, but n(v) is a maximum at the most probable speed vmp (where dn(v)/ dv ¼ 0, see Problem 4.20.2): vmp ¼ ð2kB T=mÞ1=2
ð4:20:6Þ
The molecular speed (Problem 4.20.6) is obtained by evaluating Ð v¼0 average 3=2 3 4pv ðm=2pk expðmv2 =2kB TÞdv: B TÞ v¼1 hvi ¼ ð8kB T=pmÞ1=2
ð4:20:7Þ
This average speed at 273.15 K is 1692, 566.5, 425.1, 454.2, 380.8, and 362.5 m s1 for H2, H2O(g), O2, N2, Ar, and CO2, respectively. The root-mean-square speed is vrms ðhv2 iÞ1=2 ¼ ð3kB T=mÞ1=2
ð4:20:8Þ
The relative speed of one molecule of mass M, with respect to another of mass m is vrel ¼ ð8kB T=pmÞ1=2
ð4:20:9Þ
where m is the reduced mass m1 m1 þ M1. If the gas consists of identical molecules, this formula reduces to vrel ¼ 21=2 ð8kB T=pmÞ1=2 ¼ 21=2 hvi
PROBLEM 4.20.1. Fill in the details needed to obtain Eq. (4.20.4).
41
Karl Friedrich Gauss (1777–1855).
ð4:20:10Þ
4.21
26 5
TWO- CO MPO NEN T L IQUID-VA POR P HAS E D IA GR AM S
PROBLEM 4.20.2. Show for a Maxwell–Boltzmann distribution of Eq. (4.20.1) that the most probable molecular speed vmp is given by Eq. (4.20.6). PROBLEM 4.20.3. Prove the integral:
Ð t¼þ1 t¼0
expðat2 Þt3 dt ¼ 1=2a2 .
PROBLEM 4.20.4. Show for a Maxwell–Boltzmann distribution of Eq. (4.20.1) that the average molecular speed hvi is given by Eq. (4.20.7). PROBLEM 4.20.5. Show for a Maxwell–Boltzmann distribution of Eq. (4.20.1) that the root-mean-square molecular speed vrms is given by Eq. (4.20.8). PROBLEM 4.20.6. Derive the perfect gas law at a temperature T by considerations of momentum transfer with the walls of a cubical container of volume L3 ¼ V. From the concept of a root-mean-square speed we can estimate the collision frequency Z between successive elastic collisions between molecules in a gas and the mean free path l. We assume an effective diameter d of two molecules (assumed to be hard spheres, so that each molecule will collide with another within an area pd2); the collision frequency z is given by z ¼ pd2 vrel ðP=kB TÞ ¼ P=ð2pmkB TÞ1=2
ð4:20:11Þ
The collision frequency with a wall is then given by Z ¼ P=ð2pmkB TÞ1=2
ð4:20:12Þ
while the mean free path l is given by l vrel =Z ¼ kB T=ðpd2 PÞ
ð4:20:13Þ
The rate of effusion E from a hole of area A in the gas container is E ¼ PA=ð2pmkB TÞ1=2
ð4:20:14Þ
The dependence of E on m1/2 is Graham’s42 law of effusion.
4.21 TWO-COMPONENT LIQUID-VAPOR PHASE DIAGRAMS If two pure liquids A and B with different boiling temperatures TA and TB (and therefore different vapor pressures PA and PB at any given common temperature T) are mixed, then since TA $ TB, we must consider mole fractions in the liquid phase XA1 and XB1 1XA1 that will be different from
42
Thomas Graham (1805–1869).
266
4
THE RM ODYN AM ICS
LIQUID
FIGURE 4.5 PX phase diagram for an ideal solution of A in B (or B in A). The diagonal straight line (curved line) connecting PA0 to PB0 represents Raoult’s law (Dalton’s law of partial pressures). Any vertical line is called an isopleth, or line of constant mole fraction. For the constant-pressure horizontal line DEF connecting the liquid phase to the vapor phase, the lever-arm rule shows that if nA is the number of moles of A, and nB is the number of moles of B, and if lA is the length of the segment DE, and lB is the length of the segment EF, then for the isopleth through the point E nAlA ¼ nBlB.
LIQUIDUS "CURVE"
PB0
Raoult's law (line)
D
E
TWO-PHASE REGION
PA0
F
Dalton's law (curve) ISOPLETH
VAPOR 0.5
0.0 Pure A
XB
1.0 Pure B
the mole fractions in the vapor phase XAv and XBv 1XAv . In the vapor phase, Dalton’s43 law of partial pressures will apply (“each molecule to itself!”): PA ¼ XAv PTOT
and
PB ¼ XBv PTOT
ð4:21:1Þ
while, if the liquid mixture is ideal in the liquid, Raoult’s44 law will apply: PA ¼ XA1 P0A
and
PB ¼ XBv P0B
ð4:21:2Þ
where P0A and P0B are the vapor pressures of pure A or pure B, respectively, at that temperature. Then the pressure versus composition (PX) diagram for an ideal solution is shown in Fig. 4.5, while the temperature versus composition (TX) phase diagram is shown in Fig. 4.6. Fig. 4.7 shows a temperature versus composition diagram for an ethanolwater mixture. It is slightly idealized to make a pedagogical point. If one starts with initial composition C1 in the liquid (point A in Fig. 4.7), the vapor mixture will have the composition B. If this vapor is condensed, the liquid mixture will have composition C2 at point C; the vapor phase will have the composition C3 at point D. Further cycles will achieve ever smaller increases in ethanol liquid content until the azeotrope (constant-boiling) composition is reached at 95.6 mass% and 78.2 C
43 44
John Dalton (1766–1844). Fran¸cois-Marie Raoult (1830–1901).
4.22
T W O - C O M P O N E N T S O L I D – L I Q U I D P H A S E D I A G R A M S F O R S O L I D – L I Q UI D E Q U I L I B R I A
26 7
Dalton's law (curve)
TA0
VAPOR
E
D
CONDENSATION CURVE TWO-PHASE REGION
BOILING-POINT CURVE
G F
TB0
H
FIGURE 4.6
Raoult's law (curve)
ISOPLETH
LIQUID
0.5
0.0
XA
Pure B
T/°C
1.0 Pure A
T/°C
BOILING TEMPERATURE PURE WATER
AZEOTROPE
VAPOR COMPOSITION
100°
BOILING TEMPERATURE PURE ETHANOL
A
B
D
78.5° 78.2°
C LIQUID COMPOSITION
0%
C1
C2
C3 95.6%
100%
mass % Pure H2O
Pure Ethanol
4.22 TWO-COMPONENT SOLID–LIQUID PHASE DIAGRAMS FOR SOLID–LIQUID EQUILIBRIA Temperature–composition phase diagrams for mixtures of solids and the liquids in equilibrium with them are very important in metallurgy and electronics. Figure 4.8 shows a simplified phase diagram for two phases that have a limited solubility for each other.
Liquid–vapor TX phase diagram for an ideal solution of A in B. An isopleth is shown. The line segment D ! E ! F ! G ! H represents the path of sequential fractional distillation steps through two and a half stages (or two and a half “theoretical plates”).
FIGURE 4.7 Non-ideal liquid–vapor T versus X diagram for ethanol and water solutions at 1 atm. At 89.5 mol% (95.6 mass%) ethanol and 78.1 C, a constant-boiling or azeotropic mixture is achieved, whose boiling temperature is below the boiling points of either component (pure water boils at 100 C, pure ethanol boils at 78.4 C). Even an infinite number of fractional distillation steps will not remove the last 4.4 mass% of water from the mixture; a solid drying agent, or benzene, is needed to produce “absolute” 100% ethanol (these additives make absolute ethanol undrinkable!)
268
4 T/°C
THE RM ODYN AM ICS
EUTECTIC POINT
700 LIQUID
600 α
FIGURE 4.8
α + LIQ
LIQ+θ θ
500
Nonideal solid–liquid TX diagram at 1 atm for Cu and Al (only about the left half of the diagram is shown). The two-phase regions are indicated. There is a very limited solubility of Cu in Al; this is phase a. There is similarly a limited solubility of Cu in the stoichiometric phase or intermetallic compound CuAl2 (called the y phase). The liquid solution of Al in Cu freezes at the lowest possible temperature (540 C) for 32 mass % Cu; this is the eutectic point (which is technologically useful in solders).
400
300 α + EUTECTIC
200 0
10
20
EUTECTIC+θ
30
40
50
MASS PER CENT Cu θ=CuAl2
Pure Al
4.23 TWO-DIMENSIONAL VERSION OF THE PERFECT GAS LAW The work of Langmuir45 and others allows us to discuss a two-dimensional equivalent of the perfect gas law: PA ¼ nRT
ð4:23:1Þ
which is obeyed by a single monolayer of amphiphilic molecules trapped at the air–solvent interface. These molecules [e.g., arachidic acid, or eicosanoic acid, CH3(CH2)18COOH, predissolved in a volatile solvent (e.g., chloroform)] are carefully dropped atop, for example, a very pure water surface; the solvent, which must not be miscible with water, has no choice but to evaporate rapidly, leaving amphiphilic molecules trapped at the air–water interface; the polar “head group” of the amphiphile would drag the molecule into aqueous solution, but the nonpolar “tail” prevents this from happening. If the molecules thus trapped occupy a much smaller area than the surface of the water, then the molecules are akin to a gas of molecules trapped in two dimensions. If the molecules are mechanically swept together to occupy a smaller area A (m2), and start touching each other, then they reduce the surface tension of
45
Irving Langmuir (1881–1957).
4.23
26 9
T W O - D I M E N S I O N A L VE R S I ON O F TH E P E R F E C T G A S L A W
T/°C LIQUID
LIQUID+δ
1600 δ
1200
EUTECTIC
LIQUID+ γ
1400
LIQUID +Fe3C
γ (AUSTENITE)
1000
EUTECTIC +Fe3C
γ + EUTECTIC 800 α 600
P
200 0 Pure Fe
EUTECTIC +Fe3C
P + EUTECTIC
400
2
1
3
MASS PER CENT C
4
6
5 Fe3C
FIGURE 4.9 Nonideal solid–liquid TX diagram at 1 atm for Fe and C (only the extreme left half of the diagram is shown). The twophase regions are indicated. Pure Fe has several phases: a-Fe or a-ferrite (body-centered cubic), stable up to 910 C; above it is austenite, or g-ferrite, a face-centered cubic structure; at 1401 C a third phase, d-ferrite, is formed (again a body-centered cubic structure). Of these three polymorphs, phases a and d dissolve very little C, while g can dissolve up to 2%. P is a “line” phase P with zero width. The compound Fe3C or cementite is very important, since it is very brittle and hard; it adds mechanical strength to mixtures and gives the pearly structure do damascene swords. Pure Fe is hard but brittle. Fe alloys with less than 2 mass% C can be heated into austenite, then quenched down to room temperature, with resulting mixed phases that have desirable mechanical properties, especially if alloyed with V, Cr, and so on; these are the steels. The liquid solution of 4.2 mass% C in Fe is a eutectic that melts at about 1200 C99.
water, Pw (N m1 ¼ J m2) to a smaller value Pmol. By convention the sign is reversed so that P is positive. P ¼ Pw Pmol
ð4:23:2Þ
Equation (4.23.1) can be derived from excess thermodynamic functions (Problem 4.23.1). Equation (4.23.1) can be modified, by analogy to the van der Waals equation for gases, to (P þ n2aA2) (A nb) ¼ nRT, where a represents the intermolecular attractions within the monolayer, and b represents the excluded area. The Gibbs treatment of molecules at interfaces starts from the excess internal energy ES and excess entropy SS at the interface of a two-component system, with nS1 moles of component 1 at the surface of area A, nS2 moles of component 2 at the surface, and an interfacial surface tension P: ES ¼ TSS þ m1 nS1 þ m2 nS2 þ PA
ð4:23:3Þ
270
4
THE RM ODYN AM ICS
After applying the Gibbs–Duhem relations to the differential of this equation one is left with: AdP ¼ nS1 dm1 nS2 dm2 AG1 dm1 AG2 dm2
ð4:23:4Þ
where G1 nS1 =A and G2 nS2 =A. If the interface is taken as the surface at which nS1 ¼ 0, then G2 ¼ ð@P=@m2 Þs¼0 n1
ð4:23:5Þ
but since m2 ¼ RT ln a2 ¼{where a2 is the activity of component 2) RT ln C (in dilute limit), we have dm2 ¼ RT d ln C ¼ (RT/C)dC, so finally the Gibbs equation becomes G2 ¼ ðC=RTÞðdP=dCÞ
ð4:23:6Þ
PROBLEM 4.23.1. Derive Eq. (4.23.1) from Eq. (4.23.6). The PA isotherm, the two-dimensional analog of a PV isotherm in three dimensions, would be a hyperbola if the monolayer were “perfect”; to describe nonperfect behavior, the van der Waals coefficients a and b (known for a specific three-dimensional gas) can be incorporated to modify Eq. (4.23.1), as mentioned above. If one overcompresses the monolayer, then the monolayer “collapses” and pieces of it start to overlap each other, like ice floes in the Arctic Ocean in a storm. The PA isotherms can be measured in a Langmuir trough, or film balance, or Pockels46–Langmuir–Adam47–Wilson48–McBain49 (PLAWM) trough, thus honoring most of the major scientists who developed it between 1920 and 1940. The isotherm varies a lot between molecule and molecule, and it is used to determine the area per molecule, as it sits at the air–water interface. A monolayer at the air–water interface compressed to minimum area may be called the Pockels–Langmuir (PL) monolayer. A more useful practical development is the transfer onto a solid support, as a Langmuir–Blodgett50 (LB) monolayer (Fig 4.10) or, sequentially, as an LB multilayer (Fig. 4.11). If the monolayer at the air–water interface is very rigid, then the orientational “distortions” seen in Fig. 4.11 may not be feasible, so the vertical introduction of a substrate through a monolayer may not work; then the quasi-horizontal Langmuir–Schaefer51 transfer is used, where the substrate is brought “pancake down” atop the monolayer and then withdrawn. LB monolayers and Y-type bilayers lead us to bilayers, hemimicelles, and micelles (Fig. 4.12). A bilayer of phospholipid amphiphiles forms the cell wall, which surrounds each living cell (prokaryotic and eukaryotic); the ionic outer layers contact the bulk solution (blood, serum, etc.), while the
46 47 48
Agnes Luise Wilhelmine Pockels (1862–1935). Neil Kensington Adam (1981–1973).
Donald A. Wilson (
). James William McBain (1882–1953). 50 Katharine Burr Blodgett (1898–1979). 51 Vincent J. Schaefer (1906–1993). 49
4.23
27 1
T W O - D I M E N S I O N A L VE R S I ON O F TH E P E R F E C T G A S L A W
FIGURE 4.10 Langmuir–Blodgett (LB) transfer of a monolayer of an amphiphilic molecule (compressed in a Langmuir trough to fixed area and constant film pressure controlled by mechanical barriers, shown in projection) from the air–water interface onto a solid substrate (glass microscope slide) with a hydrophilic surface: hydrophilic end of molecule onto hydrophilic surface.
FIGURE 4.11 Sequential transfer of LB multilayers onto a surface (Left) X-type multilayer; (middle) Y-type multilayer; (right) Z-type multilayer). The Y-type (centrosymmetric) is the most frequently observed, but for certain amphiphiles X-type (acentric) and Z-type (acentric) are preferred. The choice is dominated by poorly understood intermolecular attractions between successive multilayers: The Y-type is most frequent because “like likes like” (in Latin, similes similibus facillime congregantur); a hydrophobic surface usually attracts a hydrophobic adsorbate.
FIGURE 4.12 Cross-sectional views of liposome, micelle, and bilayer sheet.
272
4
THE RM ODYN AM ICS
ionic inner layer touches the cytoplasm. Bacteria and viruses also have an additional rigid outer protecting capsule. The lipid interior of the bilayer protects the cell innards from immediate lysis caused by equilibration of the chemical potential across the wall by “passive transport”; thus for many nutrients the bilayer preserves a chemical potential difference (called the Gibbs–Donnan52 potential) between the inside and the outside of the cell. However, to help transfer by active transport important nutrients across the cell against the chemical potential difference, many specialized surface proteins sit on the cell exterior and often penetrate the bilayer; these molecules pump nutrients into the cell by various chemical and even almost mechanical processes. Molecules can bind to surfaces either: (i) weakly by physical forces (van der Waals forces: physisorption) or (ii) strongly by forming chemical bonds to the surface (chemisorption, more recently rebaptized “self-assembly”: e.g., thiols R-SH, selenols R-SeH, thioacetyls R-SCOCH3, and dithiols R-S-S-R onto Au, Ag, Pt, or Pd by homolytic bond scission; carboxylic acids R-COOH onto Al, trichlorosilyls R-SiCl3 onto hydroxyl-covered Si; etc.). When molecules M are physisorbed onto some solid surface A (even if irregular), at equilibrium the adsorption and desorption rates are equal, and the Langmuir adsorption isotherm of 1916 is obtained: y ¼ ðka =kd Þp=½1 þ ðka =kd Þp
ð4:23:7Þ
where p is the partial pressure of species M, y is the fraction of the sites N occupied (0 y 1), and ka and kd are the rate constants for adsorption and desorption, respectively. Full monolayer coverage (y ¼ 1) is called 1 langmuir. PROBLEM 4.23.2. Prove Eq. (4.23.7). The Langmuir isotherm assumes that all adsorption sites are equivalent and independent of each other. When these assumptions fail, the Temkin [y ¼ a lne(bp)]53 and Freundlich54 (y ¼ cpd) isotherms are empirical improvements of limited value. The 1938 Brunauer55–Emmett56–Teller57 (BET) isotherm is a more practical isotherm, useful for computing monolayer and also multilayer coverage on surfaces: ½VðP*=PÞ11 ¼ ðc Vmono Þ1 ½ðc1ÞðP=P*Þ þ 1
52 53 54
Frederick George Donnan (1870–1956). Menassii Isaakovich Temkin (1908–1991).
Herbert Max Finlay Freundlich (1880–1941). Stephen Brunauer (1903–1986). 56 Paul Hugh Emmett (1900–1985). 57 Edward Teller (1908–2003). 55
ð4:23:8Þ
4.23
Table 4.2
Colloids
Dispersing Medium Gas
Liquid Solid
27 3
T W O - D I M E N S I O N A L VE R S I ON O F TH E P E R F E C T G A S L A W
Dispersed Medium Gas None (all gases are infinitely miscible) Foam (whipped cream) Solid foam (Styrofoam, pumice)
Liquid Liquid aerosol (fog, mist, hair spray) Emulsion (milk, mayonnaise) Gel (agar, gelatin, opal, silica gel)
where V is the volume of gas adsorbed as a multilayer, Vmono is the volume of gas that was adsorbed as a single first full monolayer, P is the equilibrium vapor pressure, P is the “saturation pressure,” and c is the BET constant defined by c exp½ðDHdes DHvap Þ=RTÞ
ð4:23:9Þ
where DHdes is the molar enthalpy of desorption, and DHvap is the molar enthalpy of vaporization. In the pressure range 0.05 G P/P G 0.35, the quantity [V(P /P) 1]1 increases linearly with P/P : from the slope “s” and intercept “i”, one obtains Vmono ¼ 1/(s þ i) and c ¼ 1 þ s/i. Thus, the BET isotherm measures the effective specific surface of a particulate solid support; substances like silica gel, zeolite, and so on, can have a specific surface useful for heterogeneous catalysis, provided that it exceeds 200 m2 g1. Micelles and Liposomes. If in a solution amphiphiles (also known as surfactants) exceed a certain concentration called the critical micelle concentration, then they aggregate spontaneously into micelles or liposomes (Fig. 4.12). Hemi-micelles are extended half-cylindrical objects similar to micelles, except that the amphiphiles are aggregated only in a semicircle, instead of a full circle, and the hemi-micelles must rest on a solid support. Colloids are metastable mixtures or dispersions of two immiscible phases, where a minority phase is suspended as aggregates within a majority phase; the mixture may exhibit surprising long-term kinetic (but not thermodynamic) stability. Table 4.2 shows some examples. Some colloids carry a high electrostatic charge. Nanoparticles with their protective covering (“spinach”) of either ionic species (e.g., gold citrate colloids) or hydrophobic species (e.g., gold with octanethiol coatings) can also be considered colloids, although present research emphasis is on their metallic properties. The physical properties of Au nanoparticle colloids transition from semiconducting aggregates (when very small) to metallic particles (when larger, with band structure, plasmons, etc.) (Fig. 4.13). One can attribute the relative stability of colloids, or dispersed particles, to a theoretical electrokinetic potential, or zeta potential, or z potential, which is defined as the potential difference between the bulk solvent and a very thin layer of the solvent (called the “slipping plane” and typically about 1 nm thick) that is tightly attached to the colloidal particle or nanoparticle. This z potential cannot be measured directly, but
Solid Solid aerosol (smoke, cloud, air particulates) Sol (pigmented ink, blood) Solid sol (cranberry glass)
274
4
THE RM ODYN AM ICS 1.5 eV
1.3 eV
HOMO-LUMO GAP energies (eV)
0.9 eV
0.7 eV 0.47 eV
0
-1 0.74 V (1.0) V 1.2 V 1.6 V 1.8 V
0.2 V 0.3 V Au∞
Au225
Au140
0
Au75
+1
Au55 Au38 Au25
FIGURE 4.13 Transition of Au nanoparticles from insulating/semiconducting molecule-like (n ¼ 13 to 75) to metal-like (n ¼ 140, 225) to metallic (n ¼ 1). Adapted from Murray [6].
Au13
METAL
METAL-LIKE QUANTIZED CHARGING
MOLECULE-LIKE ENERGY GAP
is a useful theoretical concept, which also helps to explain electro-osmosis, electromigration, and electrophoresis. For the colloidal nanoparticle, one assumes an electrical double layer on its surface, similar to, but not identical with, the Helmholtz electrical double layer (Section 6.20) assumed to form next to metal electrodes. One assumes that for 0 G |z| G 5 mV, the colloidal suspension is unstable, the particles attract each other, and rapid coagulation or flocculation occurs (the flocculation value is the minimal concentration of the colloid at which flocculation occurs). For 10 G |z| G 30 mV, the suspension can become unstable; for 30 G |z| G 40 mV, there is moderate stability; for 40 G |z| G 60 mV, there is good stability; and for 61 G |z|, there is excellent stability for the colloidal suspension. The z potential can be estimated using the 1903 theory of Smoluchowski58 [5] and the experimentally determined dynamic electrophoretic mobility. In electrophoresis, one applies a potential (typically 1 kV) across a colloidal suspension; the speed of colloids moving toward the electrode of opposite charge is proportional to |z|; this speed can be monitored by “zeta meters,” which in reality measure the mobility and not z. [Note: A purple Au colloidal solution prepared in 1857 by Faraday59 at the Royal Institution in London, by reducing KAlCl4 with phosphorus, was still stable in the Director’s office in 1997!]
58 59
Marian Ritter von Smolan Smoluchowski (1872–1917). Michael Faraday (1791–1867).
4.24
27 5
C O N T A C T AN G L E A N D SU RF A C E T E N S I O N M E A S U R E M E N T S
4.24 CONTACT ANGLE AND SURFACE TENSION MEASUREMENTS Static contact angle measurement of the sessile drop. The contact angle, yC, is the angle formed by a liquid drop at the three-phase boundary where a liquid, a gas, and a solid intersect. It depends on the interfacial surface tensions between gas and liquid PGL, liquid and solid PLS, and gas and solid PGS, as given by Young’s60 equation of 1805: PGS ¼ PSL þ PGL cos yC
ð4:24:1Þ
Contact-angle goniometers measure yC by using tangent angles (see Fig. 4.14), thus assuming that the droplet is Q a sphere or an ellipsoid, or else that it fits the Young–Laplace equation Dp ¼ (1/R1 þ 1/R2), where R1 and R2 are the principal radii of curvature of the two fluids (liquid and air). yC is a measure of the ratio of intraphase cohesion versus interphase adhesion: If yC 0, that is, if the liquid droplet spreads completely on the solid surface and “wets” it, then adhesion dominates; if yC is large, then cohesion within the drop dominates. If the drop is H2O(l) and yC 0, then the surface is hydrophilic; if yC 90 , then the surface is hydrophobic. To confirm this, one can also use a nonvolatile hydrophobic organic liquid drop, such as dodecane, for which the contact angles will be dramatically different. yC can also be measured dynamically:
θC
γLG γSG
LIQUID DROP
γSL SOLID
FIGURE 4.14 Definition of a contact angle yC for a liquid (L) drop on a solid surface (S) in the presence of a gas (G).
1. The advancing contact angle is determined by pushing a droplet out of a pipette onto a solid: When the liquid initially meets the solid, it will form some contact angle; as the pipette injects more
θC=95°
POOR WETTING θC=15°
GOOD WETTING θC=0°
FIGURE 4.15
COMPLETE WETTING
60
Thomas Young (1773–1829).
Wetting and contact angles, from y ¼ 95 (poor wetting) to 0 (complete wetting).
276
4
liquid through the pipette, the droplet will increase in volume and the contact angle will increase, but its three-phase boundary will remain stationary, until it suddenly jumps outward; the yC of the droplet immediately before jumping outward is the advancing contact angle yCA. 2. The receding contact angle is next measured by sucking the liquid back out of the droplet. The droplet will decrease in volume and the contact angle will decrease, but its three-phase boundary will remain stationary until it suddenly jumps inward. The yC of the droplet immediately before jumping inward is the receding contact angle yCR. Then yCA yCR is the contact angle hysteresis, a measure of surface heterogeneity, roughness, and mobility.
Z
R S
ϕ X
FIGURE 4.16 Pendant drop method of measuring surface tension: Drop of liquid hangs from pipette; the z axis is vertical, the x axis is horizontal, and R is the (maximum) radius of the drop. To obtain b, many values of the radii S (S ¼ 0 to R) are measured at heights z (z ¼ 0 to R) above the apex of the drop; f is the tangent angle at the radius S.
THE RM ODYN AM ICS
The pendant drop method (Fig. 4.16) measures the contact angle y and drop radius R to determine the surface tension for any liquid surrounded by gas, or the interfacial tension between any two liquids:QA drop is hung from a syringe tip in air, and the interfacial surface tension GL is PGL ¼ ðrliquid rgas ÞgR=b
ð4:24:2Þ
where g ¼ acceleration due to gravity, rliquid ¼ bulk density of the drop, rgas ¼ density of surrounding gas (or second liquid), R ¼ radius of drop at its apex, and b ¼ dimensionless shape factor of the drop, which is obtained by an iterative solution of the three simultaneous conditions (dx/ds) ¼ cos y, (dz/ds) ¼ sin f, and (df/ds) ¼ 2 þ bz – (1/x) sin f. The Wilhelmy61 plate method employs a sensitive force meter to measure a force that can be translated into a value of the contact angle, or conversely to measure an interfacial surface tension (Fig. 4.17). A small
FIGURE 4.17 The Wilhelmy method. In the top picture a plate of the solid surface is lowered into a submerging liquid. The liquid pushes up on the solid sample with force due to the buoyancy and the surface tension, and these forces are measured by instruments attached to the arm above the sample and depend on the length d, surface tension P, and wetted length I (the perimeter of the sample along the line of contact of the air, liquid, and solid). In the bottom picture the sample is being raised and the liquid exerts a downward force.
61
Ludwig Wilhelmy (1812–1864).
4.26
ADIABATIC AND DIATHE RMAL WALLS , A ND FIXED-TEMP ERATURE BAT HS
plate-shaped sample (e.g., a 8- 3-cm strip of filter paper), attached to the arm of a force meter, is vertically dipped into a pool of the probe liquid. The force is related to the contact angle y by cos y ¼ ðFFb Þ=IP
ð4:24:3Þ
where F is the total measured force, Fb is the force of due to buoyancy (the solid sample displaces the liquid), I is the wetted length, and P is the surface tension of the liquid. This method is fairly objective and yields data, which are averaged over the wetted length. Strictly speaking, this is not a sessile drop technique, as we are using a small submerging pool, rather than a droplet. However, the calculations described in the following sections, which were derived for the relation of the sessile drop contact angle to the surface energy, apply just as well.
4.25 INTERNATIONAL STANDARDS FOR TIME, MASS, LENGTH, TEMPERATURE, AND BRIGHTNESS After years of study by a committee appointed by Louis XVI62, in 1799 Bonaparte63 adopted for use throughout France the metric system of units: The unit of time was defined as 1/86,400 of a mean solar day; the meter was defined as the 40,000,000th of the mean earth diameter; and the gram was defined as the mass of 1 cubic centimeter of liquid water at the temperature of its maximum density (4 C). This defined the cgs system. Napoleon’s armies spread the system throughout continental Europe, but the British (and consequently the Americans) hung to the old English inch-pound-second system. From cgs the kilogram-meter-second (mks) system evolved, and finally SI (Systeme International d’Unites) was born from mks with the addition of units of current (ampere), temperature (kelvin), and brightness (candela). The International Temperature Scale is defined by: 13.8033 K (triple point of equilibrium H2); 24.5561 K (triple point of Ne); and 1234.93 K (freezing temperature of Ag). The United States uses metric quantities as primary standards; the inch and pound are merely secondary standards (that therefore differ very slightly from the British standard inches and pounds); the progress of “metrication” in the United States is regrettably slow, but unstoppable.
4.26 ADIABATIC AND DIATHERMAL WALLS, AND FIXED-TEMPERATURE BATHS In practice, adiabatic walls around any system (container) are achieved by pulling a very good vacuum between two metal surfaces: (a) the inner surface enclosing the system and (b) the outer surface facing ambient conditions . The mass connecting the inner and outer surfaces, needed for mechanical stability, 62 63
Louis-Auguste, King Louis XVI of France (1754–1793). Napoleon[e] [di] B[u]onaparte (1769–1821).
27 7
278
4
THE RM ODYN AM ICS
is kept to a minimum. To avoid heat (photons or thermal radiation) transfer between the inner surface and the outer surface, mirror-metal coatings on glass are used (Dewar64 vessel or ThermosÒ bottle). Another way to provide moderate adiabaticity is to fill the space with StyrofoamÒ (air-filled porous polystyrene). Provided that the heat loading is not excessive, the individual air pockets in the foam bubbles delay heat transfer across the foam from system to surroundings and vice versa. The opposite of adiabatic is either diabatic or diathermal. The best way to provide diathermal walls is connect the system (inner vessel) to the surroundings (outer vessel) with metal (an excellent heat conductor) or water (a good thermal conductor with very large specific heat capacity) or diamond (the best heat conductor and, simultaneously, the best electrical insulator). Thermostat vessels (fixed-temperature “baths” or “sinks”) are used to control the temperature of a large mass (water or metal) using an electronic feedback loop between an electrical resistance heater and a temperature monitor; the temperature of such a bath can be controlled routinely to 0.01 K or, with special care, to 0.001 K.
4.27 THERMODYNAMIC EFFICIENCY: THE CARNOT, OTTO, DIESEL, AND RANKINE CYCLES A measure of how efficiently a system can yield energy with minimum heat loss, we define the thermodynamic efficiency as the ratio of useful work output (over a complete cycle) divided by heat input: Z ðwork outputÞ=ðheat investmentÞ ¼ 1TC =TH
ð4:27:1Þ
The usual situation is the existence of a “heat engine” or motor connected to two heat “reservoirs” of infinite capacity, one at a “hot” absolute temperature TH, providing the engine heat QH, another at a “cold” temperature TC, receiving heat –QC from the engine, with useful mechanical work output –W from the engine. Running things “backwards” in a refrigerator, air conditioner, or heat pump, heat QC is pulled into the system from the cold reservoir at TC, work W is given to the system, and heat QH is expelled from the system to the hot reservoir at TH. The efficiency is still given by Eq. (4.27.1). Equation (4.27.1) shows that the only way to get 100% energy output with no loss of waste heat would be if either TC ¼ 0 K or TH ¼ 1! The Carnot65 cycle, using a perfect gas as the working fluid and reversible steps, will maximize Z. The full cycle consists of four steps: (i) an adiabat (S ¼ constant) followed by (ii) an isothermal expansion (constant TH), then (iii) one more adiabat (S ¼ constant), then (iv) a final isothermal compression (constant TC) . Other cycles are given in Table 4.3. PROBLEM 4.27.1. Prove that for the Carnot cycle using a perfect gas as the working fluid in reversible steps, the thermodynamic efficiency is given by Eq. (4.27.1).
64 65
Sir James Dewar (1842–1923). Nicola Leonard Sadi Carnot (1796–1832).
4.28
27 9
STANDARD STATES
Table 4.3
Idealized Cycles with Thermodynamic Quantities Held Constant in Individual Stepsa Step 1
Step 2
(i) Power Cycles with External Combustion, or Heat Pumps Carnot (1824) S T Joule S P Stoddard (1919)b S P Ericsson I (1833)c S P S P Brayton (jet turbines)d T V Stirling (1816)e Ericsson II (1853) T P Sargent f S V Rankineg V P (ii) Power Cycles with Internal Combustion Otto (1876: cars)h S V Diesel (1897: trucks)i V S Lenoir (pulse jet)j P V
Step 3
Step 4
S S S S S T T S S
T P P P P V P P P
S P S
V S P
Efficiency 1 TC/TH 1 TC/TH 1 TC/TH
1 rg þ 1 1 rg þ 1(ag 1)/g (a 1)
For example, the adiabatic steps are indicated as “S”). In the Otto and related cycles we have r ¼ compression ratio, g CP/CV [Eq. (4.19.27)], and a ¼ 1.32. b Elliot J. Stoddard (1790–1878). c John Ericsson (1803–1889). d George Brayton (1830–1892). e Robert Stirling (1790–1878). f C. E. Sargent (fl. 1900–1916). g William John Macquom Rankine (1870–1872). h Nicolaus August Otto (1832–1891). i Rudolf Christian Karl Diesel (1858–1913). j Jean Joseph Etienne Lenoir (1822–1900). a
PROBLEM 4.27.2. Prove that for the Otto cycle using a perfect gas as the working fluid in reversible steps, the thermodynamic efficiency is given by 1 rk þ 1, where r is the compression ratio. PROBLEM 4.27.3. Prove that for the Diesel cycle using a perfect gas as the working fluid in reversible steps, the thermodynamic efficiency is given by 1 rg þ 1(ag 1)/g(a 1), where r is the compression ratio and a is the cutoff ratio.
4.28 STANDARD STATES Chemical reactions involve finite changes in E, H, S, A, and G: these are called DE, DH, DS, DA, and DG. It would be wonderful if one could refer everything back to 0 K, and to separate atoms, so that the formation of a molecule would involve an enthalpy of formation DHf which would be negative, since all molecules should be more stable than their independent constituent atoms. However, this sensible ideal is defeated by the fact that heats of atomization (needed to split a molecule, say ethanol C2H5OH, into 2 C atoms, 1 O atom, and 6 H atoms) are hard to measure, and 0 K is impossible to reach. Instead the two quantities (1) “standard internal energies of formation at 298.15 K (25 C) DEf ;298:15 KN and (2) “standard enthalpies of formation at
280
4
THE RM ODYN AM ICS
298.15 K (25 C) DHf ;298:15 KN are defined relative to internal energies or enthalpies of formation of the elements in their “standard (“usual”) states” at 298.15 K and 1 atm (denoted by “ ”) as being zero. In other words: DHf ;298:15 K;1 atmN 0 for {H2(g), He(g), Li(s), Be(s), B(s), C(graphite, not diamond)), N2(g), O2(g), F2(g), Ne(g), and so on}. Ditto goes for DEf ;298:15 KN (H2(g)) ¼ 0, and so on. For entropy, one does assume S ¼ 0 at T ¼ 0 K for pure elements, so “absolute entropies” are used (ignoring zero-point motion at 0 K). Therefore the standard enthalpies of formation of molecules can be positive or negative! N Sideline. There was a committee-encouraged effort to replace DGN f by Df G , 66 but, luckily, that bad idea has lost favor. Remember Eisenhower’s dictum that “a camel is a horse designed by a committee”. Of course, then DGyf ;298:15 K ðC2 H5 OHðlÞÞ ¼ DHfN;298:15 K ðC2 H5 OHðlÞÞTDSN 298:15 K ðC2 H5 OHðlÞÞ The changes in E, H, A, or G as functions of temperature, pressure, phase change, or chemical reactions are vitally important for knowing the energy contents, reactivity, or thermodynamical feasibility of certain processes. For many molecules, DHf is often measured by measuring heats of combustion of the molecule, which is transformed to products of known stoichiometric composition and previously measured DHf. However, there are other ways of measuring DH indirectly, as in measuring the temperature dependence of vapor pressure using the Clausius–Clapeyron equation: N =TDV dP=dT ¼ DHvap
ð4:28:1Þ
or by the temperature dependence of equilibrium constants Keq by the van’t Hoff equation; ð@ ln eKeq =@ð1=TÞÞP ¼ DHN =R or by
DGN ¼ RT ln e Keq
ð4:28:2Þ
ð4:28:3Þ
which is derived from Eq. (4.28.2), or by the Gibbs–Helmholtz equation: ð@ðDGN =TÞ=@TÞP ¼ DHN =T 2
ð4:28:4Þ
4.29 ATTAINMENT OF HIGH AND LOW TEMPERATURES [7] To achieve temperatures below room temperature in a finite body, thermal contact with a cold liquid or solid is one way (see Table 4.1): liquid He boils at 4.2 K, liquid H2 at 33.8 K, liquid N2 at 77.35 K, and liquid NH3 at 239.80 K, 66
Dwight David Eisenhower (1890–1969).
4.32
LASER COOLING
while dry ice (solid CO2) sublimes at 194.7 K. In between these fixed points, a flow of gas (usually He, N2, or Ar) cooled around one of these liquids and passed through an electrical resistance heater, which is controlled by a flow meter and thermocouple circuit. Cryogenic liquids can cause bad burns. In addition, one must be careful with liquid N2, at whose surface liquid O2 can condense—for example, in a vacuum trap; liquid O2 will ignite any combustible organic substance (which is why liquid N2 cold traps must be brought to room temperature carefully). Liquid H2 is explosive in the presence of oxygen and a catalyst or an electrical spark.
4.30 LOWERING THE PRESSURE ABOVE LIQUID HELIUM To reach temperatures below 4.2 K, one can partially evacuate a He reservoir using a high-capacity vacuum pump; this works down to the lambda point of liquid He (2.1768 K); below this temperature the 2He4 turns into a superfluid quantum liquid, which cannot be cooled any further. The minority isotope, 3 2He remains a normal fluid down to 0.002491 K; this allows cooling down to about 1 K.
4.31 ADIABATIC DEMAGNETIZATION To achieve temperatures below 2.18 K, Giauque67 invented adiabatic demagnetization. A paramagnetic rare earth salt (e.g., gadolinium sulfate) is cycled between its ordered, lower-entropy paramagnetic state and its disordered, higher-entropy unmagnetized state. To reach 0 K, however, an infinite number of such electronic magnetization/demagnetization cycles is needed. Using adiabatic demagnetization of the paramagnetic nuclear spins of Cu metal allowed the reaching of the record lowest temperature (20 nK).
4.32 LASER COOLING Laser cooling uses light to cool atoms to a very low temperature. It was made practical by Chu,68 Cohen-Tannoudji,69 and Phillips.70 This technique works by tuning the frequency of light slightly below an electronic transition in the atom, to the “red” (i.e., at lower frequency) of the transition; the atoms will absorb more photons if they move toward the light source, due to their kinetic energy and their momentum in the direction toward the laser (Doppler71 effect). When they re-emit the radiation in a random direction, they will do so at an energy that is not Doppler-shifted; thus they lose kinetic energy.
67 68
William Francis Giauque (1895–1962).
Steven Chu (1948– ). Claude Cohen-Tannoudji (1931– ). 70 William Daniel Phillips (1948– ). 71 Christian Andreas Doppler (1803–1853). 69
28 1
282
4
THE RM ODYN AM ICS
The result of the absorption and emission process is to reduce the speed of the atom, provided that its initial speed is larger than the recoil velocity from scattering a single photon. If the absorption and emission are repeated many times, the mean velocity, and therefore the kinetic energy of the atom will be reduced, thus cooling the atoms. This works only for a dilute concentrations of the atoms, to prevent the absorption of the photons into the gas in the form of heat due to atom-atom collisions. Only certain atoms and ions have optical transitions amenable to laser cooling, since it is extremely difficult to generate the amounts of laser power needed at wavelengths much shorter than 300 nm. The following is a partial list of atoms that have been laser-cooled: H, Li, Na, K, Rb, Cs, Fr, Be, Mg, Ca, Sr, Ba, Ra, Cr, Er, Fe, Cd, Ag, Hg (plus metastable Al, Yb, He, Ne, Ar, Kr), and some ions.
4.33 ZERO KELVIN, THE UNREACHEABLE GOAL At 0 K, diatomic molecules will still have zero-point vibration.
4.34 HIGH TEMPERATURES A tube furnace or muffle furnace heated by coiled W wires, can easily attain 1200 C. Above that, an induction furnace must be used. For small samples, a Peltier-effect heater/cooler can be used.
4.35 ATTAINMENT OF HIGH AND LOW PRESSURES [8] High pressures require that the equipment have sufficient tensile strength to not deform: steel is most frequently used; if the pressure apparatus must be nonmagnetic, then a Be–Cu alloy is good up to about 10 kbar. Pressures from 1 bar to 1 kbar can be attained by using a hand-operated hydraulic piston, similar to what is used in an automobile repair shop. Above 1 kbar, pressure intensifiers can boost these pressures tenfold, reaching about 10 kbar. Both hydraulic pistons and pressure intensifiers require a hydraulic fluid (heavy oil at room temperature, n-pentane down to 77 K or so) which can be compressed isotropically; for some pressures, talcum powder can act as an almost isotropic pressure-transmitting medium. If higher pressures are needed, the demand for isotropic compression must be abandoned, and anisotropies creep in. Beyond 10 kbar, mechanical means must be used: diamond anvils. The diamond anvil method uses industrial diamonds, shaped to define a small volume, and tetrahedrally mounted pistons and anvils compress the two diamonds together to achieve very high ultimate pressures (200 kbar). Beyond such pressures, explosive charges can be used to achieve very high pressures for a very short time. The maximum pressures measured are around 1 Mbar.
RE FE REN CES
4.36 ATTAINMENT OF LOW PRESSURES Low pressures can be achieved in a mechanical or “roughing” pump, by using a fluid (water, mercury, oil) in a rotating-vane technique to adsorb molecules within the fluid when exposed to the container to be evacuated, then expelling these molecules to the laboratory when the vane has brought the fluid into contact with laboratory air. A water pump can reach pressures of 1 Torr. An oil vacuum pump can reach 20 mTorr. A turbomolecular pump can reach pressures of 1010 Torr (108 Pa). A sorption pump can reach pressures of 102 Torr by exposing the system to a porous zeolite cooled to liquid nitrogen temperature with a Dewar flask placed on the outside. To reach lower pressures, a secondary pump is used, such as a diffusion pump or a sublimation pump (both must remain connected to a primary or “roughing” pump). There are two kinds of diffusion pumps. A mercury diffusion pump can reach 106 Torr, but the toxicity of mercury vapor has decreased its use dramatically. A silicone oil diffusion pump can reach 107 Torr. For even lower pressures, a Ti sublimation pump is used: it can reach about 1011 Torr. It is usually connected to a sorption primary pump. The ultimate low pressure attained in a laboratory on earth is about 1013 Torr.
REFERENCES 1. W. J. Moore, Physical Chemistry, 4th edition, Prentice-Hall, Englewood Cliffs, NJ, 1972. 2. D. R. Lide, ed., CRC Handbook of Chemistry and Physics, 70th edition, CRC Press, Boca Raton, FL, 1989. 3. P. W. Atkins, Physical Chemistry, 6th edition, W. H. Freeman, New York 1998. 4. G.-J. Su, Modified law of corresponding states, Ind. Eng. Chem. 38:803–806 (1946). 5. M. von Smoluchowski, Bull. Int. Acad. Sci. Cracoviae 184 (1903). 6. R. W. Murray, Nanoelectrochemistry: Metal nanoparticles, nanoelectrodes, and nanopores, Chem. Rev. 108:2688–2720 (2008). 7. J. M. Sturtevant, Temperature measurement, Chapter 1 in Physical Methods of Chemistry. Part 5. Determination of Thermodynamic and Surface Properties, ed. by Arnold Weissberger and Bryant W. Rossiter, eds., Wiley-Interscience, New York, 1973, pp. 1–22. 8. G. W. Thomson and D. R. Douslin, Determination of pressure and volume, Chapter 2 in Physical Methods of Chemistry. Part 5. Determination of Thermodynamic and Surface Properties, Arnold Weissberger and Bryant W. Rossiter, eds., Wiley-Interscience, New York, 1973, pp. 23–255.
28 3
CHAPTER
5
Statistical Mechanics
Tutto sommato. . . [all things added up. . .]
5.1 INTRODUCTION The techniques of statistical mechanics, invented by Gibbs1 and Boltzmann,2 permit the evaluation of macroscopic quantities (pressure P, heat capacity CV, entropy S, internal energy U, Helmholtz3 free energy A, Gibbs free energy G, and their partial molar form, the chemical potential m, etc.) starting from a nanoscopic model of the physics involved (classical mechanics, or, more often, quantum mechanics), continuing with the construction of the relevant partition function, or sum over states, and culminating with the phenomenally important results that macroscopic quantities are partial derivatives, logarithms, and so on, of this partition function. Sideline. Gibbs, a professor at Yale University in New Haven, Connecticut, published his work in the Transactions of the Connecticut Academy of Science, whose offices were conveniently also in New Haven. Luckily, the University of St. Andrews in Scotland had a subscription to this obscure journal; the great and acclaimed Scottish physicist Maxwell4 read about Gibbs’ work and sent him a letter of praise. Through the Yale secretaries, Gibbs’ students heard about this letter. Gibbs’ lectures were in general unintelligible, and Gibbs was a shy and retiring batchelor, but students asked him whether he had gotten an important letter recently. Gibbs answered “Oh yes, a fellow from Scotland wrote.” Once his
1
Josiah Willard Gibbs, Jr. (1839–1903). Ludwig Boltzmann (1844–1906). 3 Heinrich Ludwig Ferdinand von Helmholtz (1821–1894). 4 James Clerk Maxwell (1831–1879). 2
The Physical Chemist’s Toolbox, Robert M. Metzger. Ó 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc.
284
5.1
28 5
I N T R O DU C T I O N Microcanonical ensemble Partition function: Ω(N,V, U) Rigid impermeable adiabatic wall
Canonical ensemble Partition function: Q (N, V, T) Rigid impermeable adiabatic wall
N, V, U N, V, U N, V, U
N, V, T
N, V, T N, V, T
N, V, U N, V, U N, V, U
N, V, T
N, V, T N, V, T
Rigid impermeable adiabatic wall
Rigid impermeable adiabatic wall
rigid impermeable adiabatic walls
A: μCE
rigid impermeable diathermal walls
B: CE Grand Canonical ensemble Partition function: Ξ(V,T, μ)
Isothermal-isobaric ensemble Partition function: Δ (N, p, T)
Rigid impermeable adiabatic wall
Rigid impermeable adiabatic wall
V,T, μ
V,T, μ
V,T, μ
N, p, T
N, p, T N, p, T
V,T, μ
V,T, μ
V,T, μ
N, p, T
N, p, T
Rigid impermeable adiabatic wall
Rigid impermeable adiabatic wall
rigid permeable diathermal walls
C: GCE
N, p, T
flexible diathermal walls
D: IIE
contributions were translated into German and reprinted, Gibbs gained immense respect and following in the European scientific community. It is assumed that the relevant physical properties of the macroscopic system, involving a direct and painful summation over Avogadro’s5 number’s worth of particles, can be replaced by considering the individual system of interest, replicated ad nauseam, to form an ensemble (French for “together,” or “togetherness”) of replicas of the real system, and replacing the direct sum over the particles by a sum over all the replicas, which can often be replaced by an integral, thus simplifying the calculation. What will we sum? A thing called the partition function, or Zustandsumme (sum over states). The Gibbs assumption is that, once the partition function is evaluated and the most probable energy, occupancy of available states, and so on, have been found, these most probable quantities are the most likely values of those quantities. Depending on which system one wishes to evaluate, we consider isolated, closed, or open systems, and four most useful ensembles should be considered (Fig. 5.1). The philosophical underpinning of the method is to assume the principle of equal a priori probability that an isolated system (e.g., with N, V, U fixed for the microcanonical ensemble defined below) is equally likely to be in any of its possible quantum states. Then the ensemble average of any property is the average of that property over all members of the ensemble. The same argument will work for other types of ensembles. The search will be on for the most probable distribution. The validity of this “ergodic hypothesis,” or of the related “H-theorem,” is the object of much blather, but, by golly, it works! For instance, the microcanonical ensemble (mCE) (Fig. 5.1(A)) consists of an infinite set of replicas of a system with fixed number of particles N, fixed
5
Lorenzo Romano Amedeo Carlo Bernadette Avogadro, conte di Quaregna e Cerreto (1776–1856).
FIGURE 5.1 Four ensembles: (A) microcanonical (mCE). (B) canonical (CE). (C) grand canonical (GCE). (D) isothermal-isobaric (IIE). They are all isolated from the surroundings, but differ among themselves in what can migrate between systems: nothing, energy, heat, or molecules. Inspired by Moore [1].
286
5
ST AT I S T I CA L M E CH AN I CS
overall volume V, and fixed total energy U; what must be derived is how the individual system achieves an equilibrium temperature. The walls between the replicas prevent exchange of particles (thus N is fixed and the walls are impermeable), cannot transfer energy U between replicas (thus U is fixed, and the walls are adiabatic), and cannot expand or contract (thus the walls are rigid). The “natural variable” for mCE will be the entropy S; other thermodynamic functions will be obtainable by differentiation of the optimized partition function. Similar appropriate arguments are made for the other three ensembles of Fig. 5.1. Fermion Postulate. A system of identical Fermi6–Dirac7 (FD) particles (“fermions”) 1, 2, 3,. . ., N, all of which have the same half-integral spin quantum number (either electron S or nuclear I ¼ 1/2, or 3/2, 5/2, etc.), must have a many-particle wavefunction c(1, 2, 3,. . ., N) that is antisymmetric with respect to the interchange of any two of these particles: c(1, 2, 3,. . ., i,. . ., j,. . ., N) ¼ c(1, 2, 3,. . ., j,. . ., i,. . ., N). PROBLEM 5.1.1. Given the requirement of overall antisymmetry, show that for a system of identical fermions each fermion must have its unique set of quantum number values, different from the values adopted by any other fermion. Boson Postulate. A system of identical Bose8–Einstein9 (BE) particles (“bosons”) 1, 2, 3,.., N, all of which have the same integral spin quantum number (electron S or nuclear I ¼ 0, or 1, 2,. . .) must have a many-particle wavefunction c(1, 2, 3,. . ., N) that is symmetric with respect to the interchange of any two of these particles: c(1, 2, 3, . . ., i, . . ., j, . . ., N) ¼ þ c(1, 2, 3, . . ., j, . . ., i, . . ., N). PROBLEM 5.1.2. In a system of identical bosons, any particle can have the same set of quantum number values, as any other particle. PROBLEM 5.1.3. If fermions, such as the neutrons (I ¼ 1/2) that make a hot neutron star, merge to form a massive Bose particle (cold black hole), then Bose–Einstein condensation occurs, the whole star loses energy massively, and a new minimum energy state is reached (cold black hole) (see also Problem 2.12.4). The Bose condensation has been observed on Earth in laser-cooled (Chu,10 Cohen-Tannouji,11 and Phillips12) collections of alkali atoms in ultra-high vacuum at very low temperatures in 1995 (Cornell13 and Wieman14).
6
Enrico Fermi (1901–1954).
7
Paul Adrien Maurice Dirac (1902–1984). Satyendra Nath Bose (1894–1974). 9 Albert Einstein (1879–1955). 10 Steven Chu (1948– ). 8
11
Claude Cohen-Tannouji (1933– ). William Daniel Phillips (1948– ). 13 Eric Allin Cornell (1961– ). 14 Carl Edwin Wieman (1951– ). 12
5.2
CB, FD, AND BE DIST RIBUTIO NS, A ND T HE MICROC ANONIC AL ENS EMBLE
Boltzon Postulate. Maxwell–Boltzmann (MB) statistics predict that all energies are a priori equally likely, and that all particles in the system are physically distinguishable (labeled by some number, or shirt patch, “color”, or whatever, or picked up by “tweezers”). These MB particles can be called boltzons. If, however, we remove this distinguishability, then we have indistinguishable “corrected boltzons (CB)” [2], whose statistics become very roughly comparable to the statistics of fermions or bosons (see Problem 5.3.10 below).
5.2 CB, FD, AND BE DISTRIBUTIONS, AND THE MICROCANONICAL ENSEMBLE We want to find out all about one system, with a macroscopic number (say, Avogadro’s number NA) of constituents. Let us construct a “microcanonical ensemble” (mCE), described above (see Fig. 5.1). Within this mCE, we seek the statistically most likely distributions. Since we assume that the number of particles N and the energy U of the system are limited, we state that the restraints, or constraint equations, are N ¼ Si Ni
ð5:2:1Þ
U ¼ Si Ni ui
ð5:2:2Þ
Before dealing with the ensemble, we shall establish that the most likely FD or BE or CB distributions tFD, tBE, and tCB are (Problems 5.2.1, 5.2.3, 5.2.6)
tFD ¼
tBE ¼
tCB ¼
iY ¼N
gi ! Ni !ðgi Ni Þ! i¼1
iY ¼N
ðgi þ Ni 1Þ! Ni !ðgi Ni Þ! i¼1
i¼N Y gNi i i¼1
Ni !
ð5:2:3Þ
ð5:2:4Þ
ð5:2:5Þ
Warning: These distributions are not yet the partition function! That will come later. PROBLEM 5.2.1. FD statistics. Consider a system of N independent identical fermions, distributed among energy levels u1, u2, u3,. . ., ui,. . ., uk,. . .; these levels are g1, g2, g3,. . ., gi,. . ., gj,. . .-fold degenerate, respectively. Assume that
28 7
288
5
ST AT I S T I CA L M E CH AN I CS
these is a distribution of N1 of these fermions in level 1, N2 in level 2, N3 in level 3,. . ., Ni in level i,. . ., Nk in level k,. . ., respectively. Show that can we distribute these N fermions in tFD ways, as given in Eq. (5.2.3). PROBLEM 5.2.2. Using Stirling’s formula, Eq. (2.20.7), show that, for large Ni and gi, tFD becomes lne tFD Si fgi lne ½ðgi Ni Þ=gi þ Ni lne ½ðgi Ni Þ=Ni g
ð5:2:6Þ
PROBLEM 5.2.3. BE statistics. Repeat Problem 5.2.1, but with bosons. Show that we can distribute these N bosons in tBE ways, as given in Eq. (5.2.4). PROBLEM 5.2.4. tFD becomes
By using Stirling’s formula, show that, for large Ni and gi,
lne tBE Si fgi ln½ðgi þ Ni Þ=gi þ Ni ln½ðgi þ Ni Þ=Ni g
ð5:2:7Þ
PROBLEM 5.2.5. Classical MB statistics. Same as in Problem 5.2.1, but now use macroscopic particles, which are distinguishable. Show that can we distribute these N boltzons in tB ways, where tB ¼ N!
iY ¼N Ni gi i¼1
ð5:2:8Þ
Ni !
PROBLEM 5.2.6. CB statistics in mCE. Same as in Problem 5.2.5, but now use macroscopic yet indistinguishable particles, and derive Eq. (5.2.5). PROBLEM 5.2.7. Dilute systems. If gi Ni for all i, then the system is dilute, i and the boltzons can be replaced by CB, so that tCB ¼ ðtB =N!Þ ¼ Pi gN i =Ni ! Then finally also: i tFD tBE tCB ¼ Pi gN i =Ni !
ð5:2:9Þ
Maximum Probability. For any given t, or lne t, we do expect dt ¼ 0 for the most likely distribution, but we may be tempted to use dt ¼ Si (@t/@Ni)dNi ¼ 0 and then state that (@t/@ Ni) ¼ 0 for all i; we cannot do that because the Ni are further linked in two independent and different constraint equations, Eq. (5.2.1) and Eq. (5.2.2)! So how do we find the “best” t? By Lagrange’s15 method of undetermined multipliers we get ð@ðlne tÞ=@Ni Þ þ að@N=@Ni Þ þ bð@U=@Ni Þ ¼ 0
for all i ¼ 1; . . . ; N ð5:2:10Þ
15
Joseph Louis Lagrange ¼ Giuseppe Lodovico Lagrangia (1736–1813).
5.2
CB, FD, AND BE DIST RIBUTIO NS, A ND T HE MICROC ANONIC AL ENS EMBLE
using two (sofar) undetermined multipliers, a and b. The most probable t (or equivalently ln t) must satisfy the condition lne ½ðgi Ni Þ=Ni þ abui ¼ 0
fBE : upper sign; FD : lower signg
lne ½gi =Ni þ abui ¼ 0
fCBg
where Ni is the actual occupancy of energy level i, ui is the internal energy of that level, and gi is the degeneracy (maximum possible occupancy) of that level. Rearranging, we obtain Ni ¼ gi =½expðaÞ expðbui Þ þ 1
fFDg
ð5:2:11Þ
Ni ¼ gi =½expðaÞ expðbui Þ1
fBEg
ð5:2:12Þ
Ni ¼ gi expðaÞ expðbui Þ
fCBg
ð5:2:13Þ
We are now ready to speak about the microcanonical ensemble, using CB as an example. The partition function for the mCE is defined as OðN; V; UÞ
n X ui gi exp kB T i¼1
ð5:2:14Þ
This function is certainly valid for CB: expðaÞ ¼ N=O Ni ¼ ðgi N=OÞexpðbui Þ
fCBg
ð5:2:15Þ
fCBg
ð5:2:16Þ
This partition function is thus a sum of “Arrhenius16 factors,” weighted by their degeneracies, summed over all possible states! It can be a very large quantity (as is Avogadro’s number), but its logarithm is “reasonably” sized. . .. It can be shown (most easily if the system is a perfect gas, Problem 5.2.8) that b ¼ 1=kB T
fCBg
ð5:2:17Þ
In pedestrian terms, absolute temperature seems to creep out of a Lagrange multiplier! The other undetermined multiplier a can be evaluated, when the dependence of the energy ui on quantum numbers i has been established (this calculation is different for every physical problem).
16
Svante August Arrhenius (1859–1927).
28 9
290
5
ST AT I S T I CA L M E CH AN I CS
We did not yet associate a partition function with FD or BE; the reasons will be apparent at the end of Section 5.2. PROBLEM 5.2.8. Prove Eq. (5.2.17), assuming that a perfect gas PV ¼ nRT ¼ NAkBT is treated as a mCE of CB with quantum-mechanical particlein-a-box energies. PROBLEM 5.2.9.
Assuming b 1/kBT, we need to link (d/dT) with (d/d b).
PROBLEM 5.2.10. Estimate the frequency and wavenumber and energy of a photon equivalent to T ¼ 298.15 K. The microcanonical partition function O(N, V, U) is then finally OðN; V; UÞ ¼
i¼n X i¼1
X ui U gi exp exp ¼ kB T kB T U
ð5:2:18Þ
where the sum is either over the energy levels (i ¼ 1,2,. . ., n), including their degeneracies gj, or over the energies themselves U, counted singly (the difference is purely formal). This microcanonical partition function O(N, V, U) was also called the thermodynamic probability function W in Eq. (4.5.1). O is not too easy to calculate in the general case [3]. The “natural” thermodynamic function obtained from the microcanonical ensemble is
S ¼ kB lnOðN; V; UÞ
ð15:2:19Þ
The consequence of Eq. (5.2.19) is that if T ! 0 K, as all systems become solids (except for the “quantum liquid” or superfluid 2He4 at 1 atm hydrostatic pressure), and if in the resulting solids there is perfect order (except for the quantum-mechanically mandated zero-point vibration for molecules), then O ! 1 and S ! 0. This, again, is the Third Law of Thermodynamics. Sideline. In Vienna’s Zentralfriedhof, Ludwig Boltzmann (who died a suicide after a lifelong battle with depression) is commemorated with a cenotaph with the inscription “S ¼ k ln W”; his suicide probably was not related to the slow acceptance of his discoveries. The partition function O can be as large as the upper index n in Eq. (5.2.18); this could be of the order of magnitude of Avogadro’s number, at the hightemperature limit of uj kBT, where all the terms of Eq. (5.2.18) individually reach exp(0) ¼ 1. At the opposite extreme, the partition function O can be very small—for example, at the low-temperature or high-energy limit uj kBT, where each exp(x) after the first term approaches zero. The differential form of S is m P 1 dS ¼ dN þ dV þ dU T T T
ð5:2:20Þ
5.2
CB, FD, AND BE DIST RIBUTIO NS, A ND T HE MICROC ANONIC AL ENS EMBLE
29 1
and its coefficients are, by the definition of a 3-form, the thermodynamic functions m and p, which can therefore be calculated as follows: @ ln O m ¼ kB T @N V; U
ð5:2:21Þ
@ ln O @V N; U
ð5:2:22Þ
p ¼ kB T
@ ln O 1 ¼ kB T @U N; V
ð5:2:23Þ
The overall strategy is to calculate O for the system at hand, using sums, integrals, approximations, and so on, and then obtain measurable results for S, m, and P from the appropriate logarithm or derivative of O. PROBLEM 5.2.11. Using the microcanonical partition function, Eq. (5.2.18), derive the classical MB distribution of molecular velocities n(v) for an ideal gas (Fig. 5.2):
nðvÞdv ¼ 4pv2 ðm=2pkB TÞ3=2 expðmv2 =2kB TÞdv
ð5:2:24Þ
0.0025 T = 273 K 0.002
n(v)
0.0015 T = 1273 K
0.001
0.0005
FIGURE 5.2
T = 2273 K
0
–0.0005 0
500
1000
1500
speed v (m/s)
2000
2500
3000
TheMaxwell–Boltzmann(MB)distribution of molecular velocities for nitrogen molecules in the gas phase at n(v) ¼ 4pv2(0.028/6.022 1023 2p
1.381 10 23)3/2 exp[ (0.28/ 6.022 1023 1.381 10 23T)v2] at T ¼ 273, 1273, and 2273 K.
292
5
ST AT I S T I CA L M E CH AN I CS
5.3 CANONICAL, GRAND CANONICAL, AND ISOTHERMAL–ISOBARIC ENSEMBLES [2, 3] Similar results can be obtained for the other three ensembles of Fig. 5.1: canonical (CE), grand canonical (GCE), and isothermal–isobaric ensembles (IIE). We collect all the results and add a few lines about a fifth ensemble, the generalized ensemble (GE). 1. Micro-canonical ensemble: mCE (each system has constant N, V, and U; the walls between systems are rigid, impermeable, and adiabatic; each system keeps its number of particles, volume, and energy, and it trades nothing with neighboring systems). The relevant partition function is the microcanonical partition function V ( N, V, U ): OðN; V; UÞ ¼
X U
expðU=kB TÞ
S ¼ kB lne OðN; V; UÞ dS ¼ ð1=TÞdU þ ðP=TÞdVðm=TÞdN
ðð5:2:18ÞÞ ðð5:2:19ÞÞ ðð5:2:20ÞÞ
P ¼ kB Tð@lne O=@VÞU; N
ðð5:2:21ÞÞ
m ¼ kB Tð@lne O=@NÞV; U
ðð5:2:22ÞÞ
1 ¼ kB Tð@ lne O=@UÞV; N
ðð5:2:23ÞÞ
The mCE is good for discussing isolated systems. 2. Canonical ensemble: CE (each system has constant N, V, and T; the walls between systems are rigid, impermeable, and diathermal; each system keeps its number of particles, volume, and temperature, but it can trade energy only with neighboring systems). The relevant partition function is the canonical partition function Q ( V, T, N ):
QðV; T; NÞ ¼ ¼
X j
exp½Uj ðN; VÞ=kB T
ð5:3:1Þ
OðN; V; UÞexp½UðN; VÞ=kB T
ð5:3:2Þ
X U
A ¼ kB T lne Q dA ¼ S dTP dV þ m dN
ð5:3:3Þ ð5:3:4Þ
S ¼ kB lne Q þ kB Tð@ lne Q=@TÞV; N
ð5:3:5Þ
P ¼ kB Tð@ lne Q=@VÞT; N
ð5:3:6Þ
m ¼ kB Tð@ lne Q=@NÞV;T
ð5:3:7Þ
U ¼ kB T 2 ð@ lne Q=@TÞV;N
ð5:3:8Þ
5.3
CA N O N I C A L , G R A N D C A N O N I C A L , A N D I S O T H E R M A L – I S O B A R I C E N S E M B L E S
The CE is useful for systems that must reach thermal equilibrium with their neighbors. 3. Grand-canonical ensemble: GCE (each system has constant V, T, and m; the walls between systems are rigid, but permeable and diathermal; each system keeps its volume, temperature and chemical potential, but can trade both energy and particles with neighboring systems). The relevant partition function is the grand canonical partition function J ( V, T, m ):
XðV; T; mÞ ¼ ¼
X X N
X N
j
exp½UNj ðVÞ=kB T expðNm=kB TÞ
QðV; T; NÞ expðNm=kB TÞ
ð5:3:9Þ ð5:3:10Þ
PV ¼ kB T lne XðV; T; mÞ
ð5:3:11Þ
dðpVÞ ¼ S dT þ N dm þ P d V
ð5:3:12Þ
S ¼ kB lne X þ kB Tð@ lne X=@TÞV;m
ð5:3:13Þ
N ¼ kB Tð@ lne X=@mÞV; T
ð5:3:14Þ
P ¼ kB Tð@ lne X=@VÞT;m ¼ kB Tðlne X=VÞ
ð5:3:15Þ
The GCE is useful for chemical reactions, where the number of particles of reactants decrease, and the number of particles of products increase over time. 4. Isothermal–isobaric ensemble: IIE (each system has constant P, T, and N; the walls between systems are flexible and diathermal; each system keeps its number of particles, pressure, and temperature, but can trade both volume and energy with neighboring systems). The relevant partition function is the isothermal–isobaric partition function D ( P, T, N ):
DðP; T; NÞ ¼
X X U
V
OðP; T; NÞexp½U=kB TexpðPV=kB TÞ ð5:3:16Þ
G ¼ kB T lne D dG ¼ S dT þ V dp þ m dN
ð5:3:17Þ ð5:3:18Þ
S ¼ kB T lne D þ kB Tð@ lne D=@TÞN;P
ð5:3:19Þ
V ¼ kB Tð@ lne D=@PÞN;T
ð5:3:20Þ
m ¼ kB Tð@ lne D=@NÞT;P
ð5:3:21Þ
The IIE is an alternate vehicle for studying chemical reactions, with the useful external parameters P and T kept fixed. 5. Generalized ensemble: GE (each system has constant P, T, and m: the walls between systems are flexible, porous, and diathermal; each system can trade particles, energy, volume and entropy with neigh-
29 3
294
5
ST AT I S T I CA L M E CH AN I CS
boring systems). The relevant partition function is the generalized partition function Y ( P, T, m ):
YðP; T; mÞ ¼
X X N
exp½ðmNPVUj Þ=kB TÞ
j
ð5:3:22Þ ð5:3:23Þ
0 ¼ kB T lne Y
This ensemble, seemingly so “general,” has not seen general use. Other ensembles can be invented, with other variables held constant, but they have not been used very much either. These five ensembles are summarized in Table 5.1. PROBLEM 5.3.1.
Derive an expression for A in the mCE ensemble.
PROBLEM 5.3.2.
Derive the following integral [3]: r¼1 ð
I0 ¼
expðar2 Þdr ¼
rffiffiffi p a
1 2
ð5:3:24Þ
r¼0
Note that
r¼R Ð
expðar2 Þ dr ¼ erf ðaRÞ is the error function.
r¼0
PROBLEM 5.3.3.
Derive the following integral [3]: r¼1 ð
I1 ¼
expðar2 Þr dr ¼
1 2a
ð5:3:25Þ
r¼0
PROBLEM 5.3.4.
Derive the following integral [3]: r¼1 ð
I2N1 ¼
expðar2 Þr2N1 dr ¼
1 ðN1Þ! 2aN
ð5:3:26Þ
r¼0
PROBLEM 5.3.5.
Derive the following useful integral [3]: r¼1 ð
I2N ¼ r¼0
Table 5.1
ð2N þ 1Þ! expðar Þr dr ¼ 2N 2 N!aN 2
2N
rffiffiffi p a
ð5:3:27Þ
Summary of Five Ensembles [4] Types of Contact with Next System
Name of Ensemble
Independent Variables
Microcanonical Canonical Grand canonical Isothermal– isobaric Generalized
N, V, U N, V, T V, T, m N, T, P
mCE CE GCE IIE
None Thermal Material, thermal Mechanical, thermal
P, T, m
GE
Mechanical, material, thermal
Partition Function P O ¼ j exp½Uj =kB T P Q ¼ j exp½Uj ðN; VÞ=kB T P P X ¼ N j exp½ðmNUj =kB TÞ P P D ¼ V j exp½ðUPV=kB TÞ Y¼
P P N
j
exp½ðmNpVUj Þ=kB TÞ
Fundamental Thermodynamic Equation S ¼ kBT lneO A ¼ kBT lneQ PV ¼ kBT lneX G ¼ kBT lneD 0 ¼ kBT lneU
5.3
CA N O N I C A L , G R A N D C A N O N I C A L , A N D I S O T H E R M A L – I S O B A R I C E N S E M B L E S
PROBLEM 5.3.6. radius a is [3]
Show that the volume of an N-dimensional sphere of pN=2 aN VN ¼ N þ1 G 2
PROBLEM 5.3.7. Show that the partition function in any ensemble can be transformed into a product of “sub”-partition functions if the energies involved are additive E ¼ E1 þ E 2 þ þ E N
ð5:4:28Þ
PROBLEM 5.3.8. For the canonical ensemble for distinguishable noninteracting molecules show that QðN; V; TÞ ¼ ½qtrans ðV; TÞqel ðV; TÞqvib ðV; TÞqrot ðV; TÞN ¼ ½qðV; TÞN while for indistinguishable noninteracting molecules {CB} show that QðN; V; TÞ ¼ ½qðV; TÞN =N!
ð5:3:30Þ
PROBLEM 5.3.9. Show that in a CE the translation partition function for a single molecule in a volume V is qtrans ¼ Vð2pmkB T=h2 Þ3=2
ð5:3:31Þ
Estimate a typical size for qtrans. PROBLEM 5.3.10. Show that in a CE the molecular harmonic (Hooke’s law) vibrational partition function is qvib ¼ ½1expðhv0 =kB TÞ1
ð5:3:32Þ
Estimate a typical size for qvib. PROBLEM 5.3.11. Show that in a CE for a nonsymmetrical linear molecule, the single-molecule rotational partition function qrot ¼ ðkB T=hcBÞ
ð5:4:33Þ
where B h/8p2Iec and Ie moment of inertia ¼ Estimate a typical size for qrot.
X i
mi r2i
29 5
296
5
ST AT I S T I CA L M E CH AN I CS
Rotation–vibration interactions, if present, make the calculation more difficult, because then the vibration and rotation partition functions are coupled and cannot be separate factors. For a nonlinear polyatomic molecule the rotation partition function becomes
qrot ¼ ðp1=2 =sÞð8p2 Ix kB T=h2 Þ1=2 ð8p2 Iy kB T=h2 Þ1=2 ð8p2 Iz kB T=h2 Þ1=2
ð5:3:34Þ
where s ¼ symmetry number (a small integer), and Ix, Iy, Iz are the principal moments of inertia in an appropriate principal-axis system (x, y, z). The symmetry number is s ¼ 1 for HD, s ¼ 2 for H2, s ¼ 3 for CHCl3, and s ¼ 12 for CH4. PROBLEM 5.3.12. Show that in a CE, in the absence of degeneracy, the single-molecule electronic partition function is qel 1Eel =kB T
ð5:4:35Þ
Give a typical size for qel. PROBLEM 5.3.13. Show that for a monoatomic ideal gas in the mCE the partition function is OðN; V; UÞ ¼ ½GðN þ 1ÞGð3N=2Þ1 ð2pma2 =h2 Þ3N=2 Uð3N=21Þ where G is the gamma function, m is the mass of the gas atom, and a is the macroscopic size of the box in which the gas of N molecules resides [3]. PROBLEM 5.3.14. Derive an expression for H in the CE. PROBLEM 5.3.15. Derive an expression for H, G, and U in the GCE. PROBLEM 5.3.16. tomic gas is
Show that the CE partition function for an ideal monoa-
QðN; V; TÞ ¼ ½GðN þ 1Þ1 ð2pmkB T=h2 Þ3N=2 V N PROBLEM 5.3.17. Derive aNj ¼ exp( a) exp [ b ENj(V)] exp(g N) for the GCE [3]. Now we can see how to include FD or BE statistics into partition functions. First, consider a system of FD or BE particles within a CE [5]: QðN; V; TÞ ¼
X
g j j
expðUj =kB TÞ
ð5:3:36Þ
5.3
CA N O N I C A L , G R A N D C A N O N I C A L , A N D I S O T H E R M A L – I S O B A R I C E N S E M B L E S
which, or course, is subject to the two conditions X ne Uj ¼ k k k N¼
X
ð5:3:37Þ ð5:3:38Þ
n k k
Of these, the first, Eq. (5.3.37), is easy to apply, but the second, Eq. (5.3.38), must incorporate the FD or BE conditions. This places a restriction, indicated by , on the sum: X X* QðN; V; TÞ ¼ exp n e =k T ð5:3:39Þ j j B fnkg j This restriction is, alas, difficult to apply within the CE. It is much easier to use instead the grand canonical ensemble and write XðV; T; mÞ ¼
P1
expðmN=kB TÞQðN; V; TÞ P P N P* ¼ 1 N¼0 ½expðm=kB TÞ fnkg exp j nj ej =kB T P P* Q nk ¼ 1 N¼0 fnkg k fexp½ðmek Þ=kB TÞg N¼0
ð5:3:40Þ
Now, since the sum is over all possible values of N, therefore each nk ranges over all possible and allowed values, so the above sum can be rewritten as XðV; T; mÞ ¼
Xn1max Xn2max n1¼0
n2¼0
...
Y k
fexp½ðmek Þ=kB TÞgnk
which simplifies to XðV; T; mÞ ¼
Y Xnkmax k
nk¼0
fexp½ðmek Þ=kB TÞgnk
ð5:3:41Þ
For FD, nk ¼ either 0 or 1, so nkmax ¼ 1, and therefore XFD ðV; T; mÞ ¼
Y k
f1 þ exp½ðmek Þ=kB TÞg
ð5:3:42Þ
For BE, nk ranges from 0 to 1, whence nkmax ¼ 1, and we can sum a geometrical sum exactly, to obtain XBE ðV; T; mÞ ¼
Y k
f1exp½ðmek Þ=kB TÞg1
ð5:3:43Þ
The final result is (upper sign for FD, lower sign for BE) XFD=BE ðV; T; mÞ ¼
Q
k f1
þ exp½ðmek Þ=kB TÞg1
ð5:3:44Þ
PROBLEM 5.3.18. Show that the average number of particles is given by hNiFD=BE ¼
P k
exp½ðmek Þ=kB TÞ=f1 exp½ðmek Þ=kB TÞg
ð5:4:45Þ
29 7
298
5
ST AT I S T I CA L M E CH AN I CS
and that the average number of particles in the kth quantum state is
hnk iFD=BE ¼ exp½ðmek Þ=kB TÞ=f1 exp½ðmek Þ=kB TÞg
ð5:3:46Þ
PROBLEM 5.3.19. Show that the average energy is given by hEiFD=BE ¼
P
k ek exp½ðmek Þ=kB TÞ=f1
exp½ðmek Þ=kB TÞg
ð5:3:47Þ
PROBLEM 5.3.20. Use Eq. (5.3.15) and get PV ¼ kB T
P k
lne f1 exp½ðmek Þ=kB TÞg
ð5:3:48Þ
Equations (5.3.44)through(5.3.48) are the fundamental formulas of Fermi–Dirac and Bose–Einstein statistics.
5.4 LINK BETWEEN THE PARTITION FUNCTIONS AND SOME OTHER THERMODYNAMIC FUNCTIONS Each ensemble has “natural” thermodynamic variables, but the usual relationships between macroscopic thermodynamic functions will allow us to obtain the “other” thermodynamic state functions. In the mCE, the “natural” thermodynamic variable is S ¼ kB lneO, and we found explicit expressions in Section 5.3 for m, P, and T. Can we also find some new expressions for U, H, or A? This is addressed in Problem 5.4.1.
PROBLEM 5.4.1. (a) Given U ¼
P
j Ni Uj ðN=OÞ
P
j Uj expðUj =kB TÞ,
show that
U ¼ ðN=OÞ½@O=@ð1=kB TÞU; V (b) Show H ¼ (N/O) [@O/@(1/kBT)]U,V þ VkBT [@ lneO/@V]N,U (c) Prove A ¼ (N/O) [@O/@(1/kBT)]U,V kBT lneO Next, in the CE, the “natural” thermodynamic variable is A ¼ kBT lneQ, and expressions for S, P, m, and U were found explicitly in Section 5.3. Can we find new expressions for the “other” thermodynamic functions of state, such as H and V?
5.5
29 9
HE AT CAPACITIES
PROBLEM 5.4.2. (a) Find V for CE. (b) Find H for CE. In the GCE, the “natural” thermodynamic variable is PV ¼ kBT lneX, and expressions for S, N, and P were found explicitly in Section 5.3. Can we find new expressions for the “other” thermodynamic functions of state, such as U, A, and G? PROBLEM 5.4.3. (a) Find U for GCE. (b) Find A for GCE. (c) Find G for GCE.
5.5 HEAT CAPACITIES We want to calculate the heat capacity at constant volume, defined by CV ð@U=@TÞV
ðð4:8:3ÞÞ
even though the heat capacity at constant pressure: CP ð@H=@TÞP
ðð4:8:4ÞÞ
is easier to measure directly. We remember from thermodynamics that CP CV ¼ ½P þ ð@U=@VÞT ð@V=@TÞP
ð5:5:1Þ
where (@U/@V)T is the “internal pressure.” Furthermore, CP and CV are connected exactly by CP CV ¼ a2 VT=kT
ð5:5:2Þ
where a is the volume coefficient of thermal expansivity: a ð1=VÞð@V=@TÞP
ðð4:8:1ÞÞ
and kT is the isothermal compressibility kT ð1=VÞð@V=@TÞT
ðð4:8:2ÞÞ
For a perfect gas (PV ¼ NkBT) things are super-easy, because (@ U/@ V)T ¼ 0 and CP CV ¼ NkB ¼ R
ð5:5:3Þ
300
5
ST AT I S T I CA L M E CH AN I CS
There is also a semiempirical Gr€ uneisen17 relationship: a 2:0kT CV =V
ð5:5:4Þ
Using it, we obtain trivially a funny approximate dimensionless equation: ðCP CV Þ=CV 4kT CV T=V
ð5:5:5Þ
So much for old memories from classical thermodynamics. Now let us get more serious. The canonical partition function, using the degeneracies directly, is QðN; V; TÞ ¼
X j
gj expðUj =kB TÞ
ð5:3:36Þ
and differentiating Eq. (5.3.8) we obtain CV ð@U=@TÞV ¼ NkB T½@ 2 ðT lne QÞ=@T 2 V;N
ð5:5:6Þ
As in Section 5.3, we assume that there are additive translational, electronic, vibrational, and rotational contributions to the heat capacity: vibr rot þ Cel CV ¼ Ctrans V V þ CV þ CV
ð5:5:7Þ
(there are other small contributions—for example, from isotope distribution). Consider a one-component perfect gas, and let us look at all terms in Eq. (5.5.7). Translation. From the well-known Equipartition Theorem, which assumes that (1/2)kBT of translational energy resides in each “normal mode,” we get Ctrans ¼ ð3=2ÞNkB ¼ ð3=2ÞR V
ð5:5:8Þ
which works perfectly for a monoatomic “noble” gas, such as He, Ar, and so on. Rotation. Next, consider the rotational contribution to the heat capacity for a molecule of symmetry number s, for which we found in Problem 5.3.11 that qrot ¼ (kBT/hcB) ¼ (8p2kBT Ie/sh2 ¼ 0.0419IeT/s. Thus we get Crot V ¼ NkB =s ¼ R=s
17
Eduard Gr€ uneisen (1877–1949)
ð5:5:9Þ
5.5
30 1
HE AT CAPACITIES
Nuclear Spin Effects on Rotation. There is an interesting effect on the rotational partition function, even for the hydrogen molecule, due to nuclear spin statistics. The Fermi postulate mandates that the overall wavefunction (including all sources of spin) be antisymmetric to all two particle interchanges. A simple molecule like 1 H1 2 , made of two electrons (S ¼ 1=2 ) and two protons (spin I ¼ 1/2), will have two kinds of molecule: (i) ortho-hydrogen, which has overall nuclear spin I ¼ 1, with the three nuclear spin states aa, bb, or (ab þ ba)/21/2 [see Eqs. (3.7.3) to (3.7.6) for similar electronic spin states] for which the space part of the wavefunction must be antisymmetric (this only happens for odd values of the rotational quantum number J); (ii) para-hydrogen, which has overall nuclear spin I ¼ 0, with only one antisymmetric spin state (ab ba)/21/2, for which the space part of the wavefunction must be symmetric (this requires J ¼ even). The energetics for the rotational levels of hydrogen are shown in Fig. 5.3. [To be exact, the I ¼ 0 must be combined with electron spin S ¼ 0 to yield a total spin quantum number T ¼ 0; for ortho-hydrogen, I ¼ 1 combines with S ¼ 0 to yield T ¼ 1.] As said above, a catalyst is needed to ensure that the spin statistics reach true thermodynamic equilibrium.
J=8 4000
J=7
Energy (cm–1)
3000
J=6 2000 J=5
J=4 J=3
1000
kT at 1000 K J=2 0
kT at 300 K J=1
J=0
Para hydrogen I=0, T = 0
FIGURE 5.3 Ortho hydrogen I=1, T = 1
Rotational energy levels of para (I ¼ 0) and ortho (I ¼ 1) hydrogen molecule [2].
302
5
ST AT I S T I CA L M E CH AN I CS
This means that the sums of Problem 5.3.11 must be carried out separately:
qrot ¼ ð1=4Þ
hP J¼1
J¼0;even þ3
PJ¼1 J¼1;odd
i
ð2J þ 1ÞexpðhcBJðJ þ 1Þ=kB TÞ ð5:5:10Þ
For H2, at temperatures below 1000 K, the rotational constant B ¼ 59.4 cm1 is so large that integration is invalid, and one must sum directly the leading terms of the two sums. The same holds for D2, where B ¼ 29.9 cm1. For the heavier homonuclear diatomics, like O2 (B ¼ 1.437 cm1) or N2, B is so small that the high-temperature integration of Problem 5.3.11 works well, and the difference between ortho and para states becomes experimentally indistinguishable. At “reasonable temperatures” for hydrogen: Northo =Npara ¼ 3
XJ¼1 odd¼1
XJ¼1
ð2J þ 1Þ expðhcBJðJ þ 1Þ=kB TÞ=
J¼even¼0
ð2J þ 1Þ expðhcBJðJ þ 1Þ=kB TÞ
ð5:5:11Þ
and
rot rot Crot V ðhigh-TÞ ¼ ð1=4ÞCV ðpara-H2 Þ þ ð3=4ÞCV ðortho-H2 Þ
ð5:5:12Þ
The experimental mixture is pure para (total I ¼ 0) at 0 K, but 25% para and 75% ortho (total I ¼ 1) at 300 K. In the absence of a catalyst, like charcoal, a room-temperature mixture (25% para) will preserve this distribution even when cooled. The catalyst, if present, will dissociate the molecule into atoms, which then recombine; this will establish the equilibrium predicted by statistical mechanics at all temperatures. In the high-temperature limit (hcBJ(J þ 1) kBT) of Problem 5.3.11, the two sums can be replaced by integrals; the two integrals yield, for H2: qrot ¼ ð1=4Þ
ð J¼1
dJð2J þ 1ÞexpðhcBJðJ þ 1Þ=kB TÞ
evenJ¼0
þ
ð J¼1
) dJð2J þ 1ÞexpðhcBJðJ þ 1Þ=kB TÞ
oddJ¼1
¼ ð1=4ÞfðkB T=hcBÞ þ 3ðkB T=hcBÞg ¼
ð J¼1 J¼0
dJð2J þ 1ÞexpðhcBJðJ þ 1Þ=kB TÞ ¼ ðkB T=hcBÞ
ð5:5:13Þ
5.5
30 3
HE AT CAPACITIES
PROBLEM 5.5.1. Calculate the relative ortho and para contributions to the rotational heat capacity for D2, which has two nucleons, both with I ¼ 1: rot rot Crot V ðhighTÞ ¼ ð1=3ÞCV ðorthoD2 Þ þ ð2=3ÞCV ðparaD2 Þ
ð5:5:14Þ
PROBLEM 5.5.2. Let the spin of each nucleon of a homonuclear diatomic molecule be I. Show that, if I ¼ integer (boson), there are I (2 I þ 1) antisymmetric and (I þ 1) (I þ 2) symmetric spin functions. If I ¼ half-integer (fermion), then there are I (2 I þ 1) antisymmetric and (I þ 1) (I þ 2) symmetric nuclear spin functions. Vibration.
Using Eq. (5.5.6) and Eq. (5.3.8), we get ¼ NkB T½@ 2 ðT lne Qvibr Þ=@T2 V;N Cvibr V
2 2 Cvibr V ¼ NkB ðhn=kB TÞ expðhn=kB TÞ½1expðhn=kB TÞ
ð5:5:15Þ
ð5:5:16Þ
which at infinite temperature becomes LimT ! 1 Cvibr ¼ NkB ¼ R V
ð5:5:17Þ
Electronic Excitation. Using degeneracies g0 for the ground state and g1 for the first excited state, we get from qel g0 þ g1 expðEel =kB TÞ
ð5:5:18Þ
by differentiation: 2 2 Cel V ¼ NkB T½@ ðT lne qel Þ=@T V;N
¼ NkB T½@ 2 ðT lne ½g0 þ g1 expðEel =kB TÞÞ=@T2 V;N
ð5:5:19Þ
¼ NkB g0 g1 ðEel =kB TÞ2 expðEel =kB TÞ½ g0 þ g1 expðEel =kB TÞ2 this expression will reduce to NkBg0g1 at high enough temperature. Thus we get Fig. 5.4, where the characteristic temperatures for onset of rotation, vibration, and electronic excitation are defined by Yrot h2/8p2IekB, Yvib hn/kB, and Yel Eel/kB. Einstein Theory of Low-Temperature Heat Capacity of Solids [2]. When we consider the heat capacity of solids, we realize that they consist of vibrating atoms or molecules. Their vibrations are quantized, of course, and have the nice name of phonons. Einstein considered a single vibration of an oscillator, along with its partition function: qvib ¼ ½1 expðhn=kB TÞ1
ð5:3:32Þ
304
5
ST AT I S T I CA L M E CH AN I CS
CV electronic excitation sets in
7R/2
translation plus rotation plus vibration
R 5R/2
translation plus rotation
R 3R/2
translation only
FIGURE 5.4 Temperature trend of heat capacity for gas-phase molecules [6].
T ≈ Θrot
T ≈ Θvibr
T ≈ Θel T
and assumed that in a crystal of N species (atoms or molecules) there are 3N of these. Then Einstein found:
2 2 Cxtal V ¼ NkB T½@ ðT lne qvib Þ=@T V;N
¼ 3NkB ðhn=KB TÞ2 expðhn=KB TÞ½expðhn=KB TÞ12
ð5:5:20Þ
which at high temperatures approaches the old “law” of Dulong18 and Petit19 of 1819: LimT ! 1 Cxtal V ¼ 3NkB ¼ 3R
ð5:5:21Þ
The Einstein theory is simple. The normal vibrations of a crystal span a wide spectrum. It would be wise to introduce a spectral density function g(n), such that ð1 n¼0
gðnÞdn ¼ 3N
ð5:5:22Þ
Then for instance it seems simple to compute the heat capacity from Cxtal V
¼ kB
ð1 n¼0
gðnÞdnðhn=kB TÞ2 expðhn=kB TÞ½expðhn=kB TÞ12
but the normal mode analysis to obtain the correct g(n) is painful. Debye20 found a better way.
18
Pierre Louis Dulong (1785–1838). Alexis Therese Petit (1791–1820). 20 Peter Joseph William Debye ¼ Petrus Josephus Wilhelmus Debye (1884–1966). 19
5.5
30 5
HE AT CAPACITIES
Debye Theory of the Heat Capacity of Solids. Debye assumed that a cubic crystal of side L and volume V ¼ L3 can be taken as a vacuum (German Hohlraum) that supports a set of standing waves, each with form ð5:5:23Þ
sinðpsx x=LÞ sinðpsy y=LÞ sinðpsz z=LÞ
where sx, sy, and sz are positive integers, s s2x þ s2y þ s2z , the wavelength is l 2L/s, and the frequency is n cs/2L. The number of allowed positive integers between s and s þ ds is (ps2/2) ds, and the number of allowed standing waves between n and n þ dn is ðp=2Þð2L=cÞ3 n2 dn ¼ 4pVc3 n2 dn
ð5:5:24Þ
Of these waves, there are transverse waves, with velocity ctr and two possible directions of polarization, and longitudinal waves with velocity clo. Overall, therefore, the distribution function becomes finally 3 2 gðnÞdn ¼ ð2c3 tr þ clo Þ4pVn dn
ð5:5:25Þ
The average velocity is defined by 2 3 3c3 tr 2ctr þ clo
so finally gðnÞdn ¼ 12pVc3 n2 dn
ð5:5:26Þ
If there is an experimental upper limit (Debye frequency) nD to the allowed frequencies: ð nD n¼0
gðnÞdn ¼ 3N
ð5:5:27Þ
then immediately after integration we have n2D ¼ ð3=4pÞðN=VÞc3
ð5:5:28Þ
and after defining a Debye temperature YD hnD =kB
ð5:5:29Þ
and the auxiliary variables uD hnD =kB T and u hn=kB T we obtain
CV ¼
9NkB u3 D
ð uD u¼0
u4 expðuÞ½expðuÞ12 du
ð5:5:30Þ
306
5
ST AT I S T I CA L M E CH AN I CS
This result, which can only be integrated numerically, fits experiment extremely well. Furthermore, at low T: CV 77:93ðT=YD Þ3
ð5:5:31Þ
which, for T < (YD/12), fits experiment to within 1%! This result is valid for a “normal” three-dimensional solid, in which vibrations are likely in all three orthogonal directions. For theoretical one- or two-dimensional bodies we have CV Ta, a ¼ 1 or 2. There are organic solids (e.g., quasi-one-dimensional metals) for which, at very low temperatures, a 1 is possible, because anisotropic vibrations in only one direction (the stacking direction) are dominant.
5.6 BLACK-BOX RADIATION AND THE BIRTH OF QUANTUM MECHANICS A blackbody is a body that absorbs all radiation and emits none. Experimentally, it is approximated by a “furry box” (a closed box of aluminum, whose interior walls are anodized to form a black surface, or a metal box painted with carbon black) and with a small hole drilled in one face, to allow some radiation generated at any fixed temperature to escape the box. The puzzle in the late nineteenth century was to explain the experimentally observed wavelength dependence and temperature dependence of the radiation (Fig. 5.5). Partial explanations had been obtained by Rayleigh21 and Jeans22 and by Stefan23 and Boltzmann, but the full, exact, correct, and truly revolutionary explanation was obtained in 1901 by Planck,24 who thereby ushered in quantum mechanics. Rayleigh and Jeans had estimated in 1900–1905 that the number of allowed transverse electromagnetic waves of the type of the type [E0 sin (2plx/L) sin (2pmy/L) sin (2pnz/L)] in the frequency range between n and n þ dn in a cubical box of volume V ¼ L3 is [cf. Eqs. (5.5.23) to (5.5.26), ignoring the longitudinal waves]
gðnÞdn ¼ 8pkB Tc3 n2 dn
ð5:6:1Þ
(Rayleigh found the n2 dependence, Jeans later supplied the rest). Their distribution function g(n) increases as n2, with no provision for a fall-off to zero as the frequency and the energy go to infinity (“ultraviolet catastrophe”).
21
John William Strutt, third baron Rayleigh (1842–1919). Sir James H. Jeans (1877–1946). 23 Jozef Stefan (1835–1893). 24 Max Planck (1858–1947). 22
5.6
B L A C K - B O X R A D I A T I O N A N D T HE B I RT H O F QU A N TU M M ECH A N I C S 0.0012
0.0008
0.0006
0.0004
0.0002
=1.5 μm
T=2000 K:
Range of visible wavelengths (350 nm - 750 nm)
Relative energy density
0.001
T=1750 K;
max
=1.7 μm
max
T=1500 K;
=1.9 μm
max
=2.3 μm
T=1250 K;
max
FIGURE 5.5 Blackbody radiation at various temperatures T (kelvin). T and lmax are linked by Wien’s25 experimental law, lmaxT ¼ 0.002896 in SI units).
0 0
1
2
3
4
5
λ /μm
The Stefan–Boltzmann law gave the correct temperature dependence at high energies: gðnÞdn ¼ sT4
ð5:6:2Þ
In 1901 Planck finally explained the frequency and temperature dependence of blackbody radiation, and ushered in the age of quantum physics, by introducing the quantization of the oscillators that Rayleigh had discussed. (Planck assumed that these oscillators were in the walls of the Hohlraum and that the radiation was in equilibrium with them.) The energy density (energy per unit volume) [u(n, T)/V] dn at the temperature i, in the frequency range between n and n þ dn, is given by
½uðn; TÞ=Vdn ¼ 8phn3 c3 ½expðhn=kB TÞ11 dn ¼ 8pk5B T 5 h3 c3 ðhn=kB TÞ3 ½expðhn=kB TÞ11 dðhn=kB TÞ ð5:6:3Þ or also in terms of wavelengths: rðl; TÞdl ¼ 8phcl5 ½expðhc=lkB TÞ11 dl
ð5:6:4Þ
Equation (5.6.3) can obtained from Eq. (5.6.1) by (i) taking a unit volume, (ii) choosing a frequency nj, and (iii) using a small incremental interval dn and calling this interval between nj and nj þ dn the jth interval, (iv) assuming from Eq. (5.6.1) that there are gj ¼ 8pn2j c3 dn states in it, (v) assuming (this is
25
30 7
Wilhelm Carl Werner Otto Fritz Wien (1864–1928).
308
5
ST AT I S T I CA L M E CH AN I CS
crucial) the energy of the photon to be hnj, and (vi) assuming that there are Nj such BE photons, with the usual constraint on the total energy SjNjhnj ¼ U; the BE distribution, Eq. (5.2.4), becomes, as seen earlier, lne t ¼
X j
fgj lne ½ðgj þ Nj Þ=gj þ Nj lne ½ðgj þ Nj Þ=Nj Þ
ð5:2:7Þ
Take the differential d lnet, use the Lagrange multiplier b 1/kBT, and obtain the condition @ðlne tÞ=@Nj hnj =kB T ¼ 0
ð5:6:5Þ
hence, in analogy to Eq. (5.2.12) we have Nj ¼ gj =½expðhnj =kB TÞ1
ð5:6:6Þ
And then the most probable distribution, Eq. (5.6.3), is obtained; the total energy density U/V becomes finally U=V ¼
ð1 dnuðn; TÞ=V 0
¼ 8phðkB T=hÞ4 c3
ð1
ds s3 ½expðsÞ11
ð5:6:7Þ
0
¼ 8phðkB T=hÞ4 c3 ðp4 =15Þ
U=V ¼ ð8=15Þp5 k4B T 4 h3 c3 ¼ sT4
ð5:6:8Þ
ð5:6:9Þ
This equation is exact, and correct. It involves an unusual integral (Problem 5.6.2); it incorporates the Rayleigh–Jeans result (Eq. (5.6.1)) at low frequencies, and the Stefan–Boltzmann result [Eq. (5.6.2)] at high frequencies, and also gets an explicit value for the constant s ¼ 5.669 108 W m2 K4. Planck found that the thermal motion of the atoms in the walls of the black body do excite the oscillators of the electromagnetic field in the “cavity,” but only if the oscillators can acquire the requisite energy hn: the very-high-frequency oscillators require too much energy, cannot be excited by the walls, remain unexcited, and are not involved. This fixes the ultraviolet catastrophe. Sideline. Planck was already a middle-aged physicist when he found the solution to the blackbody conundrum. However, in strolls on the Philosophenweg above Heidelberg, Germany, Planck confided to his son that he was fully aware of the controversial nature of his postulate of quanta of vibration and that this involved a true revolution in physics.
5.7
ME CHANIC S O F A ONE-DIME NSIO NAL C HAIN O F PARTIC LES
PROBLEM 5.6.1. For radiation from the solar chromosphere at a temperature of 6000 K, estimate the wavelength maximum. Prove the Wien displacement law: lmax T ¼ hc=4:965kB ¼ 2:896 103 K m
ð5:6:10Þ
PROBLEM 5.6.2. Prove [7] that ð1 8phðkB T=hÞ4 c3 ds s3 ½expðsÞ11 ¼ 8phðkB T=hÞ4 c3 ðp4 =15Þ 0
using the special integral: ð1 X ds s3 ½expðsÞ11 ¼ Gð4Þ n n4 ¼ 6zð4Þp4 =15
ð5:6:11Þ
0
where G is the gamma function (here G(4) ¼ 3!), and z is the Riemann26 zetafunction. In cosmology, the almost isotropic cosmic microwave background radiation (CMBR) fills the universe (for a traditional optical telescope, the space between stars and galaxies is quite black). But with a radio telescope sensitive to microwaves finds a faint background glow, not associated with any star, galaxy, or other object. The CMBR is well explained by the Big Bang theory: When the universe was young, before the formation of stars and planets, it was smaller, much hotter, and filled with a uniform glow from its red-hot fog of hydrogen plasma. As the universe expanded, both the plasma and the radiation filling grew cooler. When the universe got cool enough, stable atoms could form. These atoms could no longer absorb the thermal radiation, and the universe became transparent, instead of being an opaque fog. The photons that were around at that time have been propagating ever since, though growing fainter and less energetic, since the exact same photons fill a larger and larger universe. The CMBR has a thermal blackbody spectrum at a temperature of Tmax ¼ 2.725 K, nmax ¼ 160.2 GHz, lmax ¼ 1.9 mm). The CMBR was first measured by Penzias27 and Wilson.28 The glow is almost, but not quite, uniform in all directions; the anisotropies have been explained. Most, but not all, cosmologists accept CMBR as strong evidence of a Big Bang, which occurred, depending on the Hubble29 constant and other current estimates, between 13.61 and 13.85 109 years ago.
5.7 MECHANICS OF A ONE-DIMENSIONAL CHAIN OF PARTICLES We collect here some simple treatments of arrays of atoms or particles in one dimension: these will be quite useful in later analogies with the band structure of solids. In particular, the notions of Brillouin30 zone, of band edges, of
26 27
Georg Friederich Bernard Riemann (1826–1866).
Arno Allan Penzias (1933– ). Robert Woodrow Wilson (1936– ). 29 Edwin Powell Hubble (1889–1953). 30 Leon Brillouin (1889–1969). 28
30 9
310
5
ST AT I S T I CA L M E CH AN I CS
acoustical and optical branches, and of forbidden regions in energy are developed using simple models of balls connected by elastic springs. The Hooke’s law problem described in Section 2.5 can be revisited. Consider longitudinal waves in a homogeneous line described by x, the position of a particular point on that line, and u, the longitudinal displacement of that point from its equilibrium position. It can be shown (see Problem 5.7.1) that the displacement u obeys a one-dimensional mechanical wave equation: 2 2 @ 2 u=@x2 ¼ v2 0 @ u=@t
ð5:7:1Þ
where v0 is the phase velocity of the wave (v0 is positive definite). This is a well-known second-order partial differential equation. Its solutions are of the form uðx; tÞ ¼ A exp½iot þ kxÞ þ B exp½iotkxÞ
ð5:7:2Þ
where the angular frequency o (radians s1), the frequency n (Hz, or cycles s1), the wavevector k (m1), and the wavelength l (meters cycle1) are related by v0 ¼ nl ¼ ol=2p ¼ o=k
ð5:7:3Þ
The detailed form of the solution depends on the boundary conditions. The wavevector k can serve as the independent variable in the direct (space) domain, while o is the independent variable in the time domain. The dispersion relation oðkÞ is the relation between the angular frequency o (time-domain behavior) and the wavevector k (space-domain behavior). Dispersion means that waves of different angular frequencies o can travel at different speeds v. In the present simple case the dispersion relation is the linear relationship: o ¼ v0 k
ð5:7:4Þ
where v0 is the phase velocity of the wave. PROBLEM 5.7.1. Demonstrate the validity of the one-dimensional wave equation for longitudinal waves, Eq. (5.7.1). PROBLEM 5.7.2. Construct the same argument, but for transverse waves on a string with tension T and mass per unit length m: Using Fig. 5.6, show that (@ 2y/@x2) ¼ (T/m)1/2 (@ 2y/@t2). Longitudinal Elastic Waves on a 1-D Line of Equidistant Equal Atoms. Consider next the longitudinal motion of a one-dimensional array of L equal atoms of mass M (Fig. 5.7). These atoms at rest are equidistant—that is, spaced a (meters) apart—and can interact via Hooke’s law with force constant kH (N m 1), but only with their nearest neighbors. Let un be the longitudinal displacement of atom n from its equilibrium position. The net Hooke’s law force on atom n, due to the displacements un, un1, and un þ 1, is Fn ¼ kH ðunþ1 un ÞkH ðin un1 Þ
ðn ¼ 1; 2; :::; LÞ
ð5:7:5Þ
5.7
31 1
ME CHANIC S O F A ONE-DIME NSIO NAL C HAIN O F PARTIC LES
Ty + ΔTy
T θ + Δθ
Tx ≈ T cos θ ≈ T θ
Tx + ΔTx y +Δy
T
y
FIGURE 5.6
Ty = T sin θ = T(∂ y/∂x) x
Analysis of the forces for a transverse wave on a string.
x + Δx
u2
FIGURE 5.7 n=1
n=2
K
K
M
M
n=3
M
K
n=4
M
a
System of four masses M connected by three springs, each with Hooke’s law constant kH. The equilibrium distance between adjacent masses is a. The second particle has been displaced to the right by u2.
so the net equation of motion becomes
Mð@ 2 un =@t2 Þ ¼ kH ðunþ1 þ un1 2un Þ
ð5:7:6Þ
We assume that the displacement of the last atom (n ¼ L) is equal to that of the first atom (n ¼ 1): this is the Born31–von Karman32 periodic boundary condition. We seek traveling or stationary wave solutions of the general form un ¼ A exp½iðot þ knaÞ
ð5:7:7Þ
where (na) is the variable used, in place of x, to denote the position along the chain. Note that k is not the same as kH. The substitution of Eq. (5.7.7) into Eq. (5.7.6) generates the condition o2 M ¼ kH ½expðikaÞ þ expðikaÞ2 ¼ kH ½2 cosðkaÞ2 ¼ 2kH ½cos2 ðka=2Þsin2 ðka=2Þ1 ¼ 4kH sin ðka=2Þ 2
31 32
Max Born (1882–1970). Theodore von Karman ¼ Sz€ oll€ oskislaki Karman T odor (1881–1963).
ð5:7:8Þ
312
5
ST AT I S T I CA L M E CH AN I CS
1.2
(M / 4 KH)
–1/2
1
FIGURE 5.8
0.8
0.6
0.4
Dispersion relation o ¼ (4 kH/M)1/2 sin (k a/2) as a function of k, for a one-dimensional chain of masses M linked by springs of Hooke’s law constant kH, Eq. (5.7.9). The maxima omax ¼ (4kH/M)1/2 are at kmax a ¼ p radians.
0.2
0 –1.5
–1
–0.5
0
0.5
1
1.5
K / (a π)
which means that a solution of the type of Eq. (5.7.7) will exist if and only if o satisfies the dispersion relation:
o ¼ ð4kH =MÞ1=2 sinðka=2Þ
ð5:7:9Þ
This dispersion relation between the angular frequency o and the wavevector k shows a maximum omax ¼ (4 kH/M)1/2 at the “magical” points k ¼ p=a. The negative solutions for o are thrown away (see Fig. 5.8). The region inside k ¼ ðp=aÞ is called the first Brillouin zone (the Wigner33–Seitz34 cell of the reciprocal lattice). Within that region the wave can travel; substituting Eq. (5.7.9) into Eq. (5.7.7) yields un ¼ A exp½iðð4kH =MÞ1=2 sinðkat=2Þ þ knaÞ
ð5:7:10Þ
At the “band edge” or critical value
k ¼ kedge p=a
ð5:7:11Þ
the solution of Eq. (5.7.6) is no longer a traveling wave, but is rather a standing wave (independent of n a, i.e., independent of x): un ¼ A exp½ið4kH =MÞ1=2 t
33 34
Eugene Paul Wigner ¼ Jen€ o Pal Wigner (1902–1995). Frederick Seitz (1911–2008).
ð5:7:12Þ
5.7
31 3
ME CHANIC S O F A ONE-DIME NSIO NAL C HAIN O F PARTIC LES
n=1
K
n=2
M
m
K
n=3
K
n=4
M
m
K
n=5
K
n=6
M
m
k
n=7
m
a
FIGURE 5.9 System of 4 small masses m and 3 large masses M, alternating on a one-dimensional chain, with intermass equilibrium distance a. The even labels 2n refer to the larger masses, whereas the odd labels refer to the smaller masses.
The relation (5.7.11) is very similar to Bragg’s law for diffraction [nl ¼ 2 d sin y]; this can be seen by rewriting (5.7.11) as
ledge 2p=kedge ¼ 2a
ð5:7:13Þ
The values of k larger than kedge are usually “folded” back into the first Brillouin zone, since no new physics arises from the extended-zone scheme. One-Dimensional Chain with Two Kinds of Atoms: a Band Gap Appears. We discuss next the one-dimensional lattice with two kinds of atoms: atoms of mass M occupy the even-numbered sites, and atoms of smaller mass m occupy the odd-numbered sites. As before, there are L atoms, and the equilibrium distances between adjacent atoms are equal to a. Allow for a Hooke’s law force with constant kH to act between nearest neighbors only. Let un be again the longitudinal displacement of atom n from equilibrium (Fig. 5.9). We again assume Born–von K arm an periodic boundary conditions for the motion: The Nth atom has displacement equal to that of the zeroth atom (closed loop). Then the two equations of motion are M@ 2 u2n =@t2 ¼ kH ðu2nþ1 þ u2n1 2u2n Þ
ð5:7:14Þ
m@ 2 u2nþ1 =@t2 ¼ kH ðu2nþ2 þ u2n 2u2nþ1 Þ
ð5:7:15Þ
As before, we seek periodic solutions of the type u2n ðtÞ ¼ A exp½iot þ 2nika
even masses
ð5:7:16Þ
u2nþ1 ðtÞ ¼ B exp½iot þ ð2n þ 1Þika
odd masses
ð5:7:17Þ
for the even and odd masses, respectively. Again, k is not the same as kH. Thus u2n þ 2(t) will resemble u2n(t), while u2n1(t) will resemble u2n þ 1(t). When these trial solutions are used in Eqs. (5.7.14) and (5.7.15) and the common factor exp[iot þ 2nika] is discarded, two coupled homogeneous equations in the two unknowns A, B are obtained: 0 ¼ A½o2 M2kH þ B½kH expðikaÞ þ kH expðikaÞ
ð5:7:18Þ
314
5
ST AT I S T I CA L M E CH AN I CS
optical branch
Forbidden region
FIGURE 5.10 Dispersion relation o ¼ kH ð1=m þ 1=MÞf1 ½1 4 sin2 ðkaÞðmMÞ1 ð1=m þ 1=MÞ2 1=2g for a classical chain of N atoms of two kinds of masses, smaller m and larger M, Eq. (5.7.21).
acoustical branch
–3.2
–2.4
–1.6
–0.8
0
0.8
1.6
2.4
3.2
ka
0 ¼ A½kH expðikaÞ þ kH expðikaÞ þ B½o2 m2kH
ð5:7:19Þ
These have a nontrivial solution if and only if the determinant of their coefficients vanishes:
o2 M2kH þ 2kH cosðkaÞ
¼0
2kH cosðkaÞ o2 m2kH
ð5:7:20Þ
This determinantal equation, when solved for o2 yields the dispersion relation: o2 ¼ kH
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi s 1 1 1 1 2 4 sin2 ðkaÞ þ þ m M m M mM
ð5:7:21Þ
When this dispersion relation is plotted as o versus k (Fig. 5.10), one finds (i) a lower-energy, or “acoustical,” branch (A ¼ B, atoms moving together whether even or odd in combined lower-frequency longitudinal motion); (ii) a higher-frequency, or “optical,” branch (even-numbered atoms vibrating together; odd-numbered atoms vibrating together, but separately from the even-numbered atoms). (iii) For intermediate values of o and k not on these curves, the value of k would be complex (forbidden gap frequency). Thus one sees for the first time, in a classical problem, an energy gap of forbidden motions, which presages similar, frequent behavior in quantum problems. The arguments given above for longitudinal waves may be also made, mutatis mutandis, for transverse waves.
5.8
31 5
ELECTRONIC HEAT CAPACITY: DRUDE VE RSUS FE RMI–DIRAC
5.8 ELECTRONIC HEAT CAPACITY: DRUDE VERSUS FERMI–DIRAC [3] In the early 1900s, Drude35 discussed metals by assuming that the NA (Avogadro’s number) electrons in a metal are totally free and behave as a gas, and thus he applied the kinetic theory of gases. He could explain Ohm’s36 law, as well as the ratio of the thermal conductivity to the electrical conductivity. However, Drude’s model could not explain the heat capacity of the metal (for which it would predict, using the equipartition theorem, the very ¼ ðNA =2ÞkB , much greater than experiment). large result: Cmetal V Fermi and Dirac realized that most of the NA electrons do not contribute to ! We must, instead, consider the electron gas as an FD system, with most Cmetal V probable occupation, Eq. (5.2.11) with slight changes in notation, and ignoring the degeneracy index gi: Ni ¼ expfðmei Þ=kB Tg=½1 þ expfðmei Þ=kB Tg1 Ni ¼ ½1 þ expfðei mÞ=kB Tg1
ð5:8:1Þ ð5:8:2Þ
where m is the chemical potential. There are two limits: (a) a “weakly degenerate” ideal FD gas, where the quantum effects are weak, because the factor m/kBT is so small that a series expansion in m/kBT can converge; (b) a strongly degenerate ideal FD gas (which is where real metals exist), where the factor m/kBT is large, so no series expansion in m/kBT is advisable. Limit (a): Weakly degenerate ideal FD gas. Let us write P
þ expfðei mÞ=kB Tg1 P PV ¼ kB T i lne ½1 þ expfðei mÞ=kB Tg N¼
i ½1
We use the particle-in-the-box energies for ei (as we did earler for the ideal CB gas): ei ¼ ðh2 =8mV 2=3 Þði2 þ j2 þ k2 Þ
ði; j; k ¼ 1; 2; . . .Þ
then N ¼ 2pð2m=h2 Þ3=2 V
ð1 e¼0
expfðmeÞ=kB Tg½1 þ expfðmeÞ=kB Tg1 e1=2 de ð5:8:3Þ
and pV ¼ 2pkB Tð2m=h2 Þ3=2 V
ð1 e¼0
lne ½1 þ expfðmeÞ=kB Tge1=2 de
but both integrals must be evaluated numerically.
35 36
Paul Karl Ludwig Drude (1863–1906). Georg Ohm (1789–1854).
ð5:8:4Þ
316
f( )=[1+exp(10* –10)]
–1
5
FIGURE 5.11 Fermi–Dirac distribution f(e) at T ¼ 0 (squares) and for finite temperature T > 0 (circles).
ST AT I S T I CA L M E CH AN I CS
1 0.8 0.6 0.4 0.2 0 0
0.5
1 /μ
1.5
2
Limit (b): Strongly degenerate ideal FD gas. This more practical case is discussed next. At room temperature, kBT 0.025 eV. Experimentally, the chemical potential m of most metals is of the order of 1 to 5 eV, so m/kBT is between 40 and 200. Now we take ei as the continuous parameter e and write f ðeÞ ¼ ½1 þ expfðemÞ=kB Tg1
ð5:8:5Þ
where f(e) is the probability that the state e is occupied. We denote m0 as the value of m at 0 K. In general, (e m0) < 0. Therefore at T ¼ 0, f(e) ¼ 1 for e < m0, f(e) ¼ 1=2 at e ¼ m0, and f(e) ¼ 0 for e > m0 (Fig. 5.11). At 0 K, all states below m0 are occupied; all states above m0 are empty. The number of states between e and e þ de is given by WðeÞde ¼ 4pð2m=h2 Þ3=2 Ve1=2 de
ð5:8:6Þ
where the extra factor of 2 accounts for the two spin states, ms ¼ 1/2 and ms ¼ 1/2 for each electron. If N is the total number of electrons, then N¼
ð m0 e¼0
2 3=2
WðeÞde ¼ 4pð2m=h Þ
ð m0 V
e¼0
3=2
e1=2 de ¼ ð8p=3Þð2m=h2 Þ3=2 Vm0
So finally: m0 ¼ ðh2 =2mÞð3=8pÞ2=3 ðN=VÞ2=3
ð5:8:7Þ
This quantity m0 is called the Fermi energy of the metal. For the metal Na, assuming that there is only one valence electron per atom, along with a molar volume V ¼ 23 cm3 mol1, Eq. (5.8.7) yields m0 ¼ 3.1 eV. The internal energy, or zero-point energy U0 at 0 K is given by U0 ¼ 4pð2m=h2 Þ3=2 V
ð m0 e¼0
e3=2 de ¼ ð3=5ÞNm0
ð5:8:8Þ
5.8
31 7
ELECTRONIC HEAT CAPACITY: DRUDE VE RSUS FE RMI–DIRAC
Therefore the conduction electrons contribute nothing to the heat capacity at 0 K! The pressure at 0 K is given by Eq. (5.8.4), except for a new factor of two and a change in the upper limit of integration: P0 ¼ 4pkB Tð2m=h2 Þ3=2
ð m0 e¼0
lne ½1 þ expfðm0 eÞ=kB Tge1=2 de
which can be approximated by assuming exp{(m0 e)/kBT 1; then 1 is ignored and the pressure at 0 K becomes 2 3=2
P0 4pð2m=h Þ
ð m0 e¼0
ðm0 eÞe1=2 de ¼ ð2=5ÞNm0 =V
ð5:8:9Þ
which is millions of atmospheres! Since Planck’s constant is present, this is a quantum effect. Also, from G0 ¼ Nm0 ¼ U0 – TS0 þ P0V and the above, we get that the entropy vanishes at 0 K:
S0 ¼ 0
ð5:8:10Þ
There is no disorder: All states are filled exactly up to the Fermi energy. Even at room temperature, one can make “corrections” as a power series in (kBT/m0) and do fairly well. In particular, it can be shown after some pain that
Cmetal ¼ p2 Nk2B T=2m0 V
ð5:8:11Þ
which is of the order of 4.2 10 4 T (in units of J K1 mol1): This Cmetal is V rather small, in comparison to other heat capacity contributions (translation, vibration); the heat capacity problem for metals is finally solved. We next sketch the BE case. Again, there are two limits: (a) a “weakly degenerate” ideal BE gas, where the quantum effects are weak, because the factor m/kBT is small enough that a series expansion in m/kBT can converge; (b) a strongly degenerate ideal BE gas (which is where real metals exist), where the factor m/kBT is large (T small, or m large), so no series expansion in m/kBT is possible. (a) Weakly degenerate ideal BE gas. Let us deal with case (a) first, by writing P N ¼ i ½1expfðei mÞ=kB Tg1 P PV ¼ kB T i lne ½1expfðei mÞ=kB Tg We must now isolate the ground state and write N¼
X 1 1 þ e0 m ei m i$0 1exp 1exp kB T kB T
318
5
ST AT I S T I CA L M E CH AN I CS
Since 0 exp {m/kBT} exp {e0/kBT} and ei > e0 for i > 0, one can integrate
me 3=2 exp 1 2m k T þ 2p 2 B V e1=2 de N¼ e0 m me h 1exp 1exp e>0 kB T kB T 1 ð
By using the particle-in-the-box energies e ¼ 3h2/8mV2/3, choosing e0 ¼ 0, and remembering that 0 exp(m/kBT) < 1 (valid for BE, but not for FD!!), we get
m me 1 ð exp exp N 2m 3=2 kB T k T þ 2p 2 B ¼ e1=2 de m em V h V 1exp 1exp e>0 kB T kB T
ð5:8:12Þ
and similarly for p: p=kB T ¼V 1 lne ½1expðm=kB TÞ 2pð2mh2 Þ3=2 ð1 lne ½1expfðemÞ=kB Tg1 e1=2 de
ð5:8:13Þ
e>e0
and these integrals can be evaluated as power series. Virial equations are then constructed. Strongly Degenerate Ideal BE Gas. Equations (5.8.12) and (5.8.13) can be used as starting points. Virial equations can again be constructed, and BE condensation can be understood.
5.9 MAGNETIC SUSCEPTIBILITIES General Phenomenology. Figure 5.12 shows schematically the dependence of the magnetization M on the applied external field H0, and Fig. 5.13 shows how M (or the magnetic susceptibility w) depends on the absolute temperature T. The different forms of magnetism can be divided into: individual: (A) paramagnetic, (B) diamagnetic; and collective: (C) ferromagnetic, (D) antiferromagnetic, (E) ferrimagnetic, and (F) metamagnetic. The “individual” cases show what individual atomic, ionic, or molecular moments (or lack thereof) will do, when induced to reorient or respond to H0, while the “collective” ones show how interactions between the magnetic moments can foster ordering into magnetized domains, thus greatly enhancing the magnetic response to H0. The discussion will first focus on the individual (or “dilute”) phenomena of paramagnetism and diamagnetism.
5.9
31 9
MAGNETIC SUSCEPTIBILITIES
M
M DIA
PARA
A
D H0
H0
Mr
C
M
Ms M
D
ANTIFERRO
B
FERRO (initial)
E
B
E FERRO O (final)
H0
FIGURE 5.12
H0
Schematic dependence of the magnetization M on the external magnetic field H0: (A) paramagnet (the ultimate departure from linearity is due to the Brillouin function); (B) diamagnet; (C) ferromagnet: the initial state (0) can have zero magnetization (domain moments add to 0) and then achieves saturation magnetization Ms in path OBC; thereafter the hysteresis loop CDEFGC is traced “forever”; Mr remanence magnetization, Hc coercivity (D) antiferromagnet, (E) ferrimagnet, (F) metamagnet.
Hc G
F
M
FERRI
M META
F C H0
H0
Dilute Ensemble of Paramagnetic Ions. The magnetic energy for species i in an external magnetic field H0 is given by Ei ¼ mi H0
ð5:9:1Þ
If the field H0 is oriented along the laboratory z axis, then Ei ¼ ge be H0 mi
ð5:9:2Þ
where mi is the z-component of the total angular momentum vector J: (mi ¼ J, J þ1,. . ., 1, 0, 1,. . ., J1, J). At the absolute temperature T, the probability pi that the magnetic moment has energy Ei is given by a Boltzmann factor pi ¼ D expðEi =kB TÞ ¼ D expðge be H0 mi =kB TÞ
ð5:9:3Þ
320
5
M or χ
ST AT I S T I CA L M E CH AN I CS
PARA
A
T CURIE LAW: χ = C / T
FERRO below TC; PARA above Tc
COMPLEX BEHAVIOR
M or χ B
CURIE TEMP TC
FERROMAGNETIC COUPLING (below CURIE TEMPERATURE TC); HEISENBERG J > 0 (H = – 2J Si.Sj)
T CURIE-WEISS LAW (for T >T C): χ = C / (T – TC)
M or χ
ANTI-FERRO below TN; PARA above TN
C ANTI-FERROMAGNETIC COUPLING (below NEEL TEMPERATURE TN); HEISENBERG J < 0 (H = – 2J Si.Sj)
FIGURE 5.13 Temperature dependence of the magnetization M and the magnetic susceptibilityw forparamagnets,ferromagnets, and antiferromagnets.
T -TN
NEEL TEMP TN
CURIE-WEISS LAW (for T>T N): χ = C / (T – TN)
where D is a constant. Electrons are fermions, but MB statistics are allowed here, because the electrons in a macroscopic sample of interest here are assumed to be far apart and noninteracting. Therefore, pi ¼ expðge be H0 mi =kB TÞ=
XJ mi ¼J
expðge be H0 mi =kB TÞ
ð5:9:4Þ
For a macroscopic sample of N paramagnets in a volume V, the z-component of the magnetization will be 0 1 m i ¼J X g b m H e i 0 e A ge be mi exp@ k T N B X mi ¼¼J 1 0 1 Mz ¼ m ¼ N V i¼1 iz m i ¼J X ge be mi H0 A exp@ kB T mi ¼¼J 0 1 ð5:9:5Þ m i ¼J X g b m H e i 0 e A ge be mi exp@ kB T N mi ¼¼J 1X 0 1 Mz ¼ m ¼ N V i¼1 iz m i ¼J X g b m H e i 0 e A exp@ k T B m ¼¼J i
For a field H0 ¼ 0.7 tesla, the ratio gebeH0/kB is approximately 1 K; that is, this ratio is small compared to, say, room temperature. Under such conditions,
5.9
32 1
MAGNETIC SUSCEPTIBILITIES
one can sum Eq. (5.9.5) with some care, to obtain Mz ¼ Nge be JBJ ðxÞ
ð5:9:6Þ
where x ge be JH0 =kB T
ð5:9:7Þ
and BJ(x) is the Brillouin function: BJ ðxÞ ¼
2J þ 1 ð2J þ 1Þx 1 x coth coth 2J 2J 2J 2J
ð5:9:8Þ
where coth x [exp(x) exp(x)]/[exp(x) þ exp(x)]. This Brillouin function differs somewhat from the Langevin function L(x): LðxÞ coth xð1=xÞ
ð5:9:9Þ
which is used, instead, when one applies the limit J ! 1 to Eq. (5.9.5). If x 1, then the magnetic moment Mz can be approximated by Mz ¼ Ng2e b2e H0 JðJ þ 1Þ=3kB T
ðge be H0 J kB TÞ
ð5:9:10Þ
Since Mz w0 H0
ð5:9:11Þ
the static magnetic susceptibility w0 becomes, for x 1,
w0 ¼
Nge 2 be 2 JðJ þ 1Þ 3kB T
ðge be H0 J kB TÞ
ð5:9:12Þ
Thus M is linear in H0, except when the approximation ðge be H0 J kB TÞ fails; then saturation will set in, as shown in Fig. 5.14, and also sketched at higher H0 in Fig. 5.12A. If one defines the “effective Bohr magneton number” as the dimensionless meff, we obtain ð5:9:13Þ meff ðJÞ ge ½JðJ þ 1Þ1=2 Then the static susceptibility reduces to the classical Curie37 or Langevin38 result:
w0
37 38
Nbe 2 meff 2 C ¼ T 3kB T
Pierre Curie (1859–1906). Paul Langevin (1872–1946).
ðge be B0 J kB TÞ
ð5:9:14Þ
322
5
ST AT I S T I CA L M E CH AN I CS
1.2 1
BJ (x)
0.8 0.6 0.4 0.2
FIGURE 5.14
0
Plot of Brillouin function BJ(x), Eq. (5.9.8).
0
0.5
1
1.5
x
where C Nb2e m2eff =3kB is the Curie constant. There is an excellent fit between Eq. (5.9.12) and experiment. PROBLEM 5.9.1.
Derive Eq. (5.9.6) from Eq. (5.9.5).
PROBLEM 5.9.2.
Derive the approximation Eq. (5.9.12) from Eq. (5.9.5).
PROBLEM 5.9.3.
Derive the Langevin equation, Eq. (5.9.14)
PROBLEM 5.9.4.
Calculate the Langevin function L(b) for small b.
In the case of LS or Russell39–Saunders40 coupling (J is a good quantum number), and neglecting small corrections due to quantum electrodynamics, the Lande g-factor is given by ge 1 þ ½JðJ þ 1Þ þ SðS þ 1ÞLðL þ 1Þ½2JðJ þ 1Þ1
ðð3:20:20ÞÞ
In Table 5.2 are listed several effective Bohr magneton numbers meff for iron-group transition ions and rare-earth ions. For the iron-group transition metal ions a second theoretical guess (“spin-only”) is also listed: meff ðSÞ ge ½SðS þ 1Þ1=2
ð5:9:15Þ
It can be seen that for the iron-group ions, where the 3d electrons are “exposed,” the LS coupling implied by Eq. (5.12.13) overestimates the measured moment, while the modified Eq. (5.12.16) is closer to experiment; this equation assumes that the orbital angular momentum quantum number L has no effect, or is “quenched,” and that the observed paramagnetism for the irongroup ions is “spin-only.” In contrast, for the rare-earth ions, for which the 4f
39 40
Henry Norris Russell (1877–1957). Frederick Albert Saunders (1875–1963).
5.9
32 3
MAGNETIC SUSCEPTIBILITIES
Table 5.2 Effective Bohr Magneton Numbers meff (Dimensionless) for Iron-Group Transition Metal and Rare-Earth Ions Ion 3þ
Ti V4 þ V3 þ Cr3 þ V2 þ Mn3 þ Cr2 þ Fe3 þ Mn2 þ Fe2 þ Co2 þ Ni2 þ Cu þþ Ce4 þ Pr3 þ Nd3 þ Pm3 þ Sm3 þ Eu3 þ Gd3 þ Tb3 þ Dy3 þ Ho3 þ Er3 þ Tm3 þ Yb3 þ
Configuration 1
3d 3d1 3d2 3d3 3d3 3d4 3d4 3d5 3d5 3d6 3d7 3d8 3d9 4f15s25p6 4f25s25p6 4f35s25p6 4f45s25p6 4f55s25p6 4f65s25p6 4f75s25p6 4f85s25p6 4f95s25p6 4f105s25p6 4f115s25p6 4f125s25p6 4f135s25p6
Term Symbol
L
S
J
meff (J)
meff(S)
meff (exp)
2
2 2 3 2 2 2 2 2 2 2 2 2 2 3 5 6 6 5 3 0 3 5 6 6 5 3
1/2 1/2 1 1/2 1/2 2 2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 3/2 2 5/2 3 7/2 3 5/2 2 3/2 1 1/2
3/2 3/2 3/2 3/2 3/2 0 0 3/2 3/2 3/2 3/2 3/2 3/2 5/2 4 9/2 4 5/2 6 7/2 6 15/2 8 15/2 6 7/2
1.55 1.55 1.63 0.77 0.77 0 0 5.92 4.54 6.70 6.93 5.59 3.55 2.54 3.58 3.62 2.68 0.84 0 7.94 9.72 10.63 10.60 9.59 7.57 4.54
1.73 1.73 2.83 3.87 3.87 4.90 4.90 5.92 5.92 4.90 3.87 2.83 1.73
1.8 1.8 2.8 3.8 3.8 4.9 4.9 5.9 5.9 5.4 4.8 3.2 1.9 2.4 3.5 3.5 — 1.5 3.6 8.0 9.5 — 10.4 9.5 7.3 4.5
D3/2 D3/2 3 F2 2 D3/2 2 D3/2 5 D0 5 D0 2 D3/2 2 D3/2 2 D3/2 2 D3/2 2 D3/2 2 D5/2 2 F5/2 2 H4 4 I9/2 5 I4 6 H5/2 7 F6 8 S7/2 7 F6 6 H15/2 5 I8 4 I15/2 3 H6 2 F7/2 2
Note: meff(J), [Eq. (5.9.13)] works better for rare-earth ions, while the spin-only value meff(S) [Eq. (5.9.15)] works better for transition metal ions (“quenching”) [10].
electrons are well “buried” below outer-shell electrons, Eq. (5.9.13) works well: The quantum number L makes a contribution, and, except for Eu3 þ , the Russell–Saunders LS coupling model is valid. The paramagnetism in transition metal salts is affected by the presence of symmetry-dependent crystalline electric fields, which can distort the electronic configuration of the bare ion significantly, by modifying the hybridization. This is treated by ligand-field theory. Diamagnetism. All atoms and molecules have an intrinsic diamagnetism, due to induced currents induced by the field in the sample (or the sample holder). One simple derivation starts from the Larmor41 precession: vL ¼ ðgL mB = hÞH0 ¼ gL ðe=2me ÞH0 vL ¼ ðgL mB = hÞH0 ¼ gL ðe=2me cÞH0
41
Sir Joseph Larmor (1857–1942).
ðSIÞ; ðcgsÞ
ðð3:20:24ÞÞ
324
5
ST AT I S T I CA L M E CH AN I CS
The Larmor precession produces a diamagnetic current I induced by H0: I ¼ ðZe2 H0 =2me cÞð1=2pÞ
ðSIÞ;
I ¼ ðZe2 H0 =2me cÞð1=2pcÞ ðcgsÞ
which, when multiplied by the area of the loop, produces the induced moment m: m=H0 ¼ ðZe2 =4me c2 Þhr2 i
ðcgsÞ
where hr2 i is the average of the square of the perpendicular distance of the electron from the axis of the external field H0 : hr2 i ¼ hx2 i þ hy2 i ¼ ð2=3Þðhx2 iþ hy2 i þ hz2 iÞ ¼ ð2=3Þhr2 i. Hence w ¼ NZe2 hr2 i=6me c2
ðcgsÞ
ð5:9:16Þ
where N is the number of atoms per unit volume. Thus the problem of computing the diamagnetism is reduced, at least formally, to computing hr2 i, the second moment of the electron distribution. This derivation has assumed implicitly that the H0 axis is also an axis of symmetry of the atom or molecule. In general, this is not the case; Van Vleck42 provided a correction, enlisting second-order perturbation theory, to obtain the total molar susceptibility:
wM ¼
X jhnjm j0ij2 NA Ze2 hr2 i z þ 2N A 6me c2 E E n 0 n
ð5:9:17Þ
where NA is Avogadro’s number, En is the nth molecular energy level, and hnjmz j0i is the matrix element for the magnetic transition moment between the ground state and state n. This means that the overall wM < 0 if the first term dominates, but wM > 0 if the second term dominates. The diamagnetism for atoms can be obtained unambiguously from experiment (see Table 5.3). The diamagnetism for ions, also shown in Table 5.3, can be inferred from experiment by comparing salts with common ions (there is always a slight problem starting such correlations). For molecules, by simply assuming isotropic contributions, the diamagnetism can be estimated substituent group by substituent group, by using Pascal’s43 empirical constants (listed in Table 5.3). However, there are very significant anisotropies—for example, for aromatic currents, where the ring currents contribute greatly to the diamagnetism (and also to NMR chemical shifts, as we will see later); then vectorial group contributions to w must be used, and Pascal’s constants do not apply. The discussion now shifts to collective behavior. 42 43
John Hasbrouck Van Vleck (1899–1980). Paul Victor Henri Pascal (1880–1968).
5.9
32 5
MAGNETIC SUSCEPTIBILITIES
Table 5.3 Experimental Atom Susceptibilities, Experiment-Based Ionic Contributions to Diamagnetic Susceptibilities, and Pascal’s Group Constants for the Calculation of the Molecular Diamagnetic Susceptibilities w (in units of 106 cm3 mol1) [11] Al3 þ Ar As As3 þ B B3 þ Ba2 þ Bi5 þ Br (aliphatic) Br (aromatic) Br Br corra in -CH2Br Br corr. in -CHBr2 Br corr. in -CBr3 C (covalent) C4 þ C: CH2 group C ¼ C bond corr. CC bond corr. C ¼ C-C ¼ C corr. C ¼ N bond corr. CN bond corr. C: Cyclopropane ring corr. C: Cyclobutane ring corr. C: Cyclopentane ring corr. C: Cyclohexane ring corr. C: Cyclohexene ring corr. C; Cyclohexadiene ring corr. C: Piperidine ring corr. C: Piperazine ring corr. C: Pyrazoline ring corr. C: Glyoxaline ring corr. C: Benzene ring corr. C: Pyridine ring corr. C: Triazine ring corr. C: Furan ring corr. C: Pyrrole ring corr. Ca2 þ Cl (aliphatic) Cl (aromatic) Cl corr. in CH2Cl Cl corr. in CHCl2 Cl corr. in CCl3 Cl Co a
13.0 19.4 21. 20.9 7.3 7 29.0 192 30.6 26.5 34.5 1.5 0.5 10.6 6.00 6 11.36 5.5 0.8 10.6 8.15 0.8 3.4 1.1 0 3.1 7.2 10.7 3.6 7.5 8.3 7.8 1.4 0.5 1.4 2.5 3.5 10.7 20.1 17.2 0.3 0.3 2.5 24.2 13
Cs þ F (aliphatic) F Fe H þ (ion) H (covalent) He Hg2 þ I (aliphatic) I (aromatic) I Kþ Kr Li þ Mg2 þ N (open-chain) N5 þ N (monoamide) N (diamide) N (imide) N: N ¼ N bond corr. Na þ Ne Ni O (alcohol) O (ether) O (aldehyde) O (ketone) O (carboxyl) O2 P P5 þ Pb2 þ Rb þ S S2 Se Se2 Si Si4 þ Sn4 þ Sr2 þ Te2 Xe Zn
35.1 6.3 9.4 13 0 2.93 1.9 41.5 44.6 40.5 50.6 14.6 28.0 0.7 4.3 5.55 2.1 1.54 2.11 2.11 1.85 6.1 7.2 13 4.60 4.60 1.66 1.66 7.95 12 16. 26.2 46 22.0 15.2 15 5.55 23 13.0 20 30 18.0 37.3 43.0 13.5
Corr. means that the values cited in Ref. [11] were corrections to previously accepted values.
326
5
ST AT I S T I CA L M E CH AN I CS
Ferromagnetism. The iron-group elements (in group VIII, or groups 13 to 15 in the periodic table), and selected other compounds show a greatly enhanced magnetization. Figure 5.13C shows that the magnetization of a ferromagnet may start at zero, but rises as an S-shaped curve, then reaches a saturation magnetization Ms, and, when cycled repeatedly, will form an almost rectangular M–H loop with a remanence magnetization Mr at H0 ¼ 0, a coercive force Hc, and so on. These effects were first attributed phenomenologically (and correctly) by Weiss44 to the growth of uniaxially ordered magnetic domains, then explained theoretically by Heisenberg45 as due to an exchange Hamiltonian Hex, which is the sum over the whole crystal of all pairwise interactions between spins Si and Si (the orbital angular momenta do not contribute) throughout the crystal, coupled by a pairwise exchange integral Jij (whose distance dependence is, alas, unknown):
^ ex ¼ 2 H
N X i1 X
^j ^i S Jij S
ð5:9:18Þ
i¼2 j¼1
Exact forms of a distance-dependent Jij are not available. Before going into details, Table 5.4 lists the saturation magnetization Ms, the effective number of Bohr magnetons per atom, and the ferromagnetic Curie temperature TC for selected ferromagnets. The Heisenberg model of spin exchange, Eq. (5.9.18), is valid if the magnetic species are far enough apart that the overlap of their electronic wavefunctions is small; if the orbital angular momentum L contributes to the susceptibility, then the coupling may depend on the absolute (i.e., crystal axes-dependent) spin orientations. In detail, the spin–spin interactions can be due to (A) direct exchange (due to small overlap between the magnetic species), (B) superexchange (when mediated by an intermediary set of nonmagnetic species with induced diamagnetism), or (C) indirect or itinerant exchange with conduction electrons (Fig. 5.15). Weiss attempted to explain ferromagnetism by introducing an exchange field, or molecular field Hex, which he assumed to be proportional to the magnetization M: Hex lM. (this putative Hex is of the order of hundreds of teslas!) The overall field becomes H0 þ Hex; using this in the Curie law: M=ðH0 þ Hex Þ ¼ C=T ¼ M=ðH0 þ lMÞ
ð5:9:19Þ
w ¼ M=H0 ¼ C=ðTClÞ
ð5:9:20Þ
Weiss got
44 45
Pierre-Ernest Weiss (1865–1940). Werner Heisenberg (1901–1976).
5.9
32 7
MAGNETIC SUSCEPTIBILITIES
Table 5.4 Data for Ferromagnets: Saturation Magnetizationa Ms Coercivity Hc, Effective Number of Bohr Magnetons per Atom neff, and Ferromagnetic Curie Temperature TC Ms @ 0 K Hc Bohr Magnetons (Oersted) (Oersted) neff @ 0 K
Compound a-Fe Co Ni SmCo5 Nd2Fe14B Gd CrO2 MnB 45 Permalloy (45%Ni, 55%Fe) Alnico (10%Al, 20%Ni, 15%Co, 55%Fe) Fe48Pt52 VxTCNE.0.5CH2Cl2 a
1715 1434 486 1275 1273 2060 512 675 1273 1100
2 20 150 56,000 10,000 650
2.221 1.716 0.606
7.10 2.03 3.52
0.01
1500 >13,000 60
TC (K) 1043 1394 627 Mixture Mixture 293 387 630 673
750
670 >350
Given in oersteds or cgs-emu; in gauss, it is 4p times the number listed.
and by defining the ferromagnetic Curie temperature as TC Cl
ð5:9:21Þ
he obtained the Curie–Weiss law: w¼
M C C M C C ¼ ¼ ¼ w¼ ¼ H0 TCl TTC H0 TCl TTC
ð5:9:22Þ
If in Eq. (5.9.19) one assumes 2 J z <S>2, then the Curie temperature TC can be linked, albeit approximately, to an average Heisenberg <J>: hJ i 3kB TC =2zSðS þ 1Þ
ð5:9:23Þ
(A): direct exchange
(B): superexchange
(C): indirect exchange with conduction electrons
FIGURE 5.15 Cartoon for exchange interactions.
328
5
ST AT I S T I CA L M E CH AN I CS Domain wall
FIGURE 5.16 Ferromagnetic domains. (A) Single-domain magnet: the magnetic lines of force go from the N pole to the S pole. (B) Two-domain situation, with an intermediate region called the domain wall, or “Bloch wall.” This “splintering” into domains can continue until the energetics stop it. (C, F): “Domains of closure”: The magnetizations cancel, and the magnetic energy and the coercivity Hc are both zero. (D) Change from (C), where the external field H0 causes some domains to grow at the expense of others. (E) Change from (C), where the external field H0 causes the magnetizations of certain domains to rotate.
N N S S
N N N N
S S
S S
S S
(A)
(C)
N N
(B)
(D)
(E)
(F)
The magnetic domains can be larger or smaller than individual crystallites; the competing (if somewhat ad hoc) energies that dictate their size, growth, rotation, and so on, can be used to explain not only the MH curve of Fig. 5.12C, but also the fundamental practical division of ferromagnets into permanent magnets (which need a high coercivity Hc to keep their magnetization over time) and cores for electrical transformers (which need a relatively low coercivity Hc). Figure 5.16 depicts several orientations and growth patterns of domains. The domain formation and movement are dominated by three energies: (1) the exchange energy, Eq. (5.9.19); (2) the crystalline anisotropy energy, which favors the alignment of the magnetization along preferred crystalline axes (e.g., along the hexagonal axis for hexagonal cobalt), and sets the crystallographic “easy axes” of magnetization, where M increases gradually with gentle slope as the external field H is increased, and the “hard axes”, along which magnetization changes with some difficulty), and (3) the magnetic energy: ð EM ¼ 81 p1 H2 dV
ð5:9:24Þ
The experimental coercivities can vary greatly, from Hc ¼ 2 T for a FePt magnet, to a tiny Hc ¼ 4 107 T for supermalloy. Present Frontier of Flexible Magnetic Media. Since the 1950s, the density of magnetic recording (bits or bytes per unit area) has increased as dramaticaly as has the density of electronic components in integrated circuits. The recording “magnetic pigment” of choice has been a-Fe2O3, then CrO2, then finally a-Fe. Making nanoparticles of Fe (10 nm diameter or above) in the absence of oxygen is not difficult, but protecting them from air and moisture is
5.10
32 9
ELECTRIC SUSCEPTIBILITIES
Table 5.5
el Temperatures TN Antiferromagnetic Ne
Compound
TN/K
MnO FeO a-Fe2O3 CoO Cr NiO KMnF3 MnF2
122 198 950 291 311 600 88.3 67.24
non-trivial; in addition, spherical or equiaxial Fe particles have not much anisotropy. The present starting material is a-goethite, or nonmagnetic a-FeOOH, which grows as monoclinic needles. When a-FeOOH is reduced in H2 at 1100 K, the structure collapses onto itelf, but retains its acicular shape, yielding a-Fe particles with aspect ratios of about 3:1 (“rice-shaped particles”). These pyrophoric particles are immersed in toluene in a glove box (toluene has no dissolved water impurity); the suspension of a-Fe in toluene is then brought out to air, where the particles oxidize slowly and safely on their surface, forming a thin but impervious layer of oxide, after which the particles are no longer subject to corrosion by water and exfoliation. Present Frontier of Rigid Magnetic Media. disks have FeCr or FePt.
The present gigabyte magnetic
Antiferromagnets. The Heisenberg exchange mechanism, Eq. (5.10.18), can also give rise to antiferromagnets (below the Neel46 temperature), when the coupling of nearest-neighbor spins is antiparallel; above the Neel temperature, paramagnetism sets in. Table 5.5 gives Neel temperature data for some antiferromagnets. Ferrimagnets. A modified form of antiferromagnetism—for example, for binary species with two different spins on nearest-neighbor species—has nearest-neighbor spins oriented antiparallel to each other, but, since they are different in size, this gives rise to a net magnetization, which is intermediate between that of antiferromagnets and ferromagnets. Three ferrimagnetic compounds are listed in Table 5.6.
5.10 ELECTRIC SUSCEPTIBILITIES [12, 13] Given an ensemble of static electric dipole moments of magnitude m and random orientation in an external static electric field E, we can use the microcanonical ensemble partition function to compute the average moment
46
Louis Eugene Felix Neel (1904–2000).
330
5
ST AT I S T I CA L M E CH AN I CS
Table 5.6 Data for Ferrimagnets: Saturation Magnetization Ms at 0 K, Effective Number of Bohr Magnetons per Atom neff, and Curie Temperature TC Compound
Ms (oersted)
neff
TC (K)
475 510 —
4.2 —
793 848 733
CoFe2O4 FeO.Fe2O3 (magnetite) Ba ferrite, BaFe2O4
<m> at a temperature T, as an orientational average (quantum effects are small here, so that all orientations can be assumed to be equally likely): j¼2p ð
y¼p ð
sin y dy hmi ¼
djA expðDU=kB TÞm cos y j¼0
y¼0 y¼p ð
j¼2p ð
sin y dy y¼0
djA expðDU=kB TÞ j¼0
or
hmi mLðaÞma=3
ð5:10:1Þ
where kB is Boltzmann’s constant, T is the absolute temperature, L(x) is the Langevin function, Eq. (5.9.9), and a is given by a mE=kB T
ð5:10:2Þ
The linear approximation L(a) a/3 is valid up to electric fields E of several hundred kV m1 (see Problem 5.10.2). Using this approximation, we obtain hmi ma=3 ¼ m2 E=3kB T
ð5:10:3Þ
PROBLEM 5.10.1. Prove Eq. (5.10.1). PROBLEM 5.10.2. Check the validity of Eq. (5.10.30) for a ¼ 0.1, 0.2, 0.5, 1.0, and 2.0. The above <m> yields only the orientation polarization Por: Por ¼ m2 =3kB T E
ð5:10:4Þ
due to the temperature-dependent distribution of orientations of the permanent molecular static electric dipole moments m in an electric field E. Another contribution to the polarization is due to the molecular static
5.10
33 1
ELECTRIC SUSCEPTIBILITIES The molecule at center of spherical cavity
+Q –U Dielectric with ε = U/Q
D1=ε0E +P D3
D2= -P
a Other polar molecule in spherical cavity
+U
FIGURE 5.17
–Q
The Debye sphere.
electric polarizability a, which gives the induction polarization Pind (a term coined by Faraday): Pind ¼ aE
ð5:10:5Þ
These results should be transformed from an individual-molecule basis to a molar basis (by multiplying everything by Avogadro’s number NA), but also need a valid or effective electric field E at the molecule. If the molecule is not isolated, one must calculate the effect of the medium (gas, liquid, or solid) in which the test molecule finds itself. In other words, we seek the local or effective field Eeff. Consider (Fig. 5.17) the interaction between (a) a single molecule with static electric dipole polarizability a and static electric dipole moment m placed at the origin of an imaginary spherical cavity (the Debye sphere, with radius a) with an external electric field E (supplied by the two plates of a parallel-plate capacitor) and (b) a dielectric material (gaseous, liquid, or solid) with dielectric constant e placed between the capacitor plates, with the Debye sphere hollowed out within it. It is assumed that a is either of the size of the molecule, or somewhat larger, to accommodate other polar molecules in the vicinity of the “test” molecule. The derivation that follows is valid for gases, for liquids, or for solids with cubic symmetry. The electric field due to the test molecule is assumed to be negligible. The test molecule will feel the sum of four electric displacements: Deff ¼ D1 þ D2 þ D3 þ D4
ð5:10:6Þ
Here D1 is due to the capacitor, D2 is due to the surface charges on the dielectric, D3 is due to the surface charges on the interior of the spherical cavity, and D4 is due to any other polar molecules with dipole moments p and random orientation that happen to be in the same spherical cavity as our test molecule. In detail: D1 ¼ e0 E þ P
ðSIÞ;
D1 ¼ E þ 4pP
ðcgsÞ
ð5:10:7Þ
D2 ¼ P
ðSIÞ;
D2 ¼ 4pP
ðcgsÞ
ð5:10:8Þ
Next: For D3 the field at the center of the cavity due to the surface charge density P on the surface of the cavity is given by ð y¼p
ð5:10:9Þ D3 ¼ e0 ð4pe0 Þ1 2pa2 jPjsin ydy cos y=a2 ¼ P=3 ðSIÞ y¼0
332
5
ST AT I S T I CA L M E CH AN I CS
In cgs units, this local field field is E3 ¼ E þ 4 p P/3. For D4 we must consider that at a distance r from a dipole moment p the electric potential is
f ¼ ð4pe0 Þ1 p r r3 ; whence E4 ¼ rf ¼ ð4pe0 Þ1 pr3 3ðp rÞrr5 ð5:10:10Þ The sum over all the contributions to the x-component of the electric field is X
E ¼ ð4pe0 Þ1 i xi
X px r3 3 px x2 þ py xy þ pz xz r5
ð5:10:11Þ
But if the crystal is cubic, then <x2> ¼ ¼ ¼ r2/3 and <xy> ¼ ¼ ¼ 0, so the net contribution to E4 vanishes, and thus D4 ¼ 0. If the crystal symmetry is lower than cubic, D4 may be nonzero. In summary, we have Deff ¼ D1 þ D2 þ D3 þ D4 ¼ e0 E þ PP þ P=3 þ 0 ¼ e0 E þ P=3 ðSIÞ Deff ¼ D1 þ D2 þ D3 þ D4 ¼ E þ 4pP4pPþ4pP=3 þ 0 ¼ E þ ð4p=3ÞP cgs ð5:10:12Þ
Another way of saying this is that the effective, local, Clausius47– Mossotti48 or Lorentz49 field at the test molecule is Eeff ¼ E þ P=3e0
ðSIÞ:
Eeff ¼ E þ ð4p=3ÞP
cgs
ð5:10:13Þ
Consider initially only the induction polarization Pind of Eq. (5.10.5), suitably multiplied by Avogadro’s number and the effective field Eeff of Eq. (5.10.13): ð5:10:14Þ Pind ¼ NA aEeff Dropping the suffix “ind” and using Eq. (5.10.13), we obtain P ¼ NA aðE þ P=3e0 Þ ðSIÞ; P ¼ NA aðE þ 4pi=3Þ cgs
ð5:10:15Þ
The scalar first-order electric susceptibility w(1) is defined by P ¼ wð1Þ e0 E
ðSIÞ;
P ¼ wð1Þ E
cgs
ð2:7:10Þ
Inserting Eq. (5.10.15) in Eq. (5.10.14) permits the elimination of P, then E: wð1Þ e0 ¼ NA a 1 þ wð1Þ =3 ðSIÞ; wð1Þ ¼ NA a 1 þ 4pwð1Þ =3
cgs
ð5:10:16Þ
This can be solved for NAa (now neglecting the superscript on w): NA a=e0 ¼ w=ð1 þ w=3Þ ¼ 3w=ð3 þ wÞ ðSIÞ;
NA a ¼ 3w=ð3 þ 4pwÞ
cgs
ð5:10:17Þ
Rudolf Julius Emanuel Clausius ¼ Rudolf Gottlieb (1822–1888). Ottaviano Fabrizio Mossotti (1791–1863). 49 Hendrik Antoon Lorentz (1852–1928). 47 48
5.10
33 3
ELECTRIC SUSCEPTIBILITIES
Since the scalar dielectric constant (or called specific inductive capacity) is defined by e 1 þ wð1Þ
e 1 þ 4pwð1Þ
ðSIÞ;
cgs
ð5:10:18Þ
where w(1) is the first-order static electric dipole susceptibility, therefore Eq. (5.10.17) becomes the Mossotti–Clausius equation: NA a e1 ¼ 3e0 eþ2
NA a e1 ¼ 3 eþ2
ðSIÞ;
cgs
ð5:10:19Þ
Sideline. In 1850 Mossotti and in 1879 Clausius showed that for any given substance the ratio (e 1)/(e þ 2) should be (and indeed is) proportional to the density of the substance. Luckily, the static electric diple polarizability a is very close to the atomic volume (or the molecular volume). Decades before the advent of X-ray structure determination, Eq. (5.10.19) allowed estimates of the size of atoms and molecules from measures of the density and the dielectric constant. At optical frequencies for m ¼ 1, e ¼ n2 [see Eq. (2.7.42)], where n is the refractive index; with this Eq. (5.11.19) turns into the Lorentz–Lorenz50 equation: NA a n2 1 ¼ 2 3e0 n þ2
NA a n2 1 ¼ 2 3 n þ2
ðSIÞ;
cgs
ð5:10:20Þ
Using these same ideas for polar substances (i.e., including Pind as well as Pind), one obtains the Debye equation: NA ða þ m2 =3kB TÞ e1 ¼ 3e0 eþ2
ðSIÞ;
NA ða þ m2 =3kB TÞ e1 ¼ 3e eþ2
cgs
ð5:10:21Þ which allows the determination of both the molecular dipole moment m and the molecular polarizability a from measurements of the temperaturedependent dielectric constant e of a solution. Onsager51 modified some assumptions made by Debye, approximated the polarizability as an effective molecular volume, and obtained the Onsager equation: 4pNA m2 ðen2 Þð2e þ n2 Þ ¼ 9kB T eðn2 þ 2Þ2
cgs
ð5:10:22Þ
Sideline. When Onsager died, he was buried in the Grove Street Cemetery in New Haven, CT, next to his colleague, Kirkwood.52 Gibbs, Eli Whitney, and Noah Webster are also buried in the same cemetery. 50
Ludvig Lorenz (1829–1891). Lars Onsager (1903–1976). 52 John Gamble Kirkwood (1908–1959). 51
334
5
ST AT I S T I CA L M E CH AN I CS
Onsager’s widow complained to the curators of the cemetery that Kirkwood’s gravestone listed a long series of minor awards that Kirkwood had earned in his lifetime, while her husband’s gravestone was bare. They honored her request by adding “Nobel Laureate, etc. . .” to Onsager’s gravestone. When she died, the loyal Mrs. Onsager was buried alongside her husband.
5.11 UNIVERSAL THEORY OF CRITICAL PHENOMENA The existence of a critical point in the pressure–volume–temperature (PVT) diagram (actually, a point in the planar PV projection, but a critical line in a three-dimensional representation), a critical point (Curie temperature) in ferromagnetism, a critical point (Neel point) in antiferromagnetism, a critical temperature in superconductivity, and a critical point (lambda point) in liquid 2 He4 are physical descriptions of the onset of a sudden macroscopic collective transition. If one approaches the critical point very closely, dimensionless parameters, defined to describe this approach, are common to all these disparate phenomena: the approach to criticality, or to a phase transition, are really the same.
REFERENCES 1. W. J. Moore, Physical Chemistry, 4th edition, Prentice-Hall, Englewood Cliffs, NJ, 1972. 2. N. Davidson, Statistical Mechanics, McGraw-Hill, New York, 1962. 3. D. A. McQuarrie, Statistical Mechanics, Harper-Collins, New York, 1976. 4. T. L. Hill, Statistical Mechanics, McGraw-Hill, New York, 1956. 5. D. A. McQuarrie, Statistical Thermodynamics, Harper and Row, New York, 1973. 6. R. S. Berry, S. A. Rice, and J. Ross, Physical Chemistry, 2nd edition, Oxford University Press, New York, 2000. 7. C. Kittel, Elementary Statistical Physics, Wiley, New York, 1958, p. 104. 8. J. H. van Vleck, The Theory of Electric and Magnetic Susceptibilities, Oxford University Press, Oxford, UK, 1932. 9. N. W. Ashcroft and N. D, Mermin, Solid State Physics, Saunders, Philadelphia, 1976. 10. G. E. Pake, Paramagnetic Resonance, W. A. Benjamin, New York, 1962. 11. G. A. Bain and J. F. Berry, J. Chem. Educ. 85:532 (2008). 12. R. J. W. LeFevre, Dipole Moments, 2nd edition, Methuen, London, 1948. 13. C. J. F. B€ ottcher, Theory of Electric Polarization, Elsevier, Amsterdam, 1973.
CHAPTER
6
Kinetics, Equilibria, and Electrochemistry
Panta r««i [Everything flows] Heraclitus (535– ca. 435
BC)
6.1 INTRODUCTION This chapter deals with how chemical reactions proceed: their speed or rate, their mechanism, the state(s) accessed between reagents and products, the free energy profile, and theoretical calculations. It also deals with electrochemistry.
6.2 ENERGETICS, REACTION COORDINATE, TRANSITION STATES, INTERMEDIATES, AND CATALYSIS Before plunging into the details of reaction mechanisms and the mathematics needed for analyzing reactions, we should spend a few minutes discussing Fig. 6.1, a plot of Gibbs1 free energy versus a vaguely defined variable called a reaction coordinate, which may be the lengthening or contracting of a crucial chemical bond, or some other observable, whose change can be used to follow the progress of a chemical reaction. In reality there may be several such coordinates per molecule! First of all, the reagent (or reagents) R will occupy some local minimum in the Gibbs free energy G; the products (or product) P will be found at
1
Josiah Willard Gibbs, Jr. (1839–1903).
The Physical Chemist’s Toolbox, Robert M. Metzger. Ó 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc.
335
336
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
1500 T1 Gibbs free energy (kJ/mol)
1000
FIGURE 6.1 Schematic change in Gibbs free energy as a function of the “reaction coordinate” x. R denotes the reagent(s), P represents the product(s); T1 is the transition state (also known as the activated complex), I1 is the reaction intermediate (if it exists). T2 denotes the lower-energy transition state (or activated complex) for the catalyzed reaction.
I1
500
T2 0
R
–500
–1000 P –1500 –50
0
50
100
150
200
250
300
350
Reaction coordinate × (arb. units)
another local minimum of G, further along the reaction coordinate. For the reaction R!P
ð6:2:1Þ
DG ¼ GðPÞ GðRÞ
ð6:2:2Þ
The overall difference is
Next, the reaction will exhibit a free energy barrier, at the top of which may lay a very short-lived “transition state” (T1) or activated complex, with no local minimum in G, and a lifetime of the order of 1015 s (the time needed for a single vibration), or an “intermediate” (I1) with a small minimum in G and a measurable lifetime tI 1012 s or longer. Transition state theory was developed in 1935 by Eyring2 and Polanyi.3 The energy barrier for the forward reaction is given by DGz! ¼ GðI1Þ GðRÞ
ð6:2:3Þ
while the energy barrier for the backward reaction P!R
ð6:2:4Þ
DGz ¼ GðI1Þ GðPÞ
ð6:2:5Þ
is given by
Typically, the Gibbs free energy of activation of most chemical reactions consists of a large internal energy of activation DEz, along with much smaller contributions from pressure–volume effects P DVz or from the entropy DSz: DGz ¼ DEz þ PDV z TDSz
2 3
Henry Eyring (1901–1981). Michael Polanyi (1891–1976).
ð6:2:6Þ
6.2
EN E R G E T I C S , R E A C T I O N C O O R D I N A T E , T R A N S I T I O N S T A T E S , I N T E R M E D I A T E S
33 7
0.0025 T = 273 K 0.002
FIGURE 6.2
n(v)
0.0015 T = 1273 K
0.001
0.0005
T = 2273 K
0
–0.0005
0
500
1000
1500
2000
speed v (m/s)
2500
3000
The Maxwell4–Boltzmann5 (MB) distribution of molecular speeds v for N2 molecules in the gas phase, n(v) ¼ 4pv2(0.028/6.022 10232p 1.381 1023) 3/2 exp [(0.28/ 6.022 1023 1.381 1023T)v2], at T ¼ 273, 1273, and 2273 K. The boxed area indicates the speeds that exceed 2,250 m s1 (8.24, 1.8, or 0.99 times T, respectively), where the high-energy end of the MB distribution may match a DEz of several eV (1 eV 8000 K).
Of course, these energy barriers DGz are important and necessary: If the DGz for all possible chemical reactions on Planet Earth were zero, then the reactions would already have gone to completion, and life as we know it would have disappeared. Figure. 6.2 shows the Maxwell–Boltzmann distribution of molecular speeds as a function of temperature T; this distribution has maxima (i.e., most likely kinetic energies) at energies far below those needed for chemical reactions. However, what makes chemical reactions possible is the relatively small fraction of molecules possessing kinetic energies at the high end of the spectrum (boxed region in Fig. 6.2). As a further detail, the Franck6–Condon7 principle states that, for a reaction to occur, the geometry of R (atom positions) must change to match the geometry of P; this may mean a lengthening or shortening of some crucial covalent bond lengths. This principle has become a calculable Franck– Condon factor. Catalysis occurs when a chemical compound, or a solid surface, or some other agent, makes another pathway possible for the reaction, involving a new TS (T2 in Fig. 6.1) or a new intermediate, call it I2, whose Gibbs free energy is much lower than T1 or I1. Catalysis is very important, if not essential, to most industrial chemical processes, in that they lower the temperature needed to activate the reaction, and they generally also speed up the course of the reaction. Catalysts are usually quite specific to the chemical reactions they accelerate, and they were usually discovered by serendipity. Catalysts are divided into heterogeneous catalysts (nanoparticles or surfaces) and homogeneous catalysts (which are in the same phase as the reagents). In recent parlance, homogeneous transition metal catalysts have been called catalysts when they accelerate a chemical reaction, even though they are destroyed in the reaction.
6 7
James Franck (1882–1964). Edward Uhler Condon (1902–1974).
338
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
By the principle of microscopic reversibility, applicable in classical mechanics, for every forward reaction R ! P there should also be a back reaction P ! R. This principle does not allow for energy differences, however: Energy issues can affect the efficiency of the back reaction. If equilibrium becomes possible, then a double arrow is used: R>P
ð6:2:7Þ
Equilibrium will be established if the time rate of change for the forward reaction d[R]/dt equals the time rate of change for the back reaction d[P]/ dt. This is the principle of detailed balance. Hereinafter, square brackets are used, as in [A], to denote the concentration of compound A (usually in mol/L). The equilibrium constant is then written K ¼ ½P=½R
ð6:2:8Þ
aA þ bB þ cC > dD þ eE
ð6:2:9Þ
For the chemical reaction
the equilibrium constant is written as K¼
½Dd ½Ee ½Aa ½Bb ½Cc
ð6:2:10Þ
This is, formally, the Guldberg8–Waage9 law of mass action. The units of K are here [mol/L]dþeabc. Since the forward reaction can be viewed as d
d½D ¼ kf ½Aa ½Bb ½Cc dt
ð6:2:11Þ
and since the backward reaction can be viewed as a
d½A ¼ kb ½Dd ½Ee dt
ð6:2:12Þ
Therefore, equilibrium is achieved when these two rates are equal: a
d½A d½D ¼d dt dt
ð6:2:13Þ
The equilibrium constant can be viewed as a ratio of the forward reaction rate divided by the back reaction rate: K¼
8 9
Cato Guldberg (1836–1902). Peter Waage (1833–1900).
kf ½Dd ½Ee ¼ kb ½Aa ½Bb ½Cc
ð6:2:14Þ
6.2
EN E R G E T I C S , R E A C T I O N CO O R D I N A T E , T R A N S I T I O N S T A T E S , I N T E R M E D I A T E S
It is traditional to use small k for rate constants and use capital K for equilibrium constants. The equilibrium constants, with the numerator always involving the product(s), and the denominator the reagent(s), are usually very strong functions of temperature. In computing free energy changes during a chemical reaction or in a chemical equilibrium, we should remember that “standard” internal energy, enthalpy, entropy, Helmholtz free energy, or Gibbs free energy changes N N N N (either DEN rxn , DHrxn , DSrxn , DArxn , DGrxn ) are defined precisely and, by convention, as occurring at a standard temperature (usually taken as N;298:15 N;298:15 N;298:15 N;298:15 , DHrxn , DSrxn , DArxn , 298.15 K ¼ 25 C, thus (DErxn N;298:15 ) and involving one mole each of each species in “standard states.” DGrxn However, we need a more general expression when the concentrations of reagents and products are not 1 mole each. We go to an ideal gas and consider dG ¼ S dT þ V dP ¼ S dT þ ðnRT=PÞdP
ð6:2:15Þ
which at constant T, after integration, becomes DG ¼ DGN ðTÞ þ RT lne ðP=Po Þ
ð6:2:16Þ
This equation, in partial molar language, also applies to any component with mole fraction xi in a mixture of ideal gases m ¼ mN þ RT lne xi
ð6:2:17Þ
This, by a stretch of the argument, leads to a very fundamental result for any equilibrium: DGðTÞ ¼ DGN ðTÞ þ RT lne K ¼ DGN ðTÞ þ NA kB T lne K
ð6:2:18Þ
and in detail, for Eq. (6.2.14): " N
DG ¼ DG ðTÞ þ RT lne
½Dd ½Ee ½Aa ½Bb ½Cc
# ð6:2:19Þ
Note that here using the natural logarithm requires that K be a pure number, while it usually has the units of some power of concentrations. Note also that, to “keep us honest,” activities and not concentrations should enter into Eqs. (6.2.14) or (6.2.19). In the pious but understandable desire to keep equations simple when the reagents interact in nonideal fashion, activity coefficients g can be used as premultipliers to [A], [B], etc. (e.g., gA, gB, etc.) to convert concentrations to activities; these empirical “fudge factors” are themselves dependent on temperature, on concentration, and on electrolyte strength and hide within them the departure from ideal behavior. Le Ch^ atelier10 enunciated in 1885 a very important principle: If a chemical equilibrium is perturbed in any way (e.g., by changing concentrations, temperature, or pressure), then the equilibrium will shift to counteract
10
Henry Louis Le Ch^atelier (1850–1936).
33 9
340
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
that perturbation. Thus if the relative amount of a reagent is reduced, the equilibrium will shift “to the left,” to restore if possible the amount of that reagent. This far-reaching idea has even been extended to economics!
6.3 CLASSIFICATION OF REACTION TYPES Under experimental conditions, it may be found that the rate of formation of product is proportional to the first, second, or (rarely) third power of the concentration of the reagent; then the reaction is classified as first-, second-, or third-order with respect to that reagent. Thus, the order of a chemical reaction is strictly an empirical finding. Things can get quite complicated. For the gas-phase reaction between H2 and Br2 gases at high temperature (T > 773 K): H2 ðgÞ þ Br2 ðgÞ ! 2HBrðgÞ the experimental reaction rate was found to be [1] pffiffiffiffiffiffiffiffiffiffi d½HBr ka ½H2 ½Br2 ¼ dt kb þ ½HBr
ð6:3:1Þ
ð6:3:2Þ
½Br2
A plausible mechanism for this reaction was found much later, as is explained further below. If, however, the true mechanism for the reaction is discovered and found to involve either a single molecule, or two molecules, or three molecules of the same kind, then the reaction is termed unimolecular, bimolecular, or termolecular, respectively, with respect to that reagent. Order and molecularity are terms used to discriminate between conjectural and proven mechanisms, respectively, but the mathematics is the same. Chemical reactions are monitored to determine the progress of a reaction by measuring concentrations or reagents and/or products as a function of time. The results are then used to fit several possible candidate differential equations, or graphical plots that these equations suggest. In general, chemists love linear plots, so every effort is made to plot some function of the reagents that will confirm the sought-for linearity.
6.4 FIRST-ORDER AND UNIMOLECULAR REACTIONS If a chemical reaction A!D
ð6:4:1Þ
is first-order in A, then the differential equation is d½D=dt ¼ d½A=dt ¼ k1 ½A
ð6:4:2Þ
Its integral is ½A ¼ ½A0 expðk1 tÞ
ð6:4:3Þ
6.4
341
FIRST-ORDER AND UNIMOLECULAR REACTIONS
and the linear expression as a function of time is
½A ½A lne ¼ k1 t ¼ 2:30259log10 ½A0 ½A0
ð6:4:4Þ
PROBLEM 6.4.1. Prove Eq. (6.4.4). PROBLEM 6.4.2. The half-life of a reaction, t1/2, is the time required for half the concentration of the relevant component to have disappeared. Obtain a relationship between t, and k1. PROBLEM 6.4.3. The first-order reaction rate law assumes mathematically that you only exhaust [A] at infinite time: is this reasonable? Carbon-14 Dating. The earth is bombarded by an almost constant flux of cosmic rays, which in the stratosphere generate many particles, including neutrons, 0n1 which react with 7N14 nuclei to form radioactive C-14 by the (n, p) reaction, i.e. the emission of a proton 1p1. 7N
14
þ0 n1 ! 6 C14 *þ1 p1
ð6:4:6Þ
C-14 dating was discovered by Libby11 and co-worker [2]. The cosmic ray flux has been fairly constant over prehistoric and current time and provides a small but almost constant supply of 6C14, at a rate averaged over the whole atmosphere of about 2.2 atoms cm2 s1. The radioactive 6C14 will bind to oxygen in the atmosphere to form radioactive carbon dioxide, but will decay, with a half-life t1/2 ¼ 5730 years, by emitting an electron (or “b ray”) and an electron antineutrino: 6C
14
! 7 N14 þ b þ ne
ð6:4:7Þ
Since the atmospheric carbon from all sources averages 8.2 g cm2, about 0.27 electrons g1s1 are produced, which can be counted, if proper shielding can reduce the background. A living being that uses the carbon cycle will incorporate a constant amount of carbon-14 through its lifetime. After death, the amount of radioactivity due to 6C14 decreases at a constant rate, and objects as old as 60,000 years can be dated. The comparison with tree rings for trees up to 2000 years old is very satisfactory. Small fluctuations (1%) in the daily production of carbon-14 over the last 5000 years, due to variations in cosmic ray flux, were found. The age of radiocarbon samples is usually quoted as plus or minus the square root of the counts measured, so older dates, with much fewer counts, are less reliable than recent historical ones. Other radioactive markers useful for archeology are tritium: 1H
11
Willard Frank Libby (1908–1980).
3
! ½2 He3 þ þ b þ ne
ð6:4:8Þ
342
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
(t1/2 ¼ 12.33 y), useful for detecting young wine, and potassium: 19 K
40
! ð10:9%Þ 18 Ar40 þ bþ þ ne
and
19 K
40
! ð89:1%Þ 20 Ca40 þ b þ ne ð6:4:9Þ
(t1/2 ¼ 1.248 109 y): the embedded Ar is measured to date old rocks. PROBLEM 6.4.5. An Egyptian mummy shows 75.5% of the electron decay counts expected from the 6C14 of a modern and recent human corpse. How old is the mummy?
6.5 SECOND-ORDER (UNMIXED) AND UNMIXED BIMOLECULAR REACTIONS If a chemical reaction A!D
ð6:5:1Þ
is second-order in A, then the differential equation is d½A=dt ¼ k2 ½A2
ð6:5:2Þ
Its integral is
1 1 ½A ½A0
¼ k2 t
ð6:5:3Þ
which is linear in time and linear in the reciprocal of the concentration of [A]. PROBLEM 6.5.1.
Prove Eq. (6.5.3).
PROBLEM 6.5.2.
Find the half-life t1/2 for Eq. (6.5.3).
6.6 SECOND-ORDER (MIXED) AND MIXED BIMOLECULAR REACTIONS If a chemical reaction A þ B!D
ð6:6:1Þ
is first-order in A and first-order in B, and therefore second-order overall, then the differential equation is d½A=dt ¼ k2m ½A½B
ð6:6:2Þ
This cannot be immediately integrated, because B is involved, and a separation of variables is premature at this point. We need to somehow eliminate
6.8
343
RE VERSIBLE REACTIONS
the role of B in Eq. (6.6.2). Assume that at t ¼ 0 the initial concentrations of A and B are [A]0 and [B]0, respectively. Thereafter, at any time, A and B get depleted together, and one can define the “progress of the reaction” by a convenient new variable x: x ½B0 ½B ¼ ½A0 ½A
ð6:6:3Þ
whence it can be shown, after using partial fractions for the integrand, that ½B ½A 1 ¼ k2m t lne 0 ½A ½B0 ½A0 ½B
ð6:6:4Þ
PROBLEM 6.6.1. Prove Eq. (6.6.4) from Eqs. (6.6.2) and (6.6.3).
6.7 THIRD-ORDER (UNMIXED) AND UNMIXED TERMOLECULAR REACTIONS If a chemical reaction A!D
ð6:7:1Þ
is third-order in A, the differential equation is
Its integral is
d½A=dt ¼ k3 ½A3
ð6:7:2Þ
" # 1 1 1 ¼ k3 t 2 ½A20 ½A2
ð6:7:3Þ
Termolecular reactions require that, for the reaction to occur, three bodies meet at a single point; not surprisingly, termolecular reactions are very rare.
6.8 REVERSIBLE REACTIONS If a reaction is reversible and first-order in both the forward and the reverse directions kf
AÐB kb
ð6:8:1Þ
the coupled differential equations for this reaction are d½A=dt ¼ kf ½A þ kb ½B
ð6:8:2Þ
d½B=dt ¼ kf ½A kb ½B
ð6:8:3Þ
Assuming finite initial concentrations for both A and B, that is, [A]0 and [B]0, the law of conservation of mass provides the equation ½A0 ½A ¼ ½B ½B0
ð6:8:4Þ
344
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
After some pain, it can be shown that lne
PROBLEM 6.8.1.
kf ½A kb ½B ¼ ðkf þ kb Þt kf ½A0 kb ½B0
ð6:8:5Þ
Prove Eq. (6.8.5) from Eqs. (6.8.2), (6.8.3), and (6.8.4).
6.9 CONSECUTIVE REACTIONS Consider the case of two first-order reactions occurring in sequence: k1
A ! B
ð6:9:1Þ
B!C
ð6:9:2Þ
k2
with appropriate coupled differential equations: d½A=dt ¼ k1 ½A
ð6:9:3Þ
d½B=dt ¼ k1 ½A k2 ½B
ð6:9:4Þ
d½C=dt ¼ k2 ½B
ð6:9:5Þ
The first equation can be integrated immediately: ½A ¼ ½A0 expðk1 tÞ
ð6:9:6Þ
This result can be fed into Eq. (6.9.4): d½B=dt þ k2 ½B ¼ k1 ½A0 expðk1 tÞ
ð6:9:7Þ
After multiplying both sides by exp(k2t), this can be integrated to yield ½B ¼ ½B0 þ k1 ðk2 k1 Þ1 ½A0 ½expðk1 tÞ expðk2 tÞ
ð6:9:8Þ
When the initial concentration of B is zero, [B]0 ¼ 0, then the conservation of mass yields ½A0 ¼ ½A þ ½B þ ½C
ð6:9:9Þ
So when [B]0 ¼ 0, we can get finally ½C ¼ ½A0
PROBLEM 6.9.1.
k2 k1 1 expðk1 tÞ þ expðk2 tÞ k2 k1 k2 k1 Prove Eq. (6.9.8).
ð6:9:10Þ
6.11
A P P R O X I M A T I O N M E T H O D S : T H E M I C H A E L I S –M E N T E N E Q U A T I O N
6.10 THE STEADY-STATE APPROXIMATION AND THE RATE-DETERMINING STEP In dealing with complicated reaction mechanisms, a simplification can often be introduced that when the reaction has reached some kind of steady state (akin to an equilibrium, except that further reactions are possible beyond this equilibrium: hence the term steady-state approximation (SSA) is used. Mathematically, after the reaction has started, some intermediate product B has the condition d [B]/dt ¼ 0. This is best illustrated by an example. The mechanism A ! B
k1
ðð6:9:1ÞÞ
k2
ðð6:9:2ÞÞ
B ! C
was given above, with exact answers, valid if and only if at t ¼ 0 [A] ¼ [A]0, but [B] ¼ [C] ¼ 0: ½A ¼ ½A0 expðk1 tÞ ½B ¼ ½A0 k1 ðk2 k1 Þ1 ½expðk1 tÞ expðk2 tÞ
ðð6:9:6ÞÞ ðð6:9:8ÞÞ
½C ¼ ½A0 f1 k2 ðk2 k1 Þ1 expðk1 tÞ þ k1 ðk2 k1 Þ1 expðk2 tÞg ðð6:9:10ÞÞ Using the SSA, however, d[B]/dt ¼ 0; this means from Eq. (6.9.4): d½B=dt ¼ k1 ½A k2 ½B 0
ð6:10:1Þ
Combining Eqs. (6.10.1) and (6.9.6), one gets ½BSSA ¼ ðk1 =k2 Þ½A ¼ ðk1 =k2 Þ½A0 expðk1 tÞ
ð6:10:2Þ
which is actually the limiting case of Eq. (6.9.8) when k2 k1: the intermediate B must be so reactive that it cannot accumulate a large concentration (which disappears rapidly into the product C). Also, using Eq. (6.9.5) one immediately gets by integration: ½CSSA ¼ ½BSSA k2 t ¼ k1 t½A0 expðk1 tÞ
ð6:10:3Þ
In complicated reactions, the SSA is usually followed by some step whose progress dominates the overall kinetics; that reaction is called the ratedetermining step.
6.11 APPROXIMATION METHODS: THE MICHAELIS–MENTEN EQUATION A very important class of catalysts found in nature is comprised of biological enzymes, compounds which act as classical catalysts, in that they are unchanged by the reactions they assist, and make important life-sustaining
345
346
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
reactions both possible and occurring at or close to room temperature at reasonable rates (the ones that did not work were probably eliminated from life cycles by evolutionary forces). Often these enzymes help the reaction rate by controlling the steric environment to favor the reaction they assist. The enzyme E docks onto a substrate S, forming an enzyme–substrate complex ES, which then dissociates into products P and restores the enzyme E. [Note a terminological divergence here: Physicists label as “substrates” the inert objects on which reactions are carried out, while biochemists label as “substrates” the reagent that will be transformed into a product!] An example of an enzyme-assisted reaction is the dissolution of sucrose (¼substrate S) into glucose and fructose (products P) by the enzyme invertase (E); most enzymes are “-ases” of some kind or other, depending on what reactions they assist. The reaction mechanism consists of the two reversible equations: k1
E þ S Ð ES k1
k2
ES Ð P þ E k2
ð6:11:1Þ ð6:11:2Þ
In most treatments, the back reaction in Eq. (6.11.2) with rate constant k2 is neglected, for simplicity, and because this rate constant is usually negligibly small. The elementary rates of reaction are d½ES=dt ¼ k1 ½E½S ðk1 þ k2 Þ½ES þ k2 ½E½P
ð6:11:3Þ
d½S=dt ¼ k1 ½E½S þ k1 ½ES
ð6:11:4Þ
d½P=dt ¼ k2 ½ES k2 ½P½E
ð6:11:5Þ
The conservation of mass equations are ½E0 ¼ ½E þ ½ES
ð6:11:6Þ
½S0 ¼ ½S þ ½ES
ð6:11:7Þ
If we apply the steady-state approximation (SSA) to Eq. (6.11.3) we get, for a steady-state (small and time-invariant) concentration [ES]SSA: 0 d½ES=dt ¼ ðk1 þ k2 Þ½ESSSA þ k1 ½E½S þ k2 ½E½P
ð6:11:9Þ
which yields an explicit approximate value for [ES]SSA: ½ESSSA ðk1 þ k2 Þ1 ½Efk1 ½S þ k2 ½Pg
ð6:11:10Þ
which can be rewritten using a modification of Eq. (6.11.6) [E]0 [E] þ [ES]SSA: ½ESSSA ðk1 þ k2 Þ1 f½E0 ½ESSA gfk1 ½S þ k2 ½Pg
ð6:11:11Þ
6.11
A P P R O X I M A T I O N M E T H O D S : T H E M I C H A E L I S –M E N T E N E Q U A T I O N
This can be solved anew for [ES]SSA as follows: ½ESSA ðk1 þ k2 Þ1 fk1 ½E0 ½S þ k2 ½E0 ½Pgf1 þ ðk1 þ k2 Þ1 fk1 ½S þ k2 ½Pgg
1
¼ fk1 ½E0 ½S þ k2 ½E0 ½Pgfk1 þ k2 þ k1 ½S þ k2 ½Pg1 ð6:11:12Þ In Eq. (6.11.4) we replace [ES] by [ES]SSA: d½S=dt k1 ½E0 ½S þ k1 ½S½ESSSA þ k1 ½ESSSA ¼ k1 ½E0 ½S þ fk1 ½S þ k1 g½ESSSA and insert Eq. (6.11.12): d½S=dt k1 ½E0 ½S þ fk1 ½S þ k1 gfk1 ½E0 ½S þ k2 ½E0 ½Pg fk1 þ k2 k1 ½S þ k2 ½Pg1 which simplifies to d½S=dt fk1 ½E0 ½Sfk1 þ k2 þ k1 ½S þ k2 ½Pg þ fk1 ½S þ k1 g fk1 ½E0 ½S þ k2 ½E0 ½Pgfk1 þ k2 þ k1 ½S þ k2 ½Pg1 and then to d½S=dt fk1 k1 ½E0 ½S k1 k2 ½E0 ½S k21 ½S2 ½E0 k1 k2 ½E0 ½S½Pfk21 ½E0 ½S2 þk1 k1 ½E0 ½S þ k1 k2 ½E0 ½S½P þ k1 k2 ½E0 ½Pgfk1 þ k2 þ k1 ½S þ k2 ½Pg1 which after clean-up finally simplifies to d½S k1 k2 ½S þ k1 k2 ½P ½E dt k1 þ k2 þ k1 ½S þ k2 ½P 0
ð6:11:13Þ
This result is usually simplified one more time in one of two ways: Assume that either (1) the reverse of the second reaction does not occur (i.e., k2 ¼ 0) or (2) at the beginning of the reaction [P] is negligibly small; in either case, one gets that the speed of the reaction v is given by v¼
d½S k1 k2 ½S ½E dt k1 þ k2 þ k1 ½S 0
ð6:11:14Þ
This is the Michaelis12–Menten13 equation. This equation is often abbreviated by defining a Michaelis constant KM (which is not really an equilibrium constant): KM ðk1 þ k2 Þ=k1 ð6:11:15Þ and by the further definition vs k2 [E]0; then Eq. (6.11.14) becomes v vs f1 þ KM =½Sg1
12 13
Leonor Michaelis (1875–1949). Maud Menten (1879–1960).
ð6:11:16Þ
347
348
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
FIGURE 6.3 Michaelis–Menten plot.
FIGURE 6.4 Lineweaver–Burke plot.
Since the reaction speed v depends nonlinearly on [S] and only reaches an asymptotic value at infinite substrate concentration (see Fig. 6.3), a Lineweaver–Burk plot (Fig. 6.4) is better: 1 v1 ¼ KM v1 þ v1 s ½S s
ð6:11:17Þ
and so is an Eadie–Hofstee plot: 1 1 1 v½E1 ¼ k2 KM vKM 0 ½S
ð6:11:18Þ
The mathematics for the Michaelis–Menten reaction resembles that of the Langmuir14 adsorption isotherm presented in Section 4.23.
14
Irving Langmuir (1881–1957).
6.12
CHAIN RE ACT IONS . T HE REACTIO N OF H Y D R O G E N AN D B R O M I N E A T H I G H T E M P E R A T U R E
6.12 CHAIN REACTIONS. THE REACTION OF HYDROGEN AND BROMINE AT HIGH TEMPERATURE Many reactions start slowly at first and then speed up, as reagents are consumed and products are made. This is particularly true of chain reactions, in which products are made, and some reactive intermediate is regenerated to “keep the chain going.” Polymerizations, explosions, and nuclear bombs are examples of chain reactions. These chain reactions have precise components that must be identified in a successful reaction mechanism: (1) chain initiation, (2) chain propagation, (3) chain termination. The propagation step in chemical reactions usually involves the formation of very reactive free radicals (odd-electron species, while the chain termination steps may involve radical-radical reactions, which shut off the supply of reactive intermediates. We return to the gaseous hydrogen–bromine reaction discussed above: H2 ðgÞ þ Br2 ðgÞ ! 2HBrðgÞ
ðð6:3:1ÞÞ *
whose mechanism was explained [3–5] as follows (a dot, , indicates a free radical): k1
ð1Þ Initiation :
Br2 ! 2Br
ð2Þ Propagation :
Br þ H2 ! HBr þ H
k2
*
ð6:12:1Þ
*
*
k3
H þ Br2 ! HBr þ Br *
k4
ð3Þ Inhibition :
H þ HBr ! H2 þ Br
ð4Þ Termination :
Br þ M ! Br2 þ M
*
*
*
k5
*
ð6:12:2Þ ð6:12:3Þ ð6:12:4Þ ð6:12:5Þ
where M is some inert metal or glass surface. It is natural to consider the homolytic scission of the Br–Br bond, Eq. (6.12.1) (bond dissociation energy DE ¼ 190 kJ/mol), rather than the scission of the HBr bond (DE ¼ 360 kJ/mol) or the H–H bond (DE ¼ 430 kJ/mol). We wish to evaluate d[HBr]/dt, for which the empirical result we must explain is d½HBr=dt ¼ ka ½H2 ½Br2 1=2 fkb þ ½HBr=½Br2 g1
ðð6:3:2ÞÞ
We start from Eqs. (6.12.1) to (6.12.5). The steady-state approximation (SSA) can be invoked for both H and Br : *
*
d½Br=dt ¼ 2k1 ½Br2 k2 ½Br½H2 þ k3 ½H½Br2 þ k4 ½H½HBr 2k5 ½Br2 0 ð6:12:6Þ d½H=dt ¼ k2 ½Br½H2 k3 ½H½Br2 k4 ½H½HBr 0
ð6:12:7Þ
Note that Eq. (6.12.7) is the negative of the middle three terms of Eq. (6.12.6). Therefore these two SSA approximations imply a third SSA: 2k1 ½Br2 2k5 ½Br2 0
ð6:12:8Þ
349
350
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
Thus, 1=2 1=2
½Br SSA ¼ k1 k5 *
½Br2 1=2
ð6:12:9Þ
From Eq. (6.12.7), solved for [H] and using Eq. (6.12.9), we obtain 1=2 1=2
½H SSA ¼ k2 k1 k5 *
½H2 ½Br2 1=2 fk3 ½Br2 þ k4 ½HBrg1
ð6:12:10Þ
We now write down the exact rate equation for the formation of HBr: d½HBr=dt ¼ k2 ½Br ½H2 þ k3 ½H ½Br2 k4 ½H ½HBr *
*
*
ð6:12:11Þ
One could laboriously grind out the result using Eqs. (6.12.9) and (6.12.10), but instead one can look simply again at Eq. (6.12.7): k2 ½Br½H2 k3 ½H½Br2 þ k4 ½H½HBr
ð6:12:12Þ
which when inserted into Eq. (6.12.11) yields quite simply: d½HBr=dt 2k3 ½H SSA ½Br2
ð6:12:13Þ
1=2
ð6:12:14Þ
*
or 1=2
d½HBr=dt 2k1 k2 k3 k5
½Br2 ½H2 ½Br2 1=2 fk3 ½Br2 þ k4 ½HBrg1
which can be massaged to compare successfully with the empirical finding: pffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffi d½HBr 2 k1 k2 ½H2 ½Br2 pffiffiffiffiffi k4 ½HBr dt k5 1 þ k3 ½Br2
ð6:12:15Þ
Other chain reactions have been explained in a similar fashion. The most terrifying chain reactions are explosions and nuclear weapons. In particular, the first “A-bomb” used over Hiroshima on August 6, 1945 killed 140,000 civilians; it was an implosion-type nuclear device with a critical mass of the radioactive “fissile” isotope 92U235 [enriched from its natural concentration of 0.72% in uranium ore to at least 20% (weapons-usable) or even 85% (weapons-grade)]. Natural 92U235 has a half-life of 7.1 105 years, and it decays to 90Th231 by emitting 2He4. A “critical mass” of 52 kg of heavily enriched 92U235 also decays by fission, releasing two or three neutrons per nucleus (hence a chain reaction can ensue). If the neutrons are “thermalized”— that is, cooled by passage through either deuterium oxide (D2O, “heavy water”) or graphite—then the capture cross-section per neutron increases from 1 barn to 1000 barns, and these slow neutrons are efficiently captured by other 92U235 nuclei, for further fission. The ultimate product of decay by 92U235 is a mixture of fission products of much lower atomic number (e.g., Rb or Sr) plus more thermal neutrons. Some of the daughter nuclides are also heavily radioactive, with long half-lives, so in Hiroshima many other Japanese died slowly of radiation poisoning days, weeks, and years
6.13
U S I N G L A P L A C E T R A N S F O R M S T O S OL V E K I N E T I C S E Q U A T I O N S
later. The average fission yield of a single 92U235 nucleus is 202.5 MeV ¼ 3.244 1011 J/atom ¼ 19.54 TJ/mol ¼ 83.14 TJ/kg. The energy yield of this first, rather inefficient A-bomb was equivalent to that of 14 kilotons ¼ 1.4 107 g of the chemical explosive trinitrotoluene (TNT). A second A-bomb, dropped over Nagasaki, Japan on August 8, 1945, killed 80,000 civilians; it was a plutonium bomb with 94Pu239 as the “active ingredient” (critical mass 10 kg). Later technical improvements were the “H-bomb”—a fission-driven fusion bomb that converts H to He and that has a massively increased destructive power, equivalent to 50 or more megatons of TNT—and a neutron bomb. Fortunately or miraculously, since 1945 these bombs have not been used in warfare. The long-lived and lethal radioactive products of nuclear bomb tests in the atmosphere and in the soil led to the world-wide 1963 Partial Nuclear Test Ban Treaty, banning atmospheric tests of all nuclear bombs. A complete nuclear test ban treaty has not yet been signed by all nations. As of 2010 (in alphabetic order) China, France, Great Britain, India, North Korea, Pakistan, Russia, the United States, and probably Israel have nuclear weapons, while Iran is developing that frightening capability: an all-out nuclear war would destroy all human life on earth and leave radiation-resistant cockroaches to rule the planet. In peaceful uses of nuclear reactions, electrical power plants can be driven by a nuclear reactor very close to criticality, with careful control of neutron flux; excess heat from the well-shielded nuclear reactor is driven off by a liquid (H2O, Na, or Hg), which in a secondary cycle or a tertiary cycle generates electricity by turning induction turbines.
6.13 USING LAPLACE TRANSFORMS TO SOLVE KINETICS EQUATIONS Most simple kinetics problems involve first-order differential equations, which can be integrated using Laplace15 transforms. As discussed in Section 2.16, for a function f(x), the Laplace transform F(k) is given by Lff ðxÞg FðkÞ
Ð x¼1 x¼0
expðkxÞf ðxÞdx
ðð2:16:22ÞÞ
Of course, k has dimensions such that the product kx is dimensionless. The Laplace transform has the following useful properties: (a) the Laplace transform of a constant C is LfCg ¼ FðkÞ ¼
ð x¼1 x¼0
expðkxÞCdx ¼ ðC=kÞ½expðkxÞx¼1 x¼0 ¼ C=k ð6:13:1Þ
(b) The Laplace transform of a sum of functions is the sum of the transforms of each term.
15
Pierre Simon marquis de Laplace (1749–1827).
35 1
352
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
(c) The Laplace transform of the first derivative, f 0 (x) df(x)/dx, of a function f(x) is given by Lfdf ðxÞ=dxg ¼ kLff ðxÞg f ð0Þ
ð6:13:2Þ
(d) The Laplace transform of a second derivative is given by Lfd2 f ðxÞ=dx2 g ¼ kLfdf ðxÞ=dxg df ð0Þ=dx ¼ k2 Lff ðxÞg kf ð0Þ df ð0Þ=dx ð6:13:3Þ (e) The Laplace transform of the integral of a function is given by ð x¼x f ðxÞdx ¼ k1 Lff ðxÞg ð6:13:5Þ L x¼0
(f) the inverse Laplace transform of L(k) is given by L1 fFðkÞg ¼ f ðxÞ
ð6:13:6Þ
Formally, this inverse transform is given by 1
L fFðkÞg ¼ ð2piÞ
1
ð k¼cþi1 expðþkxÞFðkÞdk
ð6:13:7Þ
k¼ci1
These properties allow one to solve the integration problem in transform space and then back-transform the result into real space. There are extensive tables of Laplace transforms, some of which are given in Table 2.9. PROBLEM 6.13.1. Solve by Laplace transform methods the opposing firstorder reaction problem of Section 6.8: kf
AÐB kb
ðð6:8:1ÞÞ
The coupled differential equations for this reaction are d½A=dt ¼ kf ½A þ kb ½B
ðð6:8:2ÞÞ
d½B=dt ¼ kf ½A kb ½B
ðð6:8:3ÞÞ
with the initial conditions: [A] ¼ [A]0 and [B] ¼ 0 at t ¼ 0.
6.14 REACTION RATE THEORIES AND ENERGY SURFACES Since the 1920s, efforts have been made to calculate the energetics and pathways for chemical reactions. Assume that reagents A and B form a transition state or activated complex ABz, which then becomes some product P: k2
A þ B Ð ABz ! P k2
ð6:14:1Þ
6.14
35 3
REA CTIO N RA TE THEO RIES A ND EN ER GY S UR FACE S
If, as shown, there is some equilibrium between A, B, and ABz, then one can define an equilibrium constant: Kz ¼ ½ABz =½A½B
ð6:14:2Þ
and in the activated-complex theory one can write ½ABz ¼ Kz ½A½B ¼ zz fzA zB g1 expðDEz =kB TÞ
ð6:14:3Þ
z
where the z’s are molecular partition functions, DE is the internal energy of activation (which is usually close to the Gibbs free energy of activation DGz), kB is Boltzmann’s constant, and T is the absolute temperature. For the reaction of Eq. (6.14.1) one can write d½A=dt ¼ k2 ½A½B ¼ k2 ½ABz =Kz ¼ ½ABz nz
ð6:14:4Þ
where nz is the frequency of passage of ABz over the transition T1 indicated in Fig. 6.1. This becomes a formal recipe for computing the rate constant k2: k2 ¼ nz zz fzA zB g1 expðDEz =kB TÞ
ð6:14:5Þ
If one considers the partition function for the activated complex to be simply that of a single vibrator, then zz ¼ f1 expðhnz =kB TÞg
1
ðkB T=hnz Þ
ð6:14:6Þ
This result was obtained by expanding exp (h nz/kBT) in a power series and keeping only the first two terms. Thus finally we obtain Eyring’s equation for the rate constant: k2 ¼ kðkB T=hÞzz fzA zB g1 expðDEz =kB TÞ
ð6:14:7Þ
where an ad hoc factor k was inserted into the recipe as a transmission coefficient, that should correct for the fact that in the mechanism of Eq. (6.14.1) the activated complex (or transition state) ABz will not always proceed towards the products, but may also go back to A and B. For many gas-phase reactions k is between 0.5 and 1.0. Another way of looking at transition states is to assume that for the equilibrium constant Kz of Eq. (6.14.2) one can write the usual Gibbs free energy expression: DGz ¼ DHz TDSz ¼ kB Tlne Kz
ð6:14:8Þ
whence the rate constant of Eq. (6.14.7) can be rewritten as k2 ¼ ðkB T=hÞðDSz =kB ÞexpðDH=kB TÞ
ð6:14:9Þ
where the transmission factor is ignored, or more simply, if P DVz terms can be ignored, as k2 ¼ Az expðDHz =kB TÞ
ð6:14:10Þ
354
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY A + BC
XBC – XA Y XB – XA
X
FIGURE 6.5
AB + C
Idealized potential energy surface for the reaction AB þ C ! A þ BC. Redrawn from Moore [6].
XC – XAB
The activation entropy DSz can then be computed by the methods of statistical mechanics. These ideas, taken together, are often called the Rice16, Ramsperger,17 Kassel,18 and Marcus19 (RRKM) theory. But it is also important to know in detail how chemical bonds are broken and formed, and which favorable geometries must exist for a successful reaction. For this, one must tediously calculate the internal energy surfaces for reagents and products in their ground and excited states, to establish an energy contour map in which the chemical reaction may take place. Then one must use random-walk or Monte-Carlo techniques to estimate how likely it is for a reaction to proceed along the lowest energy hills while random vibrations, collisions, and the like, are taking place. Fig. 6.5 shows a schematic energy surface for an idealized reaction AB þ C ! A þ BC: the curve XY indicates the “hill” with stretched bonds A. . .B. . .C that must be crossed, in one way or the other. The techniques of molecular dynamics can also be used, but these are practical only for gas-phase reactions for the first few nanoseconds or microseconds.
6.15 MARCUS THEORY OF ELECTRON TRANSFER The question addressed here is the electron transfer: D þ A ! Dþ þ A
16
Oscar Knefler Rice (1903–1978). Herman Carl Ramsperger (1896–1932). 18 Louis S. Kassel (1905–1973). 19 Rudolph Arthur Marcus (1923– ). 17
ð6:15:1Þ
6.15
35 5
M A R C U S TH E O R Y OF E L E C T R O N T R A N S F E R
Gibbs Free Energy (arbitrary units)
15
D+A–
DA
DA
D+A–
D+A–
DA
10 5 0
λ Δ G*
Δ G*
Δ Gº
–5
λ = − Δ Gº Δ G* = 0
–10
Δ Gº
–15 λ
–20 (a): normal
–25 –2
(b): ideal
(c): inverted
0 2 4 6 8 10 “Reaction coordinate” × (arbitrary units)
12
FIGURE 6.6 Simplified representation of three cases for Marcus electron transfer theory. The relevant Gibbs free energy surfaces are simply represented as a parabola centered around the equilibrium coordinate(s) of the reagent (DA) and as a displaced parabola of the same slope for the product (DþA) after the transfer of one electron. In all three cases the standard Gibbs free energy of reaction DGo is assumed to be negative (exergonic process). The cases are: (a) “normal case,” where the free energy of activation DG (identical with DGz) is positive, and the reorganization free energy l is larger in absolute value than DGo : l > DGo ; (b) “ideal case,” where the free energy of activation is zero, DG ¼ 0, and the reorganization free energy l is equal and opposite to the free energy of reaction DGo : l ¼ DGo ; (c) “inverted case,” where the reorganization free energy l is smaller in absolute value than the free energy of reaction: l < D Go . In cases (a) and (c), DG ¼ (l þ DGo )2/4l.
The theoretical work of Marcus [7,9] and its experimental confirmation [9,10] prove that for this reaction the electron transfer rate kET is given by (remember the Fermi20 “golden rule”?): h1 jTDA j2 FDA kET ¼ 2p
ð6:15:2Þ
where |TDA|2 is the electronic coupling between the electron donor moiety D and the electron acceptor moiety A, and FDA is the Franck–Condon rearrangement factor, or vibrational overlap integral, between an electron donor region D and an electron acceptor region A connected by a rigid group s in a molecule D-s-A: FDA ¼ ð4plkB TÞ1=2 exp½ðDGo þ lÞ2 =4lkB T
ð6:15:3Þ
where, in turn, l is the nuclear (geometrical) reorganization energy and DGo is the standard free energy of reaction (DGo < 0 for exergonic reactions). There are three cases: “normal,” “ideal,” and “inverted” shown in Fig. 6.6. The free energy difference DGo contains inter alia the difference (ID AA), where ID is approximately the ionization energy of the electron donor moiety D, and AA is the electron affinity of the acceptor moiety A (more precisely, these are energy
20
Enrico Fermi (1901–1954).
356
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
FIGURE 6.7 Intramolecular electron transfer rate constants k (s1) as a function of the free energy difference for the reaction Biphenyl()-androstane-A ! Biphenyl-androstane-A(), estimated from the electrochemical reduction potentials in 2-methyltetrahydrofuran: the inverted region for electron transfer rates is prominent. Redrawn from Miller et al. [10].
levels of the whole molecule D-s-A). As (ID AA) increases from zero, FDA initially remains close to 1, so the reaction speeds up as |DGo | increases; however, if ID AA becomes too large, FDA becomes small (large geometry change, thus big Franck–Condon effect), so the rate slows down by several orders of magnitude. Figure 6.7 shows the experimental evidence for this “inverted case” or “inverted region” [9,10]. The carry-home messages are as follows: (1) The difference (ID AA) is important and, to first order, should be minimized; (2) in a device and under bias, (ID AA) becomes smaller than in the gas phase; (3) if (ID AA) is too large, then the rate of electron transfer may become unacceptably slow because of the Franck–Condon factor becoming small: It is a waste of time to make super-small but super-slow unimolecular devices. Photon-capture efficiency, charge separation, and asymmetric electron transfer (rather than charge recombination) are vital steps in photosynthesis. Therefore considerable theoretical attention has been dedicated for several decades to measuring and understanding the rate of electron transfer kET in a molecule D-B-A from the primary electron donor D to the primary electron acceptor A, across an intervening “bridge” B of length dDA (consisting of either saturated s or unsaturated p bonds). Incidentally, Mother Nature evolved a subtle trick to improve the photoelectric efficiency: Photosystem I and Photosystem II have not one, but three, electron acceptors in series, not bonded to each other but located in close proximity; the downhill tunneling from the first to the second and third acceptors suppresses the charge recombination rate, and the conversion of photons to separated radical pairs becomes very efficient. A single molecule D-B-A (either D-s-A or D-p-A), after electron transfer, ultimately becomes the biradical Dþ-B-A. The rate of the electron transfer reaction kET (for D-B-A ! Dþ-B-A) depends on which mechanism is operative:
6.15
35 7
MARCUS THEORY OF ELECTRON TRANSFER
(i) a thermally activated, diabatic, incoherent, “hopping” mechanism that creates real (if short-lived) excited bridge states (Dþ-B-A or D-Bþ-A) with some positive definite activation energy DEz: kET;hop ¼ FexpðDEz =kB TÞ
ð6:15:4Þ
where F is a constant, or (ii) an adiabatic, “superexchange”, or coherent tunneling mechanism [11], which uses virtual states along the bridge: kET;hop ¼ GðTÞexpðbdDA Þ
ð6:15:5Þ
where G(T) and b are constants. The temperature-dependent prefactor G(T) includes energies for molecular reorganization, vibrations, and the Franck–Condon factor. If the bridge consists of several identical repeating components (e.g., phenylene or methylene groups), then the bias-independent decay constant b can be estimated [11] from b ¼ ð2=aÞlne ðDEB =DEDB Þ
ð6:15:6Þ
where a is the length of the repeating component in the bridge, and DEDB is the energy gap between the relevant Dþ-B-A state and the initial D-B-A state (assumed to lie lower); that is, DEB is the coupling energy between adjacent bridge components [11]. If b is very small, then the exponential dependence of kET,tun on dDA is no longer obvious from experiments. Simmons21 has shown that the current I, as a function of applied voltage V, that traverses a molecule considered simply as a rectangular barrier of energy FB and width d, in the direct tunneling regime V < FB e1 [12,13] is given by I ¼ eð2phd2 Þ1 fðFB eV=2Þexp½4pð2mÞ1=2 h1 aðFB eV=2Þd ðFB þ eV=2Þexp½4pð2mÞ1=2 h1 aðFB þ eV=2Þdg
ð6:15:7Þ
where e is the electronic charge, h is Planck’s constant, and the dimensionless constant a corrects for a possible nonrectangular barrier or for using the electron rest mass m in place of a somewhat smaller “effective mass” m am. The bias-independent decay constant b can be linked to the constants used in Simmons’ formula: 1=
1=
b ¼ 4p2 2 m 2 h1 aFB
1=2
ð6:15:8Þ
The hopping and tunneling mechanisms probably occur in tandem, so one can write kET ¼ kET;hop þ kET;tun ð6:15:9Þ Other mechanisms, such as resonant tunneling or variable-range hopping, can also occur.
21
John George Simmons (1931– ).
358
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
Experimentally, when conductivity through a molecule of known size and orientation is measured, a formula similar to Eq. (6.15.5) is often used: s ¼ s0 ðTÞexpðbdDA Þ
ð6:15:10Þ
which assumes that the conductivity occurs by tunneling or superexchange through the molecule. When, however, the conductivity is “ohmic,” then a thermally activated process and multiple incoherent scattering events through a set of N repeat units is assumed: s ¼ Lð1=NÞexpðDEDB =kB TÞ
ð6:15:11Þ
where L is another constant. Experimentally, this “ohmic” behavior is seen as s ðL=dDA Þ
ð6:15:12Þ
6.16 EQUILIBRIA IN AQUEOUS SOLUTION. PH In aqueous solution the dissociation of H2O is regulated by the equilibrium H2 OðlÞ > Hþ ðaqÞ þ OH ðaqÞ
ð6:16:1Þ
with equilibrium constant ½Hþ ½OH Kw ¼ 1:00 1014
ð6:16:2Þ
at 20 C. This value varies considerably with temperature. Note that the equilibrium constant is not completely in accordance with the Guldberg– Waage convention: It does not contain the overwhelmingly constant and large concentration [H2O(l)] ¼ 55.58 mol L1. For convenience, Eq. (6.16.2) can be reset in logarithmic form: pKw log10 Kw ¼ pH þ pOH log10 ½Hþ log10 ½OH
ð6:16:3Þ
The prefix “p” in pH, pOH, and so on, denotes the negative of the Briggsian (decimal)22 logarithm of whatever concentration unit (mol/L) follows the “p”; pH was defined by Sørensen23 in 1909 from pondus hydrogenii, or “mass of hydrogen”; this definition violates the mathematical requirement that the argument of a logarithm must be a dimensionless number; that is, if [Hþ] ¼ 3.4 106 mol/L, then pH is 5.47; the formal “repair” is to assume pH log10{[Hþ]/(1 M)}. In water, a solvent with a high dielectric constant, protons Hþ are not “free” species: They are bound as Hþ(aq) to one to many molecules of H2O, forming “hydronium ions” written as H3Oþ, H5O2þ, or H7O3þ. For simplicity, we will use the symbol Hþ when we really mean Hþ(aq) or H3Oþ(aq). 22 23
Henry Briggs (1561–1630). Søren Peder Lauritz Sørensen (1868–1939).
6.16
35 9
EQUILIBRIA IN AQUEOUS SOLUTION. PH
The pH of human blood is 7.4; the pH of the human stomach acids is between 1 and 2; the pH of the surface of oceans is slightly above 8. Strong acids (e.g., HCl, HNO3, H2SO4 (first dissociation)) have low pH, or a pH close to zero or even negative; strong bases (e.g., NaOH, or KOH) have high pH, close to 14. In strong acid solutions, (e.g., HCl(aq)), the concentration of undissociated acid, [HCl(aq)], is vanishingly small: [HCl(aq)] 0; similarly, in strong base (e.g., NaOH(aq)), [NaOH(aq)] 0. Thus in an aqueous solution of a strong acid, the hydrogen ion concentration [Hþ(aq)] corresponds to the formal molarity of the strong acid added to water; in a strong base the hydroxide ion concentration [OH(aq)] equals the molarity of the base. In a titration of a known concentration of a strong acid (e.g., HCl), with strong base (e.g., NaOH), the reaction of Eq. (6.16.1) is driven toward the left, creating water from the neutralization of as much Hþ(aq) as possible by the addition of OH(aq). The pH will stay low, until all the Hþ(aq) derived from the acid is eliminated; at that point there will be large pH change (e.g., from pH 2 to pH 8), as the solution turns basic with the excess OH added. Parenthetically, “conductivity water” with pH 7 has no extra solvated ions and has a resistivity of 18.3 MO cm (0.183 MO m), due exclusively to Hþ(aq) and OH(aq). In nonaqueous solvents, the classification of strong and weak acids, and the pH scales are dramatically different! When the acid is weak (e.g., CH3COOH, HF, etc.), it only dissociates partially: CH3 COOHðaqÞ > Hþ ðaqÞ þ CH3 COO ðaqÞ with a measurable equilibrium constant for acid dissociation: Ka ¼ ½Hþ ðaqÞ½CH3 COO ðaqÞ=½CH3 COOH ¼ 1:75 105 mol=L ð6:16:4Þ This equilibrium constant can be rewritten in logarithmic form pKa ¼ pH log10 f½CH3 COO ðaqÞ=½CH3 COOHg;
pKa ¼ 4:757 ð6:16:5Þ
Equation (6.16.5) is known to biologists, but not always to chemists, as the Henderson24–Hasselbalch25 equation. In the Brønsted26–Lowry27 [14,15] nomenclature of 1923, the acetate ion CH3COO(aq) is the conjugate base to acetic acid, CH3COOH(aq). Similarly, in the reaction for a weak base, ammonium hydroxide NH4OH, or, more appropriately, aqueous ammonia NH3(aq) the dissociation is NH3 ðaqÞ þ H2 OðlÞ > NH4 þ ðaqÞ þ OH ðaqÞ
24
Lawrence Joseph Henderson (1878–1942). Karl Albert Hasselbalch (1874–1962). 26 Johannes Nikolaus Brønsted (1879–1947). 27 Thomas Martin Lowry (1874–1936). 25
ð6:16:6Þ
360
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
for which there is a base dissociation constant: Kb ¼ ½NH4 þ ðaqÞ½OH ðaqÞ=½NH3 ðaqÞ ¼ 1:75 105 mol=L;
pKb ¼ 4:756 ð6:16:7Þ
and also pKb ¼ pOH log10 f½NH4 þ ðaqÞ=½NH3 ðaqÞg
ð6:16:8Þ
Again, NH4þ(aq) is the conjugate acid to NH3(aq). Reacting, or “titrating” acetic acid CH3COOH with a strong base (e.g., NaOH) causes small pH changes, until close to the end point: the titration reaction is CH3 COOHðaqÞ þ OH ðaqÞ ! CH3 COO ðaqÞ þ H2 OðlÞ
ð6:16:9Þ
When the titration is half-accomplished that is, when about half the necessary base was added to the acetic acid, and thus [CH3COO(aq)]/[CH3COOH] 1—then we are in the middle of the “buffer region”: Ka ½Hþ ðaqÞ;
or pH pKa
ð6:16:10Þ
In this buffer region (typically 2 pH units wide), which is essential in human physiology, relatively large changes in the ratio of acid to conjugate base (e.g., [CH3COO(aq)]/[CH3COOH] from 0.2 to 5.0) will cause only small increases in pH. When the buffer capacity is exhausted, at the end of the titration of weak acid with strong base, a large pH change will be seen; at the “equivalence point” of the titration, all the CH3COOH acid will have been converted to conjugate base CH3COO(aq), and the dominant new reaction will be its hydrolysis: CH3 COO ðaqÞ þ H2 OðlÞ ! CH3 COOHðaqÞ þ OH ðaqÞ
ð6:16:11Þ
with the equilibrium constant Kb for (conjugate) base dissociation: Kb ¼ ½CH3 COO =½CH3 COOH½OH ¼ Kw =Ka ¼ 1:0 1014 =1:75 105 ¼ 5:71 1010 ð6:16:12Þ The students’ job is to detect an “end point” as close as possible to that equivalence point, by using either organic indicator acids, which change color in that pH range, or conductivity changes, or pH-sensitive voltage measurements. The acetic acid problem must simultaneously satisfy two equilibria: Ka ¼ ½Hþ ½CH3 COO =½CH3 COOH ¼ 1:75 105 mol=L
pKa ¼ 4:757
ðð6:16:4ÞÞ
½Hþ ½OH Kw ¼ 1:00 1014
pKw ¼ 14:00
ðð6:16:2ÞÞ
6.16
36 1
EQUILIBRIA IN AQUEOUS SOLUTION. PH
plus two equations for the conservation of mass and charge ½CH3 COO þ ½CH3 COOH ¼ c1
ð6:16:13Þ
½CH3 COO þ ½OH ¼ ½Hþ
ð6:16:14Þ
where c1 is the total analytical concentration. Altogether, four equations in the four unknowns [Hþ], [CH3COO], [CH3COOH], and [OH]. In the Brønsted– Lowry nomenclature, one can conveniently redefine the acid concentration as [CH3COOH] [A] and redefine its conjugate base concentration as [CH3COO] [B]. If the concentration of acid before reaction is CA, and the concentration of conjugate base before reaction is CB, then a convenient master equation is [16]: Ka ¼ ½Hþ fCB þ ½Hþ ½OH g=fCA ½Hþ þ ½OH g
ð6:16:15Þ
Using [OH] ¼ Kw/[Hþ] in this master equation and solving for [Hþ] yields a cubic equation: ½Hþ 3 þ ðCB þ Ka Þ½Hþ 2 ðCA Ka þ Kw Þ½Hþ Kw Ka ¼ 0
ð6:16:16Þ
½CH3 COO ¼ CB þ ½Hþ ½OH
ð6:16:17Þ
½CH3 COOH ¼ CA ½Hþ þ ½OH
ð6:16:18Þ
which is not always convenient to solve. This master equation requires for CA and CB not their initial values before the titration started, but their formal concentrations (“analytical concentrations”) computed by the analyst at any given point due to addition of chemicals in the titration; for the ensuing reaction (dissociation or hydrolysis), the equation will calculate [Hþ], from which new “equilibrium concentration” values of [OH], [CH3COO], and [CH3COOH] are trivially obtained. If a salt of the conjugate base (e.g., sodium acetate) has been added to a solution of acetic acid, then CB is increased accordingly. As the titration proceeds, values of CA and/or CB are computed at every step by the user. Past the equivalence point, CA becomes negligibly small but CB becomes large, because of conversion of weak acid to conjugate base, and any extra amounts of OH must be a separate term added to CB. In most cases a cubic equation is not needed, because often one term (e.g., [OH]) can be neglected, so that only quadratic equations must be solved. The titration of the weak acid CH3COOH with the strong base NaOH is discussed in detail in Problem 6.16.1 and shown in Fig. 6.8. The buffer capacity is defined as p Db=DðpHÞ Da=DðpHÞ
ð6:16:19Þ
where Da and Db are the changes in acid or base concentration due to D(pH), the change in pH. There are several “polybasic” acids, with successive equilibria: H2 SO4 ðaqÞ > HSO4 ðaqÞ þ Hþ ðaqÞ
Ka1 ¼ 1 mol=L;
pKa1 ¼ 1 ð6:16:20Þ
362
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
10
9
8
7
6 pH
5
p(acetic acid)
FIGURE 6.8 Titration of 50 mL of 2 104 M acetic acid with 4 104 M sodium hydroxide. Equivalence point after 25 mL of base were added: pH 7.47.
4 p(acetate) 3 0
5
10
15
20
25
30
mL [ OH– ] added
HSO4 ðaqÞ > SO4 þ Hþ ðaqÞ
Ka2 ¼ 1:0 102 mol=L;
pKa2 ¼ 1:99 ð6:16:21Þ
for sulfuric acid; CO2 ðaqÞ þ H2 OðlÞ > HCO3 ðaqÞ þ Hþ ðaqÞ Ka1 ¼ 4:45 107 mol=L; HCO3 ðaqÞ > CO3 þ Hþ ðaqÞ
pKa1 ¼ 6:352
ð6:16:22Þ
Ka2 ¼ 4:69 1011 mol=L; pKa2 ¼ 10:329 ð6:16:23Þ
for “carbonic acid”; and finally H3 PO4 ðaqÞ > H2 PO4 ðaqÞ þ Hþ ðaqÞ
Ka1 ¼ 7:11 103 mol=L; pKa1 ¼ 2:148 ð6:16:24Þ
H2 PO4 ðaqÞ > HPO4 2 ðaqÞ þ Hþ ðaqÞ Ka2 ¼ 6:32 109 mol=L; HPO4 2 ðaqÞ > PO4 3 ðaqÞ þ Hþ ðaqÞ
pKa2 ¼ 7:199
ð6:16:25Þ
Ka3 ¼ 4:5 1013 mol=L; pKa3 ¼ 12:35 ð6:16:26Þ
for phosphoric acid. Polyacidic bases are more rare.
6.16
36 3
EQUILIBRIA IN AQUEOUS SOLUTION. PH
To treat the phosphoric acid titrations step by step [16], one must identify all the reactions, including three acid–base dissociations, water dissociation, and two equations for the conservation of mass and charge for the six unknowns [H3PO4], [H2PO4], [HPO42], [PO43], [Hþ], [OH]; at any given pH, typically one of these six equations will be dominant. A master equation similar to Eq. (6.16.15) for dibasic acids would require solving a quartic equation and is impractical; for a tribasic acid, a master equation would require solving a quintic equation, for which no closed-form solutions are possible: in all these cases, useful approximations deal only with the significant concentrations and thus involve at most quadratic equations. A useful quantity is the fraction of total acid that is in any of its intermediate states. For a dibasic acid, this is CA ½H2 A þ ½HA þ ½A2 ¼ ½H2 Af1 þ Ka1 =½Hþ þ Ka1 Ka2 ½Hþ 2 g ð6:16:27Þ a0 ½H2 A=CA ¼ ½Hþ 2 f½Hþ 2 þ Ka1 ½Hþ þ Ka1 Ka2 g
1
a1 ½HA =CA ¼ Ka1 ½Hþ f½Hþ 2 þ Ka1 ½Hþ þ Ka1 Ka2 g a2 ½A2 =CA ¼ Ka1 Ka2 f½Hþ 2 þ Ka1 ½Hþ þ Ka1 Ka2 g
ð6:16:28Þ 1
1
ð6:16:29Þ ð6:16:30Þ
For a general n-basic acid, these expressions become a0 ½Hn A=CA ¼ ½Hþ n f½Hþ n þ Ka1 ½Hþ n1 þ Ka1 Ka2 ½Hþ n2 . . . þ . . . Kan g1
ð6:16:31Þ
a1 ½Hn1 A =CA ¼ Ka1 ½Hþ n1 f½Hþ n þ Ka1 ½Hþ n1 þ Ka1 Ka2 ½Hþ n2 . . . þ . . . Kan g1
ð6:16:32Þ
a2 ½Hn2 A2 =CA ¼ Ka1 Ka2 ½Hþ n2 f½Hþ n þ Ka1 ½Hþ n1 þKa1 Ka2 ½Hþ n2 . . . þ . . . Kan g1 ... an ½A =CA ¼ Ka1 Ka2 . . . Kan f½Hþ n þ Ka1 ½Hþ n1 þ Ka1 Ka2 ½Hþ n2 . . . þ . . . Kan g1
ð6:16:33Þ
PROBLEM 6.16.1. [Naþ] does not formally enter into Eq. (6.16.15), which in a titration of CH3COOH with NaOH requires 1. [Naþ] þ [Hþ] ¼ [OH] þ [CH3COO] (charge balance) If we were to include [Naþ], we would have this charge balance equation, plus four more equations, totaling 5 equations in the 5 unknowns [CH3COOH], [CH3COO], [Hþ], [OH], and [Naþ]:
364
6
2. 3. 4. 5.
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
[CH3COOH] þ [CH3COO] ¼ CA þ CB [Naþ] ¼ CB [Hþ][CH3COO]/[CH3COOH] ¼ Ka [Hþ][OH] ¼ Kw
(fate of acetic acid) (fate of sodium hydroxide) (acetic acid dissociation) (water dissociation)
PROBLEM 6.16.2. Prove Eq. (6.16.15) and Eq. (6.15.16). PROBLEM 6.16.3. Sketch curves for the progress of a titration of 50 mL of 2 104 M acetic acid, a weak acid (Ka ¼ 1.75 105 mol/L), with a strong base, 4 104 M NaOH. The equivalence point of the titration will be reached when 25 mL of NaOH will have been added. The following regions require different equations to be solved: (i) Before any base is added, there are equal amounts of [Hþ] and [CH3COO] produced by the dissociation of acetic acid; show that [Hþ] ¼ [CH3COO] ¼ 5.1054 105 M and [CH3COOH] ¼ 1.4895 104 M. (ii) For the first few milliliters of base added, more OH will react with CH3COOH to produce Hþ and CH3COO. (iii) In the buffer region. (iv) At the equivalence point, all the initial concentration of CH3COOH will have been converted to acetate, CH3COO, and hydrolysis will set in: Acetate will react with water to produce equal amounts of OH and CH3COOH. (v) Beyond the equivalence point, the pH is dominated by the addition of excess strong base; the acid is gone (except for a small back-reaction); the conjugate base has been made as large as possible by the equivalence point and now is slightly affected by [Hþ] or the the extra [OH] produced. PROBLEM 6.16.4. Repeat the titration of acetic acid, but now starting with 50 mL of 1 M acetic acid and adding y mL of 1 M NaOH. PROBLEM 6.16.5. For a dibasic acid H2A with two equilibria H2 A > Hþ þ HA
Ka1 ¼ ½Hþ ½HA =½H2 A
HA > Hþ þ A2
Ka2 ¼ ½Hþ ½A2 =½HA
plus a stoichiometric condition CA ¼ ½H2 A þ ½HA þ ½A2 and the electroneutrality condition ½Hþ ¼ ½HA þ 2½A2 we get four equations in four unknowns. Certain simplifications are possible:
6.17
36 5
EQUILIBRIA IN NONAQUEOUS SOLVENTS
1. As a first approximation show that by (1a) ignoring the second dissociation: [A] 0, whence the electroneutrality condition yields [Hþ] [HA]; (1b) assuming [Hþ] [OH], and (1c) setting CB ¼ 0, then these approximations used in the master equation yield ½Hþ fKa1 fCA ½Hþ g
1=2
Note also that within this approximation the second dissociation yields ½A2 ¼ Ka2 ½HA =½Hþ Ka2 2. Show that as a second approximation:
½Hþ ¼ ½HA þ ½A2 ½HA ¼ ½HA ½A2 from which a better value for [A2], called [A2], can also be calculated.
6.17 EQUILIBRIA IN NONAQUEOUS SOLVENTS Brønsted–Lowry acid–base equilibria in nonaqueous solvents are very different from those in water, because bare protons (Hþ) do not exist with the nonaqueous solvent molecules, but stay close, by Coulomb attraction, to the corresponding conjugate base. Thus perchloric acid HClO4 will dissociate into its conjugate base ClO4 plus the associated proton pþ or Hþ. Much depends on the dielectric constant e of the solvent. Many solvents are amphiprotic: They can undergo autoprotolysis (e.g., 2H2 O > H3 Oþ þ OH ; 2C2 H5 OH > C2 H5 OH2 þ þ C2 H5 O ; 2CH3 COOH > CH3 COOH2 þ þ CH3 COO ; 2NH3 > NH4 þ þ NH2 , or in general 2S > SHþ þ S ); some are aprotic (e.g., benzene, C6H6, and carbon tetrachloride, CCl4); some are basic but not acidic; none are known to be acidic but not basic. Table 6.1 lists some acid dissociation constants in different solvents (similar data are available for bases). The Hammett28 acidity function H0 can be used to measure or estimate the acid dissociation pKa for the conjugate acid BHþ of many weak bases B in solvents of high dielectric constant: H0 pKa þ log10 f½B=½BHþ g
ð6:17:1Þ
This function and its dependence on acid concentration yields estimates of pKa for weak uncharged bases.
28
Louis Plack Hammett (1894–1987).
366
6
Table 6.1
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
First Acid Dissociation Constants (pKa1) in Various Solvents H2O e ¼ 78.30
Acid HClO4 H2SO4 HCl HCOOH CH3COOH C6H5COOH
1 1 1 3.75 4.76 4.20
C2H5OH e ¼ 24.55
CH3COOH e ¼ 6.70 4.87 7.27 8.55
9.15 10.32 10.25
6.18 LEWIS ACIDS AND LEWIS BASES A vast generalization beyond the Brønsted–Lowry acids and bases concepts is the concept of a Lewis29 base (an electron pair donor) and a Lewis acid (an electron pair acceptor). This concept has been used extensively in all branches of chemistry. In physical organic chemistry, quantities of the type pA ¼ log10[A] have used extensively to study reactivities—for example, in the Hammett equation.
6.19 ELECTROCHEMISTRY. ELECTRODE POTENTIALS, AND THE NERNST EQUATION For oxidation–reduction reactions in aqueous solution under an externally applied electrical potential, or in its absence, one can write O for the oxidized species and R for the reduced species, and the half-cell reaction can be written as nO OðaqÞ þ ne > nR RðaqÞ
ð6:19:1Þ
where nO (or ions) of O produce nR molecules (or ions) of R, with the addition of n electrons. This reduction must be matched by an oxidation at the other electrode. By convention, we can use the normal hydrogen electrode (NHE) or standard hydrogen electrode (SHE), which assumes unit hydrogen ion activity coefficient (1 M) and 1 atm H2(g) pressure: The half-cell standard reduction potential for SHE or NHE is defined as o o ¼ 0.000 V ¼ Eox : Ered nHþ ðaq; 1 MÞ þ ne > ðn=2ÞH2 ðgÞ
ð6:19:2Þ
This then gives us an overall cell reaction nO OðaqÞ þ ðn=2ÞH2 ðg; 1 atmÞ > nHþ ðaq; 1 MÞ þ nR RðaqÞ
29
Gilbert Newton Lewis (1875–1946).
ð6:19:3Þ
6.19
E L E C T R O C H E M I S T R Y . E L E C T R O D E P O T E N T I A L S , A N D T H E N E R N S T E Q UA T I O N
The Gibbs free energy of this reaction is given by DG ¼ DGo þ RTlne f½Rn R½Hþ n =½OðaqÞnO ½H2 ðgÞn=2 g
ð6:19:4Þ
As a concrete example, consider (a) an electrochemical cell involving two Pt electrodes (the left one, PtL, and the right one, PtR) and (b) the equilibrium between silver chloride, a salt with sparing solubility in water (Ksp ¼ [Agþ][Cl] ¼ 1.8 1010 mol2 L2), and a hydrogen electrode: Hydrogen gets oxidized, and silver chloride gets reduced: H2 ðgÞ > 2Hþ ðaqÞ þ 2e ðPtL Þ
ð6:19:5Þ
AgClðsÞ þ e ðPtR Þ > AgðsÞ þ Cl ðaqÞ
ð6:19:6Þ
For chemical balance, the two half-reactions have to involve the same number of net electrons. By convention, half-reactions are tabulated internationally as reductions; thus the above half-reactions are assigned: 2Hþ ðaqÞ þ 2e ðPtL Þ > H2 ðgÞ 2AgClðsÞ þ 2e ðPtR Þ > 2AgðsÞ þ 2Cl ðaqÞ
Ered1
ð6:19:7Þ
Ered2
ð6:19:8Þ
Electrochemists usually avoid the suffix “red” in Ered1; it is added here for emphasis. The half-reaction of Eq. (6.19.5) is an oxidation; the half-reaction of Eq. (6.19.6) is a reduction. The net reaction is the sum 2AgClðsÞ þ H2 ðgÞ þ 2e ðPtR Þ > 2AgðsÞ þ 2Cl ðaqÞ þ 2Hþ ðaqÞ þ 2e ðPtL Þ ð6:19:9Þ The overall electrochemical reaction is 2AgClðsÞ þ H2 ðgÞ > 2AgðsÞ þ 2Cl ðaqÞ þ 2Hþ ðaqÞ
ð6:19:10Þ
and the net cell potential is Ecell ¼ Ered2 Ered1
ð6:19:11Þ
The overall cell reaction is written diagrammatically as PtL jH2 ðgÞjHClðaqÞjAgClðsÞjAgjPtR
ð6:19:12Þ
where the single vertical bar denotes a phase boundary. By convention, for chemists, anode is the electrode at which oxidation occurs [the left electrode in Eq. (6.19.6) and also in Eq. (6.19.12)], and cathode is the reaction where reduction occurs [the right electrode in Eq. (6.19.5) and also in Eq. (6.19.12)]. (Alas, physicists consider the cathode as the electrode from which electrons are emitted: this can disagree with the chemists’ convention.) If the reaction is spontaneous as written, then DGcell < 0, and also the cell potential is positive (Ecell > 0), and, when a wire connects PtL to PtR in Eq. (6.19.12), then electrons will flow externally through a wire from PtL to PtR, until the chemical concentrations reach equilibrium values, and the reaction stops. Such a cell
36 7
368
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
is considered a galvanic cell. If the reaction is nonspontaneous as written, then (i) DGcell > 0, (ii) Ecell < 0, and (iii) the cell is called an electrolytic cell (driven by an outside source of potential). It can be shown that DGcell ¼ nFEcell
ð6:19:13Þ
where F is the Faraday30 constant, F ¼ 96,485 J/V, and n is the number of electrons involved in the balanced half-reactions—for example, in Eqs. (6.19.5) and (6.19.6). The Nernst31 equation is the electrochemical analog of Eq. (6.2.19): "
½OnO ½H2 ðgÞn=2 E ¼ EN þ ðRT=nFÞlne ½RnR ½Hþ n
# ð6:19:14Þ
Since the hydrogen electrode by convention was set to E ¼ 0.000 for unit activities of both Hþ and H2(g), therefore this simplifies to E ¼ EN þ ðRT=nFÞlne f½OðaqÞnO =½RnR g
ð6:19:15Þ
For example, for the AgCl-H2 electrode reaction, Eq. (6.19.10), the cell potential is 2 N 2 þ E ¼ EN red2 Ered1 ðRT=2FÞlne f½Cl ðaqÞ fH ðaqÞ g
CE
RE
WE
FIGURE 6.9 Conventional symbols for a three-electrode electrochemical cell. Most of the IR drop is between WE and CE.
ð6:19:16Þ
(assuming that all activity coefficients g ¼ 1). Several standard electrode potentials (reduction potentials at unit activities at 298.15 K) are listed in Table 6.2. The standard potentials are valid at “zero current”—that is, before any electrons are ever moved. In practical cells and when finite currents are passed, the cell potentials are affected by the finite resistance R of the electrolyte, which causes an “IR drop” across the cell, and also by “overpotentials,” due to polarizations of the solution caused by (i) a finite mass transfer rate, (ii) a preceding reaction, or (iii) charge-transfer . If the “IR drop” is less than 0.002 V, then two-electrode cells are adequate for reproducible measurements (e.g., in polarography). In general, to compensate for larger IR drops, a three-electrode setup is used: Most of the current I is passed between (i) the working electrode (WE) and (ii) an auxiliary electrode or counter electrode (CE), between which most of the IR drop will occur. The potential is monitored between WE and (iii) a reference electrode (RE), which draws very little current; it is most often an NHE, or a standard calomel electrode (Hg|Hg2Cl2|KCl), or an Ag|AgNO3 electrode . A hopefully small fraction of the overall internal resistance, known as the “uncompensated” resistance Ru, will still be present between WE and RE; the goal is to make Ru/R tolerably small. Figure 6.9 shows the symbols used for three-electrode electrochemical cells.
30 31
Michael Faraday (1791–1867). Walther Hermann Nernst (1864–1941).
6.19
E L E C T R O C H E M I S T R Y . E L E C T R O D E P O T E N T I A L S , A N D T H E N E R N S T E Q UA T I O N
Table 6.2 Selected Standard Electrode Reduction Potentials EN red in Water (V vs. NHE) at 298.15 K, Assuming Unimolar Concentrations or Activities for Solutes and Unit Pressures or Fugacities for Gases (Ordered First by Potential, then Alphabetically) (ordered by potential) (1/2)F2(g) þ Hþ(aq) þ e ! HF(aq)
þ3.05
F2(g) þ 2e ! 2F(aq)
þ2.87
þ
HMnO4 (aq) þ 3H (aq) þ 2e ! MnO2(s) þ 2H2O(l)
þ2.09
O3(g) þ 2 Hþ(aq) þ 2e ! O2(g) þ H2O(l)
þ2.075
S2O82(aq) þ 2e ! 2SO42(aq)
þ2.010
þ
Ag (aq) þ e ! Ag (aq)
þ1.98
BrO4(aq) þ 2Hþ(aq) þ 2e ! BrO3(aq) þ H2O(l)
þ1.85
2þ
Co (aq) þ e ! Co (aq) 3þ
þ1.82
2þ
H2O2(aq) þ 2Hþ(aq) þ 2e ! 2H2O(l) þ
þ1.776
þ
AgO(s) þ 2H (aq) þ e ! Ag (aq) þ H2O(l)
þ1.77
MnO4(aq) þ 4Hþ(aq) þ 3e 4þ 2þ
þ1.70
! MnO2(s) þ 2H2O(l)
Pb (aq) þ 2e ! Pb (aq)
þ1.69
þ
PbO2(s) þ 4H (aq) þ SO42(aq) þ 2e ! Co3þ(aq) þ e ! Co2þ(in 3 M HNO3) þ
PbSO4(s) þ 2H2O(l)
Au (aq) þ e ! Au(s)
þ1.685 þ1.68 þ1.68
HClO2(aq) þ 2Hþ(aq) þ 2e ! HClO(aq) þ H2O(l)
þ1.67
Ag2O3(s) þ 6Hþ(aq) þ 4e ! 2Agþ(aq) þ 3H2O(l)
þ1.67
HClO(aq) þ Hþ(aq) þ e ! (1/2)Cl2(g) þ H2O(l)
þ1.63
Ce (aq) þ e ! Ce (aq)
þ1.61
NiO2(s) þ 4Hþ(aq) þ 2e ! Ni2þ(aq) þ 4OH(aq)
þ1.59
Au3þ(aq) þ 3e ! Au(s)
þ1.52
Mn3þ(aq) þ e ! Mnþþ(aq)
þ1.51
MnO4(aq) þ 8Hþ(aq) þ 5e ! Mn2þ(aq) þ 4H2O(l) 2ClO3(aq) þ 12Hþ(aq) þ 10e ! Cl2(g) þ 6H2O(l) PbO2 (s, a)þ 4Hþ(aq) þ 2e ! Pb2þ(aq) þ 2H2O(l) PbO2 (s, b) þ 4Hþ(aq) þ 2e ! Pb2þ(aq) þ 2H2O(l) BrO3(aq) þ 5Hþ(aq) þ 4e ! HBrO(aq) þ 2H2O(l) Ce4þ(aq) þ e ! Ce3þ (in 1 M H2SO4)
þ1.491
4þ
3þ
þ
þ1.49 þ1.468 þ1.460 þ1.45 þ1.44
2HIO(aq) þ 2H (aq) þ 2e ! I2(s) þ 2H2O(l)
þ1.44
2NH3OHþ(aq) þ Hþ(aq) þ 2e ! N2H5þ(aq) þ 2H2O(l)
þ1.42
þ
CoO2(s) þ 4H (aq) þ e ! Co (aq) þ 2H2O(l)
3þ
þ1.42
Cl2(g) þ 2e ! 2Cl (aq)
þ1.3583
Cr2O72(aq) þ 14Hþ(aq) þ 6e ! 2Cr3þ(aq) þ 7H2O(l)
þ1.33
þ
Au (aq) þ 2e ! Au (aq)
þ1.29
Tl3þ(aq) þ 2e ! Tlþ(aq)
þ1.247
3þ
þ
O2(g) þ 4H (aq) þ 4e ! 2H2O(l) þ
MnO2(s) þ 4H (aq) þ 2e ! Mn (aq) þ 2H2O(l) 2þ
þ1.229 þ1.208 (continued)
36 9
370
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
Table 6.2 (Continued ) ClO4(aq) þ 2Hþ(aq) þ 2e ! ClO3(aq) þ H2O(l)
þ1.20
2IO3(aq) þ 12Hþ(aq) þ 10e ! I2(s) þ 6H2O(l)
þ1.20
ClO2(g) þ Hþ(aq) þ e ! HClO2(g)
þ1.19
Pt2þ(aq) þ 2e ! Pt(s)
þ1.188
ClO3(aq) þ 2Hþ(aq) þ e ! ClO2(g) þ H2O(l) þ
þ1.18
Ag2O(s) þ 2H (aq) þ 2e ! 2Ag(s) þ H2O(l)
þ1.17
HSeO4(aq) þ 3Hþ(aq) þ 2e ! H2SeO3(aq) þ H2O(l)
þ1.15
AuCl2(aq) þ e ! Au(s) þ 2Cl(aq) IO3(aq) þ 5Hþ(aq) þ 4e ! HIO(aq) þ 2H2O(l) Cu2þ(aq) þ 2CN (aq) þ e ! Cu(CN)2 (aq) Br2(aq) þ 2e ! 2Br(aq) Br2(l) þ 2e ! 2Br(aq)
þ1.15 þ1.13 þ1.12 þ1.087 þ1.066
VO2þ(aq) þ 2Hþ(aq) þ e 2þ
! VO (aq) þ H2O(l)
þ1.00
Pd (aq) þ 2e ! Pd(s)
þ0.987
NO3(aq) þ 4Hþ(aq) þ 3e ! NO(g) þ 2H2O(l)
þ0.96
AuBr2(aq) þ e ! Au(s) þ 2Br(aq)
þ0.96
MnO2(s) þ 4Hþ(aq) þ e ! Mn3þ(aq) þ 2H2O(l)
þ0.95
AuCl4(aq) þ 3e 2þ
þ0.93
2þ
! Au(s) þ 4Cl (aq)
2Hg (aq) þ 2e ! Hg2 (aq)
þ0.905
MnO4(aq) þ Hþ(aq) þ e ! HMnO4 (aq)
þ0.90
2þ
Hg (aq) þ 2e ! Hg(l)
þ0.85
AuBr4(aq) þ 3e ! Au(s) þ 4Br(aq)
þ0.85
NO3(aq) þ 2Hþ(aq) þ 3e ! NO2(g) þ H2O(l)
þ0.80
2þ
þ
Ag (aq) þ e ! Ag(s)
þ0.7996
Hg22þ(aq) þ 2e ! 2Hg(l)
þ0.7961
Fe (aq) þ e ! Fe (aq)
þ0.77
Fe3þ(aq) þ e ! Fe2þ(in 1 M HCl)
þ0.770
PtCl42(aq) þ 2e ! Pt(s) þ 4Cl(aq) H2SeO3(aq) þ 4Hþ(aq) þ 4e ! Se(s) þ 3H2O(l) PtCl62(aq) þ 2e ! PtCl42(aq) þ 2Cl(aq) 3þ
þ0.758
3þ
2þ
Tl (aq) þ 3e ! Tl(s)
þ
þ0.74 þ0.726 þ0.72
p-benzoquinone þ 2H (aq) þ 2e ! hydroquinone
þ0.6992
Fe(CN)63(aq) þ e ! Fe(CN)64(in 1 M H2SO4)
þ0.69
O2(g) þ 2Hþ(aq) þ 2e ! 2H2O2(l)
þ0.682
H2MoO4(aq) þ 2Hþ(aq) þ 2e ! MoO2(s) þ 2H2O(l)
þ0.65
Hg2SO4(s) þ 2e ! 2Hg(l) þ 2SO42(aq) S2O32(aq) þ 6Hþ(aq) þ 2e ! (1/4) S8(s) þ 3H2O(l)
þ0.6158
MnO4(aq) þ 2H2O(l) þ 3e ! MnO2(s) þ 4OH(aq)
þ0.59
AuI2(aq) þ e
! Au(s) þ 2I (aq)
þ0.60 þ0.58
H3AsO4(aq) þ 2Hþ(aq) þ 2e ! H3AsO3(aq) þ H2O(l)
þ0.56
AuI4(aq) þ 3e ! Au(s) þ 4I(aq)
þ0.56
I2(s) þ 2e ! 2I (aq)
þ0.535
6.19
E L E C T R O C H E M I S T R Y . E L E C T R O D E P O T E N T I A L S , A N D T H E N E R N S T E Q UA T I O N
Table 6.2 (Continued ) I3(s) þ 2e ! 3I(aq)
þ0.5338
CO(g) þ 2Hþ(aq) þ 2e ! C(s) þ H2O(l)
þ0.52
Cuþ(aq) þ e ! Cu(s)
þ0.520
SO2(aq) þ 4Hþ(aq) þ 4e ! (1/8)S8(s) þ 2H2O(l)
þ0.50
CH3OH(aq) þ 2Hþ(aq) þ 2e ! CH4(g) þ H2O(l)
þ0.50
þ
H2MoO4(aq) þ 6H (aq) þ 3e ! Mo (aq) þ 2H2O(l)
þ0.43
O2(g) þ 2H2O(l) þ 4e ! 4OH (aq)
þ0.401
Fe(CN)63(aq) þ e ! Fe(CN)64(aq)
þ0.36
Ag2O(s) þ H2O(l) þ 2e ! 2Ag(s) þ 2OH(aq)
þ0.342
3þ
Cu (aq) þ 2e ! Cu(s) 2þ
þ
þ0.340
VO (aq) þ 2H (aq) þ e ! V (aq) þ H2O(l)
þ0.337
UO2þ(aq) þ 4Hþ(aq) þ e ! U4þ(aq) þ 2H2O(l)
þ0.323
2þ
3þ
Bi (aq) þ 3e ! Bi(s) 3þ
þ0.308
Re (aq) þ 3e ! Re(s)
þ0.300
Hg2Cl2(s) þ 2e ! 2Hg(l) þ 2Cl(aq)
þ0.2682
GeO(s) þ 2Hþ(aq) þ 2e ! Ge(s) þ H2O(l)
þ0.26
Hg2Cl2(s, calomel) þ 2e ! 2Hg(l) þ 2Cl(in sat’d KCl) (SCE)
þ0.2415
3þ
þ
H3AsO3(aq) þ 3H (aq) þ 3e ! As(s) þ 3H2O(l)
þ0.24
AgCl(s) þ e ! Ag(s) þ Cl (aq, 1 M KCl)
þ0.236
AgCl(s) þ e ! Ag(s) þ Cl(aq)
þ0.22233
AgCl(s) þ e ! Ag(s) þ Cl (aq, 4 M KCl)
þ0.200
SbOþ(aq) þ 2Hþ(aq) þ 3e ! Sb(s) þ H2O(l)
þ0.20
AgCl(s) þ e ! Ag(s) þ Cl(aq, saturated KCl)
þ0.197
þ
TiO (aq) þ 2H (aq) þ e ! Ti (aq) þ H2O(l)
þ0.19
SO42(aq) þ 4Hþ(aq) þ 2e ! SO2(g) þ 2H2O(l)
þ0.17
2þ
3þ
UO2þ(aq)
þ0.163
HSO4(aq) þ 3Hþ(aq) þ 2e ! SO2(g) þ 2H2O(l)
þ0.16
Cu2þ(aq) þ e ! Cuþ(aq)
þ0.159
UO22þ(aq) þ e
!
Sn (aq) þ 2e ! Sn (aq)
þ0.15
(1/8)S8(s) þ 2Hþ(aq) þ 2e ! H2S(g)
þ0.14
HCHO(aq) þ 2Hþ(aq) þ e ! CH3OH(aq)
þ0.13
4þ
2þ
þ
C(s, graphite) þ 4H (aq) þ 4e ! CH4(g)
þ0.13
Ge4þ(aq) þ 4e ! Ge(s)
þ0.12
þ
H2MoO4(aq) þ 6H (aq) þ 6e ! Mo(s) þ 4H2O(l)
þ0.11
N2H4(aq) þ 4H2O(l) þ 2e ! 2NH4þ(aq) þ 4OH(aq)
þ0.11
2þ
Ru(NH3)6 (aq) þ e ! Ru(NH3)6 (aq)
þ0.10
Cu(NH3)42þ(aq) þ e
þ
þ0.10
3þ
! Cu(NH3)2 (aq) þ 2NH3(aq)
HgO(s) þ H2O(l) þ 2e ! Hg(l) þ 2OH(aq) þ
þ0.0977
N2(g) þ 2H2O(l) þ 6H (aq) þ 6e ! 2NH4OH(aq)
þ0.092
Fe3O4(s) þ 8Hþ(aq) þ 8e ! 3Fe(s) þ 4H2O(l)
þ0.085
S4O6 (aq) þ 2e ! 2S2O3 (aq) 2
2
þ0.08 (continued)
37 1
372
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
Table 6.2 (Continued ) AgBr(s) þ e ! Ag(s) þ Br (aq)
þ0.07133
2Hþ(aq) þ 2e ! H2(g) (standard hydrogen electrode, SHE)
0 (by definition)
Fe (aq) þ 3e ! Fe(s)
0.04
HCOOH(aq) þ 2Hþ(aq) þ 2e ! HCHO(aq) þ H2O(l)
0.03
P(s, white) þ 3Hþ(aq) þ 3e ! PH3(g)
0.063
3þ
þ
WO3(s, probably) þ 6H (aq) þ 6e ! W(s) þ 3H2O(l)
0.09
SnO2(s) þ 2Hþ(aq) þ 2e ! SnO(s) þ H2O(l)
0.09
þ
SnO(s) þ 2H (aq) þ 2e ! Sn(s) þ H2O(l)
0.10
Se(s) þ 2Hþ(aq) þ 2e ! H2Se(g)
0.11
CO2(g) þ 2Hþ(aq) þ 2e ! CO(g) þ H2O(l)
0.11
þ
CO2(g) þ 2H (aq) þ 2e ! HCOOH(aq)
0.11
P(s, red) þ 3Hþ(aq) þ 3e ! PH3(g)
0.111
þ
WO2(g) þ 4H (aq) þ 4e ! W(s) þ 2H2O(l)
0.12
Pb (aq) þ 2e ! Pb(Hg)
0.1205
Pb2þ(aq) þ 2e ! Pb(s)
0.1263
2þ
þ
O2(g) þ H (aq) þ e ! HOO(aq)
0.13
Sn2þ(aq) þ 2e ! Sn(s)
0.1364
þ
Si(s) þ 4H (aq) þ 4e ! SiH4(g) þ
0.14
MoO2(s) þ 4H (aq) þ 4e ! Mo(s) þ 2H2O(l)
0.15
AgI(s) þ e ! Ag(s) þ I (aq)
0.15224
As(s) þ 3Hþ(aq) þ 3e ! AsH3(g)
0.23
V3þ(aq) þ e ! V2þ (aq)
0.255
Ni (aq) þ 2e ! Ni(s) 2þ
þ
0.25
H3PO4(aq) þ 2H (aq) þ 2e ! H3PO3(aq) þ H2O(l)
0.276
Co2þ(aq) þ 2e ! Co(s)
0.28
Ge(s) þ 4Hþ(aq) þ 4e ! GeH4(g)
0.29
Cd2þ(aq) þ 2e ! Cd(Hg)
0.3521
In (aq) þ 3e ! In(s) 3þ
0.34
Eu (aq) þ e ! Eu (aq)
0.35
Tlþ(aq) þ e ! Tl(s)
0.3365
PbSO4(s) þ 2e ! Pb(Hg) þ SO42(aq)
0.3505
3þ
þ
2þ
Tl (aq) þ e ! Tl(Hg)
0.3568
PbSO4(s) þ 2e ! Pb(s) þ SO42(aq)
0.3588
Cu2O(s) þ H2O(l) þ 2e ! 2Cu(s) þ 2OH (aq)
0.360
PbI2(s) þ 2e ! Pb(s) þ 2I(aq)
0.365
þ
GeO2(s) þ 2H (aq) þ e ! GeO(s) þ 2H2O(l)
0.37
Cd (aq) þ 2e ! Cd(s)
0.4026
Cr3þ(aq) þ e ! Cr2þ(aq)
0.42
2þ
2CO2(g) þ 2Hþ(aq) þ 2e ! HOOCCOOH(aq)
0.43
Fe2þ(aq) þ 2e ! Fe(s)
0.44
H3PO3(aq) þ 3Hþ(aq) þ 3e ! P(s, red) þ 3H2O(l) þ
H3PO3(aq) þ 2H (aq) þ 2e ! H3PO2(aq) þ H2O(l)
0.454 0.499
6.19
E L E C T R O C H E M I S T R Y . E L E C T R O D E P O T E N T I A L S , A N D T H E N E R N S T E Q UA T I O N
Table 6.2 (Continued ) H3PO2(aq) þ Hþ(aq) þ e ! P(s, white) þ 2H2O(l)
0.508
(1/8)S8(s) þ 2e ! 2S2(aq)
0.508
U (aq) þ e ! U (aq)
0.52
Ga3þ(aq) þ 3e ! Ga(s)
0.53
4þ
3þ
þ
2TiO2(s) þ 2H (aq) þ 2e ! Ti2O3(s) þ H2O(l)
0.56
PbO(s) þ H2O(l) þ 2e ! Pb(s) þ 2OH (aq)
0.58
Ta3þ(aq) þ 3e ! Ta(s)
0.6
Au(CN)2 (aq) þ e ! Au(s) þ 2CN (aq)
0.60
Ni(OH)2(s) þ 2e ! Ni(s) þ 2OH(aq)
0.66
Cr (aq) þ 3e ! Cr(s) 3þ
0.74
þ
Ta2O5(s) þ 10H (aq) þ 10e ! 2Ta(s) þ 5H2O(l)
0.75
Zn2þ(aq) þ 2e ! Zn(s)
0.7618
Zn (aq) þ 2e ! Zn(Hg) 2þ
0.7628
2H2O(l) þ 2e ! H2(g) þ 2OH (aq)
0.8227
Bi(s) þ 3Hþ(aq) þ 3e ! BiH3(g?)
0.8
TiO2þ(aq) þ 2Hþ(aq) þ 4e ! Ti(s) þ H2O(l)
0.88
B(OH)3(aq) þ 3Hþ(aq) þ 3e ! B(s) þ 3H2O(l)
0.89
þ
SiO2(s) þ 4H (aq) þ 4e ! Si(s) þ 2H2O(l) þ
0.91
Sn(s) þ 4H (aq) þ 4e ! SnH4(g)
1.07
Nb3þ(aq) þ 3e ! Nb(s)
1.099
V2þ(aq) þ 2e ! V(s)
1.13
Te(s) þ 2e ! Te2(aq)
1.143
Mn (aq) þ 2e ! Mn(s) 2þ
1.185
Zn(OH)4 (aq) þ 2e ! Zn(s) þ 4OH (aq) 2
1.119
Ti3þ(aq) þ 3e ! Ti(s)
1.21
ZnO2 (aq) þ 2H2O(l) þ 2e ! Zn(s) þ 4OH (aq)
1.216
Ti2O3(s) þ 2Hþ(aq) þ 2e ! 2TiO(s) þ H2O(l)
1.23
þ
1.31
Zr (aq) þ 4e ! Zr(s)
1.45
ZrO2(s) þ 4Hþ(aq) þ 4e ! Zr(s) þ 2H2O(l)
1.553
TiO(s) þ 2H (aq) þ 2e ! Ti(s) þ H2O(l)
2þ
Ti (aq) þ 2e ! Ti(s) 2þ
1.63
Al (aq) þ 3e ! Al(s)
1.66
U3þ(aq) þ 3e ! U(s)
1.66
3þ
Al3þ(aq) þ 3e ! Al(in 0.1 M NaOH)
1.706
Be2þ(aq) þ 2e ! Be(s)
1.85
Ac3þ (aq) þ 3e ! Ac(s)
2.20
H2(g) þ 2e ! 2H (aq)
2.35
Al(OH)3(s) þ 3e ! Al(s) þ 3OH(aq)
2.31
Al(OH)4 (aq) þ 3e ! Al(s) þ 4OH (aq)
2.33
ZrO(OH)2(s) þ H2O(l) þ 4e ! Zr(s) þ 4OH(aq)
2.36
Y (aq) þ 3e ! Y(s) 3þ
2.372 (continued)
37 3
374
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
Table 6.2 (Continued ) Mg2þ(aq) þ 2e ! Mg(s)
2.372
La3þ(aq) þ 3e ! La(s)
2.379
þ
Na (aq) þ e ! Na(s)
2.7109
Ra2þ(aq) þ 2e ! Ra(s)
2.8
Eu (aq) þ 2e ! Eu(s) 2þ
2.812
Ca (aq) þ 2e ! Ca(s)
2.868
Sr2þ(aq) þ 2e ! Sr(s)
2.899
La(OH)3(s) þ 3e ! La(s) þ 3OH(aq)
2.90
Ba2þ(aq) þ 2e ! Ba(s)
2.912
2þ
þ
K (aq) þ e ! Na(s) þ
2.931
Rb (aq) þ e ! Rb(s)
2.98
Csþ(aq) þ e ! Cs(s)
3.026
N2(g) þ 4H2O(l) þ 2e ! 2NH2OH(aq) þ 2OH (aq) þ
3.04
Li (aq) þ e ! Li(s)
3.0401
(3/2)N2(g) þ Hþ(aq) þ e ! NH3(aq)
3.09
(ordered alphabetically): Ac3þ (aq) þ 3e ! Ac(s) þ
2.20
Ag (aq) þ e ! Ag(s)
þ0.7996
Agþþ(aq) þ e ! Agþ(aq)
þ1.98
AgBr(s) þ e ! Ag(s) þ Br (aq)
þ0.07133
AgCl(s) þ e ! Ag(s) þ Cl (aq)
þ0.22233
þ0.236
AgCl(s) þ e ! Ag(s) þ Cl (aq, 4 M KCl)
þ0.200
AgCl(s) þ e ! Ag(s) þ Cl(aq, saturated KCl)
þ0.197
AgCl(s) þ e ! Ag(s) þ Cl (aq, 1 M KCl)
AgI(s) þ e ! Ag(s) þ I (aq) þ
0.15224 þ
AgO(s) þ 2H (aq) þ e ! Ag (aq) þ H2O(l)
þ1.77
Ag2O(s) þ 2Hþ(aq) þ 2e ! 2Ag(s) þ H2O(l)
þ1.17
Ag2O(s) þ H2O(l) þ 2e ! 2Ag(s) þ 2OH (aq)
þ0.342
Ag2O3(s) þ 6Hþ(aq) þ 4e ! 2Agþ(aq) þ 3H2O(l)
þ1.67
Al3þ(aq) þ 3e ! Al (in 0.1 M NaOH)
1.706
Al (aq) þ 3e ! Al(s)
1.66
Al(OH)3(s) þ 3e ! Al(s) þ 3OH(aq)
2.31
Al(OH)4(aq) þ 3e ! Al(s) þ 4OH(aq)
2.33
As(s) þ 3Hþ(aq) þ 3e ! AsH3(g)
0.23
3þ
þ
Au (aq) þ e ! Au(s)
þ
þ1.68
Au (aq) þ 2e ! Au (aq)
þ1.29
Au3þ(aq) þ 3e ! Au(s)
þ1.52
3þ
AuBr2(aq) þ e ! Au(s) þ 2Br(aq) AuBr4(aq) þ 3e ! Au(s) þ 4Br(aq) AuCl2(aq) þ e ! Au(s) þ 2Cl(aq) AuCl4(aq) þ 3e ! Au(s) þ 4Cl(aq) Au(CN)2 (aq) þ e ! Au(s) þ 2CN (aq)
þ0.96 þ0.85 þ1.15 þ0.93 0.60
6.19
E L E C T R O C H E M I S T R Y . E L E C T R O D E P O T E N T I A L S , A N D T H E N E R N S T E Q UA T I O N
Table 6.2 (Continued ) AuI2(aq) þ e ! Au(s) þ 2I(aq)
þ0.58
AuI4(aq) þ 3e ! Au(s) þ 4I(aq)
þ0.56
Ba (aq) þ 2e ! Ba(s)
2.912
Be2þ(aq) þ 2e ! Be(s)
1.85
B(OH)3(aq) þ 3Hþ(aq) þ 3e ! B(s) þ 3H2O(l)
0.89
2þ
Bi (aq) þ 3e ! Bi(s)
þ0.308
Bi(s) þ 3Hþ(aq) þ 3e ! BiH3(g)
0.8
3þ
Br2(aq) þ 2e ! 2Br(aq)
þ1.087
Br2(l) þ 2e ! 2Br(aq)
þ1.066
BrO3(aq) þ 5Hþ(aq) þ 4e ! HBrO(aq) þ 2H2O(l)
þ
þ1.45
BrO4 (aq) þ 2H (aq) þ 2e ! BrO3 (aq) þ H2O(l)
þ1.85
C(s, graphite) þ 4Hþ(aq) þ 4e ! CH4(g)
þ0.13
CH3OH(aq) þ 2Hþ(aq) þ 2e ! CH4(g) þ H2O(l)
þ0.50
þ
CO(g) þ 2H (aq) þ 2e ! C(s) þ H2O(l)
þ0.52
CO2(g) þ 2Hþ(aq) þ 2e ! CO(g) þ H2O(l)
0.11
CO2(g) þ 2Hþ(aq) þ 2e ! HCOOH(aq)
0.11
2CO2(g) þ 2Hþ(aq) þ 2e ! HOOCCOOH(aq)
0.43
2.76
Cd (aq) þ 2e ! Cd(Hg)
0.3521
Cd2þ(aq) þ 2e ! Cd(s)
0.4026
Ca (aq) þ 2e ! Ca(s) 2þ
2þ
Ce (aq) þ e ! Ce (in 1 M H2SO4)
þ1.44
Ce4þ(aq) þ e ! Ce3þ(aq)
þ1.61
4þ
3þ
Cl2(g) þ 2e ! 2Cl (aq) þ
þ1.3583
ClO2(g) þ H (aq) þ e ! HClO2(g)
þ1.19
ClO3(aq) þ 2Hþ(aq) þ e ! ClO2(g) þ H2O(l)
þ1.18
2ClO3(aq) þ 12Hþ(aq) þ 10e ! Cl2(g) þ 6H2O(l)
þ1.49
ClO4(aq) þ 2Hþ(aq) þ 2e ! ClO3(aq) þ H2O(l)
þ1.20
Co2þ(aq) þ 2e ! Co(s)
0.28
Co (aq) þ e ! Co (aq) 3þ
þ1.82
2þ
Co3þ(aq) þ e ! Co2þ(in 3 M HNO3) þ
þ1.68
CoO2(s) þ 4H (aq) þ e ! Co (aq) þ 2H2O(l) 3þ
Cr (aq) þ 3e ! Cr(s) 3þ
0.74
Cr3þ(aq) þ e ! Cr2þ(aq) Cr2O72(aq) þ 14Hþ(aq) þ 6e þ
0.42 ! 2Cr (aq) þ 7H2O(l) 3þ
Cs (aq) þ e ! Cs(s)
þ1.33 3.026
Cuþ(aq) þ e ! Cu(s)
þ0.520
Cu (aq) þ 2CN (aq) þ e ! 2þ
þ1.42
Cu(CN)2(aq)
þ1.12
Cu2þ(aq) þ 2e ! Cu(s)
þ0.340
Cu2þ(aq) þ e ! Cuþ(aq)
þ0.159
Cu(NH3)42þ(aq) þ e ! Cu(NH3)2þ(aq) þ 2NH3(aq)
Cu2O(s) þ H2O(l) þ 2e ! 2Cu(s) þ 2OH (aq)
þ0.10 0.360 (continued)
37 5
376
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
Table 6.2 (Continued ) Eu2þ(aq) þ 2e ! Eu(s)
2.812
Eu3þ(aq) þ e ! Eu2þ(aq)
0.35
(1/2) F2(g) þ Hþ(aq) þ e ! HF(aq)
þ3.05
(1/2) F2(g) þ e ! F(aq)
þ2.87
Fe(CN)63(aq) þ e ! Fe(CN)64(aq)
þ0.36
Fe(CN)63(aq) þ e 2þ
!
Fe(CN)64(in
1 M H2SO4)
þ0.69
Fe (aq) þ 2e ! Fe(s)
0.48
Fe3þ(aq) þ 3e ! Fe(s)
0.04
Fe3þ(aq) þ e ! Fe2þ(aq)
þ0.77
Fe (aq) þ e ! Fe (in 1 M HCl) 3þ
þ0.770
2þ
þ
Fe3O4(s) þ 8H (aq) þ 8e ! 3Fe(s) þ 4H2O(l)
þ0.085
Ga3þ(aq) þ 3e ! Ga(s)
0.53
Ge4þ(aq) þ 4e ! Ge(s)
þ0.12
þ
Ge(s) þ 4H (aq) þ 4e ! GeH4(g)
0.29
GeO2(s) þ 2Hþ(aq) þ e ! GeO(s) þ 2H2O(l)
0.37
GeO(s) þ 2Hþ(aq) þ 2e ! Ge(s) þ H2O(l)
þ0.26
2Hþ(aq) þ 2e ! H2(g) (standard hydrogen electrode, SHE)
0 (by definition)
H2(g) þ 2e ! 2 (aq)
2.35
þ
H3AsO3(aq) þ 3H (aq) þ 3e ! As(s) þ 3H2O(l)
þ0.24
H3AsO4(aq) þ 2Hþ(aq) þ 2e ! H3AsO3(aq) þ H2O(l)
þ0.56
HCHO(aq) þ 2Hþ(aq) þ e ! CH3OH(aq)
þ0.13
HCOOH(aq) þ 2Hþ(aq) þ 2e ! HCHO(aq) þ H2O(l)
0.03
HClO(aq) þ Hþ(aq) þ e ! (1/2) Cl2(g) þ H2O(l)
þ1.63
þ
HClO2(aq) þ 2H (aq) þ 2e ! HClO(aq) þ H2O(l)
þ1.67
2HIO(aq) þ 2Hþ(aq) þ 2e ! I2(s) þ 2H2O(l)
þ1.44
þ
HMnO4 (aq) þ 3H (aq) þ 2e ! MnO2(s) þ 2H2O(l)
þ2.09
H2MoO4(aq) þ 6Hþ(aq) þ 6e ! Mo(s) þ 4H2O(l)
þ0.11
þ
H2MoO4(aq) þ 6H (aq) þ 3e ! Mo (aq) þ 2H2O(l)
3þ
þ0.43
H2O(l) þ e ! (1/2) H2(g) þ OH (aq)
0.8227
H2O2(aq) þ 2Hþ(aq) þ 2e ! 2H2O(l)
þ1.776
H3PO2(aq) þ Hþ(aq) þ e ! P(s, white) þ 2H2O(l)
0.508
þ
H3PO3(aq) þ 3H (aq) þ 3e ! P(s, red) þ 3H2O(l)
0.454
H3PO3(aq) þ 2Hþ(aq) þ 2e ! H3PO2(aq) þ H2O(l)
0.499
þ
H3PO4(aq) þ 2H (aq) þ 2e ! H3PO3(aq) þ H2O(l)
0.276
HSO4(aq) þ 3Hþ(aq) þ 2e ! SO2(g) þ 2H2O(l)
þ0.16
HSeO4(aq) þ 3Hþ(aq) þ 2e ! H2SeO3(aq) þ H2O(l)
þ1.15
þ
H2SeO3(aq) þ 4H (aq) þ 4e ! Se(s) þ 3H2O(l)
þ0.74
Hg2þ(aq) þ 2e ! Hg(l)
þ0.85
Hg22þ(aq) þ 2e ! 2Hg(l)
þ0.7961
Hg2þ(aq) þ e ! (1/2) Hg22þ(aq)
þ0.905
Hg2Cl2(s) þ 2e ! 2Hg(l) þ 2Cl(aq)
þ0.2682
Hg2Cl2(s) þ 2e ! 2Hg(l) þ 2Cl (in sat’d KCl)
þ0.2415
6.19
E L E C T R O C H E M I S T R Y . E L E C T R O D E P O T E N T I A L S , A N D T H E N E R N S T E Q UA T I O N
Table 6.2 (Continued ) Hg2SO4(s) þ 2e ! 2Hg(l) þ 2SO42(aq)
þ0.6158
HgO(s) þ H2O(l) þ 2e ! Hg(l) þ 2OH(aq)
þ0.0977
I2(s) þ 2e ! 2I (aq)
þ0.535
I3(s) þ 2e ! 3I(aq) þ
þ0.5338
IO3 (aq) þ 5H (aq) þ 4e ! HIO(aq) þ 2H2O(l)
þ
þ1.13
2IO3 (aq) þ 12H (aq) þ 10e ! I2(s) þ 6H2O(l)
þ1.20
In3þ(aq) þ 3e ! In(s)
0.34
Kþ(aq) þ e ! Na(s)
2.931
La3þ(aq) þ 3e ! La(s)
2.379
La(OH)3(s) þ 3e ! La(s) þ 3OH (aq) þ
2.90
Li (aq) þ e ! Li(s)
3.0401
Mg2þ(aq) þ 2e ! Mg(s)
2.372
Mn2þ(aq) þ 2e ! Mn(s)
1.185
Mn (aq) þ e ! Mn (aq)
þ1.51
MnO2(s) þ 4Hþ(aq) þ 2e ! Mn2þ(aq) þ 2H2O(l)
þ1.208
MnO2(s) þ 4Hþ(aq) þ e ! Mn3þ(aq) þ 2H2O(l)
þ0.95
MnO4(aq) þ 2H2O(l) þ 3e ! MnO2(s) þ 4OH(aq)
þ0.59
MnO4(aq) þ Hþ(aq) þ e ! HMnO4 (aq) MnO4(aq) þ 8Hþ(aq) þ 5e ! Mn2þ(aq) þ 4H2O(l) MnO4(aq) þ 4Hþ(aq) þ 3e ! MnO2(s) þ 2H2O(l) MoO2(s) þ 4Hþ(aq) þ 4e ! Mo(s) þ 2H2O(l)
þ0.90
3þ
2þ
þ1.491 þ1.70 0.15
(3/2)N2(g) þ Hþ(aq) þ e ! NH3(aq)
3.09
N2(g) þ 4H2O(l) þ 2e ! 2NH2OH(aq) þ 2OH(aq)
3.04
þ
N2(g) þ 2H2O(l) þ 6H (aq) þ 6e ! 2NH4OH(aq)
þ0.092
N2H4(aq) þ 4H2O(l) þ 2e ! 2NH4þ(aq) þ 4OH(aq)
þ0.11
þ
þ
þ
2NH3OH (aq) þ H (aq) þ 2e ! N2H5 (aq) þ 2H2O(l)
þ1.42
Naþ(aq) þ e ! Na(s)
2.7109
Nb (aq) þ 3e ! Nb(s) 3þ
1.099
Ni (aq) þ 2e ! Ni(s)
0.23
NiO2(s) þ 4Hþ(aq) þ 2e ! Ni2þ(aq) þ 4OH(aq)
þ1.59
Ni(OH)2(s) þ 2e ! Ni(s) þ 2OH(aq)
0.66
2þ
þ
NO3 (aq) þ 4H (aq) þ 3e ! NO(g) þ 2H2O(l)
þ0.96
NO3(aq) þ 2Hþ(aq) þ 3e ! NO2(g) þ H2O(l)
þ0.80
O2(g) þ 4Hþ(aq) þ 4e ! 2H2O(l)
þ1.229
O2(g) þ 2Hþ(aq) þ 2e ! 2H2O2(l)
þ0.682
O2(g) þ 2H2O(l) þ 4e ! 4OH(aq)
þ0.401
þ
O2(g) þ H (aq) þ e ! HOO(aq)
0.13
O3(g) þ 2Hþ(aq) þ 2e ! O2(g) þ H2O(l)
þ2.075
þ
p-benzoquinone þ 2H (aq) þ 2e ! hydroquinone
þ0.6992
P(s, white) þ 3Hþ(aq) þ 3e ! PH3(g)
0.063
P(s, red) þ 3Hþ(aq) þ 3e ! PH3(g)
0.111 (continued)
37 7
378
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
Table 6.2 (Continued ) Pb2þ(aq) þ 2e ! Pb(Hg)
0.1205
Pb2þ(aq) þ 2e ! Pb(s)
0.1263
Pb (aq) þ 2e ! Pb (aq) 4þ
þ1.69
2þ
PbI2(s) þ 2e ! Pb(s) þ 2I(aq)
0.365
PbO(s) þ H2O(l) þ 2e ! Pb(s) þ 2OH (aq)
0.58
PbO2(s, a) þ 4Hþ(aq) þ 2e ! Pb2þ(aq) þ 2H2O(l)
þ1.468
þ
PbO2(s, b) þ 4H (aq) þ 2e ! Pb (aq) þ 2H2O(l)
þ1.460
PbO2(s) þ 4Hþ(aq) þ SO42(aq) þ 2e ! PbSO4(s) þ 2H2O(l)
þ1.685
PbSO4(s) þ 2e ! Pb(s) þ SO42(aq)
0.3588
2þ
PbSO4(s) þ 2e ! Pb(Hg) þ SO4 (aq)
0.3505
Pd2þ(aq) þ 2e ! Pd(s)
þ0.987
2
Pt (aq) þ 2e ! Pt(s)
þ1.188
PtCl42(aq) þ 2e ! Pt(s) þ 4Cl(aq)
þ0.758
PtCl62(aq) þ 2e 2þ
þ0.726
2þ
!
PtCl42(aq) þ 2Cl(aq)
Ra (aq) þ 2e ! Ra(s) þ
2.8
Rb (aq) þ e ! Rb(s)
2.98
Re (aq) þ 3e ! Re(s)
þ0.300
Ru(NH3)63þ(aq) þ e ! Ru(NH3)62þ(aq)
þ0.10
3þ
(1/8)S8(s) þ 2e ! 2S (aq)
0.508
(1/8)S8(s) þ 2Hþ(aq) þ 2e ! H2S(g)
þ0.14
2
þ
SO2(aq) þ 4H (aq) þ 4e ! (1/8)S8(s) þ 2H2O(l)
þ0.50
SO42(aq) þ 4Hþ(aq) þ 2e ! SO2(g) þ 2H2O(l)
þ0.17
þ
S2O3 (aq) þ 6H (aq) þ 2e ! (1/4) S8(s) þ 3H2O(l)
þ0.60
S2O82(aq) þ 2e ! 2SO42(aq)
þ2.010
2
S4O6 (aq) þ 2e ! 2S2O3 (aq) 2
þ
2
þ
þ0.08
SbO (aq) þ 2H (aq) þ 3e ! Sb(s) þ H2O(l)
þ0.20
Se(s) þ 2Hþ(aq) þ 2e ! H2Se(g)
0.11
Si(s) þ 4Hþ(aq) þ 4e ! SiH4(g)
0.14
SiO2(s) þ 4Hþ(aq) þ 4e ! Si(s) þ 2H2O(l)
0.91
Sn2þ(aq) þ 2e ! Sn(s)
0.1364
Sn4þ(aq) þ 2e ! Sn2þ(aq)
þ0.15
Sn(s) þ 4Hþ(aq) þ 4e ! SnH4(g)
1.07
þ
SnO(s) þ 2H (aq) þ 2e ! Sn(s) þ H2O(l)
0.10
SnO2(s) þ 2Hþ(aq) þ 2e ! SnO(s) þ H2O(l)
0.09
Sr2þ(aq) þ 2e ! Sr(s)
2.899
Ta3þ(aq) þ 3e ! Ta(s)
0.6
Ta2O5(s) þ 10Hþ(aq) þ 10e ! 2Ta(s) þ 5H2O(l)
0.75
Te(s) þ 2e ! Te2(aq)
1.143
Ti (aq) þ 2e ! Ti(s) 2þ
1.63
Ti3þ(aq) þ 3e ! Ti(s) þ
1.21
TiO (aq) þ 2H (aq) þ 4e ! Ti(s) þ H2O(l) 2þ
þ
TiO (aq) þ 2H (aq) þ e ! Ti (aq) þ H2O(l) 2þ
3þ
0.88 þ0.19
6.20
37 9
GOUY–C HAP MAN D O U BLE-L AYER THE ORY
Table 6.2 (Continued ) TiO(s) þ 2Hþ(aq) þ 2e ! Ti(s) þ H2O(l)
1.31
2TiO2(s) þ 2Hþ(aq) þ 2e ! Ti2O3(s) þ H2O(l)
0.56
Ti2O3(s) þ 2Hþ(aq) þ 2e ! 2TiO(s) þ H2O(l)
1.23
Tlþ(aq) þ e ! Tl(Hg)
0.3568
Tlþ(aq) þ e ! Tl(s)
0.3365
Tl3þ(aq) þ 2e ! Tlþ(aq)
þ1.247
Tl (aq) þ 3e ! Tl(s)
þ0.72
U3þ(aq) þ 3e ! U(s)
1.66
U4þ(aq) þ e ! U3þ(aq)
0.52
UO2þ(aq) þ 4Hþ(aq) þ e ! U4þ(aq) þ 2H2O(l) UO22þ(aq) þ e ! UO2þ(aq) 2þ
þ0.323
V (aq) þ 2e ! V(s)
1.13
V3þ(aq) þ e ! V2þ(aq)
0.255
3þ
þ0.163
VO2þ(aq) þ 2Hþ(aq) þ e ! V3þ(aq) þ H2O(l)
þ0.337
VO2þ(aq) þ 2Hþ(aq) þ e ! VO2þ(aq) þ H2O(l)
þ1.00
WO2(g) þ 4Hþ(aq) þ 4e ! W(s) þ 2H2O(l)
0.12
þ
WO3(s, probably) þ 6H (aq) þ 6e ! W(s) þ 3H2O(l)
0.09
Y3þ(aq) þ 3e ! Y(s)
2.372
Zn (aq) þ 2e ! Zn(s) 2þ
0.7618
Zn2þ(aq) þ 2e ! Zn(Hg)
0.7628
ZnO2 (aq) þ 2H2O(l) þ 2e ! Zn(s) þ 4OH (aq)
1.216
Zn(OH)42(aq) þ 2e ! Zn(s) þ 4OH(aq)
1.119
Zr2þ(aq) þ 4e ! Zr(s)
1.45
ZrO(OH)2(s) þ H2O(l) þ 4e ! Zr(s) þ 4OH(aq)
2.36
þ
ZrO2(s) þ 4H (aq) þ 4e ! Zr(s) þ 2H2O(l)
1.553
Source: From references 17 and 18.
To eliminate the exchange of ions between the cathode area and the anode area, a salt bridge (e.g., KCl in an agar–agar jelly) is used.
6.20 GOUY–CHAPMAN DOUBLE-LAYER THEORY Consider a macroscopic metal electrode: Its conduction electrons all tend to travel at the outer surface of the electrode, and they will induce positively charged cations to move close to the electrode surface; the “electrons inside electrode | cations” system is called the Helmholtz32 double layer [19]. If this layer of cations were at a fixed distance d from a flat electrode and if the medium had a uniform dielectric constant e, then a voltage-independent capacitance of this double layer would be (ee0/d); experimentally, these assumptions are invalid, and too simplistic. In fact, gegenions will also
32
Heinrich Ludwig Ferdinand von Helmholtz (1821–1894).
380
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
accumulate as one moves away from solution, forming an onion-skin type of alternating shells of opposite charge, until the whole solution is polarized and current stops flowing. The Gouy33–Chapman34 theory [20–22] carries out a more realistic model, of imaginary infinitesimally thin laminae, starting from an electrode, with ion concentrations ni given by ni ¼ n0i expðzi jejV=kB TÞ
ð6:20:1Þ
where e is the electronic charge, zi (positive or negative) is the charge per ion (in units of |e|) and n0i is the bulk concentration of the ions of type i. Then the total charge per unit volume is rðxÞ ¼
X
n z jej ¼ i i i
X
z jejn0i expðzi jejV=kB TÞ i i
The Poisson35 equation (Eq. (2.7.59)) then states rðxÞ ¼ ee0 ðd2 V=dx2 Þ
ðSIÞ
ð6:20:2Þ
which yields the Poisson–Boltzmann equation:
ðd2 V=dx2 Þ ¼ ðjej=ee0 Þ
P
0 i zi ni expðzi jejV=kB TÞ
ð6:20:3Þ
It can be shown formally that d2 V=dx2 ¼ ð1=2Þðd=dVÞðdV=dxÞ2 , so the differential is given by X dðdV=dxÞ2 ¼ ð2jej=ee0 Þ i zi n0i expðzi jejV=kB TÞdV which after integration yields ðdV=dxÞ2 ¼ ð2kB T=ee0 Þ
X
n0 expðzi jejV=kB TÞ i i
þ constant
After choosing that for large x, both V ¼ 0 and dV/dx ¼ 0, we get finally: ðdV=dxÞ2 ¼ ð2kB T=ee0 Þ
X
n0 ½expðzi jejV=kB TÞ i i
1
For a binary electrolyte (anion and cation have equal and opposite formal charges z) one gets
tanhðzjejV=kB TÞ=tanhðzjejV0 =kB TÞ ¼ exp½ð2n0 z2 e2 =ee0 kB TÞ1=2 x ð6:20:4Þ PROBLEM 6.20.1. Prove that d2 V=dx2 ¼ ð1=2Þðd=dVÞðdV=dxÞ2 .
33
Louis Georges Gouy (1854–1926). David Leonard Chapman (1869–1958). 35 Simeon Denis Poisson (1781–1840). 34
6.21
38 1
NERNST–PLANCK AND COTTRELL EQUATIONS
6.21 NERNST–PLANCK AND COTTRELL EQUATIONS Much of the work given hereinafter is inspired by [17]. The Nernst–Planck equation shows three contributions to the electrical current J for species i: Ji ðrÞ ¼ Di rCi ðzi F=RTÞDi Ci rf þ Ci v
ð6:21:1Þ
where the first term is Fick’s first law of diffusion (Eq. (4.16.1)), the second term represents migration, and the third term represents convection contributions to the overall current. Ci, zi, and Di are, respectively, the concentration, electrical charge, and diffusion coefficient for species i; v and r are the velocity and position vectors, f is the electrostatic potential, F is Faraday’s constant, R is the gas constant, and T is the absolute temperature. When the concentration Ci changes with time, but convection and migration can be neglected, then Fick’s second law applies in one and three dimensions: @Ci ðr; tÞ=@t ¼ Di r2 Ci ðr; tÞ ð3DÞ;
@Ci ðx; tÞ=@t ¼ Di @ 2 Ci ðx; tÞ=@x2 ð1DÞ ðð4:16:2ÞÞ
Diffusion coefficients for ions in water at 298 K are roughly Di 105 cm2 s1 but are larger for Hþ and OH ions, and much smaller for biopolymers. At 298.15 K for 0.001 M concentrations of LiCl, NaCl, KCl, RbCl, CsCl, KNO3, MgCl2, CaCl2, SrCl2, Li2SO4, Na2SO4, and Cs2SO4, the respective values are Di ¼ 1.345, 1.586, 1.964, 2.024, 2.013, 1.899, 1.189, 1.249, 1.269, 0.990, 1.175, and 1.487 105 cm2 s1. At 298.15 K for vanishingly small aqueous concentrations of Hþ Liþ, Naþ, Kþ, and Caþþ, the respective values are Di ¼ 9.313, 1.0286, 1.3349, 1.9565, and 0.7919 105 cm2 s1. The discussion now turns to how to explain routine electrochemical measurements—for example, cyclic voltammograms. First, consider a planar electrode, with three boundary conditions (i) Ci(x, t ¼ 0) ¼ C0i and (ii) Limx ! 1Ci(x, t) ¼ C0i : far from the planar electrode, the concentration of species i will not change appreciably; (iii) Ci(0, t> 0) ¼ 0: just after t ¼ 0, at the planar electrode species i has been consumed. After using several Laplace transforms, the diffusion-limited current i(t) due to solute i is given by the Cottrell36equation, which, in practice, is valid only for between 20 ms and 200 s—that is, before convection sets in:
iðtÞ nFAJi ð0; tÞ ¼ nFADi ½@Ci ðx; tÞ=@xx¼0 ¼ nFADi C0i p1=2 t1=2 1=2
ð6:21:2Þ
36
Frederick Garner Cottrell (1877–1948).
382
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
where A is the area of the electrode. The spatial dependence is 1=2 1=2
Ci ðx; tÞ ¼ C0i erf½ð1=2ÞxDi
t
ð6:21:3Þ
where erf(x) is the error function, Eq. (2.21.3). Second, consider a rapid oxidation–reduction reaction at a planar electrode: O þ ne $ R
ð6:21:4Þ
for which the Nernst equation, Eq. (6.19.14), is E ¼ EN þ ðRT=nFÞlne ½CO ð0; tÞ=CR ð0; tÞ
ð6:21:5Þ
and assume that the above discussion and assumptions apply to both O and R species: @CO(x, t)/@t ¼ DO@ 2CO(x, t)/@x2 and @CR(x, t)/@t ¼ DR@ 2CR(x, t)/@x2 with the boundary conditions CO(x, 0) ¼ C0O , Limx ! 1CO(x, t) ¼ C0O , CR(x, 0) ¼ C0R , Limx ! 1CR(x, t) ¼ C0R , and an additional flux balance condition: DO@CO(x, t)/@t þ DR@CR(x, t)/@t. After using a Laplace transform, the resulting current is
iplanar ðtÞ ¼ nFADO C0O p1=2 t1=2 f1 þ ½DO CO ð0; tÞ=DR CR ð0; tÞg 1=2
1=2
1=2
1
ð6:21:6Þ
PROBLEM 6.21.1. Using Laplace transforms, from Fick’s second equation, derive Eq. (6.21.2). PROBLEM 6.21.2. Inverting the Laplace transform Ci ðx; kÞ ¼ ð1=kÞC0i ð1=kÞC0i exp½xðk=Di Þ1=2 derive the spatial dependence, Eq. (6.21.3). Third, consider a triangular cyclic potential sweep under reversible (Nernstian) conditions for a planar electrode (Fig. 6.10), to derive a cyclic voltammogram (Fig. 6.11): In the Nernst expression, Eq. (6.19.14) a time-dependent potential is added: E(t) ¼ E(0) vt, whence EðtÞ ¼ EN þ vt þ ðRT=nFÞlne ½CO ð0; tÞ=CR ð0; tÞ
ð6:21:7Þ
which complicates matters (an analytical form is no longer obtainable by Laplace transforms); much labor finally yields [23]
iplanar ðtÞ ¼ nFAC0O p1=2 DO 1=2 wðstÞ
ð6:21:8Þ
6.21
38 3
NERNST–PLANCK AND COTTRELL EQUATIONS
E
t=0
Ei
FIGURE 6.10 t=λ
t
Triangular cyclic potential sweep.
where the special function w(st) can be found from a numerical evaluation of the integral: ð z ¼ stwðzÞðst zÞ1=2 dz ¼ ½1 þ xySðstÞ1 ð6:21:9Þ z¼0
where the auxiliary variables are s (nF/RT)v, x (DO/DR)1/2, and S(st) exp(st) [23]. If a spherical electrode (e.g., dropping mercury electrode of radius r0) is used instead of a planar electrode, an additional special function must be defined: f(st), so that the total current becomes i ¼ iplanar ðtÞ þ ispherical correction ðtÞ ¼ iplanar ðtÞ þ nFADO CO0 r0 1 fðstÞ ð6:21:10Þ Numerical values for the two auxiliary dimensionless functions w(st) and f(st) are given in Table 6.3 [23]. Figure 6.12 shows a theoretical linear potential sweep voltammogram for a planar electrode, using the w(st) data of Table 6.3. The current drops beyond the peak ip shown in Fig. 6.12 because the species getting oxidized (or reduced) is depleted, in turn because the diffusion of analyte from bulk solution has not kept apace with the electrochemical process at the electrode.
switching point
Current (arbitrary units)
Ep/2 ipc
0 E
(–)
ipa ∼Eº’
FIGURE 6.11 Resulting cyclic voltammogram for a Nernstian reversible redox process.
384
Table 6.3
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
Numerical Values for the Auxiliary Functions w(st) and f(st)
(E E1/2)/mV 130 100 80 60 50 45 40 35 30 25 20 15 10 5 0
p1/2w(st)
f(st)
(E E1/2)/mV
p1/2w(st)
f(st)
0.009 0.020 0.042 0.084 0.117 0.138 0.160 0.185 0.211 0.240 0.269 0.298 0.328 0.355 0.380
0.008 0.019 0.041 0.087 0.124 0.146 0.173 0.208 0.236 0.273 0.314 0.357 0.403 0.451 0.499
5 10 15 20 25 28.50 30 35 40 50 60 80 100 120 150
0.400 0.418 0.432 0.441 0.445 0.4463 0.446 0.443 0.438 0.421 0.399 0.353 0.312 0.280 0.245
0.548 0.596 0.641 0.685 0.725 0.756 0.763 0.796 0.826 0.875 0.912 0.957 0.980 0.991 0.997
Note: w(st) is a maximum at (E E1/2) ¼ 0.0285 V, while f(st) is at a maximum for (E E1/2) < 0.150 V [23].
The peak current ip in a linear or cyclic voltammogram is found easily, while the peak potential Ep may be sometimes difficult to identify if the peak is broad. The peak current ip is known as the limiting current—that is, the limiting value of the faradaic current, attained when the species being oxidized or reduced is constantly arriving at the electrode at the maximum possible rate. It is often convenient to report an empirical and conveniently measured half-peak potential Ep/2, at which the current has reached ip/2, one-half of its maximum value. This Ep/2 depends on experimental conditions (the potential scan speed v, and whether the electrode reaction is reversible, irreversible, 0.5 (Ep, ip) 0.4
π1/2 χ(σt)
0.3
E1/2
(1/2)ip 0.2
FIGURE 6.12 Theoretical linear sweep voltammogram for a reversible charge transfer and a planar electrode, using the dimensionless auxiliary function w(st) of Table 6.3. The half-wave potential E1/2 and the half-peak potential Ep/2 (scandependent) are shown.
0.1 Ep/2
0 120
80
40
0
–40
–80
–120
n(E – E1/2) = (RT/F)In ξ + n(Ei – Eº) – (RT/F)σ t (in mV)
38 5
RE FE REN CES
or quasi-reversible). Ep/2 is not identical to E1/2, the half-wave potential. In contrast, the half-wave potential E1/2 has a thermodynamic significance derived from the Nernst equation and is independent of experimental conditions. If reversibility reigns, then (in volts) Ep=2 E1=2 þ ð0:0285=nÞ EN þ ðRT=nFÞlne ðDO =DR Þ1=2 þ ð0:0285=nÞ ð6:21:11Þ
FURTHER READING For kinetics: See references 24 and 25. For pH, and so on: See references 17 and 24. For electrochemistry: See references 17 and 26.
REFERENCES 1. M. Bodenstein and S. C. Lind, Geschwindigkeit der Bildung des Bromwasserstoffs aus seinen Elementen, Z. Physik. Chem. 57:168–175 (1906). 2. J. R. Arnold and W. F. Libby, Age determinations by radiocarbon content: Checks with samples of known age, Science 110:678–680 (1949). 3. J. A. Christiansen, Kgl. Dansk. Videnskab. Selsk. Math.-fys. Medd. 1:14 (1919). 4. K. F. Herzfeld, Zur Theorie der Reaktionsgeschwindigkeiten in Gasen, Ann. Physik 66:635–667 (1919). 5. M. Polanyi, Reaction isochore and reaction velocity from the standpoint of statistics, Z. Elektrochem. 26:49–54 (1920). 6. W. J. Moore, Physical Chemistry, 4th edition, Prentice-Hall, Englewood Cliffs, NJ, 1972. 7. R. A. Marcus, On the theory of oxidation–reduction reactions involving electron transfer. I, J. Chem. Phys. 24:966–978 (1956). 8. R. A. Marcus, Electron transfer reactions in chemistry: Theory and experiment (Nobel Lecture), Angew. Chem. Int. Ed. Engl. 32:1111–1121 (1993). 9. L. T. Calcaterra, G. L. Closs, and J. R. Miller, Fast intramolecular electron transfer in radical ions over long distances across rigid saturated hydrocarbon spacers, J. Am. Chem. Soc. 105:670–671 (1983). 10. J. R. Miller, L. T. Calcaterra, and G. L. Closs, Intramolecular long-distance electron transfer in radical ions. The effects of free energy and solvent on the reaction rates, J. Am. Chem. Soc. 106:3047–3049 (1984). 11. H. M. McConnell, Intramolecular charge transfer in aromatic free radicals, J. Chem. Phys. 35:508–518 (1961). 12. J. G. Simmons, Generalized formula for the electric tunnel effect between similar electrodes separated by a thin insulating film, J. Appl. Phys. 34:1793–1803 (1963). 13. J. G. Simmons, Conduction in thin dielectric films, J. Phys. D4:613–657 (1971). € ber den Begriff der Sauren und Basen, Rec. 14. J. N. Brønsted, Einige Bemerkungen u Trav. Chim. 42:718–728 (1923). 15. T. M. Lowry, The Uniqueness of Hydrogen, J. Soc. Chem. Ind. London 42:43–47 (1923).
386
6
KINETICS, EQUILIBRIA, AND ELECTROCHEMISTRY
16. H. A. Laitinen, Chemical Analysis: an Advanced Text and Reference, McGraw-Hill, New York, 1960. 17. A. J. Bard and L. R. Faulkner, Electrochemical Methods: Fundamentals and Applications, 2nd edition, Wiley, Hoboken, NJ, 2001. 18. en.wikipedia.org/wiki/Standard_electrode_potential_(data_page). 19. H. Helmholtz, Pogg. Ann. 89:211 (1853). 20. G. Gouy, Compt. Rend. 149:654 (1909). 21. G. Gouy, J. Phys. 4:457 (1910). 22. D. L. Chapman, Philos. Mag. 6:475 (1913). 23. R. S. Nicholson and I. Shain, Single scan and cyclic methods applied to reversible, irreversible and kinetic systems, Anal. Chem. 36:706–723 (1964). 24. P. W. Atkins and J. de Paola, Physical Chemistry, 9th edition, Oxford University Press, Oxford, UK, 2010. 25. J. I. Steinfeld, J. S. Francisco, and W. L. Hase, Chemical Kinetics and Dynamics, 2nd edition, Prentice-Hall, Upper Saddle River, NJ, 1998. 26. D. A. Skoog, F. J. Holler, and S R. Crouch, Principles of Instrumental Analysis, 6th edition, Thomson Brooks Cole, Belmont, CA 2007.
CHAPTER
7
Symmetry
“Similes cum similibus facillime congegrantur.” [Similar people congregate most easily with similar people.] [Paraphrased remark of Marcus Porcius Priscus Cato (ca. 236–149 BC) quoted in De Senectute by Marcus Tullius Cicero’s (106–43 BC)]
7.1 SYMMETRY Symmetry is a property we find in objects with at least one dimension (D): (1-D: symmetry of beads on a string; 2-D: symmetry of objects in a plane; 3-D: symmetry of objects in space). Empty space has the most symmetry. In zero dimensions, any symmetry is allowed. An object that does not have to fill space can have any arbitrary symmetry (e.g., no symmetry, or a sevenfold rotation axis). However, if this object must fill 2-D or 3-D space, it must meet certain local symmetry requirements, which, coupled with translational symmetry operators, allows the space to be completely filled. Symmetry in nature had been a preoccupation for Greek philosophers Pythagoras,1 Plato,2 and Euclid.3 The first detailed attention to crystal symmetry is due to mineralogists, who noticed that certain crystal faces grew bigger than others, from sample to sample, but in 1669 Steno4 found that interfacial angles were constant, while in 1784 Ha€ uy5 found that cleavage 6 planes had rational intercepts. In 1848 Pasteur had separated d and l-tartaric acid crystals by their mirror-image crystalline forms, which would lead to the
1
Pythagoras of Samos (ca. 570 BC–ca. 495 BC). Plato (427 BC–347 BC). 3 Euclid (324 BC–264 BC). 4 Nicolas Steno (1638–1685). 2
5 6
Abbe Rene-Just Ha€ uy (1743–1822). Louis Pasteur (1822–1895).
The Physical Chemist’s Toolbox, Robert M. Metzger. Ó 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc.
387
388
7
SY MM ETR Y
FIGURE 7.1 (A repeat of Fig. 2.4.) The primitive direct-lattice unit cell in a triclinic (lowest-symmetry) crystal is an oblique parallelopiped with sides a, b, c, interfacial angles a, b, and g and unit vectors ea, eb, and ec.
β
c
ec
α γ
eb
ea
a
b three-dimensionality of the covalent bonding of carbon. In 1850 Bravais7 found his 14 lattices, while between 1890 and 1905 the mathematicians Scho¨nflies8 and Fedorov9 found all the 230 possible space groups, well in advance of von Laue’s10 first look at crystals by X-ray diffraction in 1914. As already discussed in Section 2.4, in 3-D any point (denoted by the vector r) can be described by its three coordinate projections x, y, z (in units such as m, nm, A, or pm) using an orthogonal coordinate system with unit vectors eX, eX, eX; hence r ¼ xeX þ yeY þ zeZ. In noncrystallographic textbooks, the position vector r is usually given in a Cartesian (orthogonal) system. Crystals are defined as symmetric objects with translational symmetry, with a fundamental repeat unit, or primitive (P) direct-lattice unit cell (“unit cell” for short) of volume V, repeated ad infinitum by multiple application of three unit cell vectors a, b, and c, along the unit vector directions ea, eb, ec. These vectors may form oblique angles a, b, and g between them (Figs. 2.4 and 7.1), but in crystals with a higher symmetry, some or all of these angles may be right angles. The translation vectors, expressed in the same system and the same units, will be la þ mb þ nc, where l, m, n are integers. In the lowest-symmetry case (triclinic crystal, Fig. 7.1), the unit cell is an oblique parallelopiped. Translational symmetry requires that any matter located at xea þ yeb þ zec must also be replicated exactly at the coordinates (x þ l |a|) ea þ (y þ m |b|) eb þ (z þ n |c|) ec, where l, m, and n are integers (positive, negative, or zero). This translation symmetry defines the unit cell and the unit cell axes a, b, c. In the least symmetric case, the contents of the unit cell (atoms, ions, molecules, trapped solvents, proteins) may not have any symmetry at all. Most often, within certain highly symmetric unit cells, local (intra-cell) symmetries may exist, which are not integer multiples of the translations. These local symmetries are (i) combinations of halves of the unit cell translations (face-centering, body-centering, or end-centering operations), and/or (ii) symmetry operations that replicate the contents of one part of the unit cell in another part of it: rotation, inversion, and mirror reflection operations, which are often combined with translations by rational fractions of the unit cell vectors (roto-reflection roto-inversion, and glide operations). When these local symmetries exist, the basic repeat unit is no longer the whole unit cell, but is a rational fraction of it, called the asymmetric unit. The unit cell is then mapped by applying all the applicable local symmetries. 7
Auguste Bravais (1811–1863). Arthur Moritz Sch€ onflies (1853–1928). 9 Evgraf Stepanovich Fedorov [or Fyodorov] (1853–1919). 10 Max Theodor Felix von Laue (1879–1960). 8
7.2
38 9
SYMMETRY O PER ATIO NS AND POINT GROUPS
7.2 SYMMETRY OPERATIONS AND POINT GROUPS The symmetry operations of a certain object, or repeat unit, form a mathematical group. A finite group of order h is a mathematical set of h objects, operators, matrices, depictions of molecules, or elements (J1, J2, . . ., Jh) closed under one operation, usually called group multiplication; one of these operators is the identity I; the inverse operator Ji1 to any operator Ji within the group must also belong to the group: Ji1 Ji ¼ I ¼ JiJi1 (this is called closure of the group under group multiplication). Space can be completely filled, or “tiled” in two and three dimensions only by certain regular forms. For instance, a two-dimensional plane can be tiled (covered completely using only multiples of the two lattice vectors or “translations” a and b) by only four systems (Fig. 7.2): (a) (b) (c) (d)
the general oblique parallelogram, and as subcases thereof: the rectangle, the square, and the parallelogram with equal sides and angles equal to 60 and 120 (third of a hexagon)
This tiling problem is called that, by analogy to the problem of how to cover a floor with tiles of a single shape: not all shapes; will cover the floor completely (squares, rectangles, triangles, hexagons will do, but pentagons, heptagons, or circles will not). Table 7.1 shows the total number of symmetry objects allowed in two and three dimensions. We delay the presentation of the Bravais lattices and the space groups, and we first deal with the symmetry operators and the point groups. The eight symmetry operators are five proper operators and three improper (i.e., combination) operators:
b
b a
unit cell
a
(c)
(a)
|a|=|b|
b a
90º
(b)
unit cell
120º
(d) |a|=|b|
FIGURE 7.2 The four planar systems: (a) oblique, (b) rectangular, (c) square, (d) hexagonal.
390
Table 7.1
7
SY MM ETR Y
Number of Symmetry Objects in Two and Three Dimensions
In Two Dimensions (2-D)
In Three Dimensions (3-D)
Restriction
10 point groups 4 systems 5 Bravais lattices 17 space groups
32 point groups 7 systems 14 Bravais lattices 230 space groups
No translations Worry about a,b,c, a, b, g Fill space with empty bodies Fill space with full bodies
A. Five proper Operators 1. Translation operators 2. Identity operator 1 ¼ C1 3. The inversion center, or inversion operator, or centrosymmetry operator i ¼ Ci ¼ 1 4. Various rotation operators (1 ¼ C1 ¼ 1-fold, or identity rotation, or rotation by 360 ; 2 ¼ C2 ¼ 2-fold, or rotation by 180 , 3 ¼ C3 ¼ 3-fold, or rotation by 120 , 4 ¼ C4 ¼ 4-fold, or rotation by 90 , and 6 ¼ C6 ¼ 6-fold, or rotation by 60 ), 5. Reflection or mirror plane (m) B. Three improper or combination operators: 6. Roto-inversion operators (combinations of rotation plus inversion: 1; 2; 3; 4; or 6); these are sometimes replaced by roto-reflection operators. Some scientists use roto-reflection operators, while crystallographers have chosen roto-inversion. 7. The screw rotations: 21, 31, 32, 41, 42, 43, 61, 62, 63, 64, 65 (Nm ¼ rotation by 360 /N plus translation by m/N cell constants along the axis of rotation). 8. The glide planes: a, b, c (not to be confused with unit cell translations), d (diamond), n (net) [e.g., glide operation a perpendicular to the a, b plane at z/c ¼ 1/4: ¼ mirror plane about z/c ¼ 1/4, followed by a translation of a/2 along a]. Since handedness (left-handed versus right-handed) is important in molecules; the eight symmetry operations can be rethought as (C) 4 operations of the first kind (which preserve handedness): translation, identity, rotation, and screw rotation; (D) 5 operations of the second kind (which reverse handedness, and produce enantiomorphs): inversion, reflection, rotoinversion, and glide planes. To obtain the coordinate transformations for rotation operators placed at the origin (without loss of generality, assume that the rotation axis is parallel to the z axis, Fig. 7.3) we can define the arbitrary rotation though an angle y, as given by x0 ¼ x cos y þ y sin y y0 ¼ x sin y þ y cos y
ð7:1:1Þ
For the important rotations by y ¼ 30 , 45 , 60 , 90 we get the expressions given in Table 7.2. Figure 7.4 shows a systematic procedure for determining the point group to which a molecule belongs, using a nomenclature introduced by Sch€ onflies;
7.3
39 1
GROUP THEORY AND CHARACTER TABLES y'
y
P(x,y) = P(x',y')
x'
θ B θ A' R θ os
B' xc
in
ys
θ
O
θ T
x
A
x' = OA' = OR + RA' = x cos θ + y sin θ y' = OB' =PT - A'T = y cos θ - x sin θ
FIGURE 7.3 Rotate the coordinate system (x,y) by a positive angle y to a new coordinate system (x0 y0 ), using Eq. (7.1.1).
Table 7.2 Effect of the Principal Rotation Operators Cn Parallel to the z axis and Through the Origin (Rotation by y degrees)a Cn
y
x’
y’
z’
C12 C8 C6 C4 C3 C2
30 45 60 90 120 180
H3 x/2 þ y/2 x/H2 þ y/H2 x/2 þ H3 y/2 y x/2 þ H3 y/2 x
x/2 þ H3 y/2 x/H2 þ y/H2 H3 x/2 þ y/2 x H3 x/2 þ y/2 y
z z z z z z
a
From the old coordinates (x, y, z) they generate the new coordinates (x’, y’, z’).
the names C, D, and S stand for cyclic, dihedral, and Spiegel (mirror), respectively; the subscripts h, v, and d stand for horizontal, vertical, and diagonal mirror planes, respectively, with respect to the principal rotation axis (which is taken as vertical). Table 7.3 shows a few point groups of interest to molecules (and to crystals). The Sch€ onflies notation is being replaced in the crystallographic literature by the Herrmann11–Mauguin12 or international notation. The seven crystal systems in three-dimensional space are listed with their defining symmetry elements in Table 7.4.
7.3 GROUP THEORY AND CHARACTER TABLES We will show how a finite group “works” by using the crystallographic 6m2) as a working example and considering the operations point group D3h (
11 12
Carl Hermann (1898–1961). Charles-Victor Mauguin (1878–1958).
392
7
SY MM ETR Y
Molecule
Yes
D∞h
Yes
i?
[acetylene]
Linear?
No
No [C60]
Ih
Yes
C∞v
[HCl]
2 Cn, n>2?
No
Yes No
Oh
[SF6]
C5 ?
Yes
i?
No
Td [CH4]
Dnh
Yes
[D2h: ethylene]
Dnd
σh ?
Find Cn with highest n; are there n C2 perp. to this Cn?
Yes
No
Yes
Yes
Cn ? No
No
n σd ?
Dn
[D2d: allene]
Cnh
Yes
[C2h: trans-1,2-C2H2F2]
[C2v: H2O]
Cnv
Yes
Cs σh ?
Yes
σ?
[pyridine]
No
No
n σv ?
Ci
Yes
i?
[meso-tartaric acid]
No
No
C1 FIGURE 7.4 Scheme for determining molecular € nflies notation). point groups (Scho
[S4: CH4]
S2n
Yes
S2n?
Cn
[CBrClFI]
[C2: H2O2]
possible for a single bipyramidal molecule with vertices A, B, C, D, and E (Fig. 7.5). Figure 7.6 shows how this trigonal bipyramid (identified by the vertices A, B, C, D, and E) is affected by each symmetry operation. Table 7.5 presents the group multiplication table. Next, we discuss representations and character tables. A group of order h can be represented by h matrices, each of dimensions h h. However, there exist so-called irreducible representations for each group, which are blockdiagonal submatrices, which span the space. Table 7.6 presents the character tables for all 32 crystallographic point groups, and some other groups as well. Table 7.7 is an abbreviated form of Table 7.6. A finite group of order h can be represented by h matrices of dimension h h, which can act on a basis set of h column matrices of dimension h 1. The groups Ci, Cs, C2, C3, and C2h are abelian or commutative (the product of all their operators commute, FG ¼ GF, and their group multiplication tables are
7.3
39 3
GROUP THEORY AND CHARACTER TABLES
€ nflies and also Table 7.3 Point Groups of Interest to Chemistry (in Scho Hermann–Mauguin Notation), with Examples of Molecules that Belong to Thema n=2 n=3 n=4 n=5 n=6 n=7 n=∞ Cn
C2
C3
C4
C5*
C6
2
3
4
5*
6
C7*
C∞*
H2O2 One n-Fold Principal Rotation Axis Cn (Rotation by 360/n Degrees) Cnh
C2h
C3h
C4h
C5h*
C6h
2/m
3/m
4/m
5/m*
6/m
C∞h*
H
F
H C
H
O
C
B
H
F
O
O H
One n-Fold Principal Rotation Axis C n (Rotation by 360/n Degrees), Plus Horizontal Mirror Plane Cnv
C2v
C3v
C4v
C5v *
C6v
2mm
3m
4mm
5mm
6mm
N H
[pyramids]
H2O SO2 NO2
C∞v*
H H
HCl, OCS, cone
NH3
One n-Fold Principal Rotation Axis Cn (Rotation by 360/n Degrees), Plus Vertical Mirror planes σv Dn
D2
D3
D4
D5*
D6
222
32
422
522*
622
D∞*
One n-Fold Principal Rotation Axis Cn , Plus nC 2 Axes Perpendicular to Cn Dnh
D2h
D3h
D4h
D5h*
mmm
6m2
4/mmm 5/mmm*
D6h
D∞h*
6/mmm
(continued )
394
7
Table 7.3
SY MM ETR Y
(Continued )
[planes or bipyramids]
n=2
n=3
n=4
C2H4
PCl5 Au(Cl4)– BF3
n=5
n=6
eclipsed ferrocene
C6H6
n=7
n=∞ N2 C2H2
F B F
F
eclipsed ruthenocene One n-Fold Principal Rotation Axis Cn, n C2 Axes Perpendicular to Cn, Horizontal Mirror Plane σ h Dnd
D2d
D3d
D4d*
D5d*
D6d*
42m
3m
4m *
5m
6m
allene
stagg. ethane
D∞d*
circle
One n-Fold Principal Rotation Axis Cn, n C2 Axes Perpendicular to Cn, n Dihedral Mirror Planes σd Sn
S2=Ci
meso-tartaric acid
HO C
H HOOC
S4
S6=C3h
4
6
tetraphenylmethane
COOH H
C OH
One n-Fold Roto-Reflection Axis Sn, (Rotation by Cn Plus One Mirror Perpendicular to C1) R3* (full rotation group)*: sphere I*
(icosahedron): C60
Oh (octahedron): SF6 Td (tetrahedron): CH4 a
(the point groups indicated by will not form space groups).
7.3
39 5
GROUP THEORY AND CHARACTER TABLES
Table 7.4
The Defining Symmetry Elements of the 7 Crystal Systems in 3-D
Name
Unit Cell Conditions a 6¼ b 6¼ c 6¼ a a 6¼ b 6¼ c 6¼ a a 6¼ b 6¼ c 6¼ a a ¼ b 6¼ c a ¼ b 6¼ c a ¼ b¼ c a¼b¼c
Triclinic Monoclinic Orthorhombic Tetragonal Hexagonal Rhombohedral or trigonal Cubic
Defining Symmetry Element
a 6¼ b 6¼ g6¼ a a 6¼ ¼ g ¼ 90 , b H 90 a ¼ b ¼ g ¼ 90 a ¼ b ¼ g ¼ 90 a ¼ b ¼ 90 , g ¼ 120 a ¼ b ¼ g G 90 a ¼ b ¼ g ¼ 90
(1) none (2) One two-fold axis C2 (222) three mutually perp. 2-fold axes C2 (4) One four-fold axis C4 (6) One six-fold axis C6 (3) One three-fold axis C3 (23) Four mutually perp. 3-fold axes C3
symmetric about the main diagonal); all the others are nonabelian or noncommmutative (“revolt from suburbia,” hehe!), that is, FG 6¼ GF. By convention, if the irreducible representations can be one-dimensional matrices, the rows are labeled A (B) if the basis or representation is symmetric—that is, has eigenvalueþ1 (is antisymmetric, i.e. has eigenvalue 1) when operated on by the principal rotation axis. In groups that have an inversion operator i, the suffix g ¼ gerade ¼ even (u ¼ ungerade ¼ odd) denotes the eigenvalue after operating with i: þ 1 for gerade, 1 for ungerade. The comments x, y, z and their combinations (x2 y2, etc) indicate that those variables belong to that representation; this is useful for deciding about selection rules for electric-dipole-allowed spectroscopic transitions. The label T is used for three-dimensional representations. The character tables of Tables 7.6 and 7.7 are best explained by example. For instance, consider the bent molecule NO2, which belongs to point group C2v, and choose a minimum basis-set of atomic orbitals centered on the three atoms (Fig. 7.7). To exploit the molecular symmetry, it is wise to orient the molecule with the z axis bisecting the ONO bond angle and with the x axis normal to the NO2 molecular plane. Consider what will happen to the column vector representing 2px orbitals centered on the three atoms: 2px (N), 2px (OA), 2px (OB): 0
1 2px ðNÞ @ 2px ðOA Þ A 2px ðOB Þ
ð7:3:1Þ
S3(z) : rotate by 120 degrees clockwise, then reflect
C3(z) : rotate by 120 degrees clockwise
S3(z)'': rotate by 120 degrees counterclockwise, then reflect
C3(z)'': rotate by 120 degrees counterclockwise
σv'
σv''' D
C
A
σv B
C2
σh
C2'
E
C2''
FIGURE 7.5 Symmetry operations in the point group D3h (6m2), applicable to a triangular bipyramid of that symmetry. The corners of the bipyramid are labeled A, B, C, D, E. The symmetry operations are two vertical threefold axes C3 (clockwise rotation by 120 ) and C30 (counterclockwise rotation by 120 , or counterclockwise by 240 ), three horizontal twofold axes C2, C20 , and C200 , and three vertical symmetry planes sv, sv0 , and sv00 .
396
7
SY MM ETR Y
D D B
A E B
C
B A
E C
C
D
D C
E
A
E A
σv
S3'(z)
σ'v
A
B B
S3(z)
C
A
D
D
E C
σ"v
D A
C σh
B
A
D
Table 7.5
C E
E C3'(z) D B
C2
C2'
FIGURE 7.6 C
C3(z) B
B
C2" E A
Effect of 11 of the 12 symmetry operators (all except identity) of point group D3h (6m2) on a triangular bipyramid (think of the molecule PCl5), with five marked vertices A, B, C, D, and E (the chlorine atoms).
E
B E C
D B
C
E B A
A
D
A E
C D
Group Multiplication Table for Point Group D3h (6m2)a Second Operation
First Operation I C3(z) C30 (z) C2 C20 C200 sh S3(z) S30 (z) sv sv0 sv00 a
I E C3 C30 C2 C20 C200 sh S3 S30 sv sv0 sv00
C3 (z)
C30 (z)
C2
C20
C200
sh
S3(z)
S30 (z)
sv
sv0
sv00
C3 C30 E C200 C2 C20 S3 S30 sh sv00 sv sv0
C30 I C3 C20 C200 C2 S30 sh S3 sv0 sv00 sv
C2 C20 C200 I C3 C30 sv0 sv00 sv S30 sh S3
C20 C200 C2 C30 I C3 sv00 sv sv0 S3 S30 sh
C200 C2 C20 C3 C30 I sv sv0 sv00 sh S3 S30
sh S3 S30 sv0 sv00 sv I C3 C30 C200 C2 C20
S3 S30 sh sv sv0 sv00 C3 C30 I C20 C200 C2
S30 sh S3 sv00 sv sv0 C30 I C3 C2 C20 C200
sv sv0 sv00 S3 S30 sh C200 C2 C20 I C3 C30
sv0 sv00 sv sh S3 S30 C2 C20 C200 C30 I C3
sv00 sv sv0 S30 sh S3 C20 C200 C2 C3 C30 I
The order of this group is 12, that is, the group has 12 elements. Each symmetry element can occur only once in each row and each column.
7.3
39 7
GROUP THEORY AND CHARACTER TABLES
Table 7.6
Full Character Tables for Several Point Groupsa
C1 ¼ 1
I h¼1
A
1
Cs ¼ ih ¼ m
I
sh
h¼2
A A0
1 1
1 1
x, y, Rz, x2, y2, z2, xy z, Rx, Ry, yz, xz
Ci ¼ S2 ¼ 1
I
Ag Au
h¼2
sh
1 1
2
Rx, Ry Rz, x , y2, z2, xy, yz, zx x, y, z
1 1
C2 ¼ 2
I
C2
h¼2
A B
1 1
1 1
z, Rz, x2, y2, z2, xy x, y, z, Rx, Ry, yz, zx
h¼3
C3 ¼ 3
I
C3
C32
A E
1 1 1
1 e e
1 e e
C4 ¼ 4
I
C2
C4
C34
A B E
1 1 1 1
1 1 1 1
1 1 i i
1 1 i i
e exp(2pi/3)
z, Rz,
x2 þ y2, z2
(x, y) (Rx, Ry),
(x2 y2, xy) (yz, zx)
h¼4 z, Rz
x2 þ y2, z2 x2 y2, xy
(x,y) (Rx, Ry)
(yz, zx)
C2v ¼ 2mm
I
C2
sv(xz)
sv0 ( yz)
h¼4
A1 A2 B1 B2
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
z, Rz x, Ry y, Rx,
x2, y2, z2 xy zx yz (continued )
398
7
Table 7.6
(Continued )
C3v ¼ 3m
I
A1 A2 E
2C3
1 1 1
C4v ¼ 4mm
1 1 1
I
A1 A2 B1 B2 E
1 1 1 1 2
1 1 1 1 0
I
2C5
A1 A2 E1 E2
1 1 2 2
1 1 2 cos a 2cos 2a 2 cos 2a2cos a
1 1 1
z, x þ y , z2 Rz (x, y) (Rx, Rx) (zx, yz)
1 1 1 1 0
1 1 1 1 0
z, x þ y , z2 Rz x2 y2 xy (x, y) (Rx, Ry) (zx, yz)
2C52
5sv
h ¼ 10, a 72
1 1
1 1 0
z, x2 þ y2, z2 Rz (x, y) (Rx, Ry) (zx, yz) (x2 y2, xy)
0
C6v ¼ 6mm
I
2C6
2C3
C2
3sv
3sd
A1 A2 B1 B2 E1 E2
1 1 1 1 2 2
1 1 1 1 1 0
1 1 1 1 1 12
1 1 1 1 2 0
1 1 1 1 0 0
1 1 1 1 0 0
C1v P A1( þ) P A2( ) Q E1( )
I
C2
1
1
1
1
1
1
1
1
2
2
2
2
E2(D) etc.
2
h¼4
2sv
1 1 1 1 2
C5v
h¼6 2
2sv0
C2
2C4
3sv
2Cf
2
h ¼ 12 z, x2 þ y2, z2 Rz
(x, y) (Rx, Ry) (zx, yz) (x2 y2, xy)
1sv
2 cos f
0
2 cos 2f
0
2
h¼1 z x2 þ y2, z2 Rz (x, y) (Rx, Ry) (zx, yz)
D2 ¼ V ¼ 222
I
C2(z)
C2(y)
C2(x)
h¼4
A1 B1 B2 B3
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
x2 þ y2, z2 z, Rz xy y, Ry zx x, Rz yz
SY MM ETR Y
7.3
39 9
GROUP THEORY AND CHARACTER TABLES
Table 7.6
(Continued )
D3 ¼ 32
I
2C3
3C2
h¼6
1 1 0
x þy , z z, Rz (x, y) (Rx, Ry), (x2 þ y2, xy), (zx, yz) 2
2
2
A1 A2 E
1 1 2
D2h ¼ mmm
I
C2(z)
C2(y)
C2(x)
i
s(xy)
s(zx)
s(yz)
h¼8
Ag B1g B2g B3g Au B1u B2u B3u
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
x2 þ y2, z2 Rz, xy Ry, zx Rx, yz
1 1 1
D3h ¼ 6m2
I
2C3
3C2
sh
2S3
3sv
h ¼ 12
A10 A20 E0 A100 A200 E
1 1 2 1 1 2
1 1 1 1 1 1
1 1 0 1 1 0
1 1 2 1 1 2
1 1 1 1 1 1
1 1 0 1 1 0
x2 þ y2, z2 Rz (x, y) (x2 y2, xy)
D4h ¼ 4/mmm
I
2C4
C2
2C20
2C200
i
2S4
sh
2sv
2sd
h ¼ 16
A1g A2g B1g B2g Eg A1u A2u B1u B2u Eu
1 1 1 1 2 1 1 1 1 2
1 1 1 1 0 1 1 1 1 0
1 1 1 1 2 1 1 1 1 2
1 1 1 1 0 1 1 1 1 0
1 1 1 1 0 1 1 1 1 0
1 1 1 1 2 1 1 1 1 2
1 1 1 1 0 1 1 1 1 0
1 1 1 1 2 1 1 1 1 2
1 1 1 1 0 1 1 1 1 0
1 1 1 1 0 1 1 1 1 0
x2 þ y2, z2 Rz x2 y2 xy Rx Ry zx yz
D5h
I
2C5
A10 A20 E10 E20 A100 A200 E100 E20
1 1 2 2 1 1 2 2
1 1 2cos a 2cos 2a 2cos 2a 2cos a 1 1 2cos a 2cos 2a 2cos 2a 2cos a
z (Rx, Ry) (zx, yz)
2C25
5C2
sh
2S5
1 1
1 1 0 0 1 1 0 0
1 1 2 2 1 1 2 2
1 1 2cos a 2cos 2a 2cos 2a 2cos a 1 1 2cos a2cos 2a 2cos 2a2cos a
1 1
z
(x, y)
2S25
5sv
h ¼ 20, a 72
1 1
1 1 0 0 1 1 0 0
x2 þ y2, z2 Rz (x, y) (x2 y2, xy)
1 1
(Rx, Ry) (zx, yz)
(continued )
400
Table 7.6
7
(Continued )
D6h ¼ 6/mmm
I
A1g A2g B1g B2g E1g E2g A1u A2u B1u B2u E1u E2u
D1h P A1g( þ g) P A1u( u ) P A2g( g) P A2u( u ) Q E1g( u) Q E1u( u)
2C6
1 1 1 1 2 2 1 1 1 1 2 2
1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1
1C20
I
3C20
C2
2C3
1 1 1 1 2 2 1 1 1 1 2 2
1 1 1 1 0 0 1 1 1 1 0 0
i
2Cf
3C200
i
1 1 1 1 0 0 1 1 1 1 0 0
2S3
1 1 1 1 2 2 1 1 1 1 2 2
1 1 1 1 1 1 1 1 1 1 1 1
1sv
2S6
3sd
sh
1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 2 2 1 1 1 1 2 2
1 1 1 1 0 0 1 1 1 1 0 0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
0
2cos f
2
0
2cos f
2
0
2cos f
2
0
2cos f
Rz (x, y)
E2g(Dg)
2
0
2cos 2f
2
0
2cos 2f
(xy, x2 y2)
E2u(Du) etc.
2
0
2cos 2f
2
0
2cos 2f
m D2d ¼ Vd 42
I
A1 A2 B1 B2 E
A1g A2g Eg A1u A2u E
1 1 1 1 2
I 1 1 2 1 1 2
1 1 1 1 0
2C3 1 1 1 1 1 1
2C20
C2
2S4
1 1 1 1 2
3C2 1 1 0 1 1 0
1 1 1 1 0
i 1 1 2 1 1 2
2S6 1 1 1 1 1 1
1 1 1 1 0 0 11 1 1 1 0 0
x2 þ y2, z2
1
Rz
(Rx, Ry) (zx, yz)
sd
h¼8
1 1 1 1 0
x þ y , z2 Rz x2 y2 xy (x, y) (zx, yz) (Rx, Ry) 2
2
3sd
h ¼ 12
1 1 0 1 1 0
x þy , z Rz (Rx, Ry)(x2 y2, xy)(zx, yz) 2
z (x, y)
2
2
h ¼ 24
3sv
h¼1
2Sf
1
D3d ¼ 3m
SY MM ETR Y
x þ y2, z2 Rz 2
(Rx, Ry) (zx yz) (x2 y2, xy) z
(x, y)
7.3
401
GROUP THEORY AND CHARACTER TABLES
Table 7.6 D4d
(Continued ) I
A1 A2 B1 B2 E1 E2 E3
2S8
1 1 1 1 2 2 2
2S82
2C4
1 1 1 1 21/2 0 21/2
1 1 1 1 0 2 0
4C20
C2
1 1 1 1 21/2 0 21/2
1 1 1 1 2 2 2
1 1 1 1 0 0 0
m Td ¼ 43
I
8C3
3C2
6S4
6sd
A1 A2 E T1 T2
1 1 2 3 3
1 1 1 0 0
1 1 0 1 1
1 1 0 1 1
1 1 0 1 1
6C2
6C4
3C2
i
6S4
8S6
3sh
6sd
A1g A2g Eg T1g T2g A1u A2u Eu T1u T2u
1 1 2 3 3 1 1 2 3 3
1 1 1 0 0 1 1 1 0 0
1 1 0 1 1 1 1 0 1 1
1 1 0 1 1 1 1 0 1 1
1 1 2 1 1 1 1 2 1 1
1 1 2 3 3 1 1 2 3 3
1 1 0 1 1 1 1 0 1 1
1 1 1 0 0 1 1 1 0 0
1 1 2 1 1 1 1 2 1 1
1 1 0 1 1 1 1 0 1 1
12C25
20C3
15C2
A T1 T2 G H
1 3 3 4 5
1 (1/2)(1 þ 51/2) (1/2)(1 51/2) 1 0
1 (1/2)(1 51/2) (1/2)(1 þ 51/2) 1 0
1 0 0 1 1
1 1 1 0 1
a
x þ y2, z2 Rz z (x, y) (x2 y2, xy) (Rx Ry) (zx yz)
(2z2 x2 y2, x2 y2) (Rx Ry Rz) (x, y, z) (xy, yz, zx)
8C3
12C5
1 1 1 1 0 0 0
x2 þ y2 þ z2
I
I
h ¼ 16 2
h ¼ 24
Oh ¼ m3m
I
4sd
h ¼ 48 x2 þ y2 þ z2 (2z2 x2 y2, x2 y2) (Rx, Ry, Rz) (zx, yz, xy)
(x, y, z)
h ¼ 48 z þ y2þ z2 (x, y, z) (Rx, Ry, Rz) (2z2 x2 y2, x2 y2, xy, yz, zx)
(The noncrystallographic point groups are identified by an asterisk ). As before, h ¼ order of group [1].
E leaves the AO’s untouched, while C2 turns 2px(OA) into 2px(OB), and turns 2px(N) into 2px(N), while sv0 turns 2px(OA) into 2px(OB) and so forth. By looking at Fig. 7.7, one can construct the 3 3 matrices representing the symmetry operations E, C2, sv, and s: 0
1 R1 ðIÞ ¼ @ 0 0
0 1 0
1 0 0 A; 1
0
1 0 0 R2 ðC2 Þ ¼ @ 0 0 1
1 0 1 A; 0
402
Table 7.7
7
Abbreviated Character Tables for the 32 Crystallographic Point Groups [2]
C1 ¼ 1
I h¼1
A
1
Ci ¼ 2
C2 ¼ 2
Ag Aux, y, z
C2h ¼ 2/m
Ag Bg Au; z Ag Bux, y
A; z B; x,y
C2v ¼ 2mm A1; z B2; y A2 A1; z B1; x
Cs ¼ m
: : :
I I I
i C2 sh
A0 ; x, y A00 ; z
: :
1 1
1 1
D2 ¼ V ¼ 222
: : :
I I I
C2 C2 C2(x)
sh sv C2(y)
i sv0 C2(z)
A1 B3; x B1; z A1 B2; y
: : : : :
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
C4 ¼ 4
I
:
A; z B Eu; x iy
C4
C24
C34 S34 1 1 i i
S4 ¼ 4
:
I
S4
S24
A B; z E; x iy
: : : :
1 1 1 1
1 1 i i
1 1 1 1
C3 ¼ 3
:
I
C3
C23 _[e exp(2pi/3)]
A; z E; x iy
: : :
1 1 1
1 e e2
1 e2 e
C3v ¼ 3m A1; z A2 E; x, y
:
I
2C3
3sv
D3 ¼ 32
:
I
2C3
3Cx
A1 A2; x E; x, y
: : :
1 1 2
1 1 1
1 1 0
SY MM ETR Y
7.3
403
GROUP THEORY AND CHARACTER TABLES
Table 7.7
(Continued )
C6 ¼ 6
:
I
C6
C26
C36
C46
C56 _[o exp(2pi/6)]
A; z B E1
: : : : : :
1 1 1 1 1 1
1 1 o2 o o o2
1 1 o o2 o2 o
1 1 1 1 1 1
1 1 o2 o o o2
1 1 o o2 o2 o2
E2; x iy
C4v ¼ 4mm D4 ¼ 422 D2d A1; z A2 B1 B2 E; x, y
A1 A2; z B1 B2 E; x, y
¼ 4m2
A1 A2 B1 B2; z E; x, y
C24
2C4
2sv
2sv0
:
I
C24
2C4
2C2
2C20
:
I
C2
2S4
2C2
2sd
: : : : :
1 1 1 1 2
1 1 1 1 2
1 1 1 1 0
1 1 1 1 0
1 1 1 1 0
2C36
2C26
2C6
3C2
2C20
:
I
3sv
3sv0
:
I
sh
2C26 2S62
2C6
D3h ¼ 6m2
C36
2S6
3C2
3sv
A10 A20 A100 A200 ; z E0 ; x, y E00
: : : : : :
1 1 1 1 2 2
1 1 1 1 2 2
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 0 0
1 1 1 1 0 0
: C6v ¼ 6mm A1; z A2 B2 B1 E1 E2; x, y
I
I
D6 ¼ 622
A1 A2; z B1 B2 E2 E1; x, y
:
T ¼ 23
:
I
3C2
4C3
4C23
A E
: : : :
1 1 1 3
1 1 1 1
1 e e2 0
1 e2 e 0
F; x, y, z
O ¼ 432 A1 A2 E F2 F1; x, y, z
m Td ¼ 43
: :
I I
C42 C2
2C4 2S4
2sv 2C2
2sv0 2sd
A1 A2 E F2; x, y, z F1
: : : : :
1 1 2 3 3
1 1 1 0 0
1 1 2 1 1
1 1 0 1 1
1 1 0 1 1
404
7
SY MM ETR Y
Z Y σv x σv'
FIGURE 7.7 The NO2 molecule, which belongs to the point group C2v ¼ 2mm (the C2 axis is along x). The z axis is pointing up; the 2px orbitals centered on N and on the oxygen atoms OA and OB are shown. The relativephases of the 2px orbitals are indicated by + and – signs inside a square box: & + or & –.
O
B
OA
0
1 R3 ðsv Þ ¼ @ 0 0
0 1 0
1 0 0 A; 1
0
1 R4 ðs0v Þ ¼ @ 0 0
0 0 1
1 0 1A 0
ð7:3:2Þ
The sum of the diagonal elements, also called the trace, or the “character,” of these four matrices is 3, 1, 1, and 1, respectively. The matrices are not all diagonal, but are block-diagonal, and this representation is therefore called reducible. The first row and first column will at most change the sign of 2px(N), but not mix 2px(N) with the orbitals on the two oxygen atoms. However, the other three operations will interchange 2px(OA) with 2px(OB), which is why the block of the last two rows and columns is not diagonal. However, if we rewrite the basis given above into a symmetry-adapted basis: 0
1 2px ðNÞ @ 2px ðOA Þ þ 2px ðOB Þ A 2px ðOA Þ2px ðOB Þ
ð7:3:3Þ
then the representation becomes diagonal: 0
1 I1 ðEÞ ¼ @ 0 0 0
1 I3 ðsv Þ ¼ @ 0 0
0 1 0
1 0 0 A; 1
0 1 0
1 0 0 A; 1
0
1 0 I2 ðC2 Þ ¼ @ 0 1 0 0 0
I4 ðs0v Þ
1 ¼ @0 0
0 1 0
1 0 0 A; 1 1 0 0A 1
ð7:3:4Þ
which can be described as a direct sum of three one-dimensional irreducible representations: G(1) þ G(1)0 þ G(1)00 , where G(1) is spanned by the basis function 2px(N), G(1)0 is spanned by the basis function 2px(OA) þ 2px(OB), and G(1)00 is spanned by the basis function 2px(OA) 2px(OB). The characters of these three one-dimensional irreducible representations are labeled B1 for G(1), B1 for G(1)0 , and B2 for G(1)00 , as seen in the character table repeated below. The entries {“x,” “y,” “z”}, {“Rx,” “Ry,” “Rz”} and {“xy,” “z2”, etc.} apply to linear functions, to
7.4
405
B R A V A I S L AT T I C E S
rotations, and to quadratic functions, respectively, that “belong to” (transform like) that particular representation. C2v ¼ 2mm
I
C2
sv(xz)
sv’(yz)
h¼4
A1 A2 B1 B2
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
z, x2, y2, z2 Rz, xy x, Ry, zx y, Rx, yz
PROBLEM 7.3.1. Prove that molecules with permanent electric or magnetic dipole moments must belong to one of the following point groups: Cn, Cnv, or Cs [1]. PROBLEM 7.3.2. Prove that molecules can be chiral (¼ optically active) if and only if they do not have an Sn improper rotation axis (n 1) [1, p. 435].
7.4 BRAVAIS LATTICES All planar lattices shown in Fig. 7.2 are “primitive”; that is, the bodies at the origin are reproduced nowhere else within the cell, but repeat only at all unit cell corners. If centering is now allowed in addition to whole-cell translations (this centering involves the fractional translation of a/2, and/or b/2), then one gets for planar systems the centered rectangular cell, as well as the primitive rectangular cell: The four plane systems of Fig. 7.2 yield the five plane Bravais lattices of Fig. 7.8 (primitive oblique, primitive rectangular, centered rectangular, primitive square, and primitive hexagonal). PROBLEM 7.4.1. Prove that a centered square lattice of side a is not unique (Fig. 7.9). Figure 7.10 shows the 14 three-dimensional Bravais lattices available for monoatomic solids. This means that, for a crystal consisting of only one atom, there are only 14 ways in which this crystal can fill space. In practice, the stable crystal structures of the chemical elements chose only a few of these lattices.
b
b
a
b
a
(a)
a
(b)
(c)
a
b
FIGURE 7.8 b
|a| = |b|
(d)
a
|a| = |b|
(e)
The five planar Bravais lattices: (a) primitive oblique (parallelogram) (b) primitive rectangular, (c) centered rectangular, (d) primitive square, (e) primitive hexagonal.
406
7
a
SY MM ETR Y
Problem 7.4.2 and Fig. 7.11 explain simply why only certain proper rotations will “tile” two-dimensional space. PROBLEM 7.4.2. In a two-dimensional lattice (Fig. 7.11), only rotations by n ¼ 1 (y ¼ 180 ), n ¼ 3 (y ¼ 120 ), n ¼ 4 (y ¼ 90 ), and n ¼ 6 (y ¼ 60 ) are allowed. PROBLEM 7.4.3. Show why (a) there is no C-centered cubic cell, (b) there is no I-centered monoclinic cell, and (c) there is no face-centered tetragonal cell.
FIGURE 7.9 A centered square plane lattice of side a (two centered unit cells are shown) is equivalent to a smaller primitive square lattice of side 21/2 |a|/2 ¼ |a|/21/2 ¼ 0.70711 |a|, constructed by using half the cell diagonals (thicker lines).
PROBLEM 7.4.4. For a monoatomic cubic crystal consisting of spherical atoms packed as close as possible, given the choices of a simple cubic crystal (SCC: atom at cell edges only; this structure is rarely used in nature, but is found in a-Po), a body-centered cubic crystal (BCC, atom at corners and at center of body), and a face-centered cubic crystal (FCC: body at face corners and at face centers), show that the density is largest (or the void volume is smallest) for the FCC structure (see Fig. 7.12). In particular, show that the packing density of spheres is (a) 52% in a simple cubic cell; (b) 68% for a bodycentered cell; (c) 71% for a face-centered cubic cell. Z
c
Y
Z
a
b
c
Y
b
b 90°
a
X
Monoclinic 2P
Z
Z
Z
c
c
c
Y
Y
a
X
Monoclinic 2C
Y
Z
a Orthorhombic 222C
Orthorhombic 222F
Y
b (b=a)
b (b=a)
a Tetragonal 4P
a
X
X
Tetragonal 4I
Z
Z
Z
(c=a) c
Y (c=a) c
Y (c=a) c
b (b=a) a Cubic 23P
a
X
c Y
The 14 possible Bravais lattices for crystals of a monoatomic molecule. The full designation shown here bears a numerical prefix—for example, 23F for face-centered cubic. When space groups are generated from the Bravais lattices, then this numerical prefix is dropped (e.g., the 23P cubic Bravais lattice reappears simply as P), because the other numbers or letters that follow the P will identify the space group uniquely.
Y
b
Z
c
FIGURE 7.10
c
b
a Orthorhombic 222IX
X
Z
b
b
b
90°
90°
Triclinic 1P a
a Orthorhombic 222P X
c
b
90°
b g
Y
Z
Y b (b=a)
b (b=a) a Cubic 23C
X
Cubic23F
a
X
Z
Y
Z
X
g=120°
Y
(c=a) c b (b=a) Rhombohedral 3R
a
X
Hexagonal 3P
a X
X
7.4
407
B R A V A I S L AT T I C E S b B"
B a
a
α A'
a
FIGURE 7.11
α A
a
Diagram to illustrate the allowed proper rotations in a plane.
A"
FIGURE 7.12
SCC
BCC
FCC=CCP
The three cubic lattices: simple cubic (SCC) (not used in nature), body-centered cubic (BCC), and face-centered cubic (FCC ¼ cubic closest-packed ¼ CCP).
Interstice left by both layers A and B; if this interstice is filled by a thrid layer C, then cubic closest-packed stacking (fcc) is formed.
A
Interstice filled by atom in layer A; if third layer is added to cover this interstice, then ..ABAB... periodicity is formed: hexagonally close-packed lattice (hcp) of cannonballs is formed
B
FIGURE 7.13 Close packing of spheres in the CCP and HCP structures.
PROBLEM 7.4.5. The HCP structure constructed by Bravais is shown in Fig. 7.13, by overlaying a layer of atoms B on top of a hexagonally ordered single layer of atoms A. If another layer of atoms A is put on top (occupying the interstice shown by the top arrow), then the HCP structure results. If, instead, a layer C is put on top, by filling the interstice indicated by the bottom arrow, then the CCP ¼ FCC lattice is obtained. Now we must translate this into putting only two atoms into the correct positions in space group P63/mmc, # 194, D6h4, Z ¼ 2. This can only be done if the atoms occupy special positions (1/3, 2/3, 1/4) and (2/3, 1/3, 3/4) (Wyckoff notation 2c). The first atom at (1/3, 2/3, 1/4) will generate layer A, while the second one at (2/3, 1/3, 3/4) will generate layer B. For convenience, let us translate both atoms by (1/3, 2/3, 1/4), so that the layer A atom is moved to (0, 0, 0) and
408
7
SY MM ETR Y
LAYER A
FIGURE 7.14 Construction of HCP stacking in a hexagonal Bravais lattice. The atoms in Layer B are placed over the centers of the triangles of layers A. The vector from the origin atom in the displaced unit cell at (0,0,0) to the closest atom in layer B at (1/3,2/3,1/2) is (1/3)a þ (2/3)b þ (1/2)c.
LAYER B
(1/3)a+(2/3)b+(1/2)c
c b
LAYER A
a
c
R S
a
c
R S
b, |b| = |a|
120° U
FIGURE 7.15 Proof of axial ratio in ideal HCP structure.
P
Q
P
r
T
r
Q
the layer B atom is moved to (1/3, 2/3, 1/2). In Fig. 7.14, the sides of the displaced unit cell are indicated by the thicker lines. (a) Show that in the “ideal” HCP structure the axial ratio c/a should be 1.63299 ¼ (8/3)1/2 (Fig. 7.15). [Most elements that have an HCP have axial ratios between 1.57 and 1.63, while Cd and Zn have c/a ¼ 1.86 (because the fiction that atoms are rigid spheres is just that: fiction!)]. (b) Show also that the packing density for the hexagonal close-packed cell (HCP) structure is 74%.
7.5 THE 32 CRYSTALLOGRAPHIC POINT GROUPS The 32 crystallographic point groups, first mentioned in Table 7.1, are now described in Table 7.8 (ordered by principal symmetry axes and also by the crystal system to which they belong). The 230 space groups of Sch€ onflies and Fedorov were generated systematically by combining the 14 Bravais lattices with the intra-unit cell symmetry operations for the 32 crystallographic point
7.5
409
T H E 3 2 C R Y S T A L L O G R A P H I C P OI N T G R O U P S
Table 7.8 The 32 Crystallographic Point Groups, Listed by Main Symmetry Axes or Plane, Using Both the Schoenflies Notation (S, e.g., C2v) and the Hermann–Mauguin or International Notation (HM, e.g., mm2)a 1 or m S
HM
sys
C1
1
tc:1
Ci Cs
1 m
tc:2 m:2
2 S C2 C2h D2 C2v D2h D2d
Principal Symmetry Axes (or Planes) and Crystal System 3 4 HM sys S HM sys S HM sys 2 2/m 222 2mm mmm 42m
m:1 m:3 o:1 o:2 o:3 te:4
C3 C3h D3 C3v D3h D3d T Th O Td Oh
3 6 32 3m 62m 3m 23 m3 432 43m m3m
Crystal System
Essential Symmetry
Triclinic Monoclinic Orthorhombic Tetragonal Trigonal Hexagonal Cubic
1 or 1 2 or m 222 or mm21 4 or 4 3 or 3 6 or 6 23
tg:1 h:2 tg:3 tg:4 h:6 tg:5 c:1 c:2 c:3 c:4 c:5
C4 C4h D4 C4v D4h S4
4 4/m 422 4mm 4/mmm 4/m
tg:1 tg:3 tg:4 tg:5 tg:7 tg:2
6 S
HM
sys
C6 C6h D6 C6v D6h S6 ¼ C3i
6 6/m 622 6mm 6/mmm i
h:1 h:3 h:4 h:5 h:7 tg:2
Point Groups 1,1 2, m, 2/m 222, mm2, mmm 4, 422, 4, 4/m, 4mm, 42m, 4/mmm 3, 3, 321, 32, 3m 6, 622, 6, 6/m, 6mm, 62m, 6/mmm 23, m3, 432, 43m, m3m
An identifier also shows to which of the seven crystal systems the associated space groups belong (tc ¼ triclicic, m ¼ monoclinic, o ¼ orthorhombic, te ¼ tetragonal, tg ¼ trigonal, h ¼ hexagonal, c ¼ cubic), preceded by an integer (to indicate ordering within each system)—for example, c:3 means cubic, third group. The 11 centrosymmetric point groups are in boldface (these are known as the 11 Laue symmetries; the X-ray beam adds a centering operator to the problem). a
groups that can tile two-dimensional space within each unit cell; this work of can be reconstructed laboriously [3]. A few words about notation: The Schoenflies notation beloved by chemists is being replaced by the Hermann–Mauguin, or international symbol, which identifies better the important symmetry elements of each space group (but loses the reminder about the point group parentage used in the Schoenflies notation). The Soviet literature used the Shubnikov13 system, which is similar to the Hermann–Mauguin system. Molecules with an internal symmetry that is not one of these 32 crystallographic point groups can still form a crystal, albeit with a symmetry lower than that of the molecule. For instance, the C60 molecule (buckminsterfullerene), a truncated eicosahedron, belongs to the icosahedral point group I, while crystals of C60 at room temperature belong to space group Pa3, which is derived from the much less symmetrical tetrahedral point group Th (see below): thus there is some rotation of the C60 molecule at room temperature: this makes it a plastic crystal.
13
Aleksei Vasilievich Shubnikov (1887–1970).
410
7
SY MM ETR Y
Also, quasi-crystals of apparent macroscopic fivefold symmetry do exist, despite the toological restriction that an object with a fivefold axis cannot tile two-dimensional space; these funny quasi-crystals do have voids in them, which “fill” the remainder of the space needed. Furthermore, there are many more “magnetic point groups” than 32; and there are other, more specialized point groups; one can also go to more than three dimensions, and so on.
7.6 THE 17 PLANE GROUPS Figure 16 shows the 17 plane groups with all their symmetry operations.
7.7 THE 230 CRYSTALLOGRAPHIC SPACE GROUPS Table 7.9 lists the 230 space groups [4]. The origin of the coordinate axes is chosen, by convention, differently in different crystal systems (e.g., the unique axis is b in monoclinic crystals, but c in tetragonal, hexagonal, and trigonal systems). This means that in a descent-of-symmetry analysis, where a chemical distortion lowers the symmetry from one space group to the other, one often has to contend with a rotation of the axes and/or a translation of the origin; this introduces 4 4 transformation matrices (discussed in the next section). The space group designations are formed by first writing the Bravais lattice type (Fig. 7.10) and then writing the Hermann–Mauguin symmetry indicators; the numerical prefix of the Bravais lattice type (e.g., 4 in 4P, 23 in 23F, or 3 in 3R) is dropped, since whatever follows repeats the information. The number of asymmetric units per unit cell, Z, is only for atoms in general positions; an integer submultiple of Z will apply to the atoms that lie on symmetry axes or planes. For instance, NaCl belongs to the cubic space group Fm3m (# 225), with Z ¼ 196, but there are only one-twelfth that many ions in the unit cell: 4 Naþ cations, at “special” position (0,0,0) plus its seven other symmetry-related positions (due to face-centering), and 4 Cl anions at “special” position (1/2,1/2,1/2) plus its seven symmetry-related related positions (due to face-centering). The monoclinic crystals now are listed with the b axis as the unique axis, but prior to 1940, another popular “setting” used c as the unique axis. Of the 230 space groups, 7 have two choices of unit cell, a primitive rhombohedral one (R) and, for convenience, a nonprimitive hexagonal one (H), with three times the volume of the rhombohedral cell. The 3 3 transformation matrices from rhombohedral (obverse, or positive, or direct) aRbR, cR to hexagonal axes aH, bH, cH and vice versa are shown in the caption to Fig. 7.17. Another funny problem is how to view the relationship between trigonal, hexagonal, and rhombohedral space groups. For an interesting discussion see Ashcroft and Mermin [5 p. 125 footnote]. Figure 7.18 shows the information provided by the International Tables for X-Ray Crystallography, Volume I (and also the newer Volume A) for that most frequent and popular of space groups for organic crystals, P21/c. The top line in Fig. 7.18 identifies the crystal system (monoclinic), the full international symbol (P 1 21/c 1), which says that the lattice is primitive (P), that along the
7.7
411
THE 230 CRYSTALLOGRAPHIC SPACE GROUPS
OBLIQUE
#2 p211 (p2)
#1 p1 [x,y]
[x,y; -x,-y]
RECTANGULAR
,
,
,
,
,
#3 p1m1 (pm)
, [x,y; -x,0.5+y]
[x,y; -x,y]
,
#4 p1g1 (pg)
,
,
, ,
, , ,
#5 c1m1 (cm)
,
[x,y; -x,y; 0.5+x,0.5+y; 0.5-x,0.5+y]
,
,
,
, ,
,
,
[x,y; -x,y; -x,-y; x,-y]
,
,
,
#7 p2mg (pmg) [x,y; -x,-y; 0.5+x,-y; 0.5-x,y]
#6 p2mm (pmm)
#8 p2gg (pgg) [x,y; -x,-y; 0.5+x,0.5-y; 0.5-x,0.5+y]
SQUARE
,
, ,
, ,
,
,
,
, ,
,
#9 c2mm (cmm)
#10 p4
[(x,y; -x,y; -x,-y; x,-y) + (0,0; 1/2,1/2)]
FIGURE 7.16 [x,y; -y,x; -x,-y; y,-x]
The 17 plane groups [4].
412
7
,
,
,
,
,
,
,
,
#11 p4mm (p4m)
,
,
,
,
#12 p4gm (p4g)
, ,
,
, ,
,
SY MM ETR Y
,
,
,
[x,y; -y,x; -x,-y; y,-x; y,x; x,-y; -y,-x; -x,y
]
[x,y; -y,x; -x,-y; y,-x; 4 others
HEXAGONAL
,
, , ,
,
, ,
#16 p6
#15 p31m
, [x,y; -y,x-y; y-x,-x; -x,-y; y,y-x; x-y,x]
[x,y; -y,x-y; y-x,-x; y,x; -x,y-x; x-y,-y]
,
, ,
,
, , FIGURE 7.16 (Continued )
,
,
, ,
,
,
#14 p3m1
,
,
,
,
[x,y; -y,x-y; y-x,-x; -y,-x; x,x-y; y-x,y]
[x,y; -y,x-y; y-x,-x]
,
,
,
#13 p3
,
,
,
,
#17 p6m
, ,
,
,
,
[x,y; -y,x-y; y-x,-x; y,x; -x,y-x; x-y,-y; -x,-y; y,y-x; x-y,x; -y,-x; x,x-y; y-x,y]
7.7
Table 7.9 No.
413
THE 230 CRYSTALLOGRAPHIC SPACE GROUPS
List of the 230 Crystallographic Space Groupsa
Space Group
Long (HM)
Old (S)
Point Group
i?
H?
Z
Unique Axis
Alternate Settings
Triclinic 1 P1 2 P1
P1 P1
C11 C1i
C1 Ci
N Y
Y N
1 2
— —
— —
Monoclinic 3 P2 4 P21 5 C2 6 Pm 7 Pc 8 Cm 9 Cc 10 P2/m 11 P21/m 12 C2/m 13 P2/c 14 P21/c 15 C2/c
P121 P 1 21 1 C121 P1m1 P1c1 C1m1 C1c1 P 1 2/m 1 P 1 21/m 1 C 1 2/m 1 P 1 2/c 1 P 1 21/c 1 C 1 2/c 1
C21 C22 C23 C1s C2s C3s C4s C12h 2 C2h 3 C2h 4 C2h 5 C2h 6 C2h
C2 C2 C2 Cs Cs Cs Cs C2h C2h C2h C2h C2h C2h
N N N N N N N Y Y Y Y Y Y
Y Y Y N N N N N N N N N N
2 2 4 2 2 4 4 4 4 8 4 4 8
b b b b b b b b b b b b b
c c c: B2 c c: Pb c: Bm c: Bb c c c: B2/m c: P2/b c: P21/b; P21/n c:B2/b
Orthorhombic 16 P222 17 P2221 18 P21212 19 P212121 20 C2221 21 C222 22 F222 23 I222 24 I212121 25 Pmm2 26 Pmc21
P222 P 2 2 21 P 21 21 2 P 21 21 21 C 2 2 21 C222 F222 I222 I 21 21 21 Pmm2 P m c 21
D12 D22 D23 D24 D25 D26 D27 D28 D29 1 C2v 2 C2v
D2 D2 D2 D2 D2 D2 D2 D2 D2 C2v C2v
N N N N N N N N N N N
Y Y Y Y Y Y Y Y Y N N
4 4 4 4 8 8 16 8 8 4 4
— c c — c — — — — c c
5 5:P2122, P2212 5:P21212
27 28
Pcc2 Pma2
Pcc2 Pma2
3 C2v 4 C2v
C2v C2v
N N
N N
4 4
c c
29
Pca21
P c a 21
5 C2v
C2v
N
N
4
c
30
Pnc2
Pnc2
6 C2v
C2v
N
N
4
c
31
Pmn21
P m n 21
7 C2v
C2v
N
N
4
c
32 33
Pba2 Pna21
Pba2 P n a 21
8 C2v 9 C2v
C2v C2v
N N
N N
4 4
c c
34 35 36
Pnn2 Cmm2 Cmc21
P m n 21 Cmm2 C m c 21
10 C2v 11 C2v 12 C2v
C2v C2v C2v
N N N
N N N
4 8 8
c c c
5:C2122, C2212 5:A222,B222 5 5 5 5:P2mm,Pm2m 5:P21ma,Pb21mc, Pm21b,Pcm21, P21am 5:P2aa,Pb2b 5:P2mb,Pc2m,Pm2a, Pbm2, P2cm 5:P21ab,Pc21b, Pbc21,P21ca 5:P2na,Pb2n,Pn2b, Pcn2, P2an 5:P21mn,Pn21n, Pnm21,P21nm 5:P2cb,Pc2a 5:P21nb,Pc21n, Pn21a, Pbn21, P21cn 5:P2nn,Pn2n 5:A2mm,Bm2m 5:A21ma,Bb21m, Bm21b,Ccm21, A21am (continued )
414
7
SY MM ETR Y
Table 7.9 (Continued ) No.
Space Group
Long (HM)
Old (S)
Point Group
i?
H?
Z
Unique Axis
37 38
Ccc2 Amm2
Ccc2 Amm2
13 C2v 14 C2v
C2v C2v
N N
N N
8 8
c c
39
Abm2
Abm2
15 C2v
C2v
N
N
8
c
40
Ama2
Ama2
16 C2v
C2v
N
N
8
c
41
Aba2
Aba2
17 C2v
C2v
N
N
8
c
42 43 44 45 46
Fmm2 Fdd2 Imm2 Iba2 Ima2
Fmm2 Fdd2 Imm2 Iba2 Ima2
18 C2v 19 C2v 20 C2v 21 C2v 22 C2v
C2v C2v C2v C2v C2v
N N N N N
N N N N N
16 16 8 8 8
c c c c c
47 48 49 50 51
Pmmm Pnnn Pccm Pban Pmma
P P P P P
1 D2h 2 D2h 3 D2h 4 D2h 5 D2h
D2h D2h D2h D2h D2h
Y Y Y Y Y
N N N N N
8 8 8 8 8
c c c c c
52
Pnna
P 2/n 21/n 2/a
6 D2h
D2h
Y
N
8
c
53
Pmna
P 2/m 2/n 21/a
7 D2h
D2h
Y
N
8
c
54
Pcca
P 21/c 2/c 2/a
8 D2h
D2h
Y
N
8
c
55 56 57
Pbam Pccn Pbcm
P 21/b 21/a 2/m P 21/c 21/c 2/n P 2/b 21/c 21/m
9 D2h 10 D2h 11 D2h
D2h D2h D2h
Y Y Y
N N N
8 8 8
c c c
58 59 60
Pnnm Pmmn Pbcn
P 21/n 21/n 2/m P 21/m 21/m 2/n P 21/b 2/c 21/n
12 D2h 13 D2h 14 D2h
D2h D2h D2h
Y Y Y
N N N
8 8 8
— c c
61 62
Pbca Pnma
P 21/b 21/c 21/a P 21/n 21/m 21/a
15 D2h 16 D2h
D2h D2h
Y Y
N N
8 8
c c
63
Cmcm
C 2/m 2/c 21/m
17 D2h
D2h
Y
N
16
c
64
Cmca
C 2/m 2/c 21/a
18 D2h
D2h
Y
N
16
c
65 66 67
Cmmm Cccm Cmma
C 2/m 2/m 2/m C 2/c 2/c 2/m C 2/m 2/m 2/a
19 D2h 20 D2h 21 D2h
D2h D2h D2h
Y Y Y
N N N
16 16 16
c c c
68
Ccca
C 2/c 2/c 2/a
22 D2h
D2h
Y
N
16
c
69
Fmmm
F 2/m 2/m 2/m
23 D2h
D2h
Y
N
32
c
2/m 2/m 2/m 2/n 2/n 2/n 2/c 2/c 2/m 2/b 2/a 2/n 21/m 2/m 2/a
Alternate Settings 5:A2aa,Bb2b 5:B2mm,Cm2m, Am2m,Bmm2, C2mm 5:B2cm,Cm2a,Ac2m, Bma2,C2mb 5:Cc2m,Am2a,Bm2, C2cm 5:B2cb,Cc2a,Ac2a, Bba2,C2cb 5:F2mm,Fm2m 5:F2dd,Fd2d 5:Im2m,I2mm 5:I2cb,Ic2a 5:I2mb,Ic2m, Im2a, Ibm2,I2cm — — 5:Pmaa,Pbmb 5:Pncb,Pcna 5:Pbmm,Pmcm, Pmam,Pmmb, Pcmm 5:Pbnn,Pncn,Pnan, Pnab,Pcnn 5:Pbmn,Pncm,Pman, Pnmb, Pcnm 5:Pbaa,Pbcb,Pbab, Pccb,Pcaa 5:Pmcb,Pcma,Pmcb 5:Pnaa,Pbnb 5:Pmca,Pbma,Pcmb, Pcam, Pmab 5:Pnmm,Pnmm 5:Pnca,Pbna,Pcnb, Pcan,Pnab 5:Pcab 5:Pbnm,Pmcn,Pnam, Pmnb, Pcma 5:Amma,Bbmm, Bmmb,Ccmm, Amam 5:Abma,Bbcm,Bmab, Ccmb,Acam 5:Ammm,Bmmm 5:Amaa,Bbmb 5:Abmm,Bmcm, Bmam,Cmmb, Acmm 5:Abaa,Bbcb,Cccb, Acaa 5
7.7
415
THE 230 CRYSTALLOGRAPHIC SPACE GROUPS
Table 7.9 (Continued ) No. 70 71 72 73 74
Space Group Fddd Immm Ibam Ibca Imma
Tetragonal 75 P4 76 P41 77 P42 78 P43 79 I4 80 I41 81 P4 82 I4 83 P4/m 84 P42/m 85 P4/n 86 P42/n 87 I4/m 88 I41/a 89 P422 90 P4212 91 P4122 92 P41212 93 P4222 94 P42212 95 P4322 96 P43212 97 I422 98 I4122 99 P4mm 100 P4bm 101 P42cm 102 P42nm 103 P4cc 104 P4nc 105 P42mc 106 P42bc 107 I4mm 108 I4cm 109 I41md 110 I41cd 111 P42m 112 P42c 113 P421m 114 P421c 115 P4m2 116 P4c2 117 P4b2 118 P4n2
Long (HM)
Old (S)
Point Group
i?
H?
Z
Unique Axis
Alternate Settings
F 2/d 2/d 2/d I 2/m 2/m 2/m I 2/b 2/a 2/m I 2/b 2/c 2/a I 2/m 2/m 2/a
24 D2h 25 D2h 26 D2h 27 D2h 28 D2h
D2h D2h D2h D2h D2h
Y Y Y Y Y
N N N N N
32 16 16 16 16
c c c c c
5 5 5:Imcb,Icma 5:Icab 5:Ibmm,Imcm,Imam, Immb,Icmm
P4 P 41 P 42 P 43 I4 I 41 P 4 I 4 P 4/m P 42/m P 4/n P 42/n I 4/m I 41/a P422 P 4 21 2 P 41 2 2 P 41 21 2 P 42 2 2 P 41 2 2 P 43 2 2 P 43 21 2 I422 I 41 2 2 P4mm P4bm P 42c m P 42n m P4cc P4nc P 42m c P 42b c I4mm I4cm I 41m d I 41c d P 4 2 m P 4 2 c P 4 21m P 4 21c P 4 m 2 P 4 c 2 P 4 b 2 P 4 n 2
C14 C24 C34 C44 C54 C64 S14 S24 1 C4h 2 C4h 3 C4h 3 C4h 5 C4h 6 C4h D14 D24 D34 D44 D54 D64 D74 D84 D94 D10 4 1 C4v 2 C4v 3 C4v 4 C4v 5 C4v 6 C4v 7 C4v 8 C4v 9 C4v 10 C4v 11 C4v 12 C4v 1 D2d 2 D2d 3 D2d 4 D2d 5 D2d 6 D2d 7 D2d 8 D2d
C4 C4 C4 C4 C4 C4 S4 S4 C4h C4h C4h C4h C4h C4h D4 D4 D4 D4 D4 D4 D4 D4 D4 D4 C4v C4v C4v C4v C4v C4v C4v C4v C4v C4v C4v C4v D2d D2d D2d D2d D2d D2d D2d D2d
N N N N N N N N Y Y Y Y Y Y N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
Y Y Y Y Y Y N N N N N N N N Y Y Y Y Y Y Y Y Y Y N N N N N N N N N N N N N N N N N N N N
4 4 4 4 8 8 4 8 8 8 8 8 16 16 8 8 8 8 8 8 8 8 16 16 8 8 8 8 8 8 8 8 16 16 16 16 8 8 8 8 8 8 8 8
c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c c
C4 C41 C42 C42 F4 F41 C-4 F-4 C4/m C42/m C4/a C42/a F4/m F41/d C422 C4221 C4122 C41221 C4222 C42221 C4322 C43221 F422 F4122 C4mm C4mb C42mc C42mn C4cc C4cn C42cm C42cb F4mm F4mc F41dm F41dc C 4m2 C 4c2 C 4m21 C 4c21 C 42m C 42c C 42b C 42n (continued )
416
7
SY MM ETR Y
Table 7.9 (Continued ) No.
Space Group
Long (HM)
Old (S)
Point Group
i?
H?
Z
Unique Axis
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142
I4m2 I4c2 I42m I42d P4/mmm P4/mcc P4/nbm P4/nnc P4/mbm P4/mnc P4/nmm P4/ncc P42/mmc P42/mcm P42/nbc P42/nnm P42/mbc P42/mnm P42/nmc P42/ncm I4/mmm I4/mcm I41/amd I41/acd
I 4 m 2 I 4 c 2 I 4 2 m I 4 2 d P 4/m 2/m 2/m P 4/m 2/cm 2/c P 4/n 2/b 2/m P 4/n 2/n 2/c P 4/m 21/b 2/m P 4/m 21/n 2/c P 4/m 21/m 2/m P 4/n 21/c 2/c P 42/m 2/m 2/c P 42/m 2/c 2/m P 42/m 2/b 2/c P 42/n 2/n 2/m P 42/m 21/b 2/c P 42/m 21/n 2/m P 42/n 21/m 2/c P 42/n 21/c 2/m I 4/m 2/m 2/m I 4/m 2/c 2/m I 41/a 2/m 2/d I 41/a 2/c 2/d
9 D2d 10 D2d 11 D2d 12 D2d 1 D4h 2 D4h 3 D4h 4 D4h 5 D4h 6 D4h 7 D4h 8 D4h 9 D4h 10 D4h 11 D4h 12 D4h 13 D4h 14 D4h 15 D4h 16 D4h 17 D4h 18 D4h 19 D4h 20 D4h
D2d D2d D2d D2d D4h D4h D4h D4h D4h D4h D4h D4h D4h D4h D4h D4h D4h D4h D4h D4h D4h D4h D4h D4h
N N N N Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y
N N N N N N N N N N N N N N N N N N N N N N N N
16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 32 32 32 32
c c c c c c c c c c c c c c c c c c c c c c c c
P3 P 31 P 32 R3 P3 R3 P312 P321 P 31 1 2 P 31 2 1 P 32 1 2 P 32 2 1 R32 P3m1 P31m P3c1 P31c R3m R3c P 3 1 m P 3 1 c P 3 m 1 P 3 c 1 R 3 m R 3 2/c
C13 C23 C33 C43 C13i C23i D13 D23 D33 D43 D53 D63 D73 1 C3v 2 C3v 3 C3v 4 C3v 5 C3v 6 C3v 1 D3d 2 D3d 3 D3d 4 D3d 5 D3d 6 D3d
C3 C3 C3 C3 C3i C3i D3 D3 D3 D3 D3 D3 D3 C3v C3v C3v C3v C3v C3v D3d D3d D3d D3d D3d D3d
N N N N Y Y N N N N N N N N N N N N N Y Y Y Y Y Y
Y Y Y Y N N Y Y Y Y Y Y Y N N N N N N N N N N N N
3 3 3 3 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 12 12 12 12 12 12
c c c c c c c c c c c c c c c c c c c c c c c c c
Trigonal 143 P3 144 P31 145 P32 146 R3 147 P3 148 R3 149 P312 150 P321 151 P3112 152 P3121 153 P3212 154 P3221 155 R32 156 P3m1 157 P31m 158 P3c1 159 P31c 160 R3m 161 R3c 162 P31m 163 P31c 164 P3m1 165 P3c1 166 R3m 167 R3c
Alternate Settings F 42m F 42c F 4m2 F 4d2 C4/mmm C4/mcc C4/amb C4/acn C4/mmb C4/mcn C4/amm C4/acc C42/mcm C42/mmc C42/acb C42/amn C42/mcb C42/mmn C42/acm C42/amc F4/mmm F4/mmc F41/ddm F41/ddc
Hexagonal Hexagonal
Hexagonal
Hexagonal Hexagonal
Hexagonal Hexagonal
7.7
417
THE 230 CRYSTALLOGRAPHIC SPACE GROUPS
Table 7.9 (Continued ) No.
Space Group
Long (HM)
Old (S)
Point Group
i?
H?
Z
Unique Axis
6 61 65 62 64 63 6 6/m 63/m 622 61 2 2 65 2 2 62 2 2 64 2 2 63 2 2 6mm 6cc 63c m 63m c 6 m 2 6 c 2 6 2 m 6 2 c 6/m 2/m 2/m 6/m 2/c 2/c 63/m 2/c 2/m 63/m 2/m 2/c
C16 C26 C36 C46 C56 C16 1 C3h 1 C6h 2 C6h D16 D26 D36 D46 D56 D66 1 C6v 2 C6v 3 C6v 4 C6v 1 D3h 2 D3h 3 D3h 4 D3h 1 D6h 2 D6h 3 D6h 4 D6h
C6 C6 C6 C6 C6 C6 C3h C6h C6h D6 D6 D6 D6 D6 D6 C6v C6v C6v C6v D3h D3h D3h D3h D6h D6h D6h D6h
N N N N N N N Y Y N N N N N N N N N N N N N N Y Y Y Y
Y Y Y Y Y Y N N N Y Y Y Y Y Y N N N N N N N N N N N N
6 6 6 6 6 6 6 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 24 24 24 24
c c c c c c c c c c c c c c c c c c c c c c c c c c c
T1 T2 T3 T4 T5 T1h T2h T3h T4h T5h T6h T7h O1 O2 O3 O4 O5 O6 O7 O8 T1d T2d
T T T T T Th Th Th Th Th Th Th O O O O O O O O T T
N N N N N Y Y Y Y Y Y Y N N N N N N N N N N
Y Y Y Y Y N N N N N N N Y Y Y Y Y Y Y Y N N
24 48 24 24 24 24 24 24 96 48 24 24 24 24 96 96 48 24 24 48 24 96
Hexagonal 168 P6 169 P61 170 P65 171 P62 172 P64 173 P63 174 P6 175 P6/m 176 P63/m 177 P622 178 P6122 179 P6522 180 P6222 181 P6422 182 P6322 183 P6mm 184 P6cc 185 P63cm 186 P63mc 187 P6m2 188 P6c2 189 P62m 190 P62c 191 P6/mmm 192 P6/mcc 193 P63/mcm 194 P63/mmc
P P P P P P P P P P P P P P P P P P P P P P P P P P P
Cubic 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216
P23 F23 I23 P 21 3 I 21 3 Pm3 P 2/n 3 F 2/m 3 F 2/d 3 I 2/m 3 P 21/a 3 I 21/a 3 P432 P 42 3 2 F432 F 41 3 2 I432 P 43 3 2 P 41 3 2 I 41 3 2 P 4 3 m F 4 3 2
P23 F23 I23 P213 I213 Pm3 Pn3 Fm3 Fd3 Im3 Pa3 Ia3 P432 P4232 F432 F4132 I432 P4332 P4132 I4132 P43m F432
Alternate Settings
(continued )
418
7
SY MM ETR Y
Table 7.9 (Continued ) No. 217 218 219 220 221 222 223 224 225 226 227 228 229 230
Space Group I43m P43n F43c I43d Pm3m Pn3n Pm3n Pn3m Fm3m Fm3c Fd3m Fd3c Im3m Ia3d
Long (HM) I 4 3 m P 4 3 n F 4 3 c I 4 3 d P 4/m 3 2/m P 4/n 3 2/n P 42/m 3 2/n P 42/n 3 2/m F 4/m 3 2/n F 4/m 3 2/c F 41/d 3 2/m F 41/d –3 2/c I 4/d –3 2/m I 41/a –3 2/d
Old (S)
Point Group
i?
H?
Z
T3d T4d T5d T6d O1h O2h O3h O4h O5h O6h O7h O8h O9h O10 h
T T T T Oh Oh Oh Oh Oh Oh Oh Oh Oh Oh
N N N N Y Y Y Y Y Y Y Y Y Y
N N N N N N N N N N N N N N
48 24 96 48 48 48 48 48 192 192 192 192 96 96
Unique Axis
Alternate Settings
a
With numerical order of listing in the International Tables [4]), the long (HM: Hermann–Mauguin or international) and old (S for Sch€ onflies) designators, point group, existence of inversion center i, H? ¼ handedness of space group ¼ chirality (no inversion centers or mirror planes or glide planes; optically active molecules crystallize only in chiral space groups), Z ¼ number of asymmetric units per unit cell, Unique axis ¼ choice of unique axis (usually the axis of highest symmetry) ¼ (a, b, or c), and alternate settings for the space group and unit cel axes (No. ¼ the number of space group designations equivalent to the standard one, if used) [4].
b axis there is a twofold screw axis (21) with translation b/2, and that normal to the b axis there is also a glide plane with glide translation c/2. The next item given the space group number (which is increasingly used as an identifier in X-ray structure papers), the short Hermann–Mauguin or international symbol (P21/c), and the old Sch€ onflies symbol (C2h5), which helps little by saying that this is the fifth space group developed from point group C2h.
cH
bH (0,0,0)
aH FIGURE 7.17 Relationship between the rhombohedral axes aR, bR, cR (obverse or positive, or direct setting) and the hexagonal axes aH, bH, cH [4, pp. 20–21). aR ¼ ð2=3ÞaH þ ð1=3ÞbH þ ð1=3Þc H aH ¼ aR bR bH ¼ bR c R bR ¼ ð1=3ÞaH þ ð1=3ÞbH þ ð1=3Þc H c H ¼ aR þbR þc R c R ¼ ð1=3ÞaH þ ð2=3ÞbH þ ð1=3Þc H
7.7
419
THE 230 CRYSTALLOGRAPHIC SPACE GROUPS
Monoclinic
P 1 21/c 1
2/m
C2h5
,
1/2-
,
1 4
1 4
1 4
1 4
1 + 4
1 4
,
-
P 21/c
No. 14
+
+
1/2+
-
1/2-
,
,
, +
1/2+ Origin at -1; unique axis b
Number of positions, Wyckoff notation, and point symmetry
Conditions limiting possible reflections
Co-ordinates of equivalent positions
General: 4
e
1
x,y,z; -x,-y,-z; -x,1/2+y,1/2-z; x,1/2-y,1/2+z.
2
d
-1
1/2,0,1/2;
1/2, 1/2,0.
2
c
-1
0,0,1/2;
0,1/2, 0.
2
b
-1
1/2,0,0;
1/2,1/2,1/2.
2
a
-1
0,0,0;
0,1/2,1/2.
hkl: No conditions h0l: l=2n 0k0: k=2n Special: as above, plus: hkl: h+k=2n
FIGURE 7.18
Symmetry of special projections (001) pgm; a'=a, b'=b
(100) pgg; b'=b, c'=c
NaCl
a
a
(010) p2; c'=c/2,a'=a
Symmetry operations in space group P21/c (#14). From Henry and Lonsdale [4].
CsCl
FIGURE 7.19 a
FCC
a
BCC
The B1 or halite, CsCl, FCC, and BCC structures.
420
7
SY MM ETR Y
The two drawings are projections on the z/c ¼ 0 plane—that is, the (001) plane—of (on the left) the positions of the asymmetric unit and (on the right), the symmetry operators. The convention is that the origin is in the upper lefthand corner of each diagram, the a axis is vertically down, and the b axis is across. On the right-hand diagram are indicated the symmetry operations: (a) centers of inversion symmetry at 0,0,0; 0,1/2,0; 1,0,0; 0,1/2,0; 0,1,0; 1/2,1/2,0; 1/2,1,0; 1,1/2,0; 1, 1,0; (b) glide planes perpendicular to b at y ¼ 1/4 and also at y ¼ 3/4, both with glide translation c/2 (implied); (c) 21 screw axes parallel to b at x ¼ 0, z ¼ 1/4; at x ¼ 1/2, z ¼ 1/4; at x ¼ 1, z ¼ 1/4. Note that the center of symmetry is not specified in the space group symbol. On the left-hand diagram, the general point in the asymmetric unit is indicated by the open circle. When the symmetry operation changes a lefthanded object (open circle) into a right-handed object, this is indicated by a circle containing a comma. The , 1/2þ, and so on, are fractional coordinates along c (z/c fractional coordinates). On the line below is indicated where the convention sets the origin (at a center of inversion symmetry 1). The unique axis b is indicated (cell angle b 6¼ 90 ), since an older setting of monoclinic space groups used the c axis as the unique axis (cell angle g 6¼ 90 ). In each cell, there are Z ¼ 4 asymmetric units (of symmetry 1, that is, no symmetry, and Wyckoff14 notation e, that is, for general position x, y, z), and the conditions for these general positions are three: (i) no restrictions for the general hkl reflection, (ii) for k ¼ 0 reflections (h0l), l must be even for a reflection to be present; this condition is generated by the two-fold screw axis, (iii) for h ¼ 0 and l ¼ 0 reflections (0k0), k must be even, or k ¼ odd is an “absent” reflection (except for the Renninger15 double diffraction effect): this condition is generated by the glide plane. The general (x, y, z) position (first position) is mapped into the equivalent (x, y, z) position by the center of inversion symmetry at (0,0,0); this second position is of opposite handedness than the first position. The first position is mapped into position (x, 1/2 þ y, 1/2 z) (third position) by the two-fold screw axis parallel to b at x ¼ 0, z ¼ 1/4, with translation b/2. The first position is mapped into position (x, 1/2 y, 1/2 þ z) (fourth position) by the glide plane perpendicular to b at y ¼ 1/4 with translation c/2 along z. If the object or asymmetric unit has inversion symmetry, then there are four possible unique locations: (1/2, 0, 1/2) (Wyckoff notation d) or (0,0,1/2) (Wyckoff notation c), or (1/2, 0, 0) (Wyckoff notation b) or (0, 0, 0) (Wyckoff notation a). If the position (1/2, 0, 1=2 ) is occupied, then there can be only Z ¼ 2 of these, the other being at (1/2, 1/2, 0). However, these positions will contribute to diffraction spots hkl only if the sum of k and l is even. Finally, a projection of the space group P21/c along (001) has plane group symmetry pgm, while that along (100) has symmetry pgg and that along (010) the projection has symmetry p2, with a halved c axis. Table 7.10 shows the effect of some symmetry elements in a space group on the coordinates of an atom at the general position (x, y, z) in the unit cell.
14 15
Ralph Walter Graystone Wyckoff, Sr. (1897–1994). Mauritius Renninger (1905–1987).
7.9
Table 7.10
Effect of Some Symmetry Operations on the General Coordinate (x, y, z) [6]
Operator mirror plane “m” k ac ¼ (101) plane at y ¼ 1/4 glide plane “b” k ac ¼ (101) plane with glide translation c/2 at y ¼ 1/4 net glide “n” k ac ¼ (101) plane with glide translation (a þ c)/2 at y ¼ 1/4 21 screw k c at x ¼ 1/2, y ¼ 1/2 with translation c/2
PROBLEM 7.7.1. Using the structure factor [2pi(hx þ ky þ lz)], prove that a two-fold screw axis, parallel to b at x ¼ 0, z ¼ 1/4 with translation b/2, mapping the general point (x, y, z) into the point (x, y þ 1/2, 1/2 z) causes an extinction for h 0 l, for l ¼ odd. PROBLEM 7.7.2. Given that the equation for the observed X-ray diffracted intensity contains the phase factor exp[2pi(hx þ ky þ lz)], where h, k, and l are the Miller indices and x, y, z are the atom coordinates, and that the h, k, l indices are negative and positive or zero, prove that a two-fold screw axis, parallel to b, at x ¼ 0, z ¼ 1/4, with translation b/2, which maps the general point (x, y, z) into the point (x, y þ 1/2, 1/2 z), also causes an extinction h 0 l, l ¼ odd.
7.8 LISTING OF ELEMENTS, SIMPLE COMPOUNDS, AND THEIR CRYSTAL STRUCTURES Table 7.11 lists all the elements, several oxides, halides, and chalcogenides and the crystal structures of their more usual and stable polymorphs. Figure 7.19 shows a few elementary cubic structures.
7.9 THE WIGNER–SEITZ CELL Physicists prefer to the crystallographic unit cell (which can contain several asymmetric units or repeat units, the Wigner16–Seitz17 cell, which for a primitive lattice is constructed by taking the direct lattice constants a, b, c and their negatives a, b, c, bisecting all six of these vectors, and then constructing the planes normal to these vectors; the volume enclosed by these planes is also primitive (contains only one lattice point) and has volume equal to V). For nonprimitive cells, or when there are several asymmetric units in the crystallographic unit cell (Z H 1), then the vectors from the origin to the center of the nearest-neighbor repeat unit (e.g., the atom at the center in a BCC lattice, or the atom in the middle of the face in an FCC lattice) are bisected by a plane, which defines part of the surface of the Wigner–Seitz cell. Some Wigner–Seitz cells are shown in Fig. 7.20. This construction also works in dimensions other than 3. The reason for this complication is that the symmetries about the origin
16 17
421
THE WIGNER–SEITZ CELL
Eugene Paul Wigner ¼ Pal Jen€ o Wigner (1902–1995). Frederick Seitz (1911–2008).
Result x, 1/2 y, z x, 1/2 y, 1/4 þ z x þ 1/4,1/2 y,1/4 þ z 1/2 x,1/2 y,1/2 þ z
422
Ac Ag AgBr AgCl AgF AgI Al AlAs a-Al2O3 b-Al2O3 AlP AlSb Ar As Au a-B20 a-BN ¼ hBN b-BN ¼ cBN g-BN Ba BaO BaS BaSe BaTe BaTiO3 BaTiO3 Be BeS BeSe BeTe Bi BiO a-Bi2O3 b-Bi2O3 g-Bi2O3 d-Bi2O3 Br2
Element or Comp.
Table 7.11
A9 B3 B4 BCC B1 B1 B1 B1 disD5 disG5 HCP B3 B3 B3
FCC
FCC FCC B1 B1 B1 B3 FCC B3 D51 D56 B3 B3 FCC
Structure Type Fm3m Fm3m Fm3m Fm3m Fm3m F 43m Fm3m F 43m R 3c P63/mmc F 43m F 43m Fm3m R 3m Fm3m R 3m P63/mmc F 43m P63mc Im3m Fm3m Fm3m Fm3m Fm3m P4/mmm Pm3m P63/mmc F 43m F 43m F 43m R 3m R3m P21/c P421c I23 Fm3m Bmab
Space Group 225 225 225 225 225 216 225 216 167 194 216 216 225 166 225 166 194 216 194 229 225 225 225 225 123 221 194 216 216 216 166 160 14 114 197 225 64
Space Group#
2 4 2 2 4 4 4 4 1 1 2 4 4 4 2 3 4 4 13 2 4
4 4 4 4 4 4 4 4 6 12 4 4 4 2 4
Z
2.5–2.9 3.615 2.55 5.025 5.523 6.39 6.60 6.99 3.9860 4.012 2.2808 4.85 5.07 5.54 4.7364 3.88 5.850 7.758 10.268 5.660 6.67
4.086 5.776 5.556 4.92 6.47 4.042 5.62 4.74943 5.56 5.45 6.13 5.26 4.142 4.0781
a (A)
4.48
8.165
8.72
3.5735
4.0259
120
120
120
g (deg.)
4.17
112.98
b (deg.)
120
57.23
54.12
a (deg.)
6.66
12.96465 22.55
c (A)
9.71 7.510 5.731
b (A)
Crystal Structures of Most Chemical Elements and Some Compounds [5], [7]a
@ 123 K
@ 473 K
Graph.sft Hard Hard
@ 4.2 K
Corundum
Note
423
C C C60 C60 K3C60 a-Ca g-Ca CaCO3 CaCO3 CaF2 CaO CaS CaSe CaTe CaTiO3(id) CaTiO3(real) Cd CdI2 CdO CdS CdTe b-Ce g-Ce d-Ce CeO Ce2O3 Cl2 b-Co a-Co CoFe2O4 COOK (CHOH)2 COONa. 4H2O Cr CrO2 Cr2O3 Cs CsBr CsCl CsCl CsF
BCC B2 B2 B1 B1
BCC
A18 FCC HCP H11
HCP C6 B1 B3 B3 HCP FCC BCC B1
FCC FCC BCC G1 G2 C1 B1 B1 B1 B1 G5
A4 A9
227 194 202 205 225 225 229 167 62 225 225 225 225 225 221 62 194 164 225 216 216 194 225 229 225 164 138 225 194 227
18 229 136 167 229 221 221 225 225
Fd3m P63/mmc Fm3 Pa3 Fm3m Fm3m Im3m R 3c Pnma Fm3m Fm3m Fm3m Fm3m Fm3m Pm3m Pnma P63/mmc P 3m1 Fm3m F 43m F 43m P63/mmc Fm3m Im3m Fm3m P 3m1 P42/ncm Fm3m P63/mmc Fd3m
P21212 Im3m P42/mnm R 3c Im3m Pm3m Pm3m Fm3m Fm3m
4 2 2 6 2 1 1 4 4
4 4 4 4 2 2 4 4 4 4 4 4 1 4 2 1 4 4 4 2 4 2 4 1 8 4 2 8
8
11.91 2.8845 4.421 4.954 6.08 4.296 4.121 7.08 6.01
3.56696 2.46 14.17 14.04 14.24 5.57 4.477 6.361 5.72 5.462 4.797 5.69 5.91 6.34 (3.825) 5.441 2.9736 4.24 4.689 5.82 6.48 3.65 5.150 4.11 5.089 3.888 8.56 3.5442 2.5074 8.37
14.32
5.96
2.917 13.584
6.20
4.0699
120
120
120
5.381 5.6058 6.835
7.644
6.069 6.12
120
4.94
46.12
120
7.94
6.69
Cs Cl @ 773 K
@173 K
Rochelle Salt
Co ferrite
H1003 K
(continued )
Ideal perovsk. Perovskite
H737 K Calcite Aragonite Fluorite
Diamond Graphite @ 300 K @11 K
424
CsI Cu CuBr CuCl CuF CuI Cu2O Dy Er Eu EuO F2 a-Fe g-Fe FeO Fe2O3 Fe3O4 a-FeOOH Ga GaAs b-Ga2O3 GaP GaSb Gd Ge GeO2 H2 He Hf HfO2 Hg HgS HgSe HgTe Ho I2 In InAs In2O3
Element or Comp.
B3 B3 HCP A4 C4 HCP HCP HCP C1 A10 B3 B3 B3 HCP A14 A6 B3
A11 B3
BCC FCC B1 D51 H11
HCP HCP BCC B1
B1 FCC B3 B3 B3 B3
Structure Type
Table 7.11 (Continued )
Pm3m Fm3m F 43m F 43m F 43m F 43m P4232 P63/mmc P63/mmc Im3m Fm3m C2/c Im3m Fm3m Fm3m R 3c Fd3m Pbnm Cmca F 43m C2/m F 43m F 43m P63/mmc Fd3m P42/mnm P63/mmc P63/mmc P63/mmc F4m3m R 3m F 43m F 43m F 43m P63/mmc Cmca I4/mmm F 43m R 3c
Space Group 221 225 216 216 216 216 208 194 194 229 225 15 229 225 225 167 227 53 64 216 12 216 216 194 227 136 194 194 194 225 166 216 216 216 194 64 139 216 167
Space Group#
4 4 2 8 2 2 2 2 4 1 4 4 4 2 4 2 4 6
2 4 4 6 8 4 8 4
1 4 4 4 4 4 2 2 2 2 4
Z 4.5667 3.6147 5.69 5.41 4.26 6.04 4.2696 3.578 3.532 4.573 5.1439 5.50 2.86645 3.64 4.31 5.035 8.380 4.596 4.5107 5.6355 12.214 5.45 6.12 3.622 5.64613 4.3975 3.75 3.57 3.20 5.115 2.999 5.85 6.08 6.43 3.557 7.250 3.241 6.04 5.487
a (A)
4.774
70.53
14.510
5.620 9.772 4.936
120
120
120 120 120
120 120
g (deg.)
2.8625 6.12 5.83 5.06
90.0
b (deg.)
120
5.7981
3.0371
a (deg.)
5.748
3.021 4.5157
13.720
7.28
5.648 5.589
c (A)
9.957 7.6448
3.28
b (A)
@ 227 K
@ 4.2 K @2K
Stable
G54 K G1183 K H1183 K Def. oxide Hematite Magnetite Goethite
Cuprite
@ 298 K
Note
425
InSb InP Ir K KBiO3 KF KBr KCl KI Kr KH2PO4 a-La b-La g-La LaCoO3 LaCrO3 LaCuO3 La2CoO4 La2CuO4 La2CuO4 LaFeO3 LaNiO3 La2NiO4 LaO La2O3 LaSrCuO4 La1.85Sr0.15CuO4 LaTiO3 LaTiO3 (ideal) Li Li LiBr LiCl LiF LiI Lu Mg MgAl2O4 MgFe2O4 MgO MgS MgSe
G5 BCC HCP B1 B1 B1 B1 HCP HCP H11 H11 B1 B1 B1
B1
B1 B1 B1 B1 FCC H22 HCP FCC BCC
B3 B3 FCC BCC
F 43m F 43m Fm3m Im3m Pn3 Fm3m Fm3m Fm3m Fm3m Fm3m I 42d P63/mmc Fm3m Im3m R 3c Pnma R 3c Cmca I4/mmm Cmca Pmna R 3c I4/mmm Fm3m C 3m I4/mmm I4/mmm Pnma Pm3m Im3m P63/mmc Fm3m Fm3m Fm3m Fm3m P63/mmc P63/mmc Fd3m Fd3m Fm3m Fm3m Fm3m 216 216 225 229 201 225 225 225 225 225 122 225 225 225 167 62 167 64 139 64 62 167 139 225 164 139 139 62 221 229 225 225 225 225 225 194 194 227 227 225 225 225
4 4 4 2 12 4 4 4 4 4 4 2 4 4 2 4 4 4 4 4 4 2 2 4 1 2 2 4 1 2 2 4 4 4 4 2 2 8 8 4 4 4
6.48 5.87 3.8389 5.344 10.016 5.344 6.58 6.28 7.07 5.72 7.437 3.770 5.307 4.26 5.3778 5.512 5.431 12.548 3.7873 13.1529 5.553 5.393 3.855 5.144 3.9373 3.765 3.7793 5.601 4.060 3.5087 3.111 5.501 5.14 4.0279 6.00 3.50 3.2028 8.116 8.359 4.205 5.20 5.45 5.4006 7.867
5.3548 5.563
5.590
5.488
5.488
120 120 5.55 5.1998
120
120
60.80
60.85
60.798
120
5.093
13.27 13.2260 7.906
12.652
7.752
5.476
6.945 12.159
(continued )
þFe,Cr,Mn Mg ferrite
@ 77 K
H533 K H1137 K
@ 58 K @ 293 K
@ 291 K
426
MgSiO3 a-Mn b-Mn MnFe2O4 MnO Mn2O3 MnS (red) MnSe Mo MoO2 b-N2 a-NH4Br b-NH4Br a-NH4Cl b-NH4Cl NH4H2PO4 Na NaBr NaBrO3 NaCl NaClO3 NaCuO3 NaF NaI Nb NbO a-Nd b-Nd NdO NdTiO3 Ne Ni NiAs NiFe2O4 Np Np Np a-O2 b-O2
Element or Comp.
BCC
FCC FCC B8 H11
hex HCP B1
B1 B1 BCC
HCP B1 B2 B1 B2 H22 BCC B1 B1 B1 B1
B3 B3 BCC
A12 A13 H11 B1
Structure Type
Table 7.11 (Continued )
Pmna I 43m P4132 Fd3m Fm3m Ia2 F 43m F 43m Im3m P21/c P63/mmc Fm3m Pm3m Fm3m Pm3m I 42d Im3m Fm3m P213 Fm3m P213 P 1 Fm3m Fm3m Im3m Pm3m P63/mmc Im3m Fm3m Pnma Fm3m Fm3m P63/mmc Fd3m Pnma P4212 Im3m C2/m R 3m
Space Group 62 217 213 227 225 206 216 216 229 14 194 225 221 225 221 122 229 225 198 225 198 2 225 225 229 221 194 194 225 62 229 225 194 227 62 90 229 12 166
Space Group#
3
4 58 20 8 4 16 4 4 2 4 2 4 1 4 1 4 2 4 4 4 4 1 4 4 2 3 2 2 4 4 4 4 2 8 4 4 2
Z 4.933 8.894 6.30 8.419 4.4345 9.408 5.60 5.82 3.150 5.6109 4.039 6.91 4.06 6.547 3.8758 7.479 4.2906 5.973 6.72 5.63874 6.568 2.748 4.628 6.47 3.3008 4.2101 3.6579 4.13 4.994 5.485 4.429 3.52394 3.610 8.357 6.663 4.897 3.52 5.403 3.307
a (A)
3.429
4.723
5.589
6.671
4.8562
6.902
b (A)
5.86 11.256
4.887 3.388
5.028
7.779
11.7992
3.462
7.516
5.6285 6.670
4.780
c (A)
76.24
a (deg.)
132.53
113.41
b (deg.)
120
128.16
120
g (deg.)
Ni ferrite G543 K 543 K G T G 850 K H850 K G23.9K 23.9 K G T G 43.8 K
@ 4.2 K
H1135 K
@ 291 K
@ 45 K @ 523 K @ 291 K @ 523 K G457 K
Mn ferrite
Note
427
g-O2 Os OsO2 g-P4 (white) P4 (black) P (red) Pa Pb PbO PbO a-PbO2 Pb2O3 Pd a-Po b-Po a-Pr b-Pr PrO PrO2 Pt a-PtO2 b-PtO2 a-Pu b-Pu g-Pu d-Pu e-Pu Rb RbBr a-RbCl b-RbCl RbF RbI Re Rh RhO2 Rn Ru RuO2 a-S8 b-S8 g-S8
FCC HCP C4 A16 Mono Mono
FCC Cubic BCC B1 B2 B1 B1 B1 HCP FCC
BC
HCP BCC B1 C1 FCC
FCC SCC
Tetrahedryl FCC
A17
HCP C4
223 194 136 12 64 139 225 57 129 136 14 225 221 166 194 229 225 225 225 191 58 11 — 70 225 229 225 221 225 225 225 194 225 136 225 194 136 170 14
Pm3n P63/mmc P42/mnm C2/m Cmca I4/mmm Fm3m Pbcm P4/nmm P42/mnm P21/c Fm3m Pm3m R 3m P63/mmc Im3m Fm3m Fm3m Fm3m C6/mmm Pnnm P21/m Monocl. Fddd Fm3m Im3m Fm3m Pm3m Fm3m Fm3m Fm3m P63/mmc Fm3m P42/mnm Fm3m P63/mmc P42/mnm Fddd P21/c 2 2 128 48
2 4 1 4 4 4 2 4 2
2 2 4 4 4 1 2 16 34 8 4
4 4 2 2 4 4 1
8 2 2 4
2.6984 4.4919 10.4434 10.90
6.83 2.7304 4.5003 9.1709 3.31 7.34 3.925 4.9496 4.743 3.96 4.9578 8.466 3.8902 3.345 3.359 3.6725 4.13 5.031 5.36 3.9237 3.08 4.488 6.1835 9.284 3.1587 4.6371 3.6361 5.709 6.85 3.749 6.548 5.64 7.34 2.7553 3.8044 4.4862
12.8401 10.96
4.533 4.8244 10.463 5.7682
5.625
5.876
8.3385 10.48
4.2730 3.1066 24.4367 11.02
3.0884
4.4493
4.19 3.138 10.973 7.859 10.162
98.22 11.835
5.476 5.01 3.3878 7.814
3.238
4.3097 3.1839 5.4336 4.38
96.73
101.81 92.13
124.80
90.31
120
120
120
120
K K K K K
@ 298 K @ 376 K
@ 83 K @ 293 K
@ 294 @ 463 @ 508 @ 593 @ 773
@1073 K
(continued )
White P Black P White-to-violet
H43.8 K
428
Sb Sb2O3 a-Sb2O4 Sb2O5 Sc Sc Sc2O3 Se a-Se b-Se Si b-SiC a-SiO2(quar) b-SiO2(quar) SiO2(atrid) SiO2(btrid) SiO2(bcrist) Fused SiO2 Sm SmO a-Sn gray b-Sn white SnO SnO2 a-Sr b-Sr g-Sr SrCoO3 SrFeO3 Sr2MnO4 SrO SrS SrSe SrTe SrTiO3 Sr2TiO4 Ta Tb Tc
Element or Comp.
BCC HCP HCP
B1 B1 B1 B1
FCC HCP BCC
B1 A4 A5
C10 C9
A4 B3 C8 C8
A8
FCC HCP
A7
Structure Type
Table 7.11 (Continued )
3 4 8 4 2 2 4 2 2 1 1 2 4 4 4 4 1 2 2 2 2
166 225 227 141 129 136 225 194 229 221 221 139 225 225 225 225 195 139 229 194 194
6 4 4 4 4 2 16 3 32 32 8 4 4 4 4 8
166 56 33 15 225 194 206 152 14 14 227 216 216 152
R-3m Pccn Pna21 C2/c Fm3m P63/mmc Ia3 P3121 P21/n P21/c Fd3m F 43m P3121 P6222 Orthorhombic P63/mmc P213 Tetragonal? R 3m Fm3m Fd3m I41/amd P4/nmm P42/mnm Fm3m P63/mmc Im3m Pm3m Pm3m I4/mmm Fm3m Fm3m Fm3m Fm3m P23 I4/mmm Im3m P63/mmc P63/mmc
Z
194 198
Space Group#
Space Group
8.982 4.943 6.47 5.8197 3.803 4.720 6.073 4.31 4.84 3.8625 3.869 3.787 5.144 6.02 6.23 6.47 3.9051 3.884 3.3026 3.585 2.735
5.03 7.1473
4.4976 4.911 5.456 12.646 4.541 3.302 9.8459 4.3545 9.65 9.31 5.43095 4.357 4.9134 4.9965
a (A)
9.67 8.07
12.464 4.814 4.7820
b (A)
5.662 4.388
12.495
7.05
3.1749 4.838 3.160
8.22
5.4052 5.4546
4.9496 11.61 12.85
5.245
5.412 11.787 5.4247
c (A)
23.31
57.17
a (deg.)
120 120
90.77 93.13
103.91
b (deg.)
120 120
120
120
g (deg.)
Cassiterite @ 298 K @ 521 K @ 887 K
G287 K @298 K
573–870 K @ 298 K m.p.1953 K 1143–1743 K @1573 K
Note
429
Te a-Th b-Th ThO2 a-Ti b-Ti TiO TiO2 TiO2 TiO2 Ti2O3 Tl Tl TlBa2Ca2Cu3O9 Tl2Ba2CaCu2O8 Tl2Ba2Ca2Cu3O10 Tl2Ba2CuO6 TlBr TlCl TlI Tl2O3 Tm a-U b-U g-U UO2 V V2O3 a-W b-W Xe Y YBa2Cu3O7 Y2Cu2O5 YAlO3 YNiO3 Y2O3 YTiO3 Yb YbO Zn ZnO
FCC B1 HCP B4
BCC A15 FCC HCP
BCC C1 BCC
HCP A20
B2 B2 B2
HCP BCC
C4 C5 C21
A8 FCC BCC C1 HCP BCC
P3121 Fm3m Im3m Fm3m P63/mmc Im3m C2/m P42/mnm I41/amd Pbca R 3c P63/mmc Im3m P4/mmm I4/mmm I4/mmm Ccc2 Pm3m Pm3m Pm3m Ia3 P63/mmc Amam P42nm Im3m Fm3m Im3m R 3c Im3m Pm3n Fm3m P63/mmc Pmmm Pna21 Pnma Pnma Ia3 Pnma Fm3m Fm3m P63/mmc P63mc 154 225 229 225 194 229 12 136 141 61 167 194 229 123 139 139 37 221 221 221 206 194 63 102 229 225 229 167 229 223 225 194 47 33 62 62 206 62 225 225 194 186
3 4 2 4 2 2 10 2 4 8 6 2 2 1 2 2 4 1 1 1 16 2 4 30 2 4 2 6 2 8 4 2 1 4 4 4 16 4 4 4 2 2
4.4467 5.074 4.11 5.59 2.953 3.33 4.142 4.59 3.7842 9.184 5.148 3.4496 3.874 3.853 3.8558 3.8503 23.2382 3.97 3.8340 4.198 10.543 3.523 2.8479 10.52 3.49 5.470 3.0399 5.105 3.16475 5.048 6.25 3.663 3.8240 10.796 5.329 5.516 10.604 5.316 5.479 4.877 2.659 3.24 7.611
5.679
4.935 5.18
5.814 11.6901 12.457 7.370 7.419
14.449
5.564 4.9455 5.57
15.913 29.2596 35.88 5.4684
5.855 2.96 9.5146 5.145 13.636 5.5137
107.53
120 120
120
120
120 C
120
4.729
3.8879 3.494 5.179 5.178
5.8580
5.4727
5.447
9.340
120
5.9149
Zincite
Yttria
Tc ¼ 9 2 K
@ 298 K H973 K @ 88 K
@ 298 K
@ 298 K @ 973 K @1073 K
@ 298 K
(continued )
Tc ¼ 99 K Tc ¼ 125 K
@ 291 K @ 535 K
Rutile Anatase Brookite
@ 298 K @1173K
@298 K @1723K
430
B3 B4 B3 B3 HCP BCC Mono Tetragonal
Structure Type F 43m P63mc F 43m F 43m P63/mmc Im3m P21/c P42/nmc I41/amd
Space Group 216 194 216 216 194 229 14 137 141
Space Group# 4 2 4 4 2 2 4 2 4
Z 5.423 3.811 5.67 6.09 3.229 3.62 5.145 3.64 6.58
a (A)
5.2075
b (A)
5.3107 5.27 5.93
120
g (deg.)
5.141 99.23
b (deg.) 120
a (deg.)
6.234
c (A)
Zircon
@1113 K
Zn blende Wurtzite
Note
“Structure Type” refers to FCC, HCP, and BCC, or otherwise to the structure types defined in the late 1930s by a German compilation of crystal structures called “Strukturbericht”: A1 ¼ FCC, A2 ¼ BCC, A3 ¼ HCP, A4 ¼ diamond, A5 ¼ b-Sn, A6 ¼ In, A7 ¼ a-As, A8 ¼ g-Se, A9 ¼ graphite, A10 ¼ a-Hg, A11 ¼ a-Ga, A12 ¼ a-Mn, A13 ¼ b-Mn, A14 ¼ I2, A15 ¼ Cr3Si, A16 ¼ a-S8, A17 ¼ black P, A20 ¼ a-U, B1 ¼ NaCl ¼ halite, B2 ¼ CsCl ¼ cesium chloride, B3 ¼ ZnS ¼ Zinc blende ¼ sphalerite, B4 ¼ ZnO ¼ wurtzite str., B8 ¼ NiAs, B10 ¼ PbO, B21 ¼ CO, B24 ¼ TlF, B26 ¼ CuO, B33 ¼ CrB, C1 ¼ CaF2 ¼ Fluorite, C4 ¼ TiO2 ¼ rutile, C5 ¼ TiO2 ¼ anatase, C8 ¼ SiO2 ¼ high quartz ¼ b-quartz, C9 ¼ SiO2 ¼ high cristobalite ¼ bcristobalite, C10 ¼ SiO2 ¼ upper high tridymite ¼ b-tridymite, C21 ¼ TiO2 ¼ brookite, C22 ¼ Fe2P, C23 ¼ PbCl2, C24 ¼ HgBr2, C25 ¼ HgCl2, C35 ¼ CaCl2, G1 ¼ CaCO3 ¼ calcite, G2 ¼ CaCO3 ¼ aragonite, D1 ¼ NH3, D51 ¼ corundum ¼ a-Al2O3, H11 ¼ MgAl2O4 ¼ spinel, H22 ¼ KH2PO4, G5 ¼ ideal perovskite ¼ ideal CaTiO3, S12 ¼ (Mg,Fe)SiO4 ¼ olivine. The axes b and/or c are not repeated, if they are defined by crystal symmetry to be equal to a; the angles a, b, g are not given if they are defined by crystal symmetry to be equal to 90 . However, for hexagonal crystals, the angle g is listed to be equal to 120 , as a reminder.
a
ZnS ZnS ZnSe ZnTe Zr b-Zr ZrO2 ZrO2 ZrSiO4
Element or Comp.
Table 7.11 (Continued )
7.10
431
RECIPROCAL LATTICE
a
a
Wigner-Seitz cell of planar oblique parallelogram
Wigner-Seitz cell of BCC
Wigner-Seitz cell of FCC
FIGURE 7.20 Some Wigner–Seitz cells (all of which enclose only one atom or asymmetric unit): the cell for a two-dimensional planar parallelogram is an irregular hexagon. For a BCC, it is a truncated octahedron (with 8 hexagonal and 8 square faces). For an FCC, it is a rhombic dodecahedron (with 12 rhombus faces). (For convenience in drawing this figure, the FCC atoms were not put in the usual position starting from (0,0,0), but are shifted, so that only one atom (hidden) is enclosed by the dodecahedron).
(0,0,0) are better seen in the Wigner–Seitz cell and that the cell reciprocal to it allows for a better discussion of momentum space behavior. In fact, the Wigner–Seitz cell in reciprocal space is used as the first Brillouin18 zone.
7.10 RECIPROCAL LATTICE We repeat and complete some concepts introduced in Section 2.4. Given the unit cell sides a, b, c and angles a (between b and c), b (between c and a), and g (between a and b), we recall the inner product or dot product a b ¼ jajjbj cos g ¼ a b
ðð2:4:11ÞÞ
b c ¼ jbjjcj cos a ¼ c b
ðð2:4:12ÞÞ
c a ¼ jcjjaj cos b ¼ a c
ðð2:4:13ÞÞ
and the vector product ex v ¼ a b ax bx
ey ay by
ez az ¼ ev jajjbjsin g bz
ðð2:4:16ÞÞ
with area A |a b| ¼ |a||b| sin g. The unit cell volume is given by V ða bÞ c ¼ ðb cÞ a ¼ ðc aÞ b V ¼ abc½1cos2 acos2 bcos2 g þ 2cos a cos b cos g 1=2
18
Leon Nicolas Brillouin (1889–1969).
ðð2:4:26ÞÞ ðð2:4:28ÞÞ
432
7
ax V ¼ a ðb cÞ ¼ bx cx
ay by cy
SY MM ETR Y
az bz cz
ðð2:4:29ÞÞ
We now define a set of reciprocal lattice cell axes a, b, c and angles a, b , g by
a* b c=V ¼ ea bc sin a=V
ð7:10:1Þ
b* c a=V ¼ eb ca sin b=V
ð7:10:2Þ
c* a b=V ¼ ec ab sin g=V
ð7:10:3Þ
where ea, eb, and ec are unit vectors in the a, b, and c directions (Fig. 7.21). This definition, of course, guarantees orthonormality between the direct space and the reciprocal or dual space: a a* ¼ b b* ¼ c c* ¼ 1
ð7:10:4Þ
a b* ¼ b a* ¼ b c* ¼ c a* ¼ c b* ¼ 0
ð7:10:5Þ
Note also that in a triclinic crystal a and a are not collinear; in a monoclinic crystal (b unique setting) b is parallel to b, but a and c form the obtuse angle b, while a and c form a smaller acute angle b given by b ¼ 180 b. The reciprocal lattice vectors and the direct lattice vectors are a ying-yang duo of concepts, as are position space and momentum space, or space domain and time domain. Fourier transformation helps us walk across from one space to other, as convenience dictates: Some problems are easy in one space, others in the space dual to it; this amphoterism is frequent in physics. The directions of the direct and reciprocal lattice vectors are shown as face normals in Fig. 7.22. By convention, the triad u, v, w denotes a point in direct space in terms of its “reduced coordinates” (i.e., the point is at ruvw ¼ ua þ vb þ wc from the origin; usually u, v, w are not integers). The symbol [u v w] refers to the realspace direction (vector) ua þ vb þ wc (now u, v, w are normally taken to be positive or negative integers or zero), which is also called a “zone axis.” The symbol hu v wi denotes a series of such vectors which are different in direction but equivalent by crystal symmetry to each other. Here [2 2 1] is the same direction in direct space as [4 4 2].
c c*
FIGURE 7.21 Construction of reciprocal lattice vectors a, b, c in a general triclinic cell of sides a, b, c.
b b* a* a
7.10
433
RECIPROCAL LATTICE
001 = 0a* + 0b* + 1c* 010 = 0a* + 1b* + 0c*
c (001) face β
(100) face
(010) face
a
b
γ
a 100 = 1a* + 0b* + 0c*
FIGURE 7.22 Miller indices of crystal faces (100), (010), and (001), along with directions of the normals hkl (double arrows) to these faces: The normal to (100) is parallel to a; the normal to (010) is parallel to b; the normal to the face (001) is parallel to c. Note that {hkl} indicates a set of (hkl) planes, where the indices are related by the symmetry operators of the crystal; these {hkl} are called “forms of planes.” By convention, the normals of a face are enclosed by square brackets: For example, [100] indicates the vector a, while [111] indicates the vector a b þ c; a family of symmetry-related vectors in direct space are enclosed in GH; for example, if [110], [011], and [110] are symmetry-related, then G110H refers to all three.
ec eb c
ea
a
b
FIGURE 7.23 The (111) plane (the locus of points coplanar with thick arrows).
The symbol hkl refers to a point in reciprocal space, with reciprocal lattice vector rhkl ¼ ha þ kb þ lc pointing to that point (arrowhead of vector) from the origin (tail of vector). The symbol (hkl) refers to the “Miller indices” of a crystal face; the scheme was invented by Miller19 in 1839. Figure 7.23 shows the (111) plane, or crystal face, in a triclinic unit cell. Since the crystal faces (planes) are physically determined by the end of crystal growth, they necessarily contain within the plane direct-space vectors of the type rhkl ¼ ha þ kb þ lc (h, k, l integers). Then the normals to this crystal faces must be parallel to the reciprocal lattice vectors rhkl ¼ ha þ kb þ lc. For example, the (124) face of a triclinic crystal has face normal r124 ¼ a þ 2b þ 4c (which is parallel to r248 and to r4,8,16, etc.). PROBLEM 7.10.1. Find the direction cosines of the vectors a, b, and c in terms of the direct lattice vectors a, b, and c.
19
William Hallowes Miller (1801–1880).
434
7 1,0,0 c
FIGURE 7.24
SY MM ETR Y
0,1,1
(001) face
Face A of this crystal contains the points 1,1/2,1; 1,1,1/2; and 0,1,1, and it shares vectors with faces (100), (001), and (010). This face is therefore face (1 2 2) (see Problem 2.17).
1,0,1
1,0.5,1
face A
(100) face a 1,0,0
(010) face
b
0,1,0
1,1,0.5 1,1,0
PROBLEM 7.10.2. Find the Miller indices of the face A shown in Fig. 7.24, if A contains the three direct-lattice points 1,0.5,1; 1,1,0.5; and 0,1,1. PROBLEM 7.10.3. The oldest mathematical definition of crystals, Ha€ uy’s law, states that the dihedral angles between crystal faces are constant, if the crystal habit (growth pattern) is the same. Compute the dihedral angle between two general crystal faces (hkl) and (h0 k0 l0 ). PROBLEM 7.10.4. Start from a face-centered (FCC) lattice, which is defined as having eight equal objects at the eight cell corners (0,0,0), (1,0,0), (0,1,0), (1,1,0), (0, 1, 1), (1, 0, 1), (1, 1, 1) and also six more objects at the six face centers (1/2, 1/2, 0), (1/2. 0, 1/2), (0, 1/2, 1/2), (1, 1/2, 1/2), (1/2, 1, 1/2), (1/2, 1/2, 1). The body-centered lattice has the same eight objects at the eight cell corners, plus a ninth object at the center of the cell (1/2,1/2,1/2). Prove that the reciprocal of an fcc lattice is a bcc lattice, and vice versa. PROBLEM 7.10.5. Find the conditions for a vector ma þ nb þ pc to lie in an (hkl) plane with normal ha þ kb þ lc. PROBLEM 7.10.6. lattice of side a.
Find the symmetry of (100), (110), and (111) faces in an fcc
PROBLEM 7.10.7. Given R [1 cos2a cos2b cos2g þ 2 cos a cos b cos g]1/2, prove that Rea ¼ sin a ea* þ sin b cos g eb* þ cos b sin g ec* Rea ¼ cos g sin a ea* þ sin b eb* þ sin g cos a ec* Rea ¼ sin a cos b ea* þ cos a sin b eb* þ sin g ec* Rea* ¼ sin a ea þ ½ðcos a cos b cos gÞ=sin a eb þ ½ðcos g cos a cos bÞ=sin a ec* Reb* ¼ ½ðcos a cos b cos gÞ=sin b ea þ eb þ ½ðcos b cos g cos aÞ=sin b ec* Rec* ¼ ½ðcos g cos a cos bÞ=sin g ea þ ½ðcos b cosgcosaÞ=sin g eb þ ½sing ec* Show also that, if an atom is at fractional coordinates ra, rb, rc along the direct unit cell axes ea, eb, ec and is also at fractional coordinates ra, rb, rc along the reciprocal cell axes ea, eb, ec, then the following holds: Rra ¼ sin a ra* þ ½ðcos a cos b cos gÞ=sin b rb* þ ½ðcos g cos a cos bÞ=sin g rc* Rrb ¼ ½cos a cos b cos gÞ=sin a ra* þ sin b rb* þ ½ðcos b cos g cos aÞ=sin g rc* Rrc ¼ ½ðcos g cos a cos bÞ=sin a ra* þ ½ðcos b cos g cos aÞ=sin b rb* þ sin grc*
7.11
435
S Y M M E T R Y OF 2 - D SUR F ACES
and conversely:
ra þ ½cos g sin a rb þ ½sin a cos b rc Rra* ¼ ½sin a Rrb* ¼ ½sin b cos g ra þ ½sin b rb þ ½cos a sin b rc Rrc* ¼ ½cos b sin g ra þ ½sin g cos a rb þ ½sin g rc PROBLEM 7.10.8. There are six choices of orthogonalized coordinate systems: (1) (2) (3) (4) (5) (6)
ex along a, ex along a, ex along a, ex in ab plane, ex along a, ex in ac plane,
ey ey ey ey ey ey
along b, in ab plane, along b, along b, in bc plane, along b,
ez ez ez ez ez ez
in ac plane along c in bc plane along c along c along c
In each of these six systems, find the expressions for: (a) the x, y, z components of the direct cell axes (ax, ay, az), (bx, by, bz), and (cx, cy, cz), (b) the x, y, z components of the reciprocal cell axes (ax, ay, az), (bx, by, bz), and (cx, cy, cz), (c) the x, y, z components of oblique reduced coordinates (ra, rb, rc: these are what crystallographers quote as “x/a, y/b, z/c” fractional coordinates for an atom in the asymmetric unit)) in direct space, (d) the x, y, z components of the reciprocal axes hkl.
PROBLEM 7.10.9. For an FCC crystal of side a (e.g., Au), determine the atomic arrangement and the first few (smallest) interatomic distances on a general face with Miller indices h, k, and l. Then show the results for faces 001, 110, 111. See Figs. 7.25 and 7.26.
7.11 SYMMETRY OF 2-D SURFACES By convention, surface superlattices are given names like 31/2 1 R 30 ; this is Wood’s 20 notation of 1964 [8], [9]. This notation describes the symmetry of the superlattice, but does not identify its origin––that is, the exact point where the superlattice is anchored (even though this is a very desirable datum for aficionados of chemisorption). The 31/2 1 R 30 superlattice means that (1) the aS and bS axes of the superlattice crystal lie in the plane of the surface of index hkl (whose axes we call ahkl and bhkl); (2) the first superlattice axis aS is 31/2 times longer than the first surface axis ahkl; (3) the second superlattice axis bS is equal in length to the second surface axis bhkl; (4) these two superlattice axes aS and bS are then rotated by 30 clockwise. An alternative notation uses
20
Elizabeth A. Wood (1912–2006).
436
7
|a| = |b| = |c| = 4.0781 Å for Au
SY MM ETR Y
c
b
FIGURE 7.25 The (100) planes in an FCC crystal (e.g., Au). Eight FCC unit cells are shown. To help guide the eye, the atoms that lie on the (001) planes are marked with solid dots; all other atoms are marked with gray dots, even though they are crystallographically equivalent to the black dots. To keep the picture uncluttered, not all atoms on the face centers are shown (many such atoms on faces away from the front of the picture are suppressed). The rectangular inset at the bottom of the diagram shows the simple square basis of the (001) face. The basis vectors are a001 ¼ {from (2,1/ 2,1/2)FCC to (2,0,1)FCC} ¼ (1/2)b þ (1/2)c and b001 ¼ {from (2,1/2,1/ 2)FCC to (2,1,1)FCC} ¼ (1/2)b þ (1/2)c.
a (2,0,2) (2,1/2,3/2)
(2,0,1) a001 = -(1/2)b+(1/2)c
90˚
(2,1,1) b001= (1/2)b + (1/2)c
(2,1/2,1/2)
(2,0,0)
2 2 matrices, but also fails to specify the site to which the superlattice is anchored. PROBLEM 7.11.1. For an Au(111) face, describe the superlattice (31/2 31/2) R 30 , or, more precisely, Au(111)(31/2 31/2) R 30 . Determine the length of all defined basis vectors. See Fig. 7.27. PROBLEM 7.11.2. If the absorbate consists of atoms of equal radius to that of the (111) face atoms, then for the superlattice (31/2 31/2) R 30 show that the fractional monolayer coverage is 1/3 of a monolayer, or 1/3 Langmuir. PROBLEM 7.11.3. For an FCC crystal of side a (e.g., Au), determine the atomic arrangement and the first few (smallest) interatomic distances on a face with intercepts 1,1, and 1/3, along a, b, and c, respectively, that is, with Miller indices 3, 3, 1. Describe in detail an “Au(3, 3, 1) H2 1 R 60 ” surface. See Fig. 7.28.
7.12
437
DESCENT OF SYMMETRY
|a| = |b| = |c| = 4.0781 Å for Au
c
FIGURE 7.26
b
a (0,0,1)FCC
(0,1/2,1/2)FCC (1/2,0,1/2)FCC = (0,1,0)111
(0,1,0)FCC = (1,0,0)111
|a FCC/21/2| = 2.8836 Å for Au (1/2,0,1/2)FCC
(1/2,1/2,0)FCC = (0,0,0)111
|aFCC/21/2| = 2.8836 Å for Au
7.12 DESCENT OF SYMMETRY One can start from the most symmetric crystal system (cubic) and, by relaxing one after the other of the constraints on unit cell sides and angles of Table 7.1, and removing gradually the symmetry elements, the space group changes (sometimes with a translation of origin dictated by tradition: see the next section), and one descends to the least symmetric system (triclinic). But, as Fig. 7.29 shows, the hexagonal system is a “fresh start” that does not originate from the cubic system. For example, the family of perovskite minerals and high-temperature ceramic superconductors exhibits this descent of symmetry, from the cubic “ideal” perovskite structure (space group Pm3m, the real mineral perovskite is orthorhombic, space group Pnma, with a fourfold larger unit cell than the ideal cubic one) to orthorhombic structures for the highest-critical
The (111) planes in an FCC crystal (e.g., Au). Eight FCC unit cells are shown, with the (111) plane outlined by lines of dashes, and a second (111) plane marked by dashed lines. To help guide the eye, the atoms that lie on the two (111) planes are marked with solid dots; all other atoms are marked with gray dots, even though they are crystallographically equivalent to the black dots. To keep the picture uncluttered, not all atoms on the face centers are shown (many such atoms on faces away from the front of the picture are suppressed). The rectangular inset at the bottom of the diagram shows the hexagonal basis of the (111) face. There are two choices of unit cell, with an acute angle between the axes, or an oblique angle. In surface science the oblique choice is taken, with basis vectors a111 ¼ {from (1/2,1/ 2,0)FCC to (0,1,0)FCC} ¼ (1/2)a þ (1/ 2)b and b111 ¼ {from (1/2,1/2,0)FCC to (1/2,0,1/2)FCC} ¼ (1/2)b þ (1/2)c.
438
7
SY MM ETR Y
FIGURE 7.27 Construction of the (31/2 31/2) R 30 superlattice of the (111) plane of an FCC crystal. If this is Au, then the full Wood’s notation designation is Au(111) (31/2 31/2) R 30 . Shown are an acute surface unit cell (center) and the preferred obtuse unit cell (lower left) with basis vectors a111 ¼ a/2þb/2 and b111 ¼ b/2 þ c/2, with lengths 21/2a and 21/2a each and with an included angle of 120 . Also shown are two alternate but equivalent settings of the primitive (31/2 31/2) R 30 supercell: The bottom one is anchored at interstitial sites (shaded circles) with basis vectors aH3 H3R30 ¼ 2a111 b111 ¼ a þ (1/2)b (1/2)c, and bH3 H3R30 ¼ a111 b111 ¼ (1/2)a b þ (1/2)c, while the top supercell (with the same size and orientation) is anchored with its origin over one of the atoms (dark circle) in the (111) face.
-a+b/2+c/2
(1/2)a-b+(1/2)c (0,0,1) -a+c -b+c (1/2,0,1/2) (0,1/2,1/2)
ACUTE
-a+b
60o
(1/2)a-b+(1/2)c(1,0,0)
(1/2,1/2,0)
(0,1,0)
-a+b/2+c/2
b111=(-b+c)/2 OBLIQUE
a111=(-a+b)/2
c (0,0,0)
b (0,1/2,-1/2) 0,1,-1)
a FIGURE 7.28 ) plane of Construction of the (311 an FCC crystal, which includes lattice points (0,0,0),(1,0,3), (0,1,1), and a few other points, such as (0,2,2) and (1/2,0,3/2). The points that lie plane (see discussion) are in the 311 surrounded by rectangles with rounded edges.
(1/2,0,-3/2)
(1,0,-3)
7.13
439
COVARIANT AND CONTRAVARIANT TRANSFORMATIONS [10,11]
CUBIC
TETRAGONAL
HEXAGONAL
ORTHORHOMBIC
TRIGONAL or RHOMBOHEDRAL
MONOCLINIC
FIGURE 7.29 TRICLINIC
Descent of symmetry.
temperature superconductors. (Perovskite, calcium titanium oxide CaTiO3 is named after Perowski.21)
7.13 COVARIANT AND CONTRAVARIANT TRANSFORMATIONS [10,11] Covariant. We first present the transformation laws for covariant quantities. We want to transform an “old” set of quantities to a “new” set of quantities, due to a transformation (typically, rotation). Let the old unit cell be represented by the 1 3 row vector Vo (aoboco), and the new unit cell be represented by the 1 3 row vector Vn (anbncn). Then there exists a 3 3 transformation matrix P such that 0 1 P11 P12 P13 ð7:13:1Þ V n ðan bn cn Þ ¼ V o P ¼ ðao bo co Þ@ P21 P22 P23 A P31 P32 P33 Similarly, the Miller indices Ho (hokolo) hao þ kbo þ lco denote the unit normal to an imaginary plane in a crystal, or a real crystal face; they transform to Hn ¼ (hnknln) as 0
P11 H n ðhn kn ln Þ ¼ H o P ¼ ðho ko lo Þ@ P21 P31
P12 P22 P32
1 P13 P23 A P33
ð7:13:2Þ
A form of planes {hkl} is a set of symmetry-related planes that may be symmetry-related in some space group, for example {hkl} ¼ (hkl), (hk l), (h kl), (hkl); this form will transform as in Eq. (7.13.2). 21
Count Lev Alekseyevich Perowski (1792–1856).
440
7
SY MM ETR Y
The inverse transformation from “new” to “old” is given by the matrix Q inverse to P: Q P1, defined so that PQ ¼ I: 0
Q11 V o ðao bo co Þ ¼ V o Q ¼ ðan bn cn Þ@ Q21 Q31 Contravariant. quantities:
Q12 Q22 Q32
1 Q13 Q23 A Q33
ð7:13:3Þ
Next, we define the transformation laws for contravariant
1. Column (position) vectors U ¼ ua þ vb þ wc ¼ [uvw] (e.g., atom positions, or “zone axis” [u v w]) which are taken from old Uo to new Un. 2. The translation vector S that takes us from an old origin Oo to a new origin On. 3. A vector R that represents the reciprocal lattice vector with components a, b, and c: it too must be taken from Ro to Rn. Note that huvwi is a form of zone axes, i.e. a set of symmetry-related directions in real space, e.g. huvwi ¼ [uvw] and [vuw] if the crystal is tetragonal, etc.; We collect a column vector representation of these three contravariant quantities: 0 1 0 1 0 1 a*0 u0 x B *C C Uo @ v0 A; ð7:13:4Þ So @ y A ; Ro B @ b0 A w0 z * c0 It turns out that all these contravariant quantities use the inverse transformation Q defined in Eq. (7.13.3) above: 0
1 0 un Q11 Un @ vn A ¼ QV o ¼ @ Q21 wn Q31
Q12 Q22 Q32
10 1 Q13 u0 Q23 A@ v0 A Q33 w0
ð7:13:5Þ
The inverse transformation of contravariant quantities will use the matrix P: 0
1 0 u0 P11 U o @ v0 A ¼ PV n ¼ @ P21 w0 P31
P12 P22 P32
10 1 P13 un P23 A@ vn A P33 wn
ð7:13:6Þ
Four by Four. A convenient way of describing a symmetry operation is by using, not the 3 3 matrix that could represent three orthogonal rotations or three translations (but not both), but rather a 4 4 augmented matrix Q. For instance. we can represent the symmetry operator number 3, namely (x þ y, x, z þ 2/3) in space group P 31 2 1 (# 152) as the matrix Q3: 0
1 B 1 Q3 ¼ B @ 0 0
1 0 0 0
0 0 1 0
1 0 0 C C 2= A 3 1
ð7:13:7Þ
7.13
COVARIANT AND CONTRAVARIANT TRANSFORMATIONS [10,11]
The 3 3 submatrix of the first three rows and first three columns describes the “rotation”; the last column, first three rows indicates the translation vector 0, 0, 2 c/3. This is convenient, since we can represent the coordinates of an old point Ro ¼ 0.235 a þ 0.347 b þ 0.180 c by the 4 1 augmented contravariant “column vector” 1 0 1 0 0:235 x0 B y0 C B 0:347 C C C B ð7:13:8Þ Ro ¼ B @ z0 A ¼ @ 0:180 A 1 1 and we can obtain the coordinates of a new point Rn ¼ 0.112a 0.235b þ 0.846c by the matrix multiplication: Rn ¼ Q3 Ro
ð7:13:9Þ
This transformation applies to contravariant quantities such as zone axes. If, instead, one is transforming a unit cell Uo ¼ (aoboco) into a new cell Un ¼ (anbncn), it is really a covariant quantity, which should be represented as a 1 4 row vector; it transforms using the matrix inverse to Q, namely P3: P3 ¼ Q3 1
ð7:13:10Þ
In practice, to find the inverse of Q3, one must use the matrix identity: Q3 Q3 1 ¼ 1
ð7:13:11Þ
where 1 is the diagonal 4 4 unit matrix. In detail, one obtains by application of Cramer’s22 rule: 0
P3 ¼ Q3 1
0 B1 ¼B @0 0
1 1 0 0
0 0 1 0
1 0 0 C C 2=3 A 1
ð7:13:12Þ
and then U n ¼ U o P3
ð7:13:13Þ
One can also transform symmetry operators whenever the coordinate system itself gets changed, as for instance in selecting an alternate setting for a space group. Then one must use a similarity transformation: in the old system let the 4 4 symmetry operator be denoted by Q3, and in the new system as Q30 ; let the coordinate system transformation be represented by the 4 4 matrix S, whose inverse matrix is S1; then the similarity transformation yields: Q30 ¼ S1 Q3 S
ð7:13:14Þ
Generation of (230) Space Groups Using (4) by (4) Matrices. Presumably, one could derive all 230 space groups by considering the internal symmetries of the 4 4 symmetry operators like Eq. (7.13.7).
22
Gabriel Cramer (1704–1752).
4 41
442
7
SY MM ETR Y
REFERENCES 1. P. W. Atkins, Physical Chemistry, 6th edition, Freeman, New York, 1998. 2. M. Hamermesh, Group Theory and Its Application to Physical Problems, AddisonWesley, Reading, MA, 1962. 3. M. J. Buerger, Elementary Crystallography, Wiley, New York, 1963. 4. N. F. M. Henry and K. Lonsdale, eds., International Tables for X-Ray Crystallography, Vol. I: Symmetry Groups, 3rd edition, Kynoch Press, Birmingham, UK, 1969. 5. N. W. Ashcroft and N. D. Mermin, Solid State Physics, Saunders, Philadelphia, PA, 1976. 6. G. H. Stout and L. H. Jensen, X-ray Structure Determination, Macmillan, London, UK, 1968. 7. D. E. Gray, ed., American Institute of Physics Handbook, 2nd edition, McGraw-Hill, New York, 1963. 8. E. A. Wood, Bell Syst. Tech. J. 43: 541 (1964). 9. E. A. Wood, Bell Syst. Tech. Publ. Monograph 4680 (1964). 10. T. Hahn, Ed., International Tables for Crystallography, Vol. A. Space Group Symmetry, Reidel, Dordrecht, Holland, 1983, p. 70. 11. D. McKie and C. McKie, Essentials of Crystallography, Blackwell, Oxford, UK, 1986, pp. 143–149.
CHAPTER
8
Solid-State Physics
“A fact is a simple statement that everyone believes: it is innocent, unless found guilty. A hypothesis is a novel suggestion that no one wants to believe: it is guilty, until found effective.” Edward Teller (1908–2003)
8.1 ELECTRICAL RESISTANCE, HALL EFFECT, DRUDE MODEL, TUNNELING, AND THE LANDAUER FORMULA Resistance. In 1827 Ohm1 found a linear relationship between applied voltage V and the measured current I [1]: V ¼ IR
ð8:1:1Þ
where R is the electrical resistance. Ohm’s law is applicable only if the electrical conduction in a bulk conductor is limited by scattering of impurities or lattice defects (scattering centers). In SI units V, I, and R are in volts,2 amperes,3 and ohms, respectively (in cgs units: statvolts, statamperes, and statohms). The reciprocal of resistance R is the conductance G (in SI units: Siemens,4 formerly mho). In a solid of length L, width W, thickness T, and cross-sectional area A ¼ TW, the resistance is an extensive property, while its volume resistivity rV R ¼ rv L=A ¼ rv L=TW 1
ð8:1:2Þ
Georg Simon Ohm (1789–1854).
2
Count Alessandro Giuseppe Antonio Anastasio Volta (1745–1827). Andre-Marie Ampere (1775–1836). 4 Ernst Werner von Siemens (1816–1892). 3
The Physical Chemist’s Toolbox, Robert M. Metzger. Ó 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc.
443
444
8
SO LI D - STA TE P HYS IC S
(a) R
L
A = TW
T
BULK RESISTANCE R
W
R
(b)
L W A = TW
T SURFACE RESISTANCE R
(c)
A = TW
L
T
FIGURE 8.1 (a) Bulk or volume resistance R; (b) surface resistance R; (c) four-probe method.
W
ΔV I constant
is an intensive property (ohm m); conversely; rv ¼ RA=L ¼ RTW=L
ð8:1:3Þ
(see Fig. 8.1a). The reciprocal of the volume, or bulk, resistivity is the volume conductivity sV (siemens m1): sv ¼ 1=rv ¼ L=RA
ð8:1:4Þ
Thus the static isotropic bulk resistivity rV is in ohm m (SI) or in statohm cm (cgs). Selected conductivities are listed in Table 8.1. If the resistance R of a surface is measured (e.g., a very thin film) then the surface resistance R (ohms) can be related to the surface resistivity rS (ohms per square) (Fig. 8.1b) by R ¼ rS L=W rS ¼ RW=L
ð8:1:5Þ ð8:1:6Þ
Surface resistance R (ohms) and surface resistivity rS (ohms per square) have, dimensionally, the same units. If the thickness of the surface is well known
8.1
E L E C T R I C A L R E S I S T A N C E , H A L L E F F E C T , D R UD E M O D E L , T U N N E L I N G
Table 8.1 Selected Static Bulk Volume Electrical Conductivities sV (in S m1) at 300 K Metals: Ag Al Au Cu Cu, annealed Na
63.0 106 37.8 106 45.2 106 56.9 106 58.0 106 23.8 106
Semimetal: Graphite Graphene (estim.) BN (hexagonal phase)
1.38 105 108 op
4 49
450
8
SO LI D - STA TE P HYS IC S
(translation: frequencies larger than the plasma frequency can traverse the solid, that is, the solid becomes transparent to radiation for wavelengths shorter than the plasma wavelength lp 2pc/op). Drude theory provides an estimate of lp from combining Eqs. (8.1.14) and (8.1.22). The Drude guess of the plasma wavelength for alkali metals is within 20% of the experimental value. In most metals the plasma wavelength is in the ultraviolet region of the spectrum. Metals are thus characterized by plasma oscillations, quantized as plasmons; these are oscillations of the electrical charge density of the free electrons relative to the stationary ionic cores. These plasmons have quantized energies: ðSIÞ; Ep ¼ hnp ¼ hop ¼ hðnjej2 =me0 Þ1=2 1=2 2 Ep ¼ hnp ¼ hop ¼ hð4pnjej =mÞ ðcgsÞ
ð8:1:27Þ
[Here n is not a quantum number, but the charge density of Eq. (8.1.2).] If plasmons couple with photons, they form a plasma polariton. At the surface of metals, the surface plasmon-polaritons, also called “surface plasmons,” are not the same as the “bulk” plasmons; these surface plasmons are affected (i.e., shifted slightly in energy) by monolayer adsorbates; thus Surface Plasmon Resonance (SPR) spectroscopy yields information about the nature of the binding of the adsorbates onto a metal surface. The surface plasmons are excited by a p-polarized electromagnetic wave (polarized in the plane of the film) that crosses a glass medium (1) , such as a prism, and is partially reflected by a metallic film (2) and back into the glass medium ; the dispersion relation is KðoÞ ¼ ðo=cÞ½e1 e2 m1 m2 =ðe1 m1 þ e2 m2 Þ1=2
ð8:1:28Þ
(the s-polarized electromagnetic wave, which is perpendicular to the plane of incidence, generates no plasmons). Surface plasmons also enhance the surface sensitivity of several spectroscopic techniques, including fluorescence, Raman16 scattering and second harmonic generation; these are then called resonance fluorescence, resonance Raman, and resonant second harmonic generation. For metal nanoparticles, the surface plasmons create a new, intense, characteristic colored absorption band. Metal plasmons are usually described by Mie scattering theory. Hall Effect in the Drude Model. The Drude treatment of the Hall effect starts from the Lorentz force on an electron: dp=dt ¼ jejðE þ ðp=mÞ BÞ p=hti ðSIÞ; dp=dt ¼ jejðE þ ðp=mcÞ BÞ p=hti ðcgsÞ
ð8:1:29Þ
which, at steady state, using the definition (Section 2.7) of the cyclotron frequency oc oc jejB=m
16
ðSIÞ;
oc jejB=mc
Sir Chandrasekhara Venkata Raman (1888–1970).
ðcgsÞ
ðð2:7:25ÞÞ
8.1
E L E C T R I C A L R E S I S T A N C E , H A L L E F F E C T , D R UD E M O D E L , T U N N E L I N G
yields components 0 ¼ eEx oc py px =hti
ð8:1:30Þ
0 ¼ eEy þ oc px py =hti
ð8:1:31Þ
s0 Ex ¼ oc htijy þ jx
ð8:1:32Þ
s0 Ey ¼ oc htijx þ jy
ð8:1:33Þ
whence
and finally RH ¼¼ 1=njej ðSIÞ;
RH ¼ 1=njejc ðcgsÞ
ð8:1:34Þ
that is, Drude predicted that the Hall coefficient RH is always negative and independent of the magnetic field, as Hall had claimed (incorrectly), and that the trasverse magnetoresistance is also always negative. Reality is more complicated: There are both positive and negative Hall coefficients, and Eq. (8.1.34) should modified by replacing jej by q (q > 0 for holes, q < 0 for electrons). Thus modified, Eq. (8.1.34) is also obtained from “semiclassical theory” for electrons in periodic potentials. In modern practice the Hall coefficient of Eq. (8.1.11) is used to calibrate magnetic fields B to within 0.1% (“Hall probes”) if n and q are known, or to measure the sign and magnitude of the charge carriers if B is known. PROBLEM 8.1.2. Derive Eq. (8.1.17). PROBLEM 8.1.3. Use Eqs. (8.1.5), (8.1.20), and (8.1.21) to derive Eq. (8.1.22). PROBLEM 8.1.4. Use the electromagnetic wave equation r2E(o) ¼ (o2/c2) e(o) E(o) to derive Eq. (8.1.24) from Eq. (8.1.23). PROBLEM 8.1.5. Estimate the plasma wavelength (cgs) by linking Eqs. (8.1.15) and (8.1.25) and using the magnitude of the Bohr radius a0. There are modern additions to the ordinary Hall effect, which have become very important in magnetics technology: 1. Giant magnetoresistance (GMR), a 10–80% decrease in electrical resistance in the presence of a magnetic field, was found by the groups of Gr€ unberg17 and Fert18 in thin-film structures consisting of two ferromagnetic thin films separated by a thin diamagnetic layer [5,6]. GMR is used routinely in magnetization read heads in magnetic harddisk storage devices: The magnetization is “written” inductively, but is “read” by measuring the resistance change due to the magnetization (in multilayer GMR, the transverse resistance difference between parallel and antiparallel magnetization of the ferromagnetic layers can be as large as 10%). The discovery of GMR gave birth to “spintronics.”
17 18
Peter Andreas Gr€ unberg (1939– ). Albert Fert (1938– ).
451
452
8
SO LI D - STA TE P HYS IC S
2. Colossal magnetoresistance (CMR) is observed below 300 K in manganese perovskite structures [7]; this has not been used in technology. 3. Tunneling magnetoresistance (TMR), occurs when the diamagnetic insulator layer, about 1 nm thick, is MgO [8,9]; the resistance change can be as large as 600% at 300 K. TMR is now being used in read heads in magnetic hard disk storage devices. The quantum Hall effect is described below. Another success story for the Drude model was the explanation for the Wiedemann19 –Franz20 law of 1858, which stated the empirical observation that the ratio of the thermal conductivity k to the electrical conductivity s of most metals was roughly the same and depended only on the absolute temperature—that is, that the so-called Lorentz number k/sT was independent of metal and temperature. The thermal conductivity k ( > 0) is defined by assuming that the heat flow JH is due to the negative gradient of the absolute temperature T (Fourier’s law): J H krT
ð8:1:35Þ
(heat flows from hot to cold). Drude obtained the expression k/sT (3/2) (kB/e)2 ¼ 1.11 108 watt O/K2, which is 50% of the approximate experimental k/sT values. This good fit was made possible by the fortuitous cancellation of two errors, both factors of about 100; in particular, a large value for the electronic specific heat CV was used, which by classical equipartition arguments would equal (3/2) kB, but is 100 times smaller experimentally. Drude also estimated the thermopower as Q ¼ kB/2e. Tunneling and the Landauer Resistance Quantum. Across nanoscopic gaps (vacuum, or atoms or molecules for distances of the order of 5 nm or less), Ohm’s law will no longer apply: The IV curve will be nonlinear, the conductance could be “ballistic” (i.e., free from scattering events), and the main mechanism is quantum-mechanical tunneling. Landauer21 proved in 1957 that (in later terminology) a “metal–molecule–metal” or a “metal–nanowire–metal” sandwich allows [10] a maximum current I: 2e I¼ h
þ1 ð
de½ fL ðeÞ fR ðeÞTrfGa ðeÞGR ðeÞGr ðeÞGL ðeÞg
ð8:1:36Þ
1
where e is the charge on one electron, h is Planck’s22 constant, e is the energy, fL(e) and fR(e) are the Fermi–Dirac distributions in the left and right electrodes, respectively, Ga(e) and Gr(e) are the advanced (and retarded) Green’s23 function for the molecule, GR(e) and GL(e) are the matrices that describe the 19 20
Gustav Heinrich Wiedemann (1826–1899).
Rudolf Franz (1827–1902). Rolf William Landauer (1927–1999). 22 Max Planck (1858–1947). 23 George Green (1793–1841). 21
8.1
E L E C T R I C A L R E S I S T A N C E , H A L L E F F E C T , D R UD E M O D E L , T U N N E L I N G
coupling between molecule and the metal electrodes, and Tr{} is the trace operator. B€ uttiker24 extended this to a multi-lead geometry, and to the case when a magnetic field is present [11]. From this formula the quantum of resistance R0 and its reciprocal, the quantum of conductance, G0, are given by R0 h=2 e2 ¼ 12:91 JA2 s2 ¼ 12:906493 kO
ð8:1:37Þ
G0 1=R0 ¼ 7:74809 105 S
ð8:1:38Þ
(and this quantity is now known to 1 part in 109). This is not to say that the intrinsic or internal resistance of a single molecule (or an atom or a nanowire) is greater or equal to R0, but that the minimum resistance of the molecule plus the two electrodes is R0. This quantum of conductivity has been measured even at room temperature in a multi-walled carbon nanotube, and for a single string of Au atoms across a scanning tunneling microscope, it has been measured in a break junction mode [12]. If one puts two electrons per quantum state (“spin-up and spin-down”), then the minimum resistance becomes 2R0 ¼ 25.812986 kO. The internal resistance of a molecule has not yet been measured. Recently, it was shown that a degenerate quasi-one-dimensional electron gas in a GaAs j GaAl1xAsx system, when interrogated in a four-probe geometry, has zero resistance drop between probes 2 and 3, in contrast to the expected R0 between probes 1 and 4; the transport is ballistic [13]. The resistance of Eq. (8.1.37) must be divided by a factor N, if N elementary one-dimensional wires, or N molecules, bridge the gap in parallel between the two metal contacts: RN ¼ h=2e2 N ¼ ð12:91=NÞ kO
ð8:1:39Þ
There is also an integer or fractional quantum Hall effect, whereby in two-dimensional systems at low temperatures (usually) and high magnetic fields, the Hall conductivity sH is quantized in units of e2/h: sH ¼ pe2 =h ¼ pð3:874 105 S cmÞ ¼ pð25:813 kO cmÞ1
ð8:1:40Þ
Here p is either integer (p ¼ 1, 2, 3, etc.) for the integer quantum Hall effect, first measured by von Klitzing25 at cryogenic temperatures [14], or a rational fraction (p ¼ 1/3, 1/5, 5/2, etc.) for the fractional quantum Hall effect, first ormer,27 and Laughlin28 [15]. The very well measured measured by Tsui,26 St€ 2 quantity e /h ¼ 25.81280745 kO is called the von Klitzing constant, although it should also be called Landauer’s constant.
24 25
Markus B€ uttiker (1950– ).
Klaus von Klitzing (1943– ). Daniel C. Tsui (1939– ). 27 Horst L. St€ ormer (1949– ) 28 Robert B. Laughlin (1950– ). 26
453
454
8
SO LI D - STA TE P HYS IC S
For “metal j insulator j metal” (MIM) sandwiches, assuming a rectangular barrier of energy FB and width d on both sides of the molecule, in the direct tunneling regime V < FBe1, the Simmons29 formula [16,17] can be used: " pffiffiffiffiffiffiffi # eV 4p 2ma eV eV FB FB exp d FB þ 2 h 2 2 " #) pffiffiffiffiffiffiffi 4p 2ma eV FB þ exp d ðð6:15:7ÞÞ h 2
e I¼ 2phd2
(
where the dimensionless constant a corrects for a possible nonrectangular barrier, or for using the electron rest mass m in place of a some what smaller “effective mass” m am. The Simmons formula was already presented, in a different context, in Chapter 6. When the applied biases exceed the tunneling barrier height FB, then cold-electron emission through a trapezoidal barrier can also occur from the electrode with greater surface roughness (e.g., the top electrode). This “field” emission is described by the Fowler30 –Nordheim31 [18] equation: " pffiffiffiffiffiffiffi 3=2 # 8p 2mFB e3 V 2 J¼ exp 8phFB d2 3ehV
ð8:1:41Þ
where J is the current density, V is the voltage (volts), FB is the tunneling barrier relative to the electrode Fermi level, e is the electronic charge, m is the mass of the electron, h is Planck’s constant, and d is the gap between electrodes. Superexchange. Consider an electron traversing a “molecule–bridge– molecule” sandwich “D–B–A,” across a covalent bridge B from the electron donor D to the electron acceptor A; let the distance between D and A be dDA. If the conduction occurs by adiabatic “superexchange” or “coherent tunneling” through the molecule [19]. using virtual states along the bridge, then the experimental conductivity will be given by a formula similar to Eq. (6.15.7)): s ¼ s0 ðTÞexpðbdDA Þ
ð8:1:42Þ
If the bridge B consists of several identical repeating components (e.g., phenylene or methylene groups) then the bias-independent decay constant b was estimated by McConnell32 [19] from b ¼ ð2=aÞlne ðDEB =DEDB Þ
ðð6:15:6ÞÞ
where DEDB is the energy gap between the initial D–B–A state (assumed to lie lower) and the relevant Dþ–B–A state, while DEB is the coupling energy
29
John George Simmons (1931– ). Sir Ralph Howard Fowler (1889–1944). 31 Lothar Wolfgang Nordheim (1899–1985). 32 Harden Marsden McConnell (1927– ). 30
8.2
F E R M I – DI R A C S T A T I S T I C S F O R E L E C T R O N G A S : S O M M E R F E L D M O D E L
between adjacent bridge components, and a is the length of the repeating component in the bridge. This bias-independent decay constant b can be linked to the constants used in Eq. (8.1.41): 1=
1=
1=
b ¼ 4p2 2 m 2 h1 aFB2
ðð6:15:8ÞÞ
In contrast, when the conductivity is due to scattering, or is “ohmic,” then s K=dDA
ð8:1:43Þ
There is also another quantum limit, called the “Coulomb blockade” [20]: If an electron is confined to a small dot—that is, a two-dimensional confined region, or quantum dot—of capacitance C (typically 1 fF), then adding another electron will cost a “charging energy” e2/C. If (e2/2C) < kBT, this Coulomb blockade occurs: No more charges can be added, until the threshhold voltage VCB ¼ ðkB T=eÞ
ð8:1:44Þ
is reached. This causes a flat region of no current increase in the IV curve from V ¼ 0 to V ¼ VCB. When V > (kBT/e), the maximum capacitance has been exceeded, and the first charge can move off the quantum dot: A finite current can be observed. At T 300 K, VCB ¼ 0.026 V.
8.2 FERMI–DIRAC STATISTICS FOR ELECTRON GAS: SOMMERFELD MODEL In the first major fix to the Drude model, Sommerfeld33 abandoned the use of the classical Maxwell34–Boltzmann (MB) distribution of molecular velocities fMB ðvÞ ¼ nðm=2pkB TÞ3=2 expðmv2 =2kB TÞ
ð8:2:1Þ
that Drude had used (see Eq. (5.2.24)), and he replaced it by the Fermi–Dirac (FD) distribution (proved below): fFD ðvÞ ¼ ð2m=hÞ3 ½1 þ expf½ð1=2Þmv2 kB T0 =kB Tg1
ð8:2:2Þ
where the Fermi temperature T0 (typically tens of thousands of degrees K) is defined by the normalization condition: ð n ¼ fFD ðvÞdv
ð8:2:3Þ
Electrons must follow the FD distribution because they are spin-1/2 fermions and are subject to the Pauli exclusion principle. The MB and FD 33 34
Arnold Johannes Wilhelm Sommerfeld (1868–1951). James Clerk Maxwell (1831–1879).
455
456
8
SO LI D - STA TE P HYS IC S
1.2 2
f (v) = 1/[1 + exp ((1/2)mv - k T )/k T)] for T =100 T FD
B
0
B
0
1
FIGURE 8.3 fMB(v) or fFD(v)
0.8
Comparison of the Maxwell–Boltzmann (MB)and Fermi–Dirac(FD) distributions, Eqs. (8.2.1) and (8.2.2), for the case T0 ¼ 100 T. Within the dimensionless abscissa parameter x, v2 is the independent variable. It is clear that the MB distribution peaks at very low velocities, while the FD occupancy is 1 (2 if you include spin) at lower temperatures, 0 at high temperatures, 1/2 at the Fermi temperature (at x mv2/2kBT ¼ 100), and between 0 and 1 in a narrow x range around x ¼ 100. From Ashcroft and Mermin [4].
0.6
0.4
0.2 2
f (v) = exp ((1/2)mv )/k T)] MB
B
0 0
20
40
60
80
100
120
x = mv2/2kBT
distributions are compared in Fig. 8.3. The FD implies that most low-energy levels are occupied by two electrons (spin-up and spin-down), that at very high kinetic energies there are no electrons, and that at T ¼ 0 K the distribution of Eq. (8.2.2) drops from 1 to zero abruptly at the Fermi energy eF. PROBLEM 8.2.1.
Evaluate Eq. (8.2.3) by integration.
A good way to introduce quantum mechanics for electrons in metals is to (1) assume for them free-wave wavefunctions (that are “free” within the crystal): cðrÞ ¼ V 1=2 expðik rÞ
ð8:2:4Þ
as solutions to the Schr€ odinger equation in the case of zero potential: ðh2 =2mÞr2 cðrÞ ¼ EcðrÞ
ð8:2:5Þ
but (2) impose Born35–von Karman36 periodic boundary conditions along x, y, and z (using a rectangular macroscopic box of sides A, B, and C and volume V): cðx; y; zÞ ¼ cðx þ A; y; zÞ ¼ cðx; y þ B; zÞ ¼ cðx; y; z þ CÞ
ð8:2:6Þ
These conditions will yield discrete values for the three projections of the wavevector k: kx ¼ 2pnx =A; ky ¼ 2pny =B; kz ¼ 2pnz =C
35 36
ðnx ; ny ; nz integersÞ
Max Born (1882–1970). Theodore von Karman ¼ Sz€ oll€ oskislaki Karman Todor (1881–1963).
ð8:2:7Þ
8.2
F E R M I – DI R A C S T A T I S T I C S F O R E L E C T R O N G A S : S O M M E R F E L D M O D E L
due to the imposed periodicity of the complex exponential: exp(ikxA) ¼ 1 when kxA ¼ 2 np, and so on. Equation (8.2.7) assumed an orthogonal macroscopic crystal of orthorhombic symmetry; for a triclinic parallelepiped, the projections of k would be along a , b , c axes, with nx , ny , nz integers. For simplicity, hereinafter we assume a large cubic crystal, of molar dimensions, so that A ¼ B ¼ C ¼ L and with volume V ¼ L3. The wavefunction in Eq. (8.2.4) is normalized and is also an eigenfunction of the momentum operator: p ðh=iÞð@=@rÞ ¼ ihr
ð8:2:8Þ
so that the eigenvalue will be hk: h=iÞð@=@rÞ½V 1=2 expðik rÞ ¼ hk p½V 1=2 expðik rÞ ¼ ð
ð8:2:9Þ
The quantum numbers for the free waves (nx, ny, nz) have an enormous range of values, positive and negative (of the order of half the cube root of Avogadro’s37 number each). We want to know the number of points allowed in k-space. In onedimensional space, the segment between successive nx values is simply 2p/L; in two dimensions, the area between successive nx and ny points is (2p/L)2; in three dimensions, it is the volume (2p/L)3. If the crystal has volume V, then the three-dimensional region of k-space of volume X will contain X/(2p/L)3 ¼ XV/8p3k values (points); in other words, the k-space density will be V/8p3. We now fill the volume V with electrons with free-wave solutions (each with two possible spin angular momentum projection eigenvalues: h/2 or h/2). Let us fill all N electrons, lowest-energy first, within a defined sphere of radius kF (called the Fermi wavevector); the number of k values allowed within this sphere will be ð4=3Þpk3F ðV=8p3 Þ ¼ Vk3F =6p3
ð8:2:10Þ
and N will be related to this by N ¼ 2Vk3F =6p3
ð8:2:11Þ
And the electron density n N/V will be given by n ¼ k3F =3p2
ð8:2:12Þ
while the Fermi momentum pF, Fermi speed vF, and Fermi energy eF will be defined by hkF ; pF
vF pF =m;
eF h2 k2F =2m
ð8:2:13Þ
For the metals listed in Table 8.1, the Fermi wavevector, speed, and energy are of the order of kF ¼ 3:63=ðrm =a0 Þ 1 ;
vF ¼ 4:20 106 =ðrm =a0 Þm s1 ;
eF ¼ 50:1=ðrm =a0 Þ eV ð8:2:14Þ
37
Lorenzo Romano Amedeo Carlo Bernadette Avogadro, Conte di Quaregna e Cerreto (1776–1856).
457
458
8
SO LI D - STA TE P HYS IC S
To compute the ground-state energy of the N electrons in the volume V, the following sum must be evaluated: X ð8:2:15Þ E ¼ 2 k k ðh2 k2 =mÞ F
This sum is obtained by replacing the sum by an integral: ð E=V ¼ ð1=4p3 Þ ðh2 k2 =mÞ4pk2 dk ¼ ðh2 k5F =10p2 mÞ
ð8:2:16Þ
k kF
The energy per electron then becomes simply E=N ¼ ð3h2 k2F =10mÞ ¼ ð3=5ÞeF
ð8:2:17Þ
The Fermi temperature is finally defined as TF eF =kB ¼ 5:82 105 Kða0 =rm Þ2
ð8:2:18Þ
One can also define a pressure P from the thermodynamic relationship P ¼ (@E/@V)N: P ¼ ð3=5Þð@NeF =VÞN ¼ 2E=3V
ð8:2:19Þ
and obtain the bulk modulus of elasticity, or volumetric elasticity B (defined to be the reciprocal of the isothermal compressibility k): B 1=k Vð@P=@VÞT ¼ 10E=9V
ð8:2:20Þ
The calculated B is within an order of magnitude of the experimental B for several metals (and even closer to the experimental B for alkali metals). PROBLEM 8.2.1.
Prove Eq. (8.2.19).
Now the Fermi–Dirac distribution function of Eq. (8.2.2) will be proved. For a system of N particles in equilibrium at a finite temperature T (where N is very large, of the order of Avogadro’s number), statistical mechanics suggests that the statistical weight PN(E) for the energy state E is given by PN ðEÞ ¼ expðE=kB TÞ=
X a
expðEN a =kB TÞ
ð8:2:21Þ
where the denominator Sa exp(EaN/kBT) is the partition function and EN a is the ath state of the N-particle system. It turns out that this partition function is related to the Helmholtz free energy A ¼ U TS by X expðEN ð8:2:22Þ expðAN =kB TÞ ¼ a =kB TÞ a so that, more simply: PN ðEÞ ¼ exp½ðE AN Þ=kB T
ð8:2:23Þ
The probability of there being one electron in the one-electron level i within this N-electron system of states r (or of states t for which there is no electron in level i) is X X N fiN ¼ P ðE Þ ¼ 1 P ðEN ð8:2:24Þ N a t Þ a t N
8.2
F E R M I – DI R A C S T A T I S T I C S F O R E L E C T R O N G A S : S O M M E R F E L D M O D E L
Let there be one more particle. Then the only difference in energies will be ei, the energy of the one-electron state for which the (N þ 1)th electron state differs from the Nth electron state. fiNþ1 ¼ 1
X
P ðENþ1 a a N
ei Þ
ð8:2:25Þ
Then ei Þ ¼ exp½ðei mÞ=kB TPNþ1 ðENþ1 Þ PN ðENþ1 a a
ð8:2:25Þ
where the chemical potential m (equal to the partial molar Gibbs38 free energy, the partial molar Helmholtz free energy, the partial molar enthalpy, and also the partial molar internal energy, as shown in Section 4.8), is defined by m ð@G=@nÞT;p ð@A=@nÞT;V ð@H=@nÞT;V ð@U=@nÞS;p ð@U=@nÞS;V ¼ ANþ1 AN
ð4:8:6Þ
All this reduces to fiN ¼ 1 exp½ðei mÞ=kB TfiNþ1
ð8:2:26Þ
Since N is of the order of Avogadro’s number, N þ 1 N, we have fiN fiNþ1 , so finally fiN ¼ f1 þ exp½ðei mÞ=kB Tg1
ð8:2:27Þ
An important result is LimT ! 0 m ¼ eF
ð8:2:28Þ
By tradition, eF is the Fermi energy only at T ¼ 0, but is called the Fermi level at T > 0. It can be shown (with some pain) that the heat capacity at constant volume cv for free electrons is cv ¼ p2 nk2B T=2eF
ð8:2:29Þ
which is typically 100 times less than the classical estimate cv ¼ (3/2)nkB. Clearly, both electrical conductivity and specific heat are dominated by the partially filled electron states around eF. Three-dimensional metals have specific heats that obey the empirical relationship cv ðexp; 3DÞ ¼ gT þ AT 3
ð8:2:30Þ
while two-dimensional metals (e.g., graphite or graphene) obey cv ðexp; 2DÞ ¼ gT þ BT 2
ð8:2:31Þ
The terms AT3 and BT2 represent the contributions of the phonon field in three and two dimensions, respectively. Sommerfeld also got improved estimates for the ratio (k/sT) ¼ (p2kB/3e)2 and of the thermopower Q ¼ ðp2 k2B T=6eeF Þ.
38
Josiah Willard Gibbs, Jr. (1839–1903).
459
460
8
SO L I D - S T A T E P H Y S I C S
8.3 X-RAY DIFFRACTION X-rays were discovered by R€ ontgen39 in 1895, and they were first used to study crystals in 1913 by von Laue.40 The diffraction of X rays by matter corresponds to almost elastic scattering of the X-ray photon (its energy is almost unchanged; indeed, the index of refraction of molecules and crystals to X rays is essentially unity). This scattering conserves the magnitude of the photon energy hn ¼ ho ¼ hcjkj, but changes its direction in space from wavevector k to wavevector k0 . Here c is the speed of light, and cjkj is the angular frequency o. Starting from the unit cell lengths a, b, c, from an incident wavevector k, and a scattered wavevector k0 (approximately of the same length as k) von Laue’s conditions for X-ray diffraction are: a ðk k0 Þ ¼ 2ph b ðk k0 Þ ¼ 2pk c ðk k0 Þ ¼ 2pl
ð8:3:1Þ
where h, k, and l are integers (positive, negative, or zero; here h is not Planck’s constant). Bragg’s41 law (1912) states that nl ¼ 2dhkl sin yhkl
ð8:3:2Þ
where yhkl is the Bragg angle, 2 yhkl is the scattering angle, dhkl is the distance (A or pm) between planes (hkl) of electron-rich matter causing constructive interference of diffracted X-ray intensities, and n is the order of the reflection (usually taken as n ¼ 1; here n is not the refractive index) (Fig. 8.4). Bragg’s law can be rewritten in terms of the unit vector of the incident beam S0 (parallel to k), the unit vector denoting the diffracted beam S (parallel
2'
2
1'
1
FIGURE 8.4
S0 or k
Bragg’s law construction. Assume that the two parallel incoming Xray wavelets 1 and 2 are phasecoherent. Then constructive interference will occur if the outgoing wavelet 10 is longer than outgoing wavelet 20 by path length difference (2dhkl sin yhkl), which is equal to an integer number times the X-ray wavelength l.
r*hkl
θ θhkl hkl
39
Wilhelm Conrad R€ ontgen (1845–1923). Max Theodor Felix von Laue (1879–1960). 41 Sir William Lawrence Bragg (1890–1971). 40
S or k' dhkl
8.3
461
X-RAY DIFFRACTION
to k0 ), and the reciprocal lattice vector rhkl ¼ (ha þ kb þ lc ) (Section 7.10) such that S S0 ¼ lr hkl * ¼ lðha* þ kb* þ lc*Þ ¼ 2sinðyhkl Þr hkl *=jrhkl *j
ð8:3:3Þ
This combines Bragg’s equation and Laue’s three equations. Another way of stating the result is that constructive interference, or diffraction of X rays by electrons, will occur if and only if the two wavevectors k and k0 (equal in magnitude) differ by a reciprocal lattice vector G so that k þ G ¼ k0 , whence (k þ G)2 ¼ k0 2 ¼ k2, or 2k G þ G G ¼ 0
ð8:3:4Þ
Ewald’s42 sphere of reflections in reciprocal space explains when and in which direction diffraction will occur. A vector k is drawn from the “origin of the reciprocal lattice” O (e.g., the center of the crystal) parallel to the incident X-ray beam, to “hit” a reciprocal lattice point A. If the vector G (or k) represents the distance between two reciprocal lattice points A and B, then in the direction O to B a scattered wave (vector k0 or S) will appear. Ewald drew a circle (in 2D) or a sphere (in 3D), called the sphere of reflection of radius 2p/l, around the point O; diffraction occurs when this sphere intersects a reciprocal lattice point (Figs. 8.5 and 8.6). As the crystal and/ or the detector are moved, the reciprocal lattice points which cross the Ewald sphere satisfy Eq. (8.3.2) or (8.3.3), and a diffracted beam is formed in direction k0 . Macroscopic crystals are not absolutely perfect: The condition for Bragg reflection, Eq. (8.3.2), is also the condition for total internal reflection. Thus, an absolutely perfect millimeter-sized crystal will reflect internally most of the X-ray beam at the Bragg angles. The imperfection is that each crystal contains crystalline domains, 1 to 10 mm in size, which are slightly misaligned relative to each other (by a few minutes of a degree at most); this is what permits the observation of X-ray diffraction “peaks” and contributes to their finite width by the size of the “perfect” crystallites (Scherrer43 line broadening or shape factor): shkl ¼ Kl=Bhkl cos yhkl
ð8:3:5Þ
where shkl is the crystallite size (nm), l is the X-ray wavelength (nm), yhkl is the Bragg diffraction angle, Bhkl is the width at half-maximum of the diffraction peak (in radians), and K is a shape factor for the average crystallite, usually K ¼ 0.9. If the diffracted intensity is unacceptably low, a quick thermal shock to the crystal may micro-shatter the crystal and thus form those domains, thus restoring a larger Bragg diffracted beam intensity.
42 43
Paul Peter Ewald (1888–1985). Paul Scherrer (1890–1965).
462
8
SO L I D - S T A T E P H Y S I C S
S/λ rhkl* 2θhkl So / λ Rotation of crystal and its reciprocal lattice about some laboratory axis Limiting sphere Diffracted beam when reciprocal lattice pont (hkl) crosses sphere of reflection Sphere of reflection (traces out a toroid as crystal rotates)
S
FIGURE 8.5 Ewald’s sphere of reflections and the Bragg–Laue equation. Inset: Planar view of diffraction event: the incoming X-ray beam of wavelength l and direction S0 /l gets diffracted into the beam with vector S/l; S and S0 are vectors of unit length; the two vectors form an angle 2yhkl, called the scattering angle, given by the Bragg or Laue–Bragg equation. A right angle is formed between the addition vector (S0 þ S)/l and the reciprocal lattice vector rhkl responsible for constructive interference (diffraction) by a plane of electrons located in the crystal plane with Miller indices h, k, and l.
X-rays in (direction S0 is fixed)
2/λ
Origin and site of crystal
b a
k O
FIGURE 8.6 Ewald circle of reflection for a planar lattice.
K = 4a - b
k'
8.4
QUANTUM NUMBERS IN A MACROSCOPIC SOLID: BLOCH WAVES
8.4 QUANTUM NUMBERS IN A MACROSCOPIC SOLID: BLOCH WAVES Even in a macroscopic crystal, each electron, being a fermion, must possess a unique set of quantum numbers apart from the “internal” set of quantum numbers within the atom, ion, or molecule. Assuming that there is translational periodicity in the three-dimensional crystal, we obtain R ¼ na a þ nb b þ nc c
ð8:4:1Þ
where na, nb, and nc are positive or negative integers or zero. An acceptable “Bloch”44 wavefunction c(r þ R) is given by [21] cðr þ RÞ ¼ uk ðrÞexpðik RÞ
ð8:4:2Þ
provided that the wavefunction uk(r) within the zeroth cell is periodic with the lattice: uk ðrÞ ¼ uk ðr þ RÞ ¼ uk ðr þ na a þ nb b þ nc cÞ
ð8:4:3Þ
Here the Bloch wavevector k is 2p times the crystallographic reciprocal lattice vector: k 2pðha* þ kb* þ lc*Þ
ð8:4:4Þ
¼ ð2p=VÞðh½b c þ k½c a þ l½a bÞ
ð8:4:5Þ
where V is the volume of the direct lattice primitive unit cell: V ¼ a ½b c
ðð2:4:26ÞÞ
Equation (8.4.2) suggests that a wavefunction uk(r) needs to be found by standard quantum-chemical means for only the atoms or molecules in the one direct-lattice primitive unit cell. For each of the Avogadro’s number’s worth of fermions in a solid, the factor exp(ik R) in Eq. (8.4.2) provides a new quantum “number,” the wavevector k, that guarantees the fermion requirement of a unique set of quantum numbers. The Bloch waves were conceived to explain the behavior of conduction electrons in a metal. Simply put, the Bloch theorem guarantees that, if the correct wavefunction is found for the zeroth cell, the wavefunctions outside the cell are a repetition of that wavefunction, multiplied by the factor exp(ik R). Among many choices, the wavefunctions uk(r) can be Wannier45 functions, which are defined to be mutually orthogonal, huk(r)jukj(r)i ¼ dkk’ while for atomic or molecular wavefunctions this orthogonality does not necessarily hold. A very important consideration is that the atomic wavefunctions for a single atom c(r) must extend to r ¼ infinity, where they must vanish; 0.4 nm or so away from the nucleus the c(r) are small in amplitude, but not zero; placing another nonbonded atom centered less than 0.3 nm away causes a problem with orthogonality; there is a finite overlap of wavefunctions [which is also
44 45
Felix Bloch (1905–1983). Gregory Hugh Wannier (1911–1983).
463
464
8
SO LI D - STA TE P HYS IC S
true for nonbonded molecules at van der Waals46 separations (0.35 nm) from each other]. Indeed, the Bloch waves (i.e., crystal periodicity) cause discontinuities between the wavefunctions centered at neighboring atoms in the region of overlap, with artificial kinetic energy contributions (from the r2c 6¼ 0 term of the Hamiltonian47). Something must be done to modify the atomic (or molecular) wavefunctions to deal with the atom (or molecule) “next door.” This explains the need for various schemes to improve the atomic wavefunctions in the crystal: (a) the tight-binding model, (b) the cellular or Wigner48–Seitz49 method, (c) the augmented plane wave (APW) method due to Slater50, (d) the orthogonalized plane wave method (OPW) due to Herring,51 (e) the Green’s function method of Korringa,52 Kohn53, and Rostoker54 (KKR), and other theoretical schemes designed to deal with the problem.
8.5 BLOCH WAVES IN ONE DIMENSION AND DISPERSION RELATIONS In one dimension, the Bloch result reduces to an earlier Floquet55 result [22]: In one dimension the periodicity of the lattice requires that the eigenfunction of the appropriate Hamiltonian must satisfy cðx þ NdÞ ¼ expðikNdÞcðxÞ
ð8:5:1Þ
in one dimension, the wavevector k simplifies to k ¼ 2pn=d
ð8:5:2Þ
where n is an integer. Consider a one-dimensional (usually almost infinite) set of N atoms, molecules, or point masses, all equally spaced at inter-particle distances d along the real-space coordinate x, with Born–von Karman periodic boundary conditions for the potential energy: VðxÞ ¼ Vðx þ NdÞ
ð8:5:3Þ
and, concomitantly, for the allowed Schr€ odinger56 wavefunctions: cðxÞ ¼ cðx þ NdÞ
46
Johannes Diderick van der Waals (1837–1923). Sir William Rowan Hamilton (1805–1865). 48 Eugene Paul Wigner (1902–1995). 47
49
Frederick Seitz (1911–2008). John Clarke Slater (1900–1976). 51 W. Conyers Herring (1914–2009). 52 Jan Korringa (ca. 1910– ) 50
53
Walter Kohn (1923– ). Norman Rostoker (1925– ) 55 Achille Marie Gaston Floquet (1847–1920). 56 Erwin Rudolf Josef Alexander Schr€ odinger (1887–1961). 54
ð8:5:4Þ
8.5
465
BLOCH WAVES IN ONE DIMENSION AND DISPERSION RELATIONS 500
E(k) = 4π k2
E(k) = 4π k2 / arbitrary units
400
300 k=π
k = –π
k = –2π
k=2π
200
100
FIGURE 8.7 0 –8
–6
–4
–2
0
2
4
6
Parabolic energy dispersion rela– 2k2/ tion for free particle E(k) ¼ h
2m in one dimension, ignoring Brillouin zones.
8
Wavevector k (= 2π / d) for d=1 meter
Let us again define the wavevector k: k 2pn=Nd
ð8:5:5Þ
parallel to x, where n is some integer in the range n ¼ (0, 1,. . ., N 1, N). Multiplying the wavevector k by Planck’s constant h and dividing by 2p yields the crystal momentum p hk. This crystal momentum of the electron includes a lattice component and is therefore not a true free-particle momentum, as it was in free-electron theory. Using the wavevector k, one can develop a dispersion relation, or E verusu k relation, for the one-particle energy E(k). For free electrons (the “empty lattice”) this dispersion relation is simply given by the kinetic energy: EðkÞ ¼ h2 k2 =2m*
ð8:5:6Þ
This parabola is plotted in Fig. 8.7. If, however, the k values are restricted within the first Wigner–Seitz cell—that is, into the first Brillouin57 zone, p k d p—then the parabola must be folded over, as seen in Fig. 8.8. However, at the extrema kd ¼ p, dE/dk has a discontinuity between the two branches: The upper and lower slopes are different. The eigenvectors of the corresponding free-electron Hamiltonian are the traveling-wave exponential functions exp(þipx/d) and exp(i px/d), or the standing-wave sine functions sin(px/a) and cosine functions cos(px/a). Since
57
cðþÞ ¼ expðipx=dÞ þ expðipx=dÞ ¼ 2 cosðpx=dÞ
ð8:5:7Þ
cðÞ ¼ expðipx=dÞ expðipx=dÞ ¼ 2i sinðpx=dÞ
ð8:5:8Þ
Leon Nicolas Brillouin (1889–1969).
466
8 500
SO LI D - STA TE P HYS IC S
2
E(k) = 4 π k : Second Brillouin Zone
E(k) = 4π k2 / (arbitrary units)
400
300
200 k=π
k = –π
100
FIGURE 8.8 Parabolic dispersion relation for free particle E(k) ¼ –h2k2/2m in one dimension, folded over and constrained into the first and second Brillouin zones.
0 –4
2
E(k) = 4 π k : First Brillouin Zone
–3
–2
–1
0
1
2
3
4
Wavevector k (=π /d) for d=1 meter
Therefore jc(þ)j2 has maxima at x ¼ 0, d, and so on (i.e., at the lattice sites for the ions) and vanishes halfway between lattice sites. It follows that at the lattice sites jc(þ)j2 will be attracted to the ions, and therefore its potential energy will be lowered. In contrast, jc()j2 vanishes at x ¼ 0, d, and so on (i.e., at the lattice sites for the ions) and has maxima halfway between lattice sites; so its potential energy will increase. This helps us to understand why an energy gap tends to form: The standing waves must undergo Bragg scattering at the band edges k ¼ p/d, and, in a periodic linear lattice, the quadratic dependence of E on k of Eq. (8.5.6) must be “softened” at the band edges, so that dE/dk ¼ 0 at k ¼ p/d; then an energy gap of size 2 jUkj must open up between a lower-energy filled band and an upper-energy band that is empty. If dE/dk 6¼ 0, then Umklapp scattering will occur: The crystal momentum phonon will and must “borrow” momentum and energy from the reciprocal lattice: ðk þ GÞ2 ¼ k2
ð8:5:9Þ
One way of seeing this explicitly is to consider the Schr€ odinger equation modified for a periodic lattice with Born–von Karman periodic boundary conditions: assuming a wavefunction c(r) ¼ Sqcq exp(iq r) and a potential U(r) which has the periodicity of the lattice; U(r) Ð ¼ SGUG exp(iG r), where the Fourier58 coefficients UG are given by UG ¼ cellU(r) exp (iG r) dr, the Schr€ odinger equation is rewritten as X U c ¼0 ð8:5:10Þ ½ðh2 =2mÞq2 Ecq þ G G qG which has a nonvanishing UG potential only when G is a reciprocal lattice vector—that is, where Bragg scattering can occur [for free electrons, the second term of Eq. (8.5.10) is zero]: an electron of momentum þ(h/2d) scatters
58
Jean-Baptiste Fourier (1768–1830).
8.5
BLOCH WAVES IN ONE DIMENSION AND DISPERSION RELATIONS
off a lattice phonon of momentum hG and reappears as an electron of momentum ( h/2d); the conservation of energy is maintained by the interaction of the electron with the phonon spectrum, and the recoil is absorbed by the phonons. This curling over of the energy El(k) near the zone boundary k ¼ p/d can be discussed as follows: The wave equation Hc ¼ ðT þ UÞc ¼ ec
ð8:5:11Þ
of the electron in the crystal is rewritten using (i) a Fourier series for the potential X U expðiGxÞ ð8:5:12Þ UðxÞ ¼ G G where UG decreases as G2 for a bare Coulomb potential, and (ii) a different Fourier expansion for the eigenfunction X CðKÞexpðiKxÞ ð8:5:13Þ c¼ K whence the Schr€ odinger equation becomes X
ðh2 =8p2 mÞCðKÞexpðiKxÞ þ K
X X
¼e
G X K
K
UG CðKÞexpðiðK þ GÞxÞ
CðLÞexpðiKxÞ
ð8:5:14Þ
which yields the “central equation”: ½ð h2 K2 =2mÞ eCðKÞ þ
X G
UG CðK GÞ ¼ 0
ð8:5:15Þ
Although this is an infinite series in G, in practice only a few coefficients C(K) are significant. At the zone boundary, where K2 ¼ (G/2)2, this central equation can be simplified, if only one of the Fourier coefficients UG, call it U, is significant, and yet is also small in comparison with the kinetic energy h2K2/ 2 m: the energy becomes simply e¼ h2 G2 =4m U
ð8:5:16Þ
A similar analysis for the two-level case yields ½e ð h2 =2mÞq2 cq ¼ UG cqG
ð8:5:17Þ
½e ð h2 =2mÞðq GÞ2 cqG ¼ UG *cq
ð8:5:18Þ
with solution e¼
1=2 i h4 2 h 2 h 2 2 2 2 q þ ðq GÞ2 ½q ðq GÞ þ U K 4m 16m2
ð8:5:19Þ
In the kinetic energy term, q2 will equal (q K)2 only when jqj ¼ jq Gj, that is, when q lies in a Bragg plane; in that case e ¼ ð h2 =2mÞq2 jUG j
ð8:5:20Þ
The formation of this energy gap is depicted qualitatively in Fig. 8.9.
467
468
8
SO LI D - STA TE P HYS IC S
FIGURE 8.9 (a) Depiction of free-electron parabola E ¼ a k2 in the neighborhood of a Bragg reflection at point K (where E ¼ 0); a gap will open soon at the crossing point K/2. (b) A gap opens at K/2. (c) Gap for tight-binding model (d) Gap for tight-binding model in extendzone scheme (e) Gap for tightbinding model in reduced-band scheme. Inspired by Ashcroft and Mermin [4].
8.6 BAND STRUCTURES We now discuss in levels of ever-increasing complication (and hopefully correctness) how the energy levels in a solid are best described as energy bands. These bands of energy change with direction inside a crystal. The surface of constant energy is called a Fermi surface. Before we start, we should remind ourselves that the face-centered cubic (FCC) structure
8.6
469
BA N D ST R U C T U R E S
Z
b
(1,1,1)
(1/2,1/2,1)
c
FIGURE 8.10
(0,1/2,1/2)
a3
a2 (1/2,0,1/2)
(1,1/2,1/2)
a1
(0,0,0)
a
X
(a Bravais59 lattice) is not a primitive structure (it contains Z ¼ 4 identical atoms per unit cell). The primitive cell is shown by construction in Fig. 8.10. Band Structure for the Free-Electron Case. If the electron is free, then the Bloch functions are simple plane waves, because the wavefunctions uk(r) used for the expansion Eq. (8.4.2) are themselves plane waves. For an electron gas with no lattice and no imposed symmetry, Fermi–Dirac statistics apply: At 0 K all electrons pair up (spin-up and spin-down), with an occupancy of 2 for every k value from k ¼ 0 to the Fermi wavevector kF ¼ 1.92/rs ¼ 3.63 a0/rs, and from zero energy up to the Fermi energy eF ¼ h2kF2 /m ¼ 50.1 eV (rs/a0)2, where rs is the radius per conduction electron and a0 is the Bohr radius, and the energy levels are spherically symmetric in k-space. The Fermi surface is a sphere of radius kF. Note that the ratio (rs/a0) varies from 0.2 to 1.0 nm for metals (Table 8.3). The symmetry of the lattice will impose distinct shapes on the Brillouin zones (which by definition are the Wigner–Seitz cells of the reciprocal lattice) for each type of symmetry. Figure 8.11 shows the first Brillouin zone for a facecentered cubic structure. Brillouin zones are delimited by Bragg planes (see Fig. 8.12): The (n þ 1) th Brillouin zone is the set of points that are neither in the (n 1)th nor in the nth zone and that can be reached from the nth zone by crossing only one Bragg plane. PROBLEM 8.6.1. A fictional simple cubic crystal has a lattice constant a ¼ 4.21 A. Compute the four lowest free-electron energy levels along the wavevector k in the reduced zone scheme at the k-space point (p/2a, 0, 0). Figure 8.13 shows the Fermi surface of Al [23], while Fig. 8.14 shows the “overall” Fermi surface for solid elements in the periodic table. To summarize, the Fermi surface is a surface of constant energy (the Fermi energy at 0 K,
59
Auguste Bravais (1811–1863).
Nonprimitive face-centered cubic structure (Bravais lattice 23F, Z ¼ 4) and a primitive rhombohedral subcell (Z ¼ 1). The new axes for the primitive rhombohedral cell are {a1 a/2 þ b/2, a2 a/2 þ c/2, and a3 b/2 þ c/2}. The interfacial angles between these axes are cos1 {(a1 a2) ja1j1ja2j1} ¼ cos1 {(a2/4) [a 21/2]2} ¼ cos1{1/2} ¼ 66.6666 .
470
8
SO LI D - STA TE P HYS IC S
U
[Σ] X (200)
K
FIGURE 8.11 First Brillouin zone for a facecentered cubic (FCC) crystal (with body-centered Brillouin zone). The point G is at the origin (0, 0, 0); the Miller60 indices of three faces are shown. The points L and X are at the centers of a hexagonal and a square face, respectively, while points U and K bisect sides, and point W is at the vertex of three adjacent sides.
[Δ]
W
[Σ]
[Q]
Γ
[Λ]
(111) L (11-1)
the Fermi level at T > 0) in k-space, which separates the unfilled orbitals from the filled orbitals. If there were no Bragg reflections and the electrons were truly free, then the Fermi surface in 3D at 0 K would be simply the surface of a sphere of radius kF [2mEF]1/2h1. In real crystals, sets of lattice points will enable Bragg diffraction and will distort this sphere into a “hypersphere” with “bumps and knobs and bananas.” The anisotropy in the electrical properties of the metal depend on the shape of the Fermi surface, because the electrical current is due to changes in the occupancy of states near the Fermi surface. A material whose Fermi level falls in a gap between bands is an insulator or semiconductor, depending on the size of the band-gap. When the Fermi level for a material falls inside a bandgap, there is no Fermi surface. Solids with a large density of states at the Fermi level become unstable at low temperatures, and tend to form ground states where the condensation b
FIGURE 8.12 Brillouin zones for a square-planar Bravais lattice. The small circles indicate reciprocal lattice points. The first three Brillouin zones lie entirely within the square of side 2b; each of them has area b2. The first Brillouin zone, indicated by “1”, is centered at the origin and includes the origin point. The second Brillouin zone is indicated as “2”, etc.; the third as “3”, etc. The diagonal and horizontal lines indicate Bragg “planes” (which must be lines in 2D). Zones 4 , 5 (not shown), and 6 (not shown) lie partially outside the square of side 2b. Adapted from Ashcroft and Mermin [4].
4*
3
3
2
4* 3
3 1
2
2
3
3 4*
60
2b
3
Willam Hallowes Miller (1801–1880).
2
3
4*
8.6
471
BA N D ST R U C T U R E S U
U *X
K
K
W
W *L Γ*
Γ*
X
1st ZONE-FULL
X
2nd ZONE-POCKET OF HOLES
*Γ (σ)
W U *
W
K* W
* L
X *Γ
3rd ZONE-REGIONS OF EL’NS
4th ZONE-REGIONS OF EL’NS
FIGURE 8.13 Free-electron Fermi surface of Al [23]. The first Brillouin zone (called “first zone” in the illustration) is completely filled, because it is inside the Fermi sphere; its center is at k ¼ 0. The second surface shown (“second zone”) encloses empty levels: the filled levels are between the concave surface faces shown and the Fermi sphere; its center is also at k ¼ 0. The third band (“third zone”) “hot dogs” are filled states, but their origin is at the center of one of the rectangular faces of the “first zone.” The fourth Brillouin zone shows small pockets of electron concentration; their origin is again at the center of one of the rectangular faces of the “first zone.”
FIGURE 8.14 Fermi surfaces of selected elements in the periodic table [www.phys. ufl.edu/fermisurface/html/Z055. html].
energy comes from opening a gap at the Fermi surface. Examples of such ground states are superconductors, ferromagnets, Jahn61-Teller62 distortions and spin-density waves. The state occupancy of fermions is governed by Fermi–Dirac statistics, so at finite temperatures the Fermi surface is accordingly broadened. 61 62
Hermann Arthur Jahn (1907–1979). Edward Teller (1908–2003).
472
8
SO LI D - STA TE P HYS IC S
De Haas63–van Alphen64 and Shubnikov65–de Haas Effects. Electronic Fermi surfaces have been measured through observation of the oscillation of transport properties in magnetic fields H, for example the de Haas–van Alphen effect (dHvA), discovered in 1930 [24], and the Shubnikov–de Haas effect (SdH) [25]. The former is an oscillation in magnetic susceptibility, while the latter is in resistivity. The oscillations are periodic in H1, and they occur because of the quantization of energy levels in the plane perpendicular to a magnetic field, a phenomenon first predicted by Landau66. The new states are called Landau levels, and they are separated by an energy hoc ¼ ðeHÞ=ðm*cÞ
ð8:6:1Þ
where oc is the cyclotron frequency, e is the electronic charge, m is the effective mass of the electron, and c is the speed of light. Onsager67 proved that the period of oscillation DH is related to the cross section of the Fermi surface (typically given in A2) perpendicular to the magnetic field direction A? by the equation A? ¼ ð2peDHÞ=ðhcÞ
ð8:6:2Þ
Thus by measuring the periods of oscillation DH for various applied field directions A?, one can map the Fermi surface. The dHvA and SdH oscillations can be seen if the magnetic fields are large enough, so the circumference of the cyclotron orbit is smaller than the mean free path. Therefore dHvA and SdH experiments are usually performed at national or international high-field facilities. Angle-Resolved Photoemission. The best experimental technique to resolve the electronic structure of crystals in the momentum-energy space, and, consequently, the Fermi surface, is angle resolved photoemission spectroscopy (ARPES). Two-Photon Positron Annihilation. With positron annihilation, the two photons carry away the momentum of the electron; as the momentum of a thermalized positron is negligible, the momentum distribution of the electron can be determined. Because the positron can be polarized, one can also get the momentum distribution for the two spin states in magnetized materials. The Tight-Binding Method. The tight-binding method starts from the Hamiltonian for the ionic (or molecular) core: ^ co c ¼ En c H n n
ð8:6:3Þ
and considers the crystal Hamiltonian as a perturbation to it: ^¼H ^ co þ DUðrÞ H
63 64
Wander Johannes de Haas (1878–1960).
P. M. van Alphen (1906–1967). Lev Vasiyevich Shubnikov (1901–1937). 66 Lev Davidovich Landau (1908–1968). 67 Lars Onsager (1903–1976) 65
ð8:6:4Þ
8.6
473
BA N D ST R U C T U R E S
For the N sites in the lattice, we need Bloch-type functions of the type X cn ðrÞ ¼ expðik RÞjðr RÞ ð8:6:5Þ R where the j(r) are not necessarily the same as the cn(r). To solve the problem, one expands j(r) in terms of the cn(r): jðrÞ ¼
X
b ðkÞcm ðrÞ m m
ð8:6:6Þ
then one writes ^ co c ðrÞ þ DUðrÞc ðrÞ ¼ EðkÞc ðrÞ ^ ðrÞ ¼ H Hc n n n n
ð8:6:7Þ
Premultiplying Eq. (8.6.7) by cm (r), integrating, and using Eq. (8.6.3), one gets ð
ð
EðkÞ E cm *ðrÞcn ðrÞdr ¼ cm *ðrÞDUðrÞcn ðrÞdr
ð8:6:8Þ
which, using the orthonormality of the cm(r), yields an eigenvalue equation for the coefficients bn(k) and for the Bloch energies E(k): X Xð * EðkÞ E1 bm ðkÞ ¼ ½EðkÞ En bn cm ðrÞcn ðr RÞexpðik RÞdr þ
X
bn
n
þ
X
n
ð
bn
n
R6¼0
c*m ðrÞDUðrÞcn ðrÞdr Xð R6¼0
c*m ðrÞDUðrÞcn ðr RÞexpðik RÞdr ð8:6:9Þ
A careful analysis of this complicated result shows that E(k) E0 and bn(k) 0 unless Em E0. For one-dimensional systems given within the tight-binding approximation [26] EðkÞ ¼ U 2t cosðkdÞ
ð1DÞ
ð8:6:10Þ
where U is the on-site energy ^ c i U hci H i
ð8:6:11Þ
^ is the one-electron Hamiltonian, ci is the Wannier for the electron on site i, H eigenfunction for the electron localized at site i, and t is the Mulliken68 transfer integral for an electron moving from site i to the adjacent site i þ 1: ^ c i t hci H iþ1
ð8:6:12Þ
The tight-binding energy E(k), Eq. (8.6.10), has, as required, dE/dk ¼ 0 at k ¼ p/d (Fig. 8.15). 68
Robert Sanderson Mulliken (1896–1986).
474
8
SO LI D - STA TE P HYS IC S
9
E(k) / (arbitrary units)
8
E(k) = 5 - 3 cos (k)
7 6
5 4
3
FIGURE 8.15
2 –4
Dispersion curve for tight-binding case E ¼ U t cos (kd) in one dimension: first Brillouin zone.
–3
–2
–1
0
1
2
3
4
k-vector (π/d) for d = 1 meter (x (degrees)* π / 180)
For an assemblage of two identical molecules spaced d nm apart, the HOMO and LUMO energies split into four levels, each split by 2t eV apart (“dimer splitting”) [26]; here t is akin to the H€ uckel69 resonance integral b of Section 3.15: Indeed, chemists will remember Eq. (8.6.10) from the simple H€ uckel molecular orbital theory for aromatic p-electron systems. As the number of molecules N increases, the energy levels become spaced more closely, until they form a quasi-continuous band of bandwidth W, where W ¼ 4t
ð8:6:13Þ
The factor of 4 can be verified by defining W max[E(k)] min[E(k)] in Eq. (8.6.10). This band can be filled by electrons (or holes) symmetrically up to the maximum (minimum) Fermi wavevectors kF (kF), either with only one electron per site (if the Coulomb electron–electron repulsion discourages more than one electron per site) or with two electrons (spin up and spin down) per site. If the band is filled up to the band edge, then kF ¼ p=d
ð8:6:14Þ
If the band is only partially filled, then kF will be some fraction of (p/d). The Bragg (Umklapp) X-ray scattering occurs between the extrema of bandfilling, at the reciprocal wavevector 2kF. One can define the Fermi energy eF as the highest energy occupied in the band: eF h2 k2F =4p2 m*
69
Erich Armand Arthur Joseph H€ uckel (1896–1980).
ð8:6:15Þ
8.6
475
BA N D ST R U C T U R E S
where m is the (effective) electron mass. One can also defined a Fermi momentum i hr (this is not a true momentum, because the crystal lattice reaction is included). The nomenclature for band filling is that a filled band has two electrons (or holes) per site (with spin up and spin down); a half-filled band has only one electron (or hole) per site; a quarter-filled band has one electron (or one hole) per two sites. Rather than evaluating t directly, it is very convenient to use the Mulliken–Wolfsberg70–Helmholz71 approximation [27,28]: t qS
ð8:6:16Þ
where S is the intersite overlap integral: S hc c
ð8:6:17Þ
i
iþ1 i
and q is some phenomenological constant. This is particularly valid for small t; note that t can be positive or negative, depending upon how the orbitals interact [29]. Cellular Method. The cellular method of Wigner and Seitz (1933) assumes that the solid is divided into “cells”; with an ionic core at the origin the atomic wavefunction is of the type cðr; y; fÞ ¼ Yl;m ðy; fÞwn;l ðrÞ
ð8:6:18Þ
that satisfies the Schr€ odinger equation for the atom, but when applied to the crystal, it must satisfy periodic boundary conditions: cðr; EÞ ¼ expðik RÞcðr þ R; EÞ
ð8:6:19Þ
n rcðr; EÞ ðk RÞnðr þ RÞ rcðr þ R; EÞ
ð8:6:20Þ
One way to do this is to expand it in terms of the atomic solution: X cðr; EÞ ¼ A Y ðy; fÞwn;l ðrÞ ð8:6:21Þ 1;m l;m l;m keeping as many terms as practical, using a finite set of boundary points to ensure compliance with Eqs. (8.6.19) and (8.6.20). In practice, this is not very easy. Band Structure for the Muffin-Tin Potential. The muffin-tin potential assumes UðrÞ ¼ Vðjr RjÞ
for jr Rj r0 ðcore regionÞ
ð8:6:22Þ
UðrÞ ¼ 0
for jr Rj > r0 ðinterstitial regionÞ
ð8:6:23Þ
where r0 is assumed to be less than half the nearest-neighbor distance. Orthogonalized Plane Waves (OPW). This method makes the plane waves orthogonal to the core electron wavefunctions, to avoid the slow convergence due to oscillations of the conduction electron states in the neighborhood of the atomic core electrons [30].
70 71
Max Wolfsberg (1928– ). Lindsay J. Helmholz (ca. 1930– ).
476
8
SO LI D - STA TE P HYS IC S
Augmented Plane Waves (APW). This method is a detailed application of the muffin-tin potential, as is the Korringa–Kohn–Rostocker Method (KKR) [30].
8.7 HUBBARD HAMILTONIAN Very useful are the Hubbard72 Hamiltonian [31–33], first discussed by Van Vleck73 [34]: ^ ¼ tP ðai;s y aðiþ1Þ;s þ ai;s aðiþ1Þ;s y Þ þ UP ai;s y ai;a ai;b y ai;b H i;s i
ð8:7:1Þ
and the extended Hubbard Hamiltonian [35]: ^ ¼ tP ðai;s y aðiþ1Þ;s þ ai;s aðiþ1Þ;s y Þ þ UP ai;s y ai;a ai;b y ai;b H i;s i P þV i aðiþ1Þ;s y aðiþ1;sÞ ai;s y ai;s
ð8:7:2Þ
where ai;s y creates an excitation of spin s at site i, and ai,s annihilates it, t is the Mulliken transfer integral, Eq. (8.6.11), U is the on-site Coulomb energy, Eq. (8.6.10), and V is the nearest-neighbor Coulomb interaction energy. The Mulliken transfer integral t is negative, and typically is of the of order of 1 to 2 eV; U is the energy that inhibits a second electron from residing on a lattice site that already has one electron on it; it is positive and typically is of the order of 2 to 6 eV; finally, V is the energy of attraction for an electron to go from one site to the second; it is typically of the order of 0.2 to 1 eV. Despite the phenomenological attractiveness of these Hamiltonians for molecular physics, they have been solved exactly only for the two-state system [36], and numerically for a restricted set of conditions [37].
8.8 MIXED VALENCE AND ONE-DIMENSIONAL INSTABILITIES “Mixed valence” was defined by chemists as follows: If ions (or molecules) of two different formal charges or valences occupy the same crystallographic site, then an intermediate “mixed-valent” state (an average of the two valences) is assigned to the ions of molecules at that site [38]. A crystal, or even a thin film, is a three-dimensional object, but some of their properties can be “quasi-one-dimensional”—that is, resemble, but do not coincide with, one-dimensional physics, whose characteristics are simple but often peculiar [39]. As discussed above, a one-dimensional periodic chain (of atoms, molecules, or simply electrons localized at lattice site with a distance d between them) can be described by Bloch waves: If there are two electrons (or atoms, or molecules, with two allowed spin states) per site, one calls this a “filled band”;
72 73
John Hubbard (1931–1980). John Hasbrouck Van Vleck (1899–1980).
8.8
477
MIXED VAL ENCE AND O NE-DIMENSIONAL INS TABILIT IES
if there is only one electron (etc.), then it is a “half-filled band”; if there is only one electron (etc.) per every two sites, we have a “quarter-filled” band, but other partial fillings are also possible. Such “quasi-one-dimensional” features, embedded in three-dimensional crystals, have been studied in Nb3Ge, Nb3Sn, K0.3MoO3, TTF TCNQ, and many other “mixed-valent” solids. A statistical–mechanical argument shows that in an infinite onedimensional chain two (or more) distinct and separate phases cannot coexist at equilibrium [40]. Peierls showed 74 [41,42] that an instability in a one-dimensional chain, with one electron per site, driven by electron–phonon interactions, can lead to a subtle structural distortion and to a first-order Peierls phase transition, at and below a finite temperature TP (the Peierls temperature) [42]. For instance, at and below TP either a dimerization into two sets of unequal interparticle distances d0 and d00 (such that d0 þ d00 ¼ 2d) or some other structural distortion must occur. The electronic energy of the metallic chain may also be lowered by the formation of a charge-density wave (CDW) of amplitude r(x): rðxÞ ¼ ro ½1 þ a cosð2kF x þ fÞ
ð8:8:1Þ
where x is the coordinate along the chain, ro is the uniform charge density, and a ro is the charge modulation amplitude. This phase transition opens up a “Peierls” energy gap D in the dispersion relation for the energy [42]. Chemists are familiar with a conceptually similar Jahn–Teller [43] distortion in organometallic systems. The Peierls distortion disrupts the periodicity and can transform a metal above TP to an electrical semiconductor or insulator below TP. Below TP, extra X-ray reflections appear, as static (locked) CDW states of the conduction electrons at 2 kF (or higher harmonics), or static (locked) spin-density wave (SDW) states at 4 kF (or higher harmonics) couple with the atoms or molecules in the lattice, and cause a slight lattice distortion, and extra X-ray reflections. Above TP, these CDW and SDW are mobile excitations, with no phase-locking between excitations on nearby chains; their X-ray signatures are diffuse reflections, similar to thermal diffuse scattering streaks in reciprocal space, which sharpen as the temperature is lowered and TP is approached. From the detection of 2 kF diffuse scattering peaks the chargetransfer between TTF and TCNQ in the organic crystal TTF TCNQ was determined to be 0.59 (thus TTFþ0.59 TCNQ0.59 instead of TTFþ1 TCNQ1); the crystal is metallic above TP 60 K, but semiconducting below TP. When the band filling is a rational fraction (1/4, 1/2, 2/3, 1, etc.), then the SDW and CDW excitations coincide with certain Bragg reflections of the background lattice of periodicity d and thus are more difficult to detect. For r ¼ 1/2, localization of one electron (or hole) on every other site will cause CDW Bragg scattering at 2 kF if one assumes that the spin of the electron (or hole) is either uncorrelated between sites or ferromagnetically aligned. If, instead, the spins are correlated antiferromagnetically (electron with spin up on site 1, no electron on site 2, electron with spin down at site 3, no electron on site 4, and so on), then one has a SDW with 4 kF scattering. When the band filling is irrational, the CDW and SDW reflections at wavelength l are incommensurate with the background lattice of
74
Sir Rudolf Ernst Peierls (1907–1995).
478
8
SO LI D - STA TE P HYS IC S
periodicity d, and the extent of charge transfer r (defined as the ratio of Ne, the net number of electrons to N, the total number of sites) can be measured directly by the equation r ¼ Ne =N ¼ 2d=ðjlÞ ¼ ð2dkF =pÞ
ð8:8:2Þ
where j ¼ 1 for 2kF and j ¼ 2 for 4 kF scattering. A similar magnetic ordering [44] in the spin system, with no obvious changes in the electrical charge transport, can transform an ordered uniform antiferromagnetic or paramagnetic chain (S ¼ 1/2 per site) above a “spinPeierls” transition temperature TSP [45] into a chain of spin-paired singlet “dimers” (S ¼ 0 per dimer) below TSP: the archetypical example is Wurster’s 75 blue perchlorate [46]. There is also a spin density wave (SDW) instability: Many r ¼ 1/2 “onechain” salts are paramagnetic above a SDW ordering temperature TSDW and become antiferromagnetic with a trapped SDW state below TSDW. The theory of CDW and SDW instabilities has received much attention: it differentiates between the weak-coupling limit (U t) [47–49], the intermediate–coupling limit [50,51,52], and the strong-coupling limit (U > t) [35,53,54]. There are three other possible instabilities: 1. Instabilities in a 1-D system, if driven by a strong on-site electron– electron Coulomb repulsion U, lead to a Mott76–Hubbard insulator [55], particularly for a r ¼ 1 system; here charge localization ensues, and the crystal becomes an insulator. For a chemist, a Mott–Hubbard insulator is like a NaCl crystal, where the energy barrier to moving a second electron onto the Cl site is prohibitively high, as is the cost of moving an electron off a Naþ site. 2. For r values with rational fractions (r ¼ 1/2, 2/3), a Wigner crystal [56] can occur: The charges alternate regularly in the crystal, yielding a “frozen CDW”, so that one site has r ¼ 0, the next r ¼ 1, and so on. A Wigner crystal is thus the antithesis of a mixed-valent [38] state. 3. An Anderson77 metal-insulator transition is driven by even a weak random field, due to structural disorder, which can then localize the electronic states.
8.9 DEFECTS AND MOBILE EXCITATIONS IN SOLIDS AND MOLECULES This section collects brief summaries of several named “effects” that are well understood and brandied about by expert practitioners of the “art.” Well-studied static defects in insulating ionic crystals (e.g., NaCl) are various centers: (i) F (for Farbzentrum) center: one trapped electron
75
Casimir Wurster (1854–1918). Sir Nevill Francis Mott (1905–1996). 77 Philip Warren Anderson (1923– ). 76
8.9
D E F E C T S A N D M O B I L E E X C I T A T I O N S I N S O L I D S A N D M O L E C UL E S
replacing an anion; (ii) M center: two electrons in two adjacent vacant anion sites; (iii) R center: three electrons in three adjacent anion sites; (iv) VK center: two adjacent anions bound together as a diatomic dianion; (v) H center: a dianion occupying an anion site; (vi) FA center: one impurity cation and one electron replacing a regular cation and its nearestneighbor ion. These defects can be detected optically and by stress experiments. Mobile defects are Frenkel 78 excitons, Mott–Wannier excitons, polarons, bipolarons, polaritons, and solitons. A Frenkel exciton is a neutral quasi-particle, proposed in 1931 by Frenkel, consisting of an excited bound-state electron and its associated “Coulomb hole” in a low-dielectric constant solid, that can travel throughout the lattice without transporting net charge; since the interaction between electron and hole is large, the exciton width is about one unit cell, or even a single molecule; its binding energy is between 0.1 and 1 eV; it thus tends to be more “localized” than the Wannier exciton. A Mott–Wannier exciton is a neutral quasi-particle, consisting of an excited bound-state electron and its associated “Coulomb hole” in a highdielectric constant solid, that can also travel throughout the lattice without transporting net charge; since the exciton radius is several lattice constants, its binding energy is as low as 0.01 eV; it thus tends to be more “delocalized” than the Frenkel exciton. Optical excitation transfer can occur between molecules as much as 10 nm apart when the dipole–dipole coupling between molecules (one excited “photon donor” chromophore, the other an unexcited “photon acceptor” chromophore) by a mechanism known as F€ orster79 resonance transfer or fluorescence resonance energy transfer (FRET); its characteristic dependence on the distance r between the two chromophores is r6. A polaron is a fermion quasi-particle consisting of an anion (or cation) defect with an associated polarized Gegenion (¼ counterion) atmosphere or polarization; this is an excited state of the system, with energy intermediate between the valence band and the conduction band. Its mobility within the lattice is due to the fact that there is a low energy barrier for the polarization to move from one site to the next. Polarons are the dominant excitations in conducting polymers. A bipolaron is a boson quasi-particle consisting of two spin-paired polarons. A phonon–polariton is a boson quasi-particle that couples an infrared photon with an optical phonon. A soliton is a giant solitary wave produced in canals by a cancellation of nonlinear and dispersive effects. The connection between aqueous solitons and tsunamis (“harbor waves”) is not definitively established. In “doped” conducting polyacetylene, a neutral soliton is a collective excitation of a polyacetylene oligomer that has amplitude for several adjacent sites [57]. A Fermi liquid is a quantum-mechanical liquid of fermions at very low temperatures, with properties resembling those of a Fermi gas of noninteracting fermions.
78 79
Yakov Il’ich Frenkel (1894–1952). Theodor F€ orster (1910–1974).
479
480
8
SO LI D - STA TE P HYS IC S
In one dimension the Tomonaga80–Luttinger81 liquid of interacting electrons (treated pairwise as bosons 82 is a more suitable model; here charge and spin excitations move separately. Breit–Wigner resonances have Lorentzian line shapes fBW(E) ¼ [(E Rres)2 þ (Wres/2)2]1 and are symmetrical in E (where W is the width of the resonance at half-height, see Section 2.1). In contrast, Fano83 resonances have a different lineshape: fFano ðEÞ ¼ ðqWres =2 þ E Rres Þ½ðE Rres Þ2 þ ðWres =2Þ2 1
ð8:9:1Þ
where q is the Fano parameter; the Fano lineshape is asymmetric, because of the coupling of a discrete channel with a continuum of states. The Kondo84 effect is a scattering of conduction electrons by magnetic impurities. In the total expression for the electrical resistance R(T) of a 3D metal as a function of the absolute temperature T is given by RðTÞ ¼ R0 þ aT 2 þ bT 5 þ c lne ðTK =TÞ
ð8:9:2Þ
where R0 is the residual resistance at 0 K, the T2 term is the contribution from the Fermi liquid, the term bT5 is the contribution from lattice vibrations, and the last term, c lne(TK/T), is the Kondo term, where TK is the Kondo temperature. The second and third terms imply that R increases rapidly with temperature, but the Kondo term provides for a resistance minimum at finite temperature TK when (2a þ 5bTK5) ¼ cTK–2m1 (usually TK 10 to 60 K).
8.10 LATTICE ENERGIES: MADELUNG, REPULSION, DISPERSION, DIPOLE–DIPOLE, AND OTHERS The classical Coulomb binding energy (or electrostatic binding energy, or Madelung 85 energy) EM [58] of an ionic lattice with formal charges zijej on ion i, zjjej on ion j, (i, j ¼ 1, 2, . . ., MNA/Z), and interionic distance ri rj is a doubly periodic sum over M ions per unit cell and (N/Z) unit cells (where NA is Avogadro’s number, and i > j):
EM ¼
1 4pe0
i¼MN XA =Z j¼i1 X i¼2
j¼1
zi zj jej2 jri rj j
ð8:10:1Þ
Equation (8.10.1) implies two lattice sums (sums over the whole lattice), but by crystal symmetry it can be rewritten in terms of a summation over
80 81 82
Shin-ichiro Tomonaga (1906–1979). Joaquin Mazdak Luttinger (1923–1997).
Satyendra Nath Bose (1894–1974). Ugo Fano (1912–2001). 84 Jun Kondo (1930– ). 85 Erwin Madelung (1881–1972). 83
8.10
L A T T I C E E N E R G I E S : M A D E L U N G, R E P U L S I O N , D I S P E R S I O N , D I P O L E – D I P O L E , A N D O T H E R S
481
one unit cell and a lattice sum over periodic translation vectors of the whole lattice rd:
EM ¼
¼M X nX NA jej2 m¼M Zm Zn 4pZe0 m¼1 n¼1
d¼N A =Z X d¼1
1 jrm rn rd j
ð8:10:2Þ
This lattice sum is only conditionally convergent. For a one-dimensional binary lattice with anion–cation distance R (z1 ¼ 1, r1 ¼ 0, z2 ¼ 1, r2 ¼ a, M ¼ 2, Z ¼ 1, rd ¼ na), the lattice sum can be evaluated analytically from the alternating infinite series: EM ¼ ðNA jej2 =4pe0 Þz1 z2
Pd¼1
d¼1 jr1
r2 daj1
¼ ð2NA jej2 =4pe0 aÞð1 1=2 þ 1=3 1=4 þ 1=5 . . . ¼ ðNA jej2 =aÞ2 lne 2 ¼ 1:386294361ðNA jej2 =4pe0 aÞ ð8:10:3Þ The Madelung constant a is defined as the ratio (a EM/H) of the Madelung energy to the Coulomb attractive energy H between the nearest-neighbor anion and cation. Thus, for the monoatomic one-dimensional lattice of equidistant cations and anions of Eq. (8.10.3), a ¼ 1.385294361. For three-dimensional crystals the lattice summation converges only slowly, and in any brute-force computational scheme one must make sure that, as one sums outward from the “zeroth unit cell” at the center of the crystal, the ions included at any stage should have as close to zero net charge as possible. For some crystals the Madelung constants a have been evaluated (Table 8.4), using component potentials obtained by summing certain infinite series. For binary crystals (1 cation and 1 anion per formula unit), assuming full formal valences on anions and cations, the Madelung constant a reflects the extra binding due to the three-dimensional packing of ions in the crystal lattice. For more complex crystals (e.g., CaF2 or Cu2O in Table 8.4), a EM/H (where H is the Coulomb attraction of only one cation and its nearest-neighbor cation) does not reflect clearly this “extra binding,” because H ought to
Table 8.4
Madelung Constants a EM/H for Ionic Latticesa
Structure Sodium chloride (halite) NaCl Cesium chloride CsCl Zinc blende (cubic) ZnS Zinc oxide (wurtzite) ZnO Calcium fluoride (fluorite) CaF2 Cu2O Niobium monoxide NbO a
the Structure Types are from Table 7.11.
Space Group
Structure Type
a
Reference
Fm3m Pm3m F 43m P63mc Fm3m P4232 Pm3m
B1 B2 B3 B4 C1 C3 ––
1.7475645946331821906362119 1.76267477307099 1.63805505338879 1.6413216273 11.6365752270767 10.2594570330750 3.008539964
59 60 60 60 60 60 61
482
8
SO LI D - STA TE P HYS IC S
Na+(g),e-(g), Cl(g)
-AA = -349 kJ/mol ID = 496 kJ/mol
Na+(g),Cl-(g)
Na(g),Cl(g) ΔH = 122 kJ/mol Na(g),(1/2)Cl2(g)
FIGURE 8.16
-ΔH = -788 kJ/mol = -8.0 eV/ion pair
ΔH = 108 kJ/mol
Experimental Born–Haber cycle for sodium chloride. The experimental binding energy DH ¼ 8.0 eV is reasonably close to the Madelung energy EM ¼ 8.923446 eV (after one adds to EM a relatively small ad hoc positive Erep).
ELEMENTS IN THEIR STANDARD STATES AT 298.15 K AND 1 BAR: Na(c),Cl2(g) -ΔH = -411 kJ/mol
Na+Cl-(c)
be redefined. Also, if the ions in the crystal are only partially ionized, EM is an ionic “overestimate.” For the crystals that are “fully” ionic––that is, those for which the formal valence charges zm are presumed to be correct (because they are crystals formed between cations of low Pauling electronegativity and cations of high Pauling 86 electronegativity, e.g. Cs or NaCl)––the Madelung energy is a large fraction of the total experimental lattice binding energy ET: All one needs to match the experimental binding enthalpy obtainable from the Born-Haber87 cycle (Fig. 8.16) is to assume that the rest of it is an ad hoc interionic repulsion energy Erep, for which alternate forms are either Erep ¼
d¼N ¼M A =Z X X nX NA m¼M 1 Cm Cn 12 2Z m¼1 n¼1 r jr m n rd j d¼1
ð8:10:4Þ
borrowed from the Lennard-Jones88 “6–12” potential or Erep ¼
d¼N ¼M A =Z X X nX NA m¼M Dm Dn expðEjrm rn rd j2 Þ 2Z m¼1 n¼1 d¼1
ð8:10:5Þ
using the Gilbert 89 softness parameter E. The parameters Cm, Dm, and E are arbitrarily chosen to fit experiment: The physical basis for the repulsion energy is that if ions or molecules are brought too close together, the kinetic energy of the crystal will rise rapidly, destroying the crystal.
86
Linus Carl Pauling (1901–1994). Fritz Haber (1868–1934). 88 Sir John Edward Lennard-Jones (1894–1954). 89 P. M. Gilbert (
). 87
8.10
L A T T I C E E N E R G I E S : M A D E L U N G, R E P U L S I O N , D I S P E R S I O N , D I P O L E – D I P O L E , A N D O T H E R S
For organic ionic crystals, the overall charges on ions zi can be delocalized onto the atom positions, but these partial atom-in-molecule charges (derived from the diagonal terms of the electron density function, or from the LCAOMO HOMO coefficients) are not quantum-mechanical observables: only the total molecular charge is. We discuss this again below. When these partial charges zi are used, then the intra-ionic or intramolecular charge–charge energies must be excluded from EM; a trivially computed term H(1) (no lattice sum here!) is then subtracted for each ion in the zeroth cell, so that d¼N ¼M A =Z X nX X NA jej2 m¼M 1 EM ¼ Hð1Þ zm zn jr 4pZe0 m¼1 n¼1 r r j m n d d¼1
ð8:10:6Þ
and H
ð1Þ
¼m X 0 nX NA jej2 m¼M zm zn ¼ 4pZe0 m¼nþ1 n¼1 jrm rn j
ð8:10:7Þ
Ewald Method. The best way to compute the lattice sum in Eq. (8.10.6) is the Ewald fast-convergence method [62], which uses an integral transform: 1 2 ¼ pffiffiffi r p
t¼1 ð
2 dt expðr t Þ ¼ pffiffiffi p
t¼Z1=2ðV 1=3
2 dt expðr2 t2 Þ þ pffiffiffi p
2 2
t¼0
t¼1 ð
dt expðr2 t2 Þ t¼Z1=2 V 1=3
t¼0
ð8:10:8Þ and then splits up the integrand into two components, as shown: The first sum is Fourier-transformed, to yield a sum over the reciprocal lattice; the second sum remains over the direct lattice. The evaluation of a limit will also yield a trivial term. The Ewald method is best derived using potentials, not energies. One defines the self-excluded charge distribution: r0 ði; rÞ rðrÞ zi jejdðr r i Þ
ð8:10:9Þ
where d(r) is the Dirac delta function and also the self-excluded electrical potential: ððð 0 C i ðr i Þ dvðr 0 Þrði; r 0 Þjr 0 r i j1 ð8:10:10Þ where the integral ranges over the whole crystal. Ewald’s result for the potentials and for the Madelung energy is ð1Þ ð1Þ C0 m ðr m Þ ¼ Að1Þ m þ Dm þ Rm
Að1Þ m
pffiffiffi 2 Z pffiffiffi zm V 1=3 p
ð8:10:11Þ ð8:10:12Þ
483
484
8
X
Dð1Þ m
SO LI D - STA TE P HYS IC S
pffiffiffi jr rn þrd j X0 1 erf Z m V 1=3
zn
jrm rn þ rd j
d
n
ð8:10:13Þ
2 2=3 2 X 1 X00 1 p V h exp z exp½2pih ðrm rn Þ ð8:10:14Þ 2 h n n pV h Z
Rð1Þ m
EM ¼
Xm¼M m¼1
zm C0 m ðr m Þ
ð8:10:15Þ
where erf (bx) is the error function integral: 1=2
erfðbxÞ 2bp
ð t¼x
ðð2:21:3ÞÞ
dt expðb2 t2 Þ
t¼0
and erf(1) is the improper integral: erfð1Þ 2bp1=2
ð t¼1
dt expðb2 t2 Þ ¼ 1
ðð2:21:2ÞÞ
t¼0
In Eq. (8.10.14) the reciprocal lattice vector is defined by h ha þ kb þ lc
(crystallographer’s convention), and V is the unit cell volume. The single prime on the direct lattice sum over d in Eq. (8.10.13) means that the term d ¼ 0 is avoided whenever i ¼ j; the double prime on the reciprocal lattice sum over h in Eq. (8.10.14) means that the term h ¼ 0 is avoided. The Ewald series can be proved as follows. Both the charge density and the electrostatic potential are periodic functions and can therefore be expanded in Fourier series. The total potential is ððð CðrÞ dvðr 0 Þrðr 0 Þjr r i j1 ð8:10:16Þ ððð ¼
dvðr 0 Þ
i¼MN XA =Z
zi jejdðr r i Þjr r 0 j1
ð8:10:17Þ
i¼1
¼
i¼MN XA =Z
zi jejjr r i j1
ð8:10:18Þ
i¼1
By using the integral transform of Eq. (8.10.8), and rearranging the order of summation and integration Eq. (8.10.17) can be rewritten into a sum of two potentials: CðrÞ ¼ C1 ðrÞ þ C2 ðrÞ where C 1 ðrÞ ¼
ð t¼Z1 =2V 1=3 dt t¼0
C 2 ðrÞ ¼
i¼1
2zi jejp1=2 exp jr ri j2 t2 r ri j1 ð8:10:20Þ
ð t¼1 t¼Z1=2 V 1=3
Xi¼MNA =Z
ð8:10:19Þ
dt
Xi¼MNA =Z i¼1
2zi jejp1=2 exp jr ri j2 t2 jr ri j1 ð8:10:21Þ
8.10
L A T T I C E E N E R G I E S : M A D E L U N G, R E P U L S I O N , D I S P E R S I O N , D I P O L E – D I P O L E , A N D O T H E R S
Using Eqs.((2.21.2)) and (2.21.3)) in Eq. (8.10.20) one gets for C2(r) in real space Xi¼MNA =Z
C 2 ðrÞ ¼
i¼1
zi jejjr ri j1 ½1 erfðZ1=2 V 1=3 jr ri jÞg
ð8:10:22Þ
which, evaluated at atom position rm, finally becomes Eq. (8.10.12): C 2 ðrm Þ ¼
Xn¼M n¼1
zn jej
X0 d
jr ri þ rd j1 ½1 erfðZ1=2 V 1=3 jr ri þ rd jÞg ¼ Dm ð1Þ
In contrast, C1(r) of Eq. (8.10.20) is evaluated in reciprocal space, using the theta-function transformation. Indeed, part of C1(r) is a periodic function, which is expanded in a Fourier series: Xi¼MNA=Z i¼1
X 2zi jejp1=2 exp jr ri j2 t2 ¼ FðhÞexpð2pih rÞ h
ð8:10:23Þ
whose Fourier coefficients are given by FðhÞ ¼ V
1
ððð
dvðr Þ2jejp1=2 V
Xi¼MNA=Z i¼1
zi exp jr ri j2 t2 2pih r ð8:10:24Þ
where the integral is only over the volume of the zeroth unit cell; multiplying and dividing the integrand by expð2pih ri Þ, then interchanging summation and integration: Xj¼MNA =Z
FðhÞ ¼ 2jejp1=2 ZV 1 NA1 zj exp 2pi h rj i¼1 ððð h
i dvðr ri Þexp jr rj j2 t2 2pih r rj V
Integrating over the whole crystal in spherical polar coordinates with h as the polar axis yields ððð VNA =Z
¼ 2p
dvðr ri Þexpðjr rj j2 t2 2pih r rj ð y¼p
ð q¼1 dqq q¼0
2
dy sin y exp t2 q2 2pijhjq cos y
y¼0
¼ p3=2 t3 exp p2 h2 t2
ð8:10:25Þ
Since the remaining sum over j has the periodicity of r(r), we obtain
Xn¼M FðhÞ ¼ 2pjejV 1 t3 exp p2 h2 t2 z expð2pih rn Þ n¼1 n and thus the Fourier expansion becomes X0 Xn¼M C1 ðrÞ ¼ 2pjejV 1 exp ð 2pih r Þ z expð2pih rn Þ h n¼1 n ð t¼Z1=2 V 1=3
dtt3 exp p2 h2 t2 t¼0
ð8:10:26Þ
485
486
8
SO LI D - STA TE P HYS IC S
Carrying out the integration over t yields C1 ðrÞ ¼ p1 jejV 1
Xn¼M 2 2=3 1 2 h exp p2V Z h z exp½2pih ðr rn Þ h n¼1 n
X0
ð8:10:27Þ which is Rm(1) in Eq. (8.10.13). To avoid the singularity at rm, a limit must be secured: n
o Limr ! rm jr r m j1 þ jr r m j1 1 erf Z1=2 V 1=3 jr r m j ( 1=2 1
¼ Limq ! 0 2p
ð t¼Z1=2 qV 1=3
q
)
exp t 2
ð8:10:28Þ ¼ 2p1=2 Z1=2 V 1=3
t¼0
ð1Þ
which, after multiplication by zm jej, is Am in Eq. (8.10.11). Summing over the self-excluded potentials in the zeroth unit cell, the Ewald result for the energy is [now including the intramolecular correction H(1) of Eq. (8.10.7)]:
EM ¼ NA jej2 =4p e0 Z Að1Þ þ Dð1Þ þ Rð1Þ Hð1Þ
A
ð1Þ
pffiffiffi X Z 1=3 pffiffiffi z2 m m p V
Dð1Þ
Rð1Þ
1X 2
m
zm
X
zn
n
ð8:10:29Þ
ð8:10:30Þ
X0 1 erf d
pffiffiffi n þrd j Z jrm r 1=3 V
j rm rn þ rd j
ð8:10:31Þ
2 2=3 2 X X 1 X0 1 p V h exp zm n zn exp½2pih ðrm rn Þ 2 m h 2pV h Z ð8:10:32Þ
The convergence of the Ewald sum is independent of Z, but seems to be optimal in both direct and reciprocal spaces if Z V1/3. Using 64-binary bit precision in computer programs, EM and a values precise to eight decimal figures can be obtained. The P above Ewald series requires that the sum of charges in the unit cell be zero: i¼M i¼1 zi ¼ 0. In case this sum is not zero, but is balanced by a uniform background of charge (Fermi sea in a metallic conductor), Fuchs 90 derived a simple correction term [63,64].
90
Klaus Emil Julius Fuchs (1911–1988).
8.10
L A T T I C E E N E R G I E S : M A D E L U N G, R E P U L S I O N , D I S P E R S I O N , D I P O L E – D I P O L E , A N D O T H E R S
PROBLEM 8.10.1. Derive Shockley’s 91 proof of the Ewald formulas [65]. At each lattice point, superpose a fictitious spherically symmetric Gaussian charge distribution at each lattice point, and subtract it out again: r0 ðm; rÞ r1 ðm; rÞ þ r2 ðrÞ þ r3 ðm; rÞ
ð8:10:33Þ
where n
o r1 ðm; rÞ zm jej V 1 ðZ=pÞ3=2 exp ZV 2=3 jr rm j2
r2 ðrÞ
r3 ðm; rÞ
Xj¼MN=Z j¼1
zj jejfþV 1 ðZ=pÞ3=2 expðZV 2=3 jr rj j2 Þg
X j¼MN=Z j¼1
ð8:10:34Þ
ð8:10:35Þ
n
o
zj jej d r rj V 1 ðZ=pÞ3=2 exp ZV 2=3 jr rj j2 ð8:10:36Þ
Correspondingly, define three components of the self-excluded potential as Cm ðrm Þ Cm1 ðrm Þ þ Cm2 ðrm Þ þ Cm3 ðrm Þ ððð Cm1 ðrm Þ
dvðr0 rm Þr1 ðm; r0 rm Þjr0 rm j1 ððð
Cm2 ðrm Þ ððð Cm3 ðrm Þ
dvðr0 Þr2 ðr0 Þjr0 rm j1
dvðr0 Þr3 ðm; r0 Þjr0 rm j
1
ð8:10:37Þ
ð8:10:38Þ
ð8:10:39Þ
ð8:10:40Þ
Prove the Ewald formulas, Eq. (8.10.12)–(8.10.14). PROBLEM 8.10.2. Derive Bertaut’s 92 proof of Ewald’s formula [66]. It starts with a theorem of electrostatics: If one replaces the system of point charges rðrÞ ¼
Xj¼MNA =Z j¼1
zi jejdðr ri Þ
whose periodic Fourier transform is P(h), by a system of spatially diffuse point charges yðr ri Þ: r0 ðrÞ ¼
91 92
Xj¼MNA =Z j¼1
William Bradford Shockley (1910–1989). Erwin Felix-Lewy Bertaut (1913–2003).
zi jejyðr ri Þ
ð8:10:41Þ
487
488
8
where
ÐÐÐ
SO LI D - STA TE P HYS IC S
dvðrÞyðrÞ ¼ 1, and the periodic Fourier transform of yðrÞ is YðhÞ: ððð YyðhÞ ¼
dvðrÞexpð2pih rÞ
then the total energy EM does not change, provided that (i) the y(r ri) are spherically symmetric, and (ii) the functions y(r ri) do not overlap each other. Bertaut chooses:
yðrÞ V 1 23=2 Z3=2 p3=2 exp 2ZV 2=3 jrj2
ð8:10:42Þ
so its Fourier transform is
YðhÞ ¼ exp 4p2 h2 V 2=3 =8Z
ð8:10:43Þ
Next, Bertaut uses repeatedly the convolution theorem. He defines a “grand total electrostatic energy”: ððð ETM ð1=2Þ
dvðrÞrðrÞjrj
1
ððð
dvðr0 Þrðr þ r0 Þ
ð8:10:44Þ
which is infinite and also contains an infinite self-energy term Eself: EM ¼ ETM Eself
Eself ¼ ð1=2ÞNA jej2 Z1
Xn¼M n¼1
ððð z2n
ð8:10:45Þ
dvðr0 Þdðr0 Þjr0 j1
ð8:10:46Þ
Getting a finite EM from two infinite energies ETM and Eself seems impractical, but using the equivalent charge distribution y(r), Eq. (8.10.42), will create two finite energies, E0 TM E0 self , whose difference should be the same as E0TM E0self . In particular, Bertaut defines E0TM
ððð ð1=2Þ
0
1
ððð
dvðrÞr ðrÞjrj
dvðr0 Þr0 ðr þ r0 Þ
where r0 ðrÞ is defined as the convolution of r ðrÞ and yðrÞ : r0 ðr0 Þ ÐÐÐ dvðrÞrðrÞyðr r0 Þ whose periodic Fourier transform is PðhÞYðhÞ. Note also that ððð
dvðrÞr0 ðrÞr0 ðr þ r0 Þ ¼ jej2
Xi¼MNA =Z Xj¼MNA =Z i¼1
þjej2 ðN=ZÞ
j¼1
Xm¼M m¼1
ði 6¼ jÞ zi zj p r0 þ ri rj
z2m pðr0 Þ ð12Þ
For the two-dimensional “slice” an Ewald-type sum EM was derived [67]: All the atoms are in the ab plane, the nth ion is at the origin, and the three-dimensional volume V reduces to the two-dimensional volume V ¼ ab sin g; the direct space terms A(12) and D(12) are the same as in the
8.10
L A T T I C E E N E R G I E S : M A D E L U N G, R E P U L S I O N , D I S P E R S I O N , D I P O L E – D I P O L E , A N D O T H E R S
three-dimensional case, but R(12) is different: h i
ð12Þ EM ¼ e2 =4pe0 ðNA =2ZÞ Að12Þ þ Dð12Þ þ Rð12Þ Að12Þ ¼ 2p1=2 V 1=3 Z1=2
X
Dð12Þ
X
zm
X
m
Rð12Þ
n
zn
d
ðð8:10:30ÞÞ
z2 m m
X0 1 erf
ðð8:10:29ÞÞ
pffiffiffi jrm rn þ rd j Z V 1=3 j rm rn þ rd j
ðð8:10:31ÞÞ
2 0 13 2 2=3 2 X X00 1 X 2 p V h 41 erf@ A5 1=2 2=3 z z exp½2pih ðrm rn Þ m hh m n n Z p V 0 1 2 2=3 2 X X00 X 4 p V h @ A exp z z exp½2pih ðrm rn Þ þ 3=2 2=3 m h m n n Z p V
1 2 ð pffiffiffi p u¼1 Z 2Z m 2p exp u @ A u du T 2 þ u2 2p! V 1=3
Xp¼1 ð1Þp p¼1
0
u¼0
ð8:10:47Þ where T2 p2 V2/3 Z1(h2a 2 þ k2b 2 þ 2h ka b sin g ) and Zm is the (angstrom or nanometer) distance of the mth ion from a plane parallel to the slice and going through the nth ion. For the one-dimensional “chain” an Ewald-type sum EM(11) was also derived [67]: All the atoms are parallel to the a axis, atom m is at the origin, the other atoms n are at Yn and Zn, and the three-dimensional volume V becomes V ¼ a. The direct space term A(11) does not change, but D(11) and R(11) are different:
i
h ð11Þ EM ¼ NA e2 =8p e0 Z Að11Þ þ Dð11Þ þ Rð11Þ Að11Þ ¼ 2p1=2 V 1=3 h1=2 Rð11Þ
X
z2 m m
0 1 2 2=3 1 X 2 X00 p V a*2 A z expð2pihxm Þexp@ h m m a Z Xp¼1 ð1Þp Zp
P Y2m þ Z2m p¼1 ð2pÞ!a2p=3 t¼1 ð
ðð8:10:29ÞÞ ðð8:10:30ÞÞ
ð8:10:48Þ
tp Y2m þ Z2m þ t expðtÞ dt
t¼0
Dð11Þ
1 X 2 X00 2 2=3 *2 zm exp ð 2pihx ÞEi p V a =Z m h a m
ð8:10:49Þ
489
490
8
SO LI D - STA TE P HYS IC S
where Ei (x) is the exponential integral: EiðxÞ
ð t¼1
dtt1 expðtÞdt
ð8:10:50Þ
t¼x
and the last integral in Eq. (8.10.49) can be obtained in terms of series and exponential integrals. The Ewald series for the three-dimensional crystal can also be differentiated. The first derivative yields expressions for the Madelung electric field FM (due to local charges). The second derivative yields the Madelung field gradient, or, equivalently, the internal or dipolar or Lorentz field FD (due to local dipoles) [68–71].This second derivative can also generates the dimensionless 3 3 Lorentz factor tensor L with its nine components Lab: FD ¼ E þ
e LP
ðSIÞ;
1 3 0
FD ¼ E þ 4pL P ðcgsÞ
ð8:10:51Þ
where P is the polarization of the lattice (the vector sum of all the local dipoles in the lattice). In particular, consider Fa ðrÞ ¼ Ea þ
Xn¼1 n¼1
@ 2 =@ra @rb jrm rj1 mb
ð8:10:52Þ
which can be used to define a 3 3 Lorentz-factor tensor [68]: Lab ðrm Þ ¼
Xn¼M n¼1
@ 2 =@ra @rb jrm rn j1
ð8:10:53Þ
whose nine terms can be evaluated in an Ewald-type sum as Lab ðrm Þ Aab ðrm Þ þ Dab ðrm Þ þ Rab ðrm Þ
ð8:10:54Þ
Aab ðrm Þ ¼ 31 Z1 Z3=2 p3=2 dab
ð8:10:55Þ
( ¼m X 3=2 1 nX 3Ra;mn Rb;mn dab R2mn 0 Z Dab ðrm Þ ¼ ½1 erfðRmn Þ d 4p Z n¼1 R5mn 2 3 ) ð8:10:56Þ 2
3R R d R R R a;mn b;mn ab mn a;mn b;mn 5 þ4 exp R2mn þ R4mn 2R2mn
Rab ðrm Þ ¼
X
n¼M 1 X00 Qa Qb 2 exp Q cos½2ph ðrm rn Þ 2 h Z Q n¼1
ð8:10:57Þ
where two dimensionless vectors were defined: Rmn h 1/2V1/3(rm rn) in 1/3 1/2 h in reciprocal space. The calculated Lorentz direct space and Q pV Z factor components for the two unique molecular centers in the naphthalene crystal are L11(1,1) ¼ 0.187573, L12(1,1) ¼ 0, L22(1,1) ¼ 0.626530, L13(1,1) ¼ 0.019127, L23(1,1) ¼ 0,,L33(1,1) ¼ 0.185895, L11(1,2) ¼ 0.947176, L12(1,2) ¼ 0,
8.10
L A T T I C E E N E R G I E S : M A D E L U N G, R E P U L S I O N , D I S P E R S I O N , D I P O L E – D I P O L E , A N D O T H E R S
L22(1,2) ¼ 0.326888, L13(1,2) ¼ 0.018085, L23(1,2) ¼ 0, L33(1,2) ¼ 0.274064 ([72]; crystal structure from Cruickshank [73]). Tensor forms of Clausius93–Mossotti94 and Lorentz–Lorenz95 equations. If there are no permanent local dipoles, but only induced dipoles because atoms or molecules have a tensor polarizability a, then P ¼ Na FD
ð8:10:58Þ
where N is the number of molecules per unit volume. The electrical permittivity (dielectric constant) tensor « can be defined by « ðE þ 4pPÞ=E cgs ð8:10:59Þ « ðe0 E þ PÞ=ðe0 EÞ ðSIÞ; Then the dipolar field is given by
FD
1 3
e0 ½ð« þ 2Þ=ð« 1ÞP
ðSIÞ;
F D ¼ ð4p=3Þ½ð« þ 2Þ=ð« 1ÞP
cgs
ð8:10:60Þ Let the polarization of a crystal be represented by P¼
Xi¼1
di m0i þ mind i i¼1
ðSIÞ;
P¼
Xi¼1
di m0i þ mind i i¼1
cgs ð8:10:61Þ
P¼
Xi¼1
di m0i þ aii F Di ðSIÞ; i¼1
P¼
Xi¼1
di m0i þ aii F Di i¼1
cgs
ð8:10:62Þ where di is the local density of atoms of type i in the crystal, mi0 is the local permanent static electric dipole moment, and miind is the dipole moment induced by the field FDi. Using this information, the isotropic scalar Clausius–Mossotti equation becomes ð« þ 2Þ=ð« 1Þ ¼
1 3 0
e
X
da i i ii
ðSIÞ; ð« þ 2Þ=ð« 1Þ ¼ ð4p=3Þ
X
da i i ii
cgs
ð8:10:63Þ Merging the previous equations, the anisotropic Clausius–Mossotti equation becomes a ðI þ L « I L I Þ ¼ ðe0 NÞðe I IÞ ðSIÞ;
a ðI þ L « I L I Þ ¼ 14pN ðe I I Þ where I is the unit vector along E.
Rudolf Julius Emanuel Clausius ¼ Rudolf Gottlieb (1822–1888). Ottaviano Fabrizio Mossotti (1791–1863). 95 Ludvig Lorenz (1829–1891). 93 94
ð8:10:64Þ
491
492
8
SO LI D - STA TE P HYS IC S
If I is oriented along the principal-axis system of the optical indicatrix, then for each of the three components of the refractive index (n1, n2, n3) the anisotropic Lorentz–Lorenz equation is
aii þ ni 2 1 aia Lai ¼ ðe0 =NÞ ni 2 1 ðSIÞ; ð8:10:65Þ 2
1 2 cgs aii þ ni 1 aia Lai ¼ ð4pN Þ ni 1 A general method for lattice sums of potentials rn, n > 3 was suggested [74], described obliquely [75], and implemented for the general dispersion sum (n ¼ 6) [76]: h i XX Edisp ¼ ðNA =2ZÞ i j6¼i Bi Bj rij 6 ¼ ðNA =2ZÞ Að6Þ þ Dð6Þ þ Rð6Þ Að6Þ
Dð6Þ
Rð6Þ
3=2 3=2
p
Z 6V 2
X
B i i
2
ð8:10:66Þ
3
Z 12V 2
X
B2 i i
jrm rn þ rd j2 X X X0 exp Z V 2=3 Bm Bn d jrm rn þ rd j6 m n " # jrm rn þ rd j2 Z2 jrm rn þ rd j4 1þZ þ V 2=3 V 4=3
ð8:10:67Þ
ð8:10:68Þ
82 0 13 < pffiffiffi 1=3 p9=2 X0 3 X pV jhj A5 h j m Bm expð2pih rm Þj2 4 pð1 erfÞ@ h : Z 3V 0 1) 2 3 pffiffiffi 2 3=2 2 2=3 Z Z ð8:10:69Þ 5exp@ p Vpffiffiffijhj AÞ þ4 Z 2p3 Vjhj3 pV 1=3 jhj
Although the lattice sum of the Coulomb potential is only conditionally convergent—that is, it requires the presence of equal densities of positive terms and negative terms—the elements of the lattice sums not yet multiplied by the positive and negative charges do converge to finite (if large) limits, at least in the Ewald formalism [62]; this allows the introduction of Hund96–Madelung Hij and Hund–Naor97Nij lattice sums for pairs of sites (i, j) in the zeroth cell, which, when multiplied by the appropriate charges, will yield the site potentials and the Madelung energy: X0 Hij ¼ jr rj þ rd j1 ð8:10:70Þ d i X
fj ¼ e2 =4p e0 ðN=ZÞ i zi Hij
ð8:10:71Þ
XX
EM ¼ e2 =4p e0 ðN=2ZÞ i j zj zj Hij
ð8:10:72Þ
[67]. If i ¼ j, Eq. (8.10.70) should diverge, but instead converges, by computer calculation, to a very large number ( 1036); this is not understood.
96 97
Friedrich Hermann Hund (1896–1997). P. Naor (ca. 1930– ).
8.11
493
S U P E R C O N D UC T I V I T Y
“A first-rate theory predicts, a second-rate theory forbids, a third-rate theory explains after the fact.” Alexander I. Kitaigorodskii (1914–1985) Prediction of Crystal Structures. Starting from a molecule, a molecular complex, or an ionic structural unit, one would like to predict its crystal structure, presumably by finding a minimum Gibbs free energy, or a minimum lattice energy ET (unless there are polymorphs). The general problem of predicting crystal structures has not yet been solved. For molecular crystals, in which the attractive term would be the London98 dispersion energy EL and the repulsive term could be a r12 potential, Kitaigorodskii99 had hoped to find, for example, for any carbon atom, regardless of bonding, a general London parameter AC and a repulsion parameter BC, and ditto for hydrogen atoms [77]. This was too simplistic, as were the usual dicta that “the lock fits a key” or that “nature abhors a vacuum.” The overall classical lattice energy ET may consist of the Madelung energy EM, a dipole–dipole or Keesom100 energy Emm, a charge–dipole energy Ecm, a charge-induced dipole (or polarization) energy Eca, a dipole-induced dipole (or Debye101 induction) energy Ema, and an induced-dipole-to-induced-dipole (or van der Waals or London dispersion) energy EL ¼ Eaa and a repulsion energy ER: ET ¼ EM þ Emm þ Ecm þ Eca þ Ema þ EL þ ER
ð8:10:74Þ
where the atom-in-molecule charges zi, the atom-in-molecule hybridization dipoles mi, and the atom-in-molecule polarizabilities ai would be obtained from a quantum-mechanical calculation; the atom-in-molecule repulsion parameter Ci of Eq. (8.10.4) would still remain an ad hoc quantity. Such a scheme has not yet been tested. When a “trial structure” is very close to the right answer, a least-squares energy minimization of ET will always converge to the “right result,” but, in general, despite modest recent progress, finding the global minimum (or the multiple local minima needed for polymorphs) in ET as a function of molecular orientation and packing still remains a formidable and unsolved computational and intellectual challenge.
8.11 SUPERCONDUCTIVITY Electrical superconductivity was discovered in Hg at 4.2 K by Kamerlingh– Onnes102 in 1911, but was explained theoretically only in 1956 (Bardeen103– Cooper104–Schrieffer105 (BCS) theory). For superconductors, the electrons
98
Fritz Wolfgang London (1900–1954). Alexander Isaakovich Kitaigorodskii (1914–1985). 100 Willem Hendrik Keesom (1876–1956). 101 Peter Joseph William Debye (1884–1966). 99
102
Heike Kamerlingh-Onnes (1853–1923). John Bardeen (1908–1991). 104 Leon Neil Cooper (1930– ). 105 John Robert Schrieffer (1931– ). 103
494
8
SO LI D - STA TE P HYS IC S
(1)
FIGURE 8.17 Cooper pairs of electrons with equal and opposite momenta attract each other: in a metal, a rectangular lattice of positive ions (cations) is shown, with a free electron with momentum p that (1) has just traveled upwards and (2) has attracted some cations toward itself. Then a second free electron (3) with equal and opposite momentum –p is attracted to electron (1) because the cations, being much more massive, have not yet relaxed back to their original unperturbed positions.
(3)
(2)
(2)
close to the Fermi surface form “Cooper pairs” of electrons with opposite spin and opposite crystal momentum, thanks to a “critical” coupling of both electrons to certain lattice phonons: this pairing leads to a very very small resistance. This Cooper-pair coupling is described in Fig. 8.17: Thanks to the relatively much larger mass of the ionic cores in a metal, two almost free metallic electrons, with equal but opposite momenta p and –p, may be indirectly coupled to each other by an electron–phonon–electron attraction. The Cooper pairs are bosons, and below a critical Tc (which is affected by both applied pressure and by applied magnetic field) can condense to the same momentum state and wavefunction for all Cooper pairs in the solid; these pairs have long-distance phase coherence and are present in all known superconductors. However, the condensation of these Cooper pair bosons is attributed to electron–phonon coupling only for monoatomic and diatomic metals (BCS theory), where the critical temperature Tc depends on isotopic mass. Tc ¼ 1:14yD exp 1=Uep DðeF Þ
ð8:11:1Þ
where yD is the Debye temperature of the lattice, Uep is the electron–phonon coupling energy, and D(eF) is the density of states at the Fermi energy eF. For elemental superconductors, the product UepD(eF) is about 0.1 to 0.5, so Tc is one to five orders of magnitude smaller than yD. For an elemental superconductor the BCS theory predicted, and experiment confirmed, that Tc depends on the isotopic mass because of the factor yD. The superconductors known today do not all obey BCS theory. An early practical hope to find technologically useful materials with Tc close to room temperature led to wide searches across ever more complicated chemical structures, as seen at the bottom of Table 8.5. In 1964 Little106 proposed that the Cooper pairs could be coupled by an electron–exciton interaction, which should be two orders of magnitude larger 106
William A. Little (1930– ).
8.11
495
S U P E R C O N D UC T I V I T Y
Table 8.5 Selected Electrical Superconductors. Structural Details for Some Entries are Given in Section 12.3
Material Al Hg La Nb Pb Tl Mo0.475Re0.525 Nb3Sn Nb3Ge (TMTSF)2ClO4 k-(BEDT-TTF)2Cu(NCS)2 Cs3C60 KC8 (SN)x La1.85Sr0.15CuO4 (“214”) YBa2Cu3O7x (“123”) Bi2Sr2CaCu2O8 (“2212”) Tl2Ba2Ca2Cu3O10 (“2223”) HgBa2Ca2Cu3Ox (“2223”) LaFePO LaFeAsO1x (“1111”) (BaK)Fe2As2 (“122”) MgB2
Critical Critical Critical Temperature, Magnetic Field, Current Density, Hc jc Type Tc I or II ? (K) (gauss) (A m3) 1.140 4.153 6.00 9.50 7.193 2.39 10.9 17.91 23.2 1.4 10.4 38 0.55 0.3 36 90 108 127 135 4 56 36 39
105 412 1100 1980 803 171
I I I I I I II II II II II ? ? II II II II II II II II II
than the electron–phonon coupling Uep of BCS, thus predicting roomtemperature superconductivity [78]; this proposal inspired many searches for high-Tc organic superconductivity, but has never found experimental confirmation. For external magnetic fields between zero and a critical field Hc, there is a partial (for type II superconductors) or complete (for Type I superconductors) flux exclusion within the superconductor (Meissner107–Ochsenfeld108 effect), so a bulk superconductor is diamagnetic. Type II superconductors allow a partial penetration of the field, in quantized flux units, into them, creating local normal (nonsuperconducting) regions. Type I superconductors have a first-order phase transition between metal and superconductor: The superconducting phase is lower than the normal state by an energy difference (“gap”). In contrast, type II superconductors have no energy gap, and the transition is second-order. In the macroscopic and phenomenological Ginzburg109–Landau description of superconductivity, a complex “order parameter” C(r) ¼ jc(r)jexp(if) is proposed, which equals zero above Tc and whose magnitude determines
107
Fritz Walther Meissner (1882–1974). Robert Ochsenfeld (1901–1993). 109 Vitali Lazarovich Ginzburg (1916– ). 108
496
8
SO LI D - STA TE P HYS IC S
the degree of superconducting order as a function of r. In BCS theory this c(r) can be the one-particle wavefunction of all the degenerate Cooper pairs:
j ¼ 2e2 =m A þ ðeh=2pmÞrf jcðrÞj2
ð8:11:2Þ
where A is the magnetic vector potential. Applying Eq. (8.11.2) to the interior of a superconducting ring, one obtains the striking result that the magnetic flux F enclosed by this ring must be quantized: F ¼ nhc=2e
ð8:11:3Þ
where the flux quantum or fluxoid is F0 ¼ hc/2e ¼ 2.06783366 107 gauss cm2 ¼ 2.067833667 1015 Wb (note that 1 Wb ¼ 1 V s ¼ 1 J A1 ¼ 1 T m). It should be noted that some mechanism other than the electron–phonon coupling of BCS theory must be operative for organic superconductors (Tc < 13 K) and for high-temperature cuprate superconductors (Tc < 125 K).
Josephson110 Effect. If two superconductors are separated by a thin layer ( 0Þ
ð9:2:18Þ
This can be differentiated once, to yield a linear homogeneous second-order differential equation: Lðd2 I=dt2 Þ þ RðdI=dtÞ þ ð1=CÞI ¼ 0
ðt > 0Þ
ð9:2:19Þ
A trial solution could be I ¼ A expðs tÞ
ð9:2:20Þ
ðLs þ R þ 1=CsÞA expðs tÞ ¼ E
ð9:2:21Þ
which yields from Eq. (9.2.18)
For s ¼ 0, the impedance Z(s) {Ls þ R þ 1/Cs} becomes infinite, while in practice the current I in Eq. (9.24) must stay small and finite. By applying Eq. (9.2.20) to Eq. (9.2.19), one gets the indicial equation:
Ls2 þ Rs þ 1=C ¼ 0
ð9:2:22Þ
which has two roots: s1;2 ¼ ðR=2LÞ ½ðR2 =4L2 Þ 1=LC0:5
ð9:2:23Þ
So the general solution is I ¼ A expðs1 tÞ þ B expðs2 tÞ
ð9:2:24Þ
and the situation must be analyzed for when the roots are real and unequal, or real and equal, or complex conjugates of each other. In a series RLC circuit, the current iR(t) through the resistor R is in-phase with the voltage v(t) across it; the current iC(t) through the capacitor C is p/2 radians ahead of the phase of the voltage vC(t) across it; the current iL(t)
50 9
510
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
through the inductor L is p/2 radians behind the phase of the voltage vL(t) across it. The current I(t) must be the same throughout the circuit (conservation of charge), but the drops in voltage are different within each of the three components R, C, and L. In a series RC circuit, the current I(t) decays with time t: I(t) ¼ I0 exp (t/RC), and the product RC is popularly called the time constant. If in the series RLC circuit of Fig. 9.2 the DC battery source is replaced by an AC source of angular frequency o: VðtÞ ¼ V0 expðiotÞ ¼ V0 ½cosðotÞ þ i sinðotÞ
ð9:2:25Þ
then the analysis starts by assigning to the resistor R a real impedance ZR: ZR R
ð9:2:26Þ
to the capacitor a purely imaginary impedance ZC: ZC i=oC
ð9:2:27Þ
and to the inductor a purely imaginary impedance ZL: ZL ioL
ð9:2:28Þ
For the real part of the applied voltage: V(t) ¼ V0 cos(ot), the root-meansquare voltage Vrms is defined by
Vrms fð2p=oÞ1
PROBLEM 9.2.2.
Ð t¼2p=o t¼0
dt V02 cos2 ðotÞg1=2 ¼ 21=2 V0
ð9:2:29Þ
Prove Eq. (9.2.29).
Similarly, for a current I(t) ¼ I0 cos(ot), the root-mean-square current Irms is defined by Irms fð2p=oÞ1
ð t¼2p=o t¼0
dt I02 cos2 ðotÞg1=2 ¼ 21=2 I0
ð9:2:30Þ
The inhomogeneous integro-differential equation for this series RLC circuit now becomes LðdI=dtÞ þ IðtÞR þ C1
ðt 0
IðtÞdt ¼ V0 expðio_ tÞ
ð9:2:31Þ
9.2
SIMPLE CIRCUITS W I TH NO RE CTIFICATION OR AMP LIFICATION
where the initial condition can be chosen as V ¼ 0 at t ¼ 0. The overall impedance Z is Z R þ iðoL þ o1 C1 Þ
ðseries RLCÞ
ð9:2:32Þ
This complex quantity has a magnitude jZj ¼ ½R2 þ ðoL þ o1 C1 Þ2 1=2
ð9:2:33Þ
Z ¼ jZjexpðiyÞ
ð9:2:34Þ
and a phase angle y:
In an Argand17 diagram representation of Z, called a phasor diagram, y is the angle by which the current I(t) lags the voltage V(t). The reciprocal of the impedance Z is the admittance Y: Y 1=Z ¼ ½R þ iðoL þ o1 C1 Þ1 1
1
¼ ½iðoL þ o C Þ½R þ o L þ 2LC 2
2 2
ð9:2:35Þ 1
2
2 1
þo C
A big goal in power transmission is to match the overall C and L, so that the phasor angle y is driven to as close to zero as possible; this effort to achieve electrical resonance (by approaching LC ¼ o1/2, as shown below) maximizes the transmission of efficient electrical power. This is why it is difficult to suddenly restart an electricity grid after a power failure: The 300,000-V transmission lines between cities already have a mostly inductive coupling to the ground, and the loads at the power consumption sites are also mainly inductive (if they consist of air conditioners and heat pumps eagerly waiting to restart). The current I(t) due to a voltage V0 exp (iot) in a series RLC circuit is IðtÞ ¼ VðtÞ=Z ¼ V0 expðiotÞ=½R þ iðoL þ o1 C1 Þ:
ð9:2:36Þ
PROBLEM 9.2.3. Expand Eq. (9.2.36) into real and imaginary parts. The resistance R dissipates power irretrievably; the energy lost from the circuit per cycle, ER, is released as heat: pR ¼ ½IðtÞVðtÞR ¼ IðtÞVR ðtÞ ¼ IðtÞ2 R ¼ V02 R1 cos2 ðotÞ
ER ¼
ð t¼2p=o t¼0
17
dtV02 R1 cos2 ðotÞ ¼ po1 V02 R1 ¼ po1 I02 R
Jean-Robert Argand (1768–1822).
ð9:2:37Þ
ð9:2:38Þ
51 1
512
9
PROBLEM 9.2.4.
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
Prove Eq. (9.2.38).
In contrast, the capacitance C stores power (in an electric field internal to the capacitor) during half the cycle, then releases it later in the cycle [Eq. (9.2.16)], so that in a full 360 (or 2p radian) cycle, C stores no net energy EC: pC ¼ ½IðtÞVðtÞC ¼ V0 cosðotÞCV0 ½d cosðotÞ=dtÞ ¼ ðCV02 =2oÞ sinð2otÞ ð9:2:39Þ
EC ¼
ð t¼t2 t¼t1
EC ¼ 0
dtðCV02 =2oÞ sinð2otÞ
ð9:2:40Þ
for t2 t1 ¼ 2p=o
ð9:2:41Þ
EC ¼ ðCV02 =2o2 Þ PROBLEM 9.2.5.
for t2 t1 ¼ p=o
ð9:2:42Þ
Prove Eqs. (9.2.41) and (9.2.42).
The inductance L stores power in the magnetic field, then releases it, so that, in a full 360 cycle, L also stores no net energy EL; using Eq. (9.2.17) and I ¼ I0 cos (ot), we obtain pL ¼ ½VðtÞIðtÞL ¼ VL ðtÞIðtÞ ¼ IðtÞLðdI=dtÞ ¼ I0 cosðotÞ½oLI0 sinðotÞ ð9:2:43Þ EL ¼ oI02 L
ð t¼t2 dt cosðotÞ sinðotÞ
EL ¼ 0 if t1 ¼ 0 EL ¼ ð1=2ÞI02 L PROBLEM 9.2.6.
ð9:2:44Þ
t¼t1
and
t2 ¼ 2p=o
if t1 ¼ 0 and
ð9:2:45Þ t2 ¼ p=o
ð9:2:46Þ
Prove Eqs. (9.2.45) and (9.2.46).
Therefore the instantaneous power in a series RLC circuit varies at twice the frequency of the applied voltage. At a very special frequency o0, called the resonant frequency:
o0 ¼ ðLCÞ1=2
ð9:2:47Þ
when the impedance becomes purely resistive, that is, when the imaginary impedance i (oL þ 1/oC) becomes zero, then resonance occurs; this is shown graphically in Fig. 9.3.
9.2
51 3
SIMPLE CIRCUITS W I TH NO RE CTIFICATION OR AMP LIFICATION 200
150
| ZL | = ω L
|Z|, Ohms
| ZC | =1 / ω C ω0 = L-1/2 C -1/2
100
50
FIGURE 9.3 ZR = R
Frequency dependence of the three components of impedance (resistive, capacitative, and inductive), showing the resonance frequency o0 ¼ (LC)1/2.
0 5
10
15
20
25
ω
The quality factor, or Q-factor, is a general dimensionless parameter, used in mechanical, electrical, electromagnetic, and optical contexts. Given some signal intensity S(o) as a function of frequency o, the Q-factor is defined as the resonance frequency divided by the bandwidth Do (see Fig. 9.4): Q o0 =Do
ð9:2:48Þ
As shown in Fig. 9.4, Do is the full width at half-maximum, FWHM (often incorrectly referred to as the half-width). Since at the bandwidth points the Q = ω0 / Δω = 500/16.651 = 21.02 ω0 = 350; S(ω0) = 500
Signal Intensity S(ω) (arbitrary units)
500
400
3 dB 0.5 S(ω0) = 250;
300
Δω = 16.651
200
100 ω = 341.675 ω = 358.325
0 300
320
340
360
380
Angular frequency ω (radians per second)
400
FIGURE 9.4 Quality factor Q (artificial data).
514
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
signal is (1/2) S(o0), a factor of 2 down from the peak S(o0), the signal is also 3 dB down from its peak value. Three decibels corresponds to 0.30103 ¼ log10 2 times 1 bel. The bel B, named in honor of Bell,18 the inventor of the telephone, is defined as B ¼ log10(S/S0) and dB ¼ 10 log10(S/S0). The Q factor is defined in several fields. For a mechanical system that consist of a mass m attached to a spring obeying Hooke’s19 law (constant kH) and a mechanical resistance R, Q can be shown to become Q ¼ m1=2 kH R1 ¼ om=R 1=2
ð9:2:49Þ
where o is the angular frequency of oscillation: o ¼ kH m1=2 . For a laser system we have 1=2
Q ¼ oE=P ¼ 2pnE=P
ð9:2:50Þ
where n is the frequency (Hz) of the optical cavity, E is the stored energy, and P ¼ dE/dt is the energy dissipation. The shape of the function depends on the details of the system discussed. For a series RLC circuit at resonance, the quality factor Q0 may be defined as:
Q0 ¼ 2pðmaximum energy stored in L or C per cycleÞ= ðenergy lost in R per cycleÞ
ð9:2:51Þ
In applying this formula, one should be careful with the numerator, since no net energy is stored per full cycle in either L or C [Eqs. (9.2.41) and (9.2.45)], and power flows in and out of L and C at twice the frequency of the voltage. For a series RLC circuit, Q0 is given by
Q0 ¼ o0 L=R ¼ 1=o0 RC ¼ L1=2 R1 C1=2
ð9:2:52Þ
i L iL
C iC
R
If R is small, then Q0 is large. Equation (9.2.51) also applies to radio receivers, whose “tank” circuit which can be analyzed as a single RLC circuit. PROBLEM 9.2.7. Eq. (9.2.46).
Prove Eq. (9.2.52) from Eq. (9.2.51), using Eq. (9.2.38) and
For a parallel RLC circuit (Fig. 9.5) the impedance is Z ¼ ½R1 þ iðoC o1 L1 Þ1
FIGURE 9.5 Parallel RLC circuit.
18 19
Alexander Graham Bell (1847–1922). Robert Hooke (1635–1703).
ðparallel RLCÞ
ð9:2:53Þ
9.3
VACUUM-TUBE DIODE
Q can be increased in appropriately designed circuits; this is utilized in several instruments (radios, in quartz crystal oscillators, quartz clocks, etc.) that are “tuned” to detect resonant transitions. The experimental methods of Chapter 11, which contain the word “resonance” (e.g., “nuclear magnetic resonance” “electron paramagnetic resonance, etc.), refer to an allowed absorption or emission process (just as in optical spectroscopy), which is measured in a circuit electrically tuned to the frequency for the quantum-mechanical transition. Of course, absorption or emission of light by an atom or molecule also occurs only if the light energy matches the energy level difference; nevertheless, by tradition the term “resonance” is not used in that case. Three theorems help in analyzing the effect of circuits: (i) Thevenin’s theorem (1883) states that if a two-terminal connection is used to probe any circuit, then, no matter how complex this circuit is, it can be analyzed as a voltage source in series with a single impedance.20 (ii) Norton’s theorem (1926) states that the same two-terminal connection to an arbitrary connection can be considered as a short-circuit current in parallel to a simple impedance.21 (iii) The maximum power transfer theorem states that maximum power is transferred to an external circuit if the impedance of the external circuit is the complex conjugate of the internal impedance of the circuit, as defined by Thenevin’s theorem.
9.3 VACUUM-TUBE DIODE The next two sections deal with those vacuum-tube electronic devices which, historically, first provided rectification and amplification (gain); in the twenty-first century these vacuum-tube devices have been almost completely superseded by semiconductor rectifiers and transistors. Vacuum-tube devices are high-voltage, low-current devices with lifetimes of a few thousand hours, while semiconductor devices are low-voltage, high-current devices with an almost infinite life-span (which can be abruptly truncated by high-voltage pulses—for example, static electricity discharges). Both types of devices dissipate heat during operation. Figure 9.6 shows these various electronic devices in their circuit representation. The rationale for presenting vacuum-tube devices first is pedagogical: some principles are in common. The story begins with the Edison22 incandescent light bulb, which uses thermo-ionic emission but was made practical and long-lived by Langmuir’s23 expedient of evacuating the bulb and back-filling it with Ar gas in 1901.
20
Leon Charles Thevenin (1857–1926). Edward Lawry Norton (1898–1983). 22 Thomas Alva Edison (1847–1931). 23 Irving Langmuir (1881–1957). 21
51 5
516
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
ANODE
ANODE
ANODE
GRID
favored CATHODE electron flow
CATHODE
VACUUM-TUBE DIODE (Fleming, 1901)
gridlimited favored electron flow
VACUUM-TUBE TRIODE (de FOREST, 1905)
VACUUM-TUBE PENTODE
ANODE ANODE
COLLECTOR (C) C BASE (B)
CATHODE
favored electron flow
CATHODE
EMITTER (E)
ZENER pn DIODE (RECTIFIER)
pn JUNCTION DIODE or RECTIFIER
B
BIPOLAR npn JUNCTION TRANSISTOR
baselimited favored electron flow
E BIPOLAR pnp JUNCTION TRANSISTOR
DRAIN (D)
G
GATE (G)
BODY
S SOURCE (S)
FIGURE 9.6
UNIPOLAR n-CHANNEL MOSFET or IGFET (depletion or enhancement modes)
Vacuum-tube diode, triode, and pentode, bipolar junction rectifiers and transistors, and unipolar fieldeffect transistors (FETs).
G
G
BODY
D
D
D
UNIPOLAR p-CHANNEL MOSFET or IGFET (enhancement mode)
S gatelimited UNIPOLAR favored UNIPOLAR n-CHANNEL JFET electron p-CHANNEL JFET (depletion mode) flow (depletion mode) S
The vacuum-tube diode, invented by Fleming24 in 1904 [2,3], works because of the relative geometrical shapes of the two concentric electrodes, the cathode and the anode. It consists of a cylindrical glass enclosure that is partially evacuated, bonded, and sealed to a metal base. It contains an inner metallic thin-wire “cathode” (negative electrode, consisting of W, oxidecovered W, or a Th-W alloy), placed along the cylinder axis. This cathode is electrically heated to 900 K or above, using an auxiliary filament circuit, typically driven by a 6.3-V power supply, to foster thermoionic emission of electrons from the cathode. This cathode is cylindrically surrounded by a metallic outer electrode, the anode or “positive electrode” or “plate,” which is a hollow metallic cylinder, whose axis coincides with that of the cathode. The
24
Sir John Ambrose Fleming (1849–1945).
9.4
51 7
VACUUM-TUBE TRIODE
electrons are then accelerated easily from the cathode to the anode by an externally applied “forward bias” þV. The electrons must travel across the “space-charge region” generated by the other boiled-off electrons traveling though the vacuum. If a reverse bias (V) is applied, the electrons “boiled off” the cylindrical anode will most often miss the small wire cathode for geometrical reasons: The cross section is too small, so the reverse current is very small; a vacuum-tube diode works because of the relative geometries of anode and cathode. The electrical current from cathode to plate, called the plate current Ib, is given by Ib ¼ AT 2 expðbTÞ
ð9:3:1Þ
where A is a constant, T is the absolute temperature, and b is related to the work function of the cathode material (metal or metal alloy). Increasing the cathode temperature increases the saturation current. Child25 showed in 1911 that, at moderate forward bias Eb, when the current is limited by space charge, the plate current Ib rises as 3=2
Ib ¼ KEb
ð9:3:2Þ
where K is determined by the diode dimensions. This 3/2 law is simply understood: The number of electrons in the inter-electrode space is propor1=2 tional to Eb, while their velocity is dependent on Eb . At much higher forward bias, Ib becomes independent of Eb, or “saturates”; that is, the overall IV curve in the forward bias direction becomes S-shaped. Under reverse bias (cathode positive, plate negative), the very few electrons that may escape the unheated plate will miss the wire, because the solid angle subtended is so small (or the scattering cross section is small), so the reverse-bias current is low, except at very high negative bias. To recapitulate, a vacuum-tube diode works in part because of its geometry (any electron leaving the wire cathode and traversing the spacecharge region can easily hit the plate that surrounds the cathode cylindrically, while few electrons leaving the plate are likely to hit the cathode) and in part because of temperature (the cathode is heated, the anode is not).
9.4 VACUUM-TUBE TRIODE The vacuum-tube triode (Fig. 9.7) is a voltage-driven device, invented by De Forest26 in 1908 [4], with three electrodes (cathode, anode, grid): current amplification is achieved by interposing a cylindrical wire mesh control “grid” electrode between the heated wire cathode and the unheated hollow cylinder anode of a diode. This grid acts as an imperfect electrostatic shield. Current amplification occurs, again, because of the relative geometrical sizes and placements of the electrodes. As shown in Fig. 9.7, the cathode can be
25 26
Clement D. Child (1868–1933). Lee De Forest (1873–1961).
518
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
Cathode Grid Anode, or plate
small AC input signal
FIGURE 9.7 Vacuum-tube triode. Left: Concentric cylinders of cathode, grid, and anode (in vacuum-tube diodes the grid is absent). Right: Commoncathode or grounded-cathode circuit for amplification (commongrid or common-anode geometries are also possible). The auxiliary heater circuit (underneath the cathode) is independent, and is not always shown in circuit diagrams.
Electron Plate current Ip flow within triode Electron flow in plate circuit large AC output signal
Small reverse grid voltage Eg or Ec
Heater circuit Ground (grounded cathode)
Large forward plate voltage from "B" battery Ep or Eb
grounded (in the common-cathode or grounded-cathode geometry); the other, logically equivalent choices are common-anode and common-plate geometries. 1. The cathode is a wire-shaped electrode with relatively high electrical resistance, which emits electrons when heated electrically (6.3-V cathode heater circuit, which often is not shown). 2. The anode is a hollow cylinder, with the cathode placed along its center; a high positive or “forward” voltage or bias (anode þ, cathode ) accelerates electrons from the cathode toward the anode. As in the vacuum-tube diode, the geometrical placement assures that electrons boiled off the cathode will radiate symmetrically outwards and will have a good chance of being captured by the anode. Conversely, at reverse bias, if the anode is negative () and the cathode is positive (þ), the electrons emitted thermally from the anode have little chance of hitting the cathode, because the cross-sectional area of the cathode is so small, compared with that of the anode (diode action). 3. The grid is a wire-mesh cylindrical electrode, placed between cathode and anode, so that its radius is also at the cathode. A retarding voltage, or reverse bias, between cathode and grid (grid , cathode þ) can prevent the flow of electrons from cathode to anode; conversely, an accelerating voltage (grid þ, cathode ) will accelerate the electrons. The cross-sectional area of the grid is so small that under an accelerating voltage the capture of the electrons by the grid is inefficient, and the electrons sail past the grid to the anode. “Forward bias” means that the plate is kept at a large positive accelerating potential Ep > 0, which is similar to the positive bias in a vacuum-tube diode. The potential Ep applied to the plate is traditionally called the “Bþ” potential, obtained from the positive terminal (battery þ) of a DC power supply or battery; therefore Ep is also called Eb. In contrast to the large positive accelerating plate potential, the grid is held at a relatively small negative “control” or “grid” potential or bias Ec or Eg
9.4
51 9
VACUUM-TUBE TRIODE
(Ec ¼ Eg < 0). This control bias tends to drive the boiled-off electrons from the space-charge region back toward the cathode. We discuss two situations, for negative grid bias first and for positive grid bias next. (A) If Ec < 0 is applied in a triode, then the plate current Ib versus plate voltage Eb curve is shifted from Eq. (9.3.2), so that the plate current is reduced considerably (see Fig. 9.7). Indeed, at some sufficiently negative Ec, Ip becomes zero; this occurs when the grid voltage Ec is sufficiently negative to neutralize the positive cathode-to-anode field created by the positive plate voltage Eb. Below we define m as the amplification factor, which depends on the geometry of the tube and not on either Ec or Eb. If (Ec þ Eb/m) > 0, then the plate current under forward bias and negative grid bias is given by Ib ¼ KðEc þ Eb =mÞ3=2
ð9:3:3Þ
If instead (Ec þ Eb/m) < 0, then the plate current Ib becomes zero [5]: Ib ¼ 0
ð9:3:4Þ
(B) If the grid bias or voltage Ec becomes positive, then there are two currents in the region between grid and plate, the grid current Ic and the plate current Ib, so the total current is Ib þ Ic ¼ KðEc þ Eb =mÞ3=2
ð9:3:5Þ
If Ec < Eb, then typically Ic < Ib. The amplification factor of the triode m is defined by m ð@Ib =@Ec Þ=ð@Ib =@Eb Þ ¼ ð@Eb =@Ec ÞIb ðdEb =dEc ÞIb
ð9:3:6Þ
where the d are either finite increments (in practice) or (in theory) partial differentials @. The experimental values of m range between 8 and 100. The (dynamical) plate resistance rp (ohms) is given by rp ¼ ð@Eb =@Ib Þ The transconductance (or mutual conductance, in siemens) gm is defined by gm ¼ m=rp ¼ ð@Ic =@Eb Þ
ð9:3:7Þ
The transconductance (in siemens) is used as the figure of merit for the triode: One wants both a low plate resistance rp and a high amplification factor m. Of the three triode parameters m, rp, and gm, only two are independent. PROBLEM 9.3.1. Show that m in Eq. (9.3.5) is the same as m in Eq. (9.3.6). The plots of the plate current Ib as a function of grid voltage Ec (Fig. 9.8) and of the plate current Ib as a function of plate voltage Eb (Fig. 9.9) are very similar to
520
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
0.08
Plate Current Ib = Ip (Amperes)
Ep= 400 Volts
FIGURE 9.8 Plot of plate current Ib ¼ Ip versus grid or control voltage Ec ¼ Eg for various fixed values of the plate voltage Ep in a vacuum tube triode (common-cathode or groundedcathode circuit). The grid is backbiased, while the plate is forwardbiased. Note that the curves are essentially the same, but displaced along the grid voltage axis. The amplification factor is m ¼ 5. Adapted from Terman [5].
0.07
Ep= -350 Volts
0.06
Ep = 300 Volts Ep = 250 Volts
0.05
Ep = 200 Volts
0.04
Ep = 150 Volts Ep = 100 Volts
0.03
Ep = 50 Volts
0.02 0.01 0 -100
-80
-60
-40
-20
0
20
Grid Voltage Ec = Eg (Volts)
each other. Figure 9.10 shows how the triode parameters are obtained from the plots. The triode is operated at relatively high voltages and low currents, in the region where the plate current is linear with the grid voltage (to avoid distortion). Typically, the DC voltages applied are 6.3 V for the cathode heater, 50 V for the grid circuit, and þ300 V for the plate circuit. In typical triodes, the plate resistance rp may vary from a few hundred ohms to several thousand ohms. The dynamical transconductance thus may equal several millisiemens. The amplification factor of triodes rarely exceeds 100 and is typically between 5 and 50. A fairly large load resistor is usually placed in the plate circuit. In amplifier operation, a small AC voltage, added in series to the grid circuit, is amplified into a large AC voltage in the grid circuit (see Fig. 9.7) [5,7].
FIGURE 9.9 Plot of plate current Ib versus plate voltage Eb for various fixed values of the grid voltage Ec in a vacuum tube triode (common-cathode or grounded-cathode circuit). The grid is back-biased, the plate is forward-biased. Note that the curves are essentially the same, but are displaced along the plate voltage axis and resemble those of Fig. 9.8. In actual triodes the amplification factor varies and is not just simply the m ¼ 5 used here. Adapted from Terman [5].
Plate Current Ip, Amp (grid Eg = 10V)
0.08
Eg = -10 V
0.07
Eg=0 V
Eg= -30 V Eg= -40 V
Eg = -20 V
Eg = -50 V Eg=-60 V
0.06 0.05 0.04 0.03 0.02 0.01 0 0
100
200
300
Plate Voltage, Eb = Ep (Volts)
400
500
9.4
52 1
VACUUM-TUBE TRIODE 16 Egrid = 0 V Transconductance g = mu/R = = 18.1/10400 = =1.74 milliSiemens
14
plate current Ip (milliA)
12
At constant Ip = 4 milliA : d Egrid = −10 − (−2) = 8 V d Eplate = 245 −100 =145 Amplification factor mu = 145/8 = 18.1
Egrid = −2V
At Eg = −8 Volts: Egrid = − 6 V d Ep= 210 − 160 V = 50 V d Ip = 6.8 − 2.0 milliA =4.8 milliA Dynam. plate R =50/0.0048 = Egrid = − 8 V = 10,400 Ohms
10
FIGURE 9.10
8 Egrid = −10 V 6 4 2 0 0
50
100
150
200
250
plate voltage Ep (Volts)
300
350
Characteristics of a typical commoncathode or grounded-cathode vacuum tube triode (curves of plate current Ib versus plate voltage Eb, for different values of grid voltage Eg ¼ Ec). The horizontal line connects points of constant plate current Ib, and the vertical arrows show how Eq. (9.3.7) is applied to obtain the transconductance. Note that chang400 ing Eg only displaces the IV curve but, to first approximation, does not distort it. Adapted from Mandl [6].
For instance, the 12AX7 twin triode tube has a 6.3-V filament voltage that produces a filament current of 0.3 A; for a fixed plate voltage Eb ¼ 250 and a grid voltage Ec ¼ 2 V, one gets a plate current Ib ¼ 1.2 mA, an amplification factor m ¼ 100, a plate resistance rp ¼ 62,500 O, and a transconductance of 1.6 millisiemens. If the cathode, grid, and plate were perfect cylinders, perfectly nested within each other, with no edge effects, then the amplification factor m would be independent of plate, grid, and filament voltages and could be calculated from the theory of electrostatic shielding. In practice this is not the case, so curves of constant m are not horizontal, but are weakly dependent on plate voltage or grid voltage in plots similar in Figs. 9.8 or 9.9. Several more “grid” electrodes can be added. A vacuum-tube pentode has, radially outward, a control grid, a screen grid, a suppressor grid, and finally the plate electrode. These extra grids are set at potentials (zero at cathode, intermediate for control grid, maximum on screen grid, zero for the suppressor grid, and fairly high for plate) such that the cathode is effectively screened from the anode, so that the number of electrons drawn from the cathode is now almost independent of the plate voltage (see Fig. 9.11). We will see later that the characteristics of vacuum triodes are replicated in npn junction transistors, albeit by a very different mechanism. One can operate very well with vacuum tubes, but they are more costly and difficult to manufacture and are prone to relatively early failure, mainly due to the tube becoming “gassy” with the evaporation of W from the filament or from vacuum breakdown, due to the prolonged heating of the vacuum tube. The typical vacuum tube duty cycles (a few thousand hours) are much inferior to the duty cycles of transistors.
522
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
0.01
Plate current Ip (Amps)
0.008
E = 0 Volts C
0.006 E = -1.5 Volts C
0.004
E = -3.0 Volts
0.002
C
FIGURE 9.11 Plot of plate current Ip versus plate voltage Ep for a typical pentode at various set control-grid potentials Ec. Adapted from Terman [5].
0 0
50
100
150
200
250
Plate voltage Ep (Volts)
9.5 CONDUCTION IN PURE AND DOPED SI AND GE High-purity Ge or Si became available in the late 1940s. Adding carefully controlled levels of impurities (controlled doping) was then initiated and made the “silicon revolution” possible. The starting material for Si wafer fabrication is sand (SiO2) which is reduced in an arc furnace with coal and other additives to 98% Si. This powdered Si is reacted with HCl: Si þ 3HCl ! SiHCl3 þ H2
ð9:5:1Þ
The liquid SiHCl3 is fractionally distilled and is then reacted with H2 to produce high-purity polycrystalline Si: SiHCl3 þ H2 ! Si þ 3HCl
ð9:5:2Þ
The starting material for Ge is GeCl4. Si or Ge crystals can be grown as 2-inchthick sausage-shaped ingots. To produce Si ingots, there are two techniques: an adaptation of the ingot-pulling or Czochralski27 [8] process invented in 1916, which is a very slow pulling of the ingot from a hot melt, and the Bridgman28 floating-zone-refining method [9]. For GaAs, a Bridgman zonerefining process is used. In the crucible-pulling process [8], this high-purity Si, in a light vacuum or in an atmosphere of Ar or He, is melted in an SiO2 crucible supported by a carbon crucible. A seed Si crystal, with prominent (100) or (101) faces, on the tip of a rod, is made to touch the melt, then the rod is pulled slowly
27 28
Jan Czochralski (1885–1953). Percy Williams Bridgman (1882–1961).
9.5
52 3
CONDUCT IO N IN PURE AND D OPE D SI AND GE
(1 mm min1), while the crucible and the rod are rotated at about 50 rpm in opposite directions. The ingots can be single crystals, and are later cut with a diamond saw, to form thin wafers, which are then polished to 5 nm roughness by using a high flux of liquid containing small particles of abrasives and an added potential (electropolishing). Both Si and Ge are covalently bonded network crystals with the diamond crystal structure (Chapter 8). There are 4.52 1022 Ge atoms per cm3 and 4.34 1022 Si atoms per cm3. The excitation energy E needed to produce a free electron within the crystal is E ¼ 0.75 eV for Ge and E ¼ 1.12 eV for Si. The conductivity s (siemens cm1) for electrons (with density [n] electrons cm3) and for holes (with density [p] holes cm3) is given by s ¼ eð½nmn þ ½pmp Þ
ð9:5:3Þ
where e is the electronic charge, mn is the electron mobility (cm2 s1 V1), and mp is the hole mobility. In all materials there is an equilibrium constant linking the concentrations [n] and [p]:
Keq ¼ ½n½p
ð9:5:4Þ
For Ge at 300 K, Keq ¼ 6.25 1026 cm6; for Si at 300 K, Keq ¼ 2.25 1020 cm6; for Si at temperatures below 700 K the equilibrium constant is Keq ¼ 1.55 1033 T3 exp(1.21 jej/kBT). Indeed, for Si at T ¼ 300 K, Keq ¼ 1.58 1020 [10]. In a pure “undoped” crystal, the intrinsic concentration [ni] involves an equal number of electrons and holes: ½ni ¼ ½n ¼ ½p
ð9:5:5Þ
However, electrons and holes have different mobilities. For instance, in silicon (Si) mn ¼ 1200 cm2 s1 V1, mn ¼ 250 cm2 s1 V1. In germanium (Ge) mn ¼ 3600 cm2 s1 V1 and mp ¼ 1700 cm2 s1 V1. The intrinsic concentration [ni] depends on temperature and excitation potential: ½ni ¼ AT 3=2 expðeE=kB TÞ
ð9:5:6Þ
where the pre-exponential factor is different for every material. For Ge, A ¼ 9.64 1015 and E ¼ 0.75 volt; therefore at 300 K [ni] ¼ 2.37 1013 electrons cm3, s ¼ 2.13 102 S cm1, and [ni] ¼ [n] ¼ [p] ¼ 2.37 1013 cm3. For Si, s ¼ 4.34 102 S cm1 (the difference between Si and Ge is mainly due to the difference in the excitation energies E). These data are summarized in Table 9.2. When Ge is lightly n-doped (with group V (group 15) metals, such as P, As, or Sb, which act as electron donors), [n] ¼ 1.75 1014 electrons cm3 (i.e., 3.87 108 free electrons per Ge atom) and [p] ¼ 3.57 1012, with s ¼ 0.1 S cm1. It is important to realize that, because of the equilibrium constant of Eq. (9.5.4), as an electron donor “dopant” is added, the spontaneous thermal generation of n and p carriers from the underlying Ge or Si is affected: any extra holes will recombine with the available electrons. For moderately p-doped Ge (Ge doped with group III (group 13) metals, such as B, Al, Ga, or In, which act as electron acceptors), [n] ¼ 1.70 1011 electrons cm3,
524
Table 9.2
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
Crucial Data for Ge, Si, and GaAs [5,11,12]
Property
Ge
Si
GaAs
Space group Molar mass (g/mol) Unit cell side (A) Unit cell volume (A3 ¼ 1000 nm3) Number of formula units per cell Z Density (#Ge, #Si, or #GaAs cm3) Density (g cm3) Excitation energy, or energy gap (eV) Equilibrium constant at 300 K, Keq [n][p] 1=2 Intrinsic carrier concentration [ni] ðcm3 Þ Keq [ni]/[At] ¼ no. impurities/atom Electron mobility mn (cm2 V1 s1) Hole mobility mp at 300 K (cm2 V1 s1) Intrinsic conductivity at 300 K (S cm1) n-dopant elements (electron donors) p-dopant elements (electron acceptors)
Fd3m (diamond) 72.6 5.64613 179.98 8 4.44 1022 5.362 0.66 5.61 1026 2.37 1013 5.53 1010 3600 1700 2.13 102 P, As, Sb B, Al, Ga, In
Fd3m (diamond) 28.09 5.43095 160.19 8 4.994 1022 2.909 1.12 2.25 1020 1.45 1010 3.46 1013 1350 480 4.34 102 P, As, Sb B, Al, Ga, In
F 43m (zincblende) 144.6 5.6355 178.98 4 2.235 1022 5.368 1.42 3.20 1012 1.79 106 8.01 1017 8500 400 108 Si, Se Be, Mg
[p] ¼ 3.68 1015 holes cm3, and s ¼ 1 S cm1. For heavily n-doped Ge, [n] ¼ 1.75 1017 electrons cm3, [p] ¼ 3.57 109 holes cm3, and s ¼ 100 S cm1. The ranges of conductivity and hole and electron concentrations for Ge, Si, and GaAs are given in Table 9.3. The energy bands (valence and conduction) are depicted schematically in Fig. 9.12. Insulators are characterized by a filled valence band, a large energy gap (large with respect to thermal energies), and an empty conduction band. Intrinsic semiconductors have the same energy band structure as insulators, but the intrinsic defect states allow for a small population of electrons (at the bottom of the conduction band) and an equal number of holes (at the top of the valence band): it is these electrons and holes that permit the intrinsic conductivity of Si or Ge. These bands can be loosely considered as the broadening of the single-molecule HOMO and LUMO levels of single atoms or molecules into the valence and conduction bands, respectively. An “ntype” semiconductor has a set of donor atom levels (As or Se atom levels for
Table 9.3 Ranges of Electron and Hole Concentration, Mobility, and Conductivity in Typical Samples of Ge and Si [5,10–12] Intrinsic (No Doping)
n-Type (Light Doping)
p Type (Moderate Doping)
Deg.n-type (Heavy Doping)
#Ge atoms cm3 [n] (electrons cm3) [p] (hole cm3) s (S cm1)
4.44 1022 2.37 1013 2.37 1013 2.13 102
4.44 1022 1.75 1014 3.57 1012 1.00 101
4.44 1022 3.68 1015 1.70 1011 1.00 100
4.44 1022 1.75 1017 3.57 109 1.00 102
#Si atoms cm3 [n] (electrons cm3) [p] (hole cm3) s (S cm1)
4.99 1022 1.45 1010 1.45 1010 4.34 106
4.99 1022 7.5 1013 2.8 106 5 102
4.99 1022 9.55 105 2.2 1014 1.1 102
4.99 1022 2 1019 10.5 6 102
9.6
52 5
RE CTIFICATION IN pn J U NC T I O N D I O DE S O R REC T I FI E RS
Conduction band
ΔE
ΔE
D
D
D
ΔE
ΔE A
A
A
Valence band
(A) Insulator: filled valence band, empty conduction band, & large energy gap ΔE
(B) Intrinsic Semiconductor almost filled valence band, almost empty conduction band, large energy gap ΔE, a few electrons a few holes
(C) n-type semiconductor: filled valence band, almost empty conduction band, large energy gap ΔE, few donors in gap, D D D a few electrons in conduction band
(D) p-type semiconductor: almost filled valence band, empty conduction band, large energy gap ΔE, few acceptors A A A in gap, a few holes in valence
(E) Metal: half-filled band, with many empty states available for conduction electrons
(A) Semimetal: valence band and empty conduction band overlap,
band
Ge, P, or S atom levels for Si) close to the conduction band, and an extra set of free electrons in the bottom of the conduction band. A “p-type” semiconductor has a set of acceptor atom levels (Ga or Zn for Ge, Al or Mg for Si) close to the valence band and has a set of extra holes close to the top of the valence band.
9.6 RECTIFICATION IN pn JUNCTION DIODES OR RECTIFIERS The first pn junction diode or rectifier was reported in 1949 [13]. The term “diode” comes from the vacuum-tube literature, and the new device was called a rectifier when it was used in electrical rectifier circuits. However, the term “diode” should be reserved to vacuum-tube devices, and “rectifiers” should be used for semiconductor pn junction devices. In a typical np or pn junction diode an n-doped region is put in intimate electrical contact with a p-doped region. This can be done in two ways: (1) In the grown junction, a single crystal is slowly pulled from a melt that contains, say, a group V (15) metal, making the part of the crystal above the melt n-doped; in the middle of the process, an excess of impurities of the opposite kind (group III or group 13 metal) is added to the melt, so the part pulled thereafter from the melt will be overall p-doped. (ii) In the fused-junction diode, for example, In is melted atop a slab of n-type Ge, so that the region around the In becomes overall p-doped. The important junction is at the interface between the n-type region and the p-type region (Fig. 9.13).
FIGURE 9.12 Insulators, semiconductors, metals, and semimetals.
526
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
(A) Fixed cations (in n-region) and anions (in p region), and mobile electrons (left) and holes (right) junction: space-charge or depletion region
n region
p region mobile positive charges (holes)
mobile electrons
Fixed positive charges on group-V (15) electron donor atoms after electron donation
Arrows: electric Fixed negative charges on group-III (13) field in depletion electron acceptor atoms after electron donation region
concentration of electron donors concentration of electron acceptors
(B) concentration of mobile electrons
(C)
concentration of mobile holes
concentration of uncompensated net charges (positive on left, negative on right)
n region
p region
(D) electrical potential as a function of position (at zero applied bias) n region
p region
Voltage
HOLES Vbi
Free ELECTRONS
x Electron energy
(E) EC(n-reg)
eVbi eVn
EC(p-reg) eVp EF
FIGURE 9.13 Distribution of free carriers (electrons, holes), stationary uncompensated charges (cations in n region, anions in p region).
EV(n-reg)
EV(p-reg)
(F) electrical symbol for np rectifier shown above (arrow follows hole current direction)
When such a pn junction is subjected to an electric field, then the thickness of the depletion, or space-charge, region is affected (Fig. 9.14): At forward bias it becomes thinner, at reverse bias it thickens. By a convention in solid-state physics, the energy plotted in Figs. 9.13 and 9.14 is the electron energy; therefore a positive bias moves the energy level downwards. The overall current density I (A m2) is given by the Ebers29–Moll30 equation [14]: I ¼ Irs ½expðeV=kB TÞ 1
29 30
Jewell James Ebers (1921–1959). John L. Moll (1921– ).
ð9:6:1Þ
52 7
RE CTIFICATION IN pn J U NC T I O N D I O DE S O R REC T I FI E RS
9.6
(B) (A)
FORWARD BIAS
NO BIAS
p
n
p
n −
V=0
(C) REVERSE BIAS
p
n
−
+
+
V > 0 (for p), forward bias
V < 0, (for p region): reverse bias p region
n region n region Electron energy
p region
n region
p region DE+|V|
holes DE
electrons x=0
DE-|V|
position
deapletion, or space-charge region of uncompensated net charges on the two sides of the pn junction
x=0
position
this depletion or space-charge region shrinks in size under forward bias
x=0
position
the depletion or space-charge region increases in size under reverse bias
FIGURE 9.14 Effect of forward and reverse bias on the size of the pn junction region. Adapted from Terman [5].
where kB is Boltzmann’s constant, T is the absolute temperature, e is the electronic charge, V is the external voltage applied to rectifier, and Irs is the reverse saturation current or generation flux; Irs is due to a flow of minority intrinsic carriers (of the opposite charge to the majority carriers, but several orders of magnitude smaller) that is largely unaffected by V (for V 0.1 V). Typically, Irs is of the order of a few microamperes and is opposite in sign to the forward current. At T ¼ 300 K the Ebers–Moll equation is I ¼ Irs ½expð38:86 VÞ 1
ð9:6:2Þ
(V in volts) (Fig. 9.15). Equation (9.6.1) will be seen again below for transistors. For practical diodes, the exponent in Eq. (9.6.2) should be somewhere between 15.5 and 39 [15]. The Ebers–Moll equation is the same as the Shockley31 equation [9]: J ¼ Jp þ Jn ¼ Js ½expðeV=kB TÞ 1
31
William Bradford Shockley (1910–1989).
ð9:6:3Þ
528
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
current I / amp m–2
forward current
breakdown voltage
reverse current (before breakdown)
FIGURE 9.15 -Irs
Rectifier current–voltage (IV) plot. The reverse current before Zener breakdown is Irs, the (negative) reverse saturation current. The equation should be the Ebers–Moll equation, Eq. (9.6.1), I ¼ Irs [exp (eV/kBT) 1], which does not work very well in the Zener breakdown region.
1V
– 1V
– 3V
2 V Voltage V / Volts
Zener region
1=2
1=2
1 where Js ¼ eDp pn0 tp þ eDn np0 tn ¼ eDp pn0 L1 [12]. Dp p þ eDn np0 Ln and Dn are the diffusion coefficients for holes and electrons, respectively, pn0 and np0 are the equilibrium hole density on the n side and the equilibrium electron density on the p side, respectively; Lp and Ln are the diffusion lengths for holes and for electrons, respectively; tp and tn are the recombination 1=2 1=2 lifetimes for holes and electrons, respectively. Furthermore, Lp ¼ Dp tp and 1=2 1=2 Ln ¼ Dn tn . Shockley derived Eq. (9.6.3) for the case where (a) the depletion layer ends suddenly, with dipole layers at both extremities; (b) the Boltzmann32 distribution is valid for the concentrations of both electrons and holes: 1=2
1=2
pn0 ¼ ni exp½ðEi EF Þ=kB T
ð9:6:4Þ
np0 ¼ ni exp½ðEi EF Þ=kB T
ð9:6:5Þ
and ni is the intrinsic concentration of either holes or electrons at zero applied volts; (c) the minority carrier concentration is low, relative to the majority carrier concentration; and (d) there is no extra “external” current in the depletion layer. In Eq. (9.6.1) the current I is zero at V ¼ 0; I grows very rapidly in the forward direction; at V < 0, but before Zener33 breakdown, I is the small, negative, voltage-independent negative reverse saturation current Irs ¼ jIrsj.
32 33
Ludwig Eduard Boltzmann (1844–1906). Clarence Melvin Zener (1905–1993).
9.6
52 9
RE CTIFICATION IN pn J U NC T I O N D I O DE S O R REC T I FI E RS
Beyond some large negative Vbrk, where the electric field exceeds 3 107 V/ m, electrical breakdown occurs [16]: the current I now becomes negative and very large; this effect is not included in Eq. (9.6.3) but is shown in Fig. 9.15. If this large negative voltage Vbrk is removed quickly enough, the overall current is limited by the circuitry, and the crystal has not been heated unduly, then the np junction can recover from the breakdown, and the IV curve remains reversible. If too much Joule heating occurs for too long a time at V Vbrk (i.e., in the Zener region), then the breakdown is irreversible and the rectifier is destroyed. The negative current at V < Vbrk is used in heavily doped reverse-biased Zener diodes to regulate power supply voltages precisely. At even more negative potentials, avalanche breakdown occurs, where impact ionization creates electron–hole pairs. In the n region, the majority carriers are electrons, which constitute the “drift current,” but there are usually some holes also, whose concentration is typically at least 100 times lower than that of the electrons; these holes in the n region constitute a diffusion current. Symmetrically, in the p region, the majority carriers are holes, which constitute the drift current; the minority carriers (100 to 1000 times lower in concentration) are the electrons, which in the p region make up a diffusion current. Ignoring electron–hole recombination and thermal generation, the sum of drift (majority) and diffusion (minority) current is constant across the length of the p region and also within the n-region. The total current (electron þ hole) is constant across the whole device, by conservation of charge. The built-in potential Vbi in Fig. 9.13(C) in a pn junction rectifier is about 0.4 V for Ge, 0.8 V for Si, and 1.2 V for GaAs for a background doping concentration is about 1014 cm3. Some important characteristics of rectifiers are as follows: 1. The junction capacitance under reverse bias is of the order of C ¼ 5 to 50 pF
ð9:6:6Þ
2. The dynamic resistance R of a rectifier of cross-sectional area A, total forward current density I (amperes m2), and total forward current It ¼ IA (amperes) (i.e., the resistance to small increase in voltage dV added to the forward bias) is R ¼ ð1=AÞðdV=dIÞ ¼ ð39It Þ1
ð9:6:7Þ
if one uses Eq. (9.6.2) in the limit exp(39 V) 1. This dynamic resistance is typically 25 ohms for a forward current of 1 mA, and it decreases for larger currents. 3. The rectification ratio
RRðVÞ IðVÞ=IðVÞ
ð9:6:8Þ
is the ratio of the current I for a stated positive voltage V, to the negative of the current I at the equal and opposite voltage V. A typical and desirable value is RR 100. Note that, in the forward bias mode for an np rectifier, electrons from the n region, which carry the so-called drift current” of majority carriers, rush
530
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
toward the np junction; on the p side of the junction, these electrons quickly meet an excess of holes, recombine with them, and disappear (on the p side of the junction, electrons are the minority carriers and become a relatively unimportant “diffusion current”). In the p-region, the holes “take over”: They rush (in the reverse direction to the motion of the electrons in n region); and this reverse positive current, due to holes, is equal in magnitude to the electron current in the n region. When the current is switched very rapidly, the rectifier may not be able to respond with infinite speed. The electrical contacts to the rectifiers or transistors or vacuum tubes discussed here are, by preference, nonrectifying or ohmic, which occurs when the Fermi levels bend, but meet with conduction bands on both sides of the junction. When, however, the motion of electrons (or holes) from semiconductor to metal encounter a barrier that is higher in one direction than in the opposite direction, then the junction is called rectifying, and one speaks of a Schottky34 barrier [17]. Table 9.4 gives characteristics for a few typical pn junction rectifiers. The Esaki35 tunnel diode exhibits a region where the current decreases with increasing bias (this is called “negative differential resistance,” or NDR) (Fig. 9.16); it occurs in junctions between heavily degenerate n- and pregions (labeled nþ and pþ); the decrease in current is due to elastic tunneling, whose probability becomes significant when filled and empty bands reach the same level (case V in Fig. 9.16). This tunnel diode has two advantages: First of all, the negative resistance of the device means it can be used as an amplifier (see Fig. 9.16E). Next, the diode emits microwaves (as can the related Gunn diode, which has a thin lightly doped layer sandwiched between two thicker heavily doped layers: NDR is seen also for the Gunn36 diode (see Chapter 10).
Table 9.4
Characteristics for Some Commercial pn Junction Rectifiers Continuous VF @ IF
Peak VF @ IF
Reverse VR @ IR
Type
V
A
V
A
V
mA
Reverse Rec. (ns)
PAD-1 ID101 1N3595 1N4002 1N5819 1N5625
0.8 0.8 0.7 0.9 0.4 1.1
0.005 0.001 0.010 1 1 5
— 1,1 1.0 2.3 1.1 2.0
— 0.03 0.2 25 20 50
20 10 150 100 40 400
1 pA 10 pA 3 50 10000 50
— — 3000 3500 — 2500
Source: Horowitz and Hill, Ref. 16.
34
Walter Hermann Schottky (1886–1976). Reona “Leo” Esaki (1925– ). 36 John Battiscombe Gunn (1928–2008). 35
Capacitance C (pF)
Notes
0.8 0.8 8.0 15 50 45
Lowest IF Very low IF General-purpose Industrial standard power Schottky 5-A rectifier
53 1
pnp AN D npn TRA NSIS TOR S
9.7
(A)
heavily doped p-region
(B)
n+
p+
I
DEPLETION REGION
S heavily doped n-region
F
V
T
V
R
At thermal equilibrium (I=0 at V=0)
(C)
S
T
F
V
R
(E)
-R (D)
+R
Vout
Vin
9.7 pnp AND npn TRANSISTORS When an np rectifier is connected, through a shared p region, to a pn rectifier, we have a npn junction “triode” transistor, or bipolar junction transistor (BJT). This transistor can amplify signals, just as does the vacuum-tube triode, but by a totally different mechanism. The first transistor, made by Bardeen,37 Shockley, and Brattain38 at Bell Telephone Laboratories in 1947, was a point-contact transistor (Fig. 9.17) [18–21]. 37 38
John Bardeen (1908–1991). Walter Houser Brattain (1902–1987).
FIGURE 9.16 Esaki tunnel diode (A) made by accosting a heavily doped p-region to a heavily doped n-region, and band energy diagram for thermal equilibrium and zero bias condition. (B) IV curve, showing region where R < 0, with points R, T, S, V, and F marked. (C) Band diagrams that corresponds to the points R, T, S, V, and F. (D) Symbol for Esaki tunnel diode. (E) Use of tunnel diode as an amplifier: If the load resistor on the right (þR) matches the negative resistance R within the diode, then the total output resistance is zero; a small signal Vin on the left can be amplified to a large signal Vout on the right (“infinite gain”).
532
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS p-type regions EMITTER
COLLECTOR
FIGURE 9.17 The first transistor, a point-contact transistor,inventedbyBardeen,Brattain, and Shockley in 1947 [18,19]. Adapted from Terman [5].
Ge wafer (n-type) BASE
At the request of Brattain, Pierce39 coined the name “transistor,” an abbreviation of the neologism “transfer resistor.” Later, Teal40 and Sparks41 made the first Ge BJT at Bell Labs in 1950, and Teal made the first Si BJT at Texas Instruments in 1954 [22], ushering in the “silicon age.” Junction transistors (Figs. 9.18A and 9.18B) can be synthesized in two ways: (1) the grown-junction pnp transistor or (2) the fused-junction transistor . 1. The grown-junction transistor consists of pulling a crystal from a melt that is p-doped, then interrupting the pulling, doping the melt with excess n material, then pulling the crystal for a short time longer, then doping the melt again with p material. The cross-sectional area of such a transistor is of the order of 0.25 mm2, and thickness of the intermediate n region is about 0.1 mm (Fig. 9.18). If, instead, one desires an npn transistor, the doping is done in reverse order (p for n, n for p). One end of the transistor is called the emitter (E) region, which emits either electrons or holes, and the other is the collector (C) region. The region in the middle is called the base (B) region. The E region is designated with an arrow in Fig. 9.18 (arrowhead away from B for npn, arrowhead toward B for pnp). 2. The second type of pnp junction transistor is the fused-junction transistor, where an n-type Ge or Si wafer is contacted on both sides with In drops and then heated; the In-rich regions become the p-type E and C regions, while the n-region in the middle is the B region. The circuits can use the transistor in a “grounded emitter,” “grounded base,” or “grounded-collector” arrangement, as shown in Figs. 9.19A, 9.19B, and 9.19C, respectively. These three choices are mirrored in vacuum-tube triode circuit design (Figs. 9.19A’, 9.19B’, and 9.19C’). The first point-contact transistor of Fig. 9.17 was used in a grounded-base or common-base circuit. Note that in Fig. 9.20A, the emitter is forward-biased (i.e., VC ¼ 0.1 V), while the collector is back-biased (VE ¼ 6 V), as the polarity on the battery indicates. Under those conditions, the majority electron current flows from
39
John Robinson Pierce (1910–2002). Gordon Kidd Teal (1907–2003). 41 Morgan Sparks (1916–2008). 40
53 3
pnp AN D npn TRA NSIS TOR S
9.7
(A) npn
(B) pnp Emitter
Collector E
Emitter
C
E
n
n p
p n
B
B
Emitter
(D) pnp
Collector E
IE
C
VE
Collector
Emitter IC
IE
p
Base
Base
(C) npn
Collector
C
VC
B
E
IC
C
VE
VC
B
Base
Base
FIGURE 9.18 "BLACK BOX"
(E) npn
(F) npn
iC
iE
C
C
vC
vE
IC
B
iB
IC
B
E
IB
(H)
Depiction of bipolar junction transistors: (A) npn and (B) pnp, and their representation in commonbase or grounded-base circuit diagrams (C) npn, and (D) pnp, with the signs for positive currents I and voltages V indicated. (E) Four-terminal “black box” representation of the common-base npn circuit. After applying a small voltage vE, one measures a small current iE on the left, and on the output side one measures an output voltage vC and an output current iC; the base current iB is not measured, but inferred.
(G) pnp
E
IB
(I)
C
TO-92 case, polymer
E B
E
B C
TO-5 case, metal TO-18 case, metal
iC iE
E
iC
C
iB
B
iE
B
C
iB
E VE
B
RL
VC
C VC
VB
iB
VE
VB
iE
VC
VE
iC
VC
VB
VB
VE
(C) npn common-collector
(B) npn common-emitter
(A) npn common-base
E
FIGURE 9.19 ZL eg Eg
Ep
(A′) grounded-grid circuit
ep
ep
Common-base, common-emitter, and common-collector circuits for ec a bipolar npn transistor (A, B, C, ZL Eg Ep Eg Ep respectively), and the equivalent grounded-grid, grounded-cathode, and grounded-plate circuits for vacuum-tube triodes (A0 corresponds to A, B0 to B, and C0 to C). (B′) grounded-cathode circuit (C′) grounded-plate or cathodeAdapted from Terman [5]. follower circuit eg
ZL
eg
534
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
(A) n
p
Emitter E
n B
VC = VCB | VC | 6 Volts
VE = VEB | VE | 0.1 Volt
(B)
Collector C
BASE WIDTH
EMITTER JUNCTION THICKNESS
COLLECTOR JUNCTION THICKNESS
HOLES ELEC TRONS
3.68 x 1015
HOLES 3.57 x 1012 ELECTRONS 1.70 x 1011 HOLES
HOLES 3.57 x 109 n-region (emitter) σ = 100 Siemens
p-region (base) σ = 1 Siemens
(C)
npn bipolar junction transistor: (A) Circuit arrangement, (B) carrier concentrations (the electron concentration within the base is shown by a dashed thick line), (C) potential distribution, and (D) current components. Adapted from Terman [5].
ELECTRONS
(D) IE =total emitter current
IE' = electron current from emitter
HOLES
n-region (collector) σ = 0.1 Siemens VC = VCB = Collector-base voltage, typically 6 Volts
V E = VEB = Emitter-base voltage, typically -0.1 Volts
FIGURE 9.20
ELECTRONS ELECTRONS 1.75 x 10 14
ONS CTR ELE
CHARGE CONCENTRATION (not to scale)/ CM 3
ELECTRONS 1.75 x 10 17
HOLES holes lost to recombination
IC = total collector current I CO =reverse saturation current
IE"=hole current to emitter IB =IE-IC' -ICO=base current
emitter to base (the current IE flows in the direction opposite to that of the arrow in Fig. 9.20A), and then most of it flows from the base to the collector; the current IC flows in the direction indicated in Fig. 9.20A. Some of the electrons that flow through the base recombine with holes traveling in the reverse direction. Therefore the total collector current is always slightly smaller than the emitter current, but the aim is to make IC as close to IE as possible, by adjusting the majority carrier concentrations (the collector region is more heavily doped than base or collector) and by reducing the thickness of the base region. The rest of Fig. 9.20 shows diagrammatically how the electron and hole concentration varies as one moves from the emitter n region to the base pregion to the collector n-region (Fig. 9.20B), how the voltages vary from E to B to C (Fig. 9.20C), and how the total current, and contributors to it, vary in the E, B, and C regions (Fig. 9.20D).
9.7
53 5
pnp AN D npn TRA NSIS TOR S
The motion of carriers through an npn transistor is depicted in Fig. 9.20. The majority carriers in the emitter (n region; electrons) rush toward the base (p region), where their concentration quickly decays, as a function of penetration in the base region. Since the base region is made very thin, the concentration of electrons may decrease roughly linearly with distance within the base region, as shown in a thick dashed line in Fig. 9.20B. At the base-tocollector interface the concentration of minority carriers (electrons) is very low, but, within the collector, the electrons are again majority carriers, which by a combination of drift and diffusion hurtle on to the collector terminal of the circuit (under reverse bias). Most of the potentials applied from the outside to the transistor are dissipated within the emitter-to-base and base-to-collector junctions (the former very thin, the latter a bit thicker). As indicated in Fig. 9.20D, there is also a base current IB, which is designed to be as small as possible, compared to IE, so that IC can be as close as possible to IE. The total current in the emitter IE, not too far from the emitter-to-base junction, consists of two contributions: the electron current (drift current) I 0 E , which proceeds under forward bias toward the emitter-to-base junction, and the much smaller hole current (a diffusion current), which originates in the base and proceeds in the opposite direction, but decays exponentially, as the distance from the junction increases. The total current in the collector, close to the base-to-collector junction, consists of electrons I 0C (a large fraction of I 0 E ) that have somehow evaded capture within the base and proceed against reverse bias in the collector region. The rest of the electron current in the collector is what in pn diodes is called reverse saturation current Irs, and here it is called collector current with zero emitter current; IC0 ¼ Irs. I 0C is the “useful” electron flow in the transistor. Figure 9.21 shows that the total emitter current IE depends exponentially on the emitter voltage VE and is displaced to the left, as the collector voltage VC increases. Figure 9.22 shows that when the npn transistor is designed properly, the collector current IC is almost independent of the collector bias VC and increases linearly as the emitter current IE is increased. The collector current IC is also very close to the emitter current IE; that is, the dimensions and conductivities of the three regions are so adapted that the base current IB is kept small. The curves in Fig. 9.22 for the npn transistor resemble the curves in Fig. 9.11 for the vacuum-tube pentode. The base current IB (which one seeks to minimize) consists mainly of holes within the base, which are generated from the electron donor cations in order
Emitter current IE [A (VC = 0)]
-0.003 -0.0025
VC = 40 Volts
-0.002 -0.0015
FIGURE 9.21 -0.001
VC = 0
-0.0005 0 0
-0.02
-0.04
-0.06
-0.08
Emitter voltage VE (Volts)
-0.1
-0.12
Emitter current IE as a function of emitter-to-base voltage VE for an npn junction transistor in the grounded-base or common-base connection for two different values of the collector voltage VC. Adapted from Terman [5].
536
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
Collector current IC (Amperes)
0.005
IE = -5 mA
Load Line RL = 7000 Ohms
0.004
IE = -4 mA
0.003
IE = -3 mA
0.002
IE = -2 mA
Operating point
IE = -1 mA
0.001
0 0
5
10 15 20 Collector voltage VCB = VC (Volts)
25
30
FIGURE 9.22 Collector current IC as a function of collector-to-base voltage VC at several different emitter current values IE, for an npn junction transistor in the common-base (grounded-base) configuration. Adapted from Terman [5]. The load line is a straight line that shows the effective load resistance RL ¼ 7000 O, which intersects the collector voltage axis at the voltage of the collector circuit (or collector power supply voltage, here and in Fig. 9.21, VCB ¼ 30 V), and the collector current axis at VCB/RL ¼ 30/7000 ¼ 0.04 A. The operating point, or quiescent point, is the intersection of this load line with the collector current curve at the voltage of the emitter voltage (here VE ¼ 15.5 V), where the plate current is given in the absence of an externally applied signal. Note the similarity with Fig. 9.11.
to replace the holes which were lost by electron–hole recombination in the region close to the base-to-emitter junction, using electrons from the emitter. The emitter current IE is not independent of the collector voltage VC: Figure 9.23 shows that VC affects the thickness of the base-to-collector junction and thus influences the diffusion time for currents to cross the base. Transistor designers wish to bring the ratio IC/IE as close to unity as possible. Four factors tend to make IC differ from IE: 1. Only the part of IE that is carried across the emitter-to-base junction, namely I 0 E (see Fig. 9.20D), will contribute to the formation of IC. The ratio I 0 E /IE is called the emitter efficiency, and it is made to approach 1 from below, typically 0.99, by doping the emitter more heavily than the base material (e.g., see Fig. 9.20B: 100 siemens vs. 1 siemens).
FIGURE 9.23 Characteristics of same pnp junction transistor as in Figs. 9.20 and 9.21, but in a common-emitter circuit. Adapted from Terman [5].
Collector current IC (Amperes)
0.005 IB = 0.250 mA IB = 0.200 mA
0.004 β = (4.45-0.65)/(0.250-0.050)= 19
0.003
IB = 0.150 mA
0.002
IB = 0.100 mA IB = 0.050 mA
0.001 IB = 0 mA
0 0
5 10 15 20 25 Collector (-to-emitter) voltage VC = VCE (Volts)
30
9.8
53 7
SMALL-SIGNAL THEORY FOR TRANSISTORS
2. The ratio IC/IE is affected by the carrier recombination in the base region. The fraction of minority carriers (electrons) that cross the base (p-doped) region unscathed is called the transport factor and can be made as large as 0.99 by reducing the thickness of the base. 3. Electrons that cross the base-to-collector junction modify (reduce) the minority hole concentration in the collector region, increasing the hole concentration gradient: this accounts for a a current multiplication factor, which can be as large as 1.003. 4. The presence of the reverse saturation current Irs ¼ IC0 (this is usually a small effect). The total collector current IC depends on the reverse saturation current Irs (as explained for pn junctions above; also called IC0, or collector current at zero emitter current) and on the emitter voltage VE by the Ebers–Moll equation [already introduced in Eq. (9.6.1)] [14]: IC ¼ Irs ½expðeVE =kB TÞ 1
ð9:7:1Þ
The important practical output parameters are the small-signal collector current iC and small-signal collector voltage vC as a function of the input parameters: the small-signal emitter voltage vE, the relatively small forwardbias emitter voltage VE, and the large reverse-bias collector voltage VC. For the common-base circuit, the collector current IC is about the same as the emitter current IE, so the doubt arises: Where is the amplification? The answer is that the output voltage can be drawn across a large output load resistance RL (Fig. 9.22). A totally different way of looking at transistor action is by using a common-emitter circuit (Fig. 9.19B): Now the input current is (a relatively small) iB, and the output is the relatively large) collector current iC; this collector current is still controlled by the Ebers–Moll equation, but the current gain is now explicit: b ¼ ðdIC =dIB Þ
ð9:7:2Þ
Typical b values of 50 to 250 are available in transistors, but b does vary from sample to sample within a batch: therefore a circuit that depends on a particular value of b is poorly designed.
9.8 SMALL-SIGNAL THEORY FOR TRANSISTORS Transistors are always operated so that amplification is obtained for a signal whose voltage is small on the input side (relative to the bias on the emitter) and small on the output side (relative to the collector voltage). Therefore small-signal theory applies. The designer obviously wants all signals to be amplified linearly, that is, by a common factor; this makes the circuit behave more reasonably. In small-signal theory, one considers the superposition of a small voltage vE onto the DC power-supply-generated voltage VE, and the generation of a small voltage vC on the output, thanks to the application of a large
538
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
(reverse-bias) voltage VC. The small incremental emitter current iE and collector current iC must also be considered (see Fig. 9.18E). One can define four coefficients or hybrid parameters h11, h12, h21, and h22 and assume a linear dependence of the emitter voltage vE and the collector current iC on the emitter current iE and the collector voltage vC, as follows: vE ¼ h11 iE þ h12 vC
ð9:8:1Þ
iC ¼ h21 iE þ h22 vC
ð9:8:2Þ
where h11 ¼ ðdVE =dIE ÞjVC ¼constant ¼ ðvE =iE ÞjvC ¼0 ¼ dynamic resistance
ð9:8:3Þ
h12 ¼ ðdVE =dVC ÞjIE ¼constant ¼ ðvE =vC ÞjiE ¼0
ð9:8:4Þ
h21 ¼ ðdIC =dIE ÞjVC ¼constant ¼ ðiC =iE ÞjvC ¼0 ¼ a0 ¼ ðzero-freq: aÞ
ð9:8:5Þ
h22 ¼ ðdIC =dVC ÞjIE ¼constant ¼ ðiC =vC ÞjiE ¼0 ¼ dynamic conductance
ð9:8:6Þ
The linearity implicit in Eqs. (9.8.1) and (9.8.2) is not guaranteed, but, for small signals, it is a good first approximation. The term a0 is the zerofrequency limit to the frequency-dependent dimensionless coefficient a. The coefficients are designated in Fig. 9.24A, and typical values are given in Table 9.5. The dynamic resistance h11 of the emitter-base circuit is of the order of 100 O, while h22 is the dynamic conductance of the collector-base circuit, and it ranges from 0.5 106 to 5 106 siemens. The coefficient h21 is always negative and slightly larger than 1. The coefficient h12 represents the relative effectiveness of the emitter and collector voltages in influencing the emitter current. The above equations are appropriate for a common-base circuit. The characteristics of a typical junction transistor are shown in Table 9.5, with several ways of analyzing the behavior of the same npn junction. The first three are for the common-base circuit shown so far, as shown in Fig. 9.23. Figure 9.24A identifies the four hybrid parameters or coefficients hij defined in Eqs. (9.8.1) to (9.8.6). If one adopts a T-description (Fig. 9.24B), then one must define new parameters as follows: rE h11 ð1 þ h21 Þh12 h1 22
ð9:8:7Þ
rB h12 h1 22
ð9:8:8Þ
1 rC ð1 h12 Þh1 22 h22
ð9:8:9Þ
a ðh21 h12 Þð1 h12 Þ1 h21 a0
ð9:8:10Þ
9.8
53 9
SMALL-SIGNAL THEORY FOR TRANSISTORS (B)
(A)
α iE
CURRENT GENERATOR
E +
iE
C
CURRENT GENERATOR
h11
h21iE
vE
+
E
C
rE
vC
vE
−
rB
−
B
α0 (1-α0)-1iE
CURRENT GENERATOR
iC m0v′C
rC′
C
+
B +
rE vE
CC
rB′
−
B
(D)
α iE
(C)
E +
+
iB
h12vC
iE
iC
rC
vC
h22
−
+
iE
iB
C
rB (1-α)rC
vC
vB
+
FIGURE 9.24 vC
(A, B) Equivalent circuits for an npn transistor: (C) In common-base configuration and (D) in commonemitter configuration. See Table 9.5 for numerical values.
rE
iB −
iC
iE −
−
B
− E
If one considers Fig. 9.24C instead, which attempts to display the actual physical processes, then one must also define: r0 E effective emitter resistance h11 r0 B ð1 a0 Þ
ð9:8:11Þ
r0 B effective base resistance
ð9:8:12Þ
r0 C effective collector resistance ðdVC =dIC ÞIE¼const: h1 22
ð9:8:13Þ
m0 h12 h12 r0 B
ð9:8:14Þ
For the common-emitter circuit (Fig. 9.19C) the current gain b is used: b ¼ ðdIE =dIB Þ að1 aÞ1
Table 9.5
ðð9:8:2ÞÞ
Characteristics of a Typical Junction Transistora
Common-Base (circuit Fig. 9.19A) Eqs. (9.8.3)–(9.8.6) Equiv: Fig. 9.24A
Common-Base (circuit Fig. 9.19A) Eqs. (9.8.7)–(9.8.10) Equiv: Fig. 9.24B
Common-Base (circuit Fig. 9.19A) Eqs. (9.8.11)–(9.8.14) Equiv: Fig. 9.24C
h11 ¼ 30 O h22 ¼ 106 siemens h12 ¼ 5 105 h21 ¼ 0.98
rE ¼ 20 O rC ¼ 106 O rB ¼ 500 O a ¼ 0.98
r0 E ¼ 24 O r0 C ¼ 106 O r0 B ¼ 300 O a ¼ 0.98 m0 ¼ 2 104
a
Common-Emitter (circuit Fig. 9.19B) Eq. (9.8.7)–(9.8.10) Equiv: Fig. 9.24D rE ¼ 20 O rC (1 a0) ¼ 20,000 O rB ¼ 500 O a ¼ 0.98 Current gain b ¼ 49
Zero-frequency a0 ¼ 0.98, alpha cutoff frequency fa ¼ 1 MHz, capacitance of base-to-collector junction CC ¼ 10 pF.
540
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
The common-base circuit for an npn transistor (Fig. 9.19A) seems logical and simple, but its efficiency in amplification is not obvious and must be explained. The npn transistor in a common-emitter circuit (Fig. 9.19B) is a bit easier to understand as a current amplifier. There are four rules: 1. The collector must be more positive than the emitter (usually by several volts). 2. The emitter-base and base-collector circuits behave like diodes, but the emitter-base junction is forward biased (VEB ¼ 0.6 V), while the basecollector junction is reverse-biased. Typically VB ¼ VE þ VEB
VE þ 0.6 V. 3. The maximum ratings of temperature, collector current IC, base current IB, emitter-to-collector voltage VCE, and power dissipation ICVCE should not be exceeded. 4. The collector current IC is roughly proportional to IB: IC ¼ bIB
ð9:9:15Þ
One can imagine a “Transistor Man” (Fig. 9.25) whose only job is to adjust the resistor on the right in the collector circuit, so that the collector current IC he reads from the meter on the right is exactly b times the base current IB he has just read on the left. Table collects some data on bipolar transistors.
FIGURE 9.25
IC = β IB
Explanation of the term “transistor” as “transfer resistor”: the constant job of Mr. Transistor Man for a common-emitter npn transistor is to turn the rheostat on the right, as needed, so that the equation IE ¼ bIB (with a preset value of the current gain b, say b ¼100) is always obeyed; he makes sure that the ammeter on the right registers a current IE that is always a fixed amount b times larger than current IB measured by the ammeter on the left. Adapted from [16].
Table 9.6
E
Characteristics of Selected Bipolar Transistors
Use
VCE (V)
ICmax (mA)
General purpose High gain. low noise High current High voltage High speed
25 25 30–60 150 12
200 300 600 600 100
a
C
B
b 200 250 150 100 50
IC (mA) 2 50 150 10 8
TO-92 plastic casing.b TO-5 metal casing.c TO-18 metal casing (see Fig. 9.18).
CCB (pF) 1.8–2.8 4 5 3–6 1.5
fa (MHz)
npn Examples
pnp Examples
300 300 300 250 900
2N4124a 2N6008a 2N2219b 2N5550a 2N918c
2N2126a 2N6009a 2N2905b 2N5401a 2N4208c
9.9
541
L A R G E - S I G N A L BE H A V I O R OF J U N C T I O N T R A N S I S T O R S
9.9 LARGE-SIGNAL BEHAVIOR OF JUNCTION TRANSISTORS The Ebers–Moll equation for a pnp transistor can be rewritten as IE ¼ a11 ½expðefE =kB TÞ 1 þ a12 ½expðefC =kB TÞ 1
ð9:9:1Þ
IC ¼ a21 ½expðefE =kB TÞ 1 þ a22 ½expðefC =kB TÞ 1
ð9:9:2Þ
where the quantities fE and fC are the junction potentials at the emitter-tobase junction and at the base-to-collector junction, respectively; it is assumed that (i) the resistivities of the semiconductor regions are low, (ii) the injected current densities are low, and (iii) the space-charge layer widening effects are negligible. In general [14]; a12 ¼ a21
ð9:9:3Þ
The four coefficients a11, a12, a21, and a22 can be obtained from the four following transistor parameters: IE0 ¼ the current at the emitter-base junction at saturation, with zero collector current, IC0 ¼ the current at the collector-base junction at saturation, with zero emitter current, aN ¼ the “normal” current gain, with the emitter functioning as an emitter, and the collector functioning as a collector (normal a), and aI ¼ the “inverted” current gain, with the collector functioning as an emitter, and the emitter functioning as a collector (inverted a). It can be shown [14] that IE ¼ ½IE0 =ð1 aN aI Þ½expðefE =kB TÞ 1 þ ½aI IC0 =ð1 aN aI Þ½expðefC =kB TÞ 1 ð9:9:4Þ IC ¼ ½aN IE0 =ð1 aN aI Þ½expðefE =kB TÞ 1 ½IC0 =ð1 aN aI Þ½expðefC =kB TÞ 1 ð9:9:5Þ which can be solved as IE þ aI IC ¼ IE0 ½expðefE =kB TÞ 1
ð9:9:6Þ
IC þ aN IE ¼ IC0 ½expðefC =kB TÞ 1
ð9:9:7Þ
Of the four variables in Eq. (9.9.6), one can be eliminated by using aI IC0 ¼ aN IE0
ð9:9:8Þ
542
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS + OUTPUT
n-channel depletion-mode MOSFETs n-channel JFETs
n-channel enhancement-mode MOSFETs npn JUNCTION TRANSISTORS
- INPUT
+ INPUT
p-channel enhancement-mode MOSFETs pnp JUNCTION TRANSISTORS
p-channel JFETs
FIGURE 9.26 Comparison of FETs and BJTs.
- OUTPUT
0.007 REGION 3: Collector current saturation, or collector voltage cutoff
Transistor characteristics in the collector region, showing region 1 (collector voltage saturation, or collector current cutoff), region 2 (active region, for IC 0 and VC 0), and region 3 (collector current saturation, or collector voltage cutoff).
Collector current IC (Amperes)
0.006
FIGURE 9.27
IE = - 7 mA
0.005 0.004 0.003 0.002 0.001 0
IE = - 6 mA IE = - 5 mA IE = - 4 mA IE = - 3 mA IE = - 2 mA IE = - 1 mA
REGION 1: Collector current cutoff, or collector voltage saturation
-0.001
-5
0
5
10 15 20 25 Collector voltage ECB = EC (Volts)
30
35
which is a consequence of Eq. (9.9.3) [14]. This leaves two equations in three unknowns: IE þ aI IC ¼ IE0 ½expðefE =kB TÞ 1
ð9:9:9Þ
IC þ aN IE ¼ ðaN =aI ÞIE0 ½expðefC =kB TÞ 1
ð9:9:10Þ
9.10 UNIPOLAR OR FIELD-EFFECT TRANSISTORS (FET) The surface FET was proposed by Lilienfeld42 [23] and by Heil43 [24]. The junction unipolar or field-effect transistor (JFET) was proposed by 42 43
Julius Edgar Lilienfeld (1882–1963). Oscar Heil (1908–1994).
9.10
543
UNIPOLAR OR FIELD-EFFECT TRANSISTORS (FET) DESIGN RULE
FIGURE 9.28 DR AI N
SO
UR CE
GATE
Au SiO2 insulator
n-Si
n-Si p-Si BODY or SUBSTRATE
n-type conducting "channel" is formed when gate is positive
Shockley [25] but was first realized by Dacey44 and Ross45 [26]. Figure 9.28 shows schematically an FET. The conducting n-type region (“channel”) is controlled by the electric field between gate and body; the silicon oxide (glass) insulator layer has a resistivity as big as 1014 O, to prevent any current from the gate to channel, source, or drain: the present minimum thickness of this insulator is about 5 to 6 atoms thick. The design rule (DR) is the lateral distance between components. The integrated circuit (IC) using FETs was invented almost simultaneously in 1959 by Kilby46 at Texas Instruments [27] and by Noyce47 at Fairchild Corp. [28]. Thereafter, the dramatic integration and compression of circuit sizes was made possible by photolithographic design of the whole circuit, using visible light (down to DR ¼ 150 nm), deep UV light (DR < 150 nm), and then electron beams (down to DR ¼ 50 nm at present): the stages were LSI (large-scale integration), then VLSI (very large scale integration), then ULSI (ultra large-scale integration), and VHSIC (very high speed integrated circuits): these circuits contain FETs rather than BJTs, which are more difficult to place into ICs. At present, DR ¼ 40 nm devices are approaching commercialization. The technological drive to smaller DR is spurred by the increase in speed of the circuit as DR decreases: Moore48 observed a doubling of IUC integration per square area every two years in the 1960s [29]; this doubling (“Moore’s “law”), with a concomitant doubling of circuit speed, has continued unabated ever since; this is a triumph of engineering, driven by a profit motive. The ultimate limit (DR ¼ 5 nm) may be reached by 2020, but photolithography setup costs and heat dissipation become huge problems. Both bipolar junction transistors (BJTs) and field-effect transistors (FETs) are charge-control devices [15]. The functions of the emitter, base, and collector electrodes of the BJT are replaced by the source, drain, and gate
44 45
G. C. Dacey (1921– ).
Ian Munro Ross (1927– ). Jack St. Clair Kilby (1923–2005). 47 Robert Norton Noyce (1927–1990). 48 Gordon Earle Moore (1929– ). 46
n-channel MOSFET. The thickness of the silicon oxide is made as small as possible. The “design rule” for FETs or other components is the minimum distance used between components; small design rules make large-scale IC integration possible. Adapted from Horowitz and Hill [15].
544
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
electrodes of the FET, respectively. There are, however, differences between the two [15]: (A) In an npn BJT, the collector-to-base junction is reverse-biased, so no majority-carrier electrons would flow from the base to the collector if this were the only bias present; however, the emitter-to-base junction is forwardbiased sufficiently (i.e., by an extra 0.6 V or so), so that the electrons that leave the emitter n region and enter the base p region have enough energy to penetrate across the back-biased base-to-collector junction and finally penetrate into the collector region. However, because of the perennial equilibrium between electron concentration and hole concentration in all semiconducting regions, there is some small “minority carrier” electron current that enters the base (base current) and heads from the base toward the collector, across the base-to-collector junction. Thus, the large collector current is controlled by a much smaller, but significant, base current; the BJT is therefore, roughly, a constant-gain current amplifier, or a dynamic transconductance device. (B) The conduction in a FET channel (the region between the source electrode and the drain electrode) is controlled by an electric field created in the region between the source and drain electrodes on one side and the gate electrode on the other; this field can narrow the channel dimensions laterally, forcing the charge carrier that goes from the source electrode to the drain electrode to traverse a narrower region than if the electric field were absent. No current enters the circuit from the gate electrode. There are three main differences between FETs and BJTs: 1. The FET scheme depends, as in the vacuum-tube triode, on the relative size and spacing of three electrodes: the source (S), the gate (G), and the drain (D). 2. The signal travels through a thick, or even molecularly thin, semiconductor that connects these electrodes; it could be an inorganic semiconductor (doped Si, doped Ge), an organic conducting polymer (polyaniline, polythiophene, polyacetylene), a carbon nanotube, or an organic semiconductor (sexithiophene). 3. While BJTs have a significant base current, the FETs have no gate current. There are several types of FETs; they are distinguished by the following: (i) Polarity. n-channel and p-channel (where the majority carriers are electrons and holes, respectively). (ii) Channel Doping. Depletion-mode (where the material conducts even at zero bias) or enhancement-mode (conductivity is achieved only beyond a certain bias). (iii) Interface Region. Semiconductor junction (JFET), metal–semiconductor (MESFET), or metal–oxide–semiconductor (MOSFET); these MOSFETs have an insulating metal oxide region (e.g. SiO2) with a typical resistance of 1014 ohms and are often also called insulatedgate FETs (IGFETs); at the research level, there are also molecular FETs (MolFETs, where the molecule can by a polymer such as poly (ethylenedioxothiophene), fullerenes, or a semiconducting singlewalled carbon nanotube).
9.10
545
UNIPOLAR OR FIELD-EFFECT TRANSISTORS (FET)
Although the two choices each of type (JFET, MOSFET) of channel (nchannel, p-channel) and doping (enhancement, depletion) could yield eight choices, in practice only five main FET types exist: 1. 2. 3. 4. 5.
n-channel JFETs p-channel JFETs n-channel depletion-mode MOSFETs n-channel enhancement-mode MOSFETs p-channel enhancement-mode MOSFETs
The advantages of FETs are: 1. Low current 2. Easy integrability into integrated circuits (IC) The dependence of the source-to-drain current ID on the source-to-drain voltage VD is explained in Fig. 9.29 [15]. The gate controls the conduction from source to drain through the electrical field it generates within the channel and
(A)
VG >VT=Vbi
VS = 0
Gate
Source
VD = small
Drain
Au SiO2 insulator n-channel depletion layer
n-Si
p-Si
n-Si
Base −VB
(B)
VG >VT
VS = 0
VD = VDsat
Gate
Source
Drain
Au SiO2 insulator
n-Si
n-Si
depletion layer
p-Si
PINCH-OFF
Base
−VB (C)
VG >VT
VS = 0
Source
VD > VDsat
Gate
Drain
Au SiO2 insulator
n-Si
n-Si
depletion layer
Base
p-Si
−VB
FIGURE 9.29 MOSFET operation for grounded source S. (A) In the linear region—that is, at relatively low drain voltage VD—electrons going from source to drain preferentially travel within the narrow n-channel; the corresponding current ID depends linearly on VD; as VD increases more, the current now depends also on VD2 . (B) At the onset of saturation (VD ¼ VDsat) the nchannel narrows to zero at the contact with the drain region; this is the pinch-off point. The electrons must now also travel in the broadened depletion region. (C) Beyond saturation (VD > VDsat) the n-channel no longer reaches the drain region, and electrons traveling from source to drain must go through the depletion region; the current becomes (D) independent of VD. In all three cases, VD is kept larger than VT, where VT is the threshold voltage for nonzero drain current and is similar to the built-in bias Vbi discussed earlier for pn junction rectifiers. Adapted from Horowitz and Hill [15].
546
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
the depletion region beyond. The drain current IDS for a MOSFET can be modeled as [12] 2 IDS’linear” ¼ 2a½VDS ðVG VT Þ þ ð1=2ÞVDS
IDSsat ¼ aðVGS VT Þ2
for VDS < ðVG VT Þ ð9:10:1Þ for VDS ðVG VT Þ ð9:10:2Þ
where VT is the threshold voltage for the gate to allow the onset of current flow; this is akin to the built-in voltage Vbi discussed earlier for pn junction rectifiers. The constant a will not be discussed here. The two regimes [Eqs. (9.10.1) and (9.10.2)] are explained in the caption to Fig. 9.29. Typical IDS versus VDS curves for a VN0106 n-channel MOSFET are shown in Fig. 9.30 (for this commercial MOSFET VT ¼ 1.63 V). The important take-home message is that the very desirable “saturation regime” (Fig. 9.30C), where IDS becomes (to first approximation) independent of VDS, is achieved after n-channel pinch-off, as the current goes through both the n-channel and the widened depletion region; this regime is nice and horizontal (a boon to circuit designers), as was seen for the vacuum-tube pentode, Fig. 9.11 and for the BJT, Fig. 9.22. The first unipolar transistor was a JFET; the layout of an n-channel JFET is given in Fig. 9.31. The mechanism of function is similar to the MOSFET: The nchannel is narrowed by the electric field applied to the gate and the body. The operational characteristics, mutatis mutandis, resemble those of the MOSFET discussed above.
(A)
IDS
LINEAR REGIME IDS 2a* [VDS*(VGS-VT) − 0.5VDS2]
SATURATION REGIME IDS a*(VGS-VT)2
BREAKDOWN REGIME
FIGURE 9.30 (A) General MOSFET characteristics with three regimes: “linear,” “saturation,” and “breakdown.” (B) VN0106 n-channel MOSFET at low bias (VT ¼ 1.63 V): IDS ¼ 0.16 2 [VDS(VGS VT) 0:5 VDS ] (“linear regime”) (C) VN0106 n-channel MOSFET at high bias (VT ¼ 1.63 V) (“saturation”): IDS is independent of VDS: IDS ¼ 0.08(VGS VT)2. Adapted from Horowitz and Hill [15]. VDS
547
UNIPOLAR OR FIELD-EFFECT TRANSISTORS (FET)
(B) MOSFET VN 0106 for low VDS 0.025
IDS (Ampères)
0.02
IDS for VGS = 3.3 Volts
0.015
0.01
IDS for VGS = 2.25 Volts
0.005 IDS for VGS = 2.0 Volts
0 0
0.05
0.1
0.15
0.2
VDS (Volts) (C)
MOSFET VN 0106 HIGH VDS (SATURATION) 0.01 VG = 1.95 Volts
0.008 IDS (Amperes)
9.11
VG = 1.90 Volts
0.006
VG = 1.85 Volts
0.004
0.002
0 0
5
10 VDS (Volts)
(A)
15
FIGURE 9.30
20
(Continued )
(B) S
G
Au
D
D
S
SiO2 insulator
p-Si n-Si channel
FIGURE 9.31
p-Si Body G
(A) Diagram of n-channel JFET. (B) Circuit symbol for n-channel JFET.
548
Table 9.7
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
Characteristics of Selected JFETs and MOSFETs IDS
JFETs Name
Chn
2N4416 2N5457 2N5460 2SJ72
n n p p
MOSFETs
VGSoff & VP
Ciss
Crss
BVGSS (V)
min (mA)
max (mA)
min (V)
max (V)
max (pF)
max (pF)
30 25 40 25
5 1 1 5
15 5 5 30
2.5 0.5 0.75 0.3
6 6 6 2
4 7 7 185
0.8 3 2 55
RDS @
VGS
ID(on)
Crss
BVDS
min (mA)
max (pF)
max (V)
— 3 1.3 2
0.5 2.5 0.8 2.5
30 25 15 25
VGSth
Name
Chn
max (O)
(V)
min (V)
max (V)
SD211 2N4351 CD3600 2N4352
n n p p
45 300 500 600
10 10 10 10
0.5 1.5 1.8 1.5
2 5 — 6
RDS@ max (O)
VGS max (V)
VGSth typ (V)
Crss typ (pF)
Crss max (pF)
BVDS curr. (V)
Cont. Drain max (A)
6 0.6 5 0.5
5 10 10 10
2.5 4 4.5 4
60 180 150 1100
5 15 20 150
60 100 60 200
0.2 4 0.4 11
Power MOSFETs Name
Chn
VN0610L IRF520 VQ2004J IRF9640
n n p p
BVGS
IGSS
15 35 15 35
10 0.01 0.02 0.01
Source: Horowitz and Hill [15].
9.11 JFETS Table 9.7 provides some characteristics of selected commercial MOSFETs and JFETS [15].
9.12 OPERATIONAL AMPLIFIERS The work of the circuit designer has been facilitated by the development of inexpensive microchips, called operational amplifiers (“op amps”), which contain transistor elements to perform certain analog or digital functions. Other simple integrated circuits perform logical operations. Figure 9.32 shows a general scheme for an op amp, and some sample uses. Figure 9.33 shows AND, OR, NOT, XOR, and NAND gates with their associated “truth tables.”
9.13 HISTORICAL INTRODUCTION TO COMPUTERS Large-scale computation was needed by mathematicians, astronomers, and “natural philosophers,” but it was also needed by navigators, census-takers, engineers, and warmongers.
9.13
549
HIST ORIC AL INT RODUC TION TO C OMP UT ERS
(B)
(A)
V-
-
V+
+
R
I Vo = I R
Vo
CURRENT -TO-VOLTAGE AMPLIFIER
OPERATIONAL AMPLIFIER (D)
(C)
R
C R
V
V
C Vo = -(1/RC)∫V dt
Vo = -RC(dV/dt)
FIGURE 9.32 VOLTAGE DIFFERENTIATOR
VOLTAGE INTEGRATOR
Operational amplifier and some of its uses.
Weaving machines were first “programmed” by punched-card systems (the Jacquard49 process, 1801). The compilation of census data on 80-column punched Hollerith50 cards became possible in 1886, and speed up the 1890 US census. Assorted card-punches, card-readers, and card-sorters were later commercialized by the Computing Tabulating Recording Company (formed
A B
AND A B
NAND
A
A 0 OUT 1 0 1
B 0 0 1 1
Out 0 0 0 1
A 0 OUT 1 0 1
B 0 0 1 1
Out 1 1 1 0
OUT
A 0 1
NOT or INVERT
49 50
Joseph-Marie Jacquard (1752–1834). Herman Hollerith (1860–1929).
Out 1 0
A B
OR A B
NOR A B
XOR
A 0 OUT 1 0 1
B 0 0 1 1
Out 0 1 1 1
A 0 OUT 1 0 1
B 0 0 1 1
Out 1 0 0 0
A 0 OUT 1 0 1
B 0 0 1 1
Out 0 1 1 0
FIGURE 9.33 Logic gates and truth tables.
550
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
in 1911), which later became International Business Machines Co.; then IBM Corp. Hollerith cards, popularly called IBM cards, were used up to the 1970’s to store programs and data, and then they became obsolete. Punched paper tape was used for programs and data in early computers and in TeletypeÒ machines, but they too disappeared around 1980. The advent of digital computers was presaged by Babbage’s51 ideas for an Analytical Engine in 1837 (the machine was mechanical and was never finished, despite many years of work and a valiant attempt by Ada Byron52 at propagandizing it and writing for it maybe the first-ever “computer program”). Analog computers became practical first, with analog computers available in the 1930s to compute trajectories for naval gunnery and for designing electrical circuits. Digital computers were first built at Harvard University (Aiken’s53 Automatic Sequence Controlled Calculator, Mark I, 1939–1944) and at the University of Pennsylvania by Eckert54 and Mauchly55 (Electronic Numerical Integrator and Calculator, ENIAC, 1946); they used vacuum tubes instead of the cumbersome and slow mechanical switches. ENIAC morphed into an Eckert–Mauchly design of BINAC, which was sold to Remington Rand and became Univac I. Von Neumann,56 a mathematician, first conceived of the idea of storing both data and digital computing program in the same physical memory (provided that some digital bit let the machine know what was program and what was data); this modified ENIAC into EDVAC (Electronic Discrete Variable Automatic Computer, 1947). In those early days, the first “computer bug” in 1947 was a moth that got trapped in the circuitry, causing a program failure. (This term was coined by Hopper,57 a mathematician who later was co-inventor of the programming language COBOL). In the early 1950s, computers became first an academic improvement (Illiac at the University of Illinois) and then a commercial enterprise. Watson’s58 IBM Corp. came to dominate and almost monopolize the market by offering to lease and maintain, rather than sell, their expensive mainframes. The primacy of IBM mainframes in the business world helped to overwhelm the competition. IBM was followed by the “seven dwarfs” (Remington-Rand, Burroughs, Digital Equipment (DEC), Control Data Corp. (CDC), National Cash Register (NCR), General Electric (GE), and Honeywell). In Europe, Atlas (UK), Bull (France), Olivetti (Italy), and Siemens (Germany) never amounted to a serious challenge to US primacy. In Japan, Fujitsu mounted a major challenge. In the mid-1960s, timesharing systems were introduced. Cray,59 the main architect of Control Data Corp., developed in the CDC 6600 mainframe the idea of pipelining instructions through 10 parallel processors to speed up
51
Charles Babbage (1792–1871).
52
Augusta Ada Byron King, Countess of Lovelace (1815–1852). Howard Hathaway Aiken (1900–1973). 54 John Adam Presper Eckert, Jr. (1919–1995). 55 John William Mauchly (1907–1980). 53
56
John Lajos von Neumann (1903–1957). Rear Admiral Grace Brewster Murray Hopper (1906–1992). 58 Thomas John Watson, Sr. (1874–1956). 59 Seymour Roger Cray (1925–1996). 57
9.14
55 1
ELEMENTARY CONCEPTS
scientific computation; later he started Cray Research. Digital Equipment Corporation developed the laboratory “midi” computer (PDP-1, then PDP8, PDP-11, then VAX 11/780). In the System/360, IBM mated scientific and business computing into a single architecture, conceived by Amdahl.60 The design ideas for convenient and user-friendly computers was much advanced by the Xerox Palo Alto Research Center, which designed future systems in the mid-1970s by introducing the mouse, the WSIWYG concept (“what you see is what you get”), and GUI (graphic user interfaces) in an Alto computer (never marketed). These ideas were stupidly discarded by Xerox Corp. management (major strategic error); these developments lay fallow until they were picked up in the microcomputer revolution. As data volumes increased, data storage on punched cards and paper tape was replaced by magnetic storage media with ever-increasing storage densities, made possible by dramatic reductions in magnetic particle size (tape & drum (now mostly obsolete), disk), and by optical media (magneto-optical recording, erasable optical media). The PC or microcomputer developed around the Altair 8800 (born 1975), the Tandy Corp. TRS-80 (“Trash-80”), the Apple I, and the IBM PC. The “mouse,” laser printer, and Postscript hardware (Adobe Corp.), as well as the telefacsimile (“fax”) machine, provided added flexibility for computer design. Single-user desktops have proliferated worldwide. Servers came to process large-volume digital traffic on the internet. Laptops have almost replaced desktops, and tablets are mounting an assault on laptops. By 2000, distributed computing and workstations have mostly displaced powerful mainframes, except for intense computation-limited applications, such as weather forecasting, nuclear and large-molecule quantum-mechanical calculations, and so on. Massively parallel architectures (e.g., 256 parallel processors), evolved from the Cray 6600 concept, are speeding the throughput of computation.
9.14 ELEMENTARY CONCEPTS All digital computers assume that “0” and “1” are two fundamental states of the elementary circuit. There are many ways of representing 0 and 1 (Table 9.8) The first computers (see Table 9.9) used bulky vacuum tubes (Maniac, Eniac, Illiac); the transistor changed all that, and so did the integrated circuit. The first computers stored the data in “nonvolatile” magnetic memories, which had the advantage of preserving the data when power failed. These “ferrite” memories were gradually replaced by MOS (metal oxide
Table 9.8 “1” “0”
60
Choices of Storing “0” or “1” Bits ON OFF
5V 2V
Gene Myron Amdahl (1922– ).
Capacitor charged Capacitor discharged
Magnetization up Magnetization down
552
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
Table 9.9
Short Historical List of Digital Computers
Year
Name
Core Type
CPU Type
1944 1946 1951 1951 1960 1962 1962 1965 1965 1965 1964 1960 1965 1973 1978 1976 1984
Harvard U. Mark I ENIAC (U. Penn.) Illiac Univac I IBM 704 Atlas IBM 7090 IBM/360-91 Burroughs 5500 Univac 1108 CDC 6600 PDP-1 PDP-8 TRS-80 PC IBM PC Apple I PC Apple Macintosh
Vacuum tube Vacuum tube Vacuum tube Vacuum tube Drum Drum Ferrite core ICa Ferrite core Ferrite core MOS Ferrite core Ferrite core MOS (Z80) MOS (Intel 8066) MOS (6502) MOS (Motorola 60126)
Vacuum tube Vacuum tube Vacuum tube Transistors Transistors
a
Word Length Bits
Memory Size CPU Bytes___ Speed (MHz)
23 decimal digits 10 12 ?? 48 32 32 36 36 60 18 12 8 16 16 16
0.04 0.5 0.46 16.7 262 k 4K 4K 4K 8K 4K 256 K
40 0.2 0.66 4.77 1
IC, integrated circuit.
semiconductor) memories, which were cheaper to produce but volatile. The modern computer consists of the following: 1. A central processing unit (CPU) which executes all instructions. 2. Core memory (the data here are usually “volatile”); British books call this the “store”; the Germans call it “Speicher.” In the old days, ferrite cores were used; this memory was not volatile. Nowadays it is usually MOS memory (volatile). 3. Peripherals for information storage and retrieval: magnetic tape, magnetic disk, optical disk; (the data here are usually nonvolatile, that is, they are still available after a power shutdown, and they remain until they are overwrittten or erased). 4. Peripherals for user input/output: card readers (obsolete), card punches (obsolete), line printers (almost obsolete), laser printers, terminals, also called CRT (cathode-ray tubes), speech recognition devices, speech synthesizers, optical scanners, modems (modulators–demodulators, to piggyback digital data onto an acoustical carrier for telephone transmission), IR laser ports, and so on. 5. A starting set of permanently “wired” instructions (in read-only memory, or ROM, or “boot block”) is used to start the computer, or “boot” it after a power shutdown. These instructions address a few key harware locations (in core, on disk, etc.) in which other start-up data and instructions can be accessed. If this ROM, or the all-important “boot blocks” on a hard disk, are somehow destroyed, the computer cannot be started. The data in a computer are fundamentally digital bits, requiring binary logic, but they are organized for convenience in a large number of computer
9.14
55 3
ELEMENTARY CONCEPTS
Table 9.10 Representation of Numbers in Decimal, Binary, Octal, and Hexadecimal Systems Decimal
Binary
Octal
Hexadecimal
010 110 210 310 410 510 610 710 810 910 1010 1110 1210 1310 1410 1510 1610 1710
02 12 102 112 1002 1012 1102 1112 10002 10012 10102 10112 11002 11012 11102 11112 100002 100012
08 18 28 38 48 58 68 78 108 118 128 138 148 158 168 178 208 218
016 116 216 316 416 516 616 716 816 916 A16 B16 C16 D16 E16 F16 1016 1116
words of identical size (word length), which is often subdivided in a small number of computer bytes. If we need to represent base-10 numbers in binary bits, we have to remember how to do this. The bits can be assembled into “3-bit bytes” (octal representation) or into “4-bit bytes “(hexadecimal representation), as seen in Table 9.10. For instance, let us back-transform 218 into a power-of-ten number: 281 þ 180 ¼ 1610 þ 110 ¼ 1710. The representation of all numeric data in computer words is in binary form, but early on it was realized that the largest integer representable in a N-bit word is 2N, or, if one bit is needed to represent the sign of the integer (þ or ), then 2N1. If one more bit is needed to represent a “mask” for numeric data (instead of program), then 2N2. Thus a 36-bit integer word can be no larger than 231 ¼ 2,147,483,648. For larger or smaller numbers, the “floating-point” representation is used; a certain number of bits is reserved for the exponent and its sign, the rest for the mantissa. The issue is to utilize the word size to maximum advantage. For instance, if eight bits (out of 36) are used for a binary exponent and its sign (leading 1 for , leading 0 for þ), then 111111112 ¼ 2778 ¼ (282 þ 781 þ 780)¼ 12810 þ 5610 þ 710) ¼19110. If the other 28 bits of the 36-bit word are used for a binary representation of the mantissa and its sign, then 227 ¼ 134,217,72810, which says that the precision of the mantissa is no better than 1 part in 134,217,72810, or about 1 part in 108. The IBM/360 and its successors (IBM/370, 4341, 3090, etc.) used a 32-bit word, divided into four eight-bit “bytes,” adequate for business applications, and chose a hexadecimal representation to increase the exponent range, but this sacrifices precision. Thus, using an 8-bit hexadecimal exponent and a 24bit mantissa, including signs—six hexadecimal digits, or approximately six decimal digits of precision—were achieved; the magnitude limits for the oneword (or single-precision) real numbers are from 1665 ¼ 5.397 1079 1078 to 1663 ¼ 7.237 1075 ¼ 7.237 1075 1075.
554
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
For greater scientific precision, Cray Research adopted in the Cray 6600 a 64-bit word, with 16 bits reserved for the exponent (up to 1777778 ¼) and 48 bits for the mantissa: the number range goes from 28193 102467 to 28191
102465, so the precision is 48 significant binary bits, or 1 part in 1015 or so. However, for integers, the full 64-bit word is not used, but only 46 bits (almost what is reserved for the mantissa of a real constant): this allows for integers between approximately 1014 and 1014 to be represented. One way to increase precision, at the expense of computing speed, is to “chain” two or more words to provide a “double-precision” or “extended precision” word. The number of bits reserved to the exponent is sometimes increased, sometimes not, but the mantissa receives lots of extra bits from the extra chained words. This is done routinely for scientific work on 8-bit PDP-8 machines and on the early 16-bit PC’s (IBM PC, PC AT, from Intel 80086 to 80186 to 80286 to 80486 CPUs), until 32-bit PC’s became the norm (Pentium CPUs). Alphanumeric information (alphabetic or numeric), as seen in keys on a typewriter or a computer terminal keyboard, plus special “keys” like “ring a bell,” or “carriage return” or “line feed” or “backspace,” and so on, can be represented by several digital code conventions. The early dominance of IBM Corp. in the computer workplace meant that BCD (binary coded decimal) or EBCDIC (extended binary coded decimal interchange code) dominated keyboard design, magnetic tape storage conventions, and so on. A different code, agreed upon by the non-IBM mainframe manufacturers, has now emerged triumphant: the ASCII code (American Society for Computer Information and Interchange). For instance, ASCII 010 means line feed, ASCII 012 means carriage return, and so on. Usually one byte of every computer (6-bit, 7-bit, or most usually 8-bit) is enough to represent one ASCII character, sometimes called a Hollerith constant.
9.15 COMPUTER ARCHITECTURE The central processing unit (CPU), controlled by a computer “clock,” fetches instructions and data from memory, and executes add, multiply, bit-compare, skip-to-new-address, and other elementary operations, “mails” the results back into memory, and prepares for the next instruction. The CPU is truly the “heart” of the computer. Every machine has a set of fundamental machine instructions (typically between 50 and 200 of them), which are represented as “machine language instructions,” (e.g., digital add, bit compare, floating-point multiply, branch to, etc.), plus combined bunching of several of instructions. The instructions are hardware-dependent. An assembler converts the instructions into an executable program. However, the average machine life is between 5 and 7 years (less for a cheap PC), and so a massive rewrite of such programs written in assembler language, which is very much machine-specific, would keep legions of computer programmers superbusy beyond belief. As discussed below, to avoid this “quick death,” high-level programming languages have evolved, designed to be somewhat hardware-independent: of course they need hardware-dependent compilers (¼ translators) and assemblers.
9.15
COMPUTER ARCHITECTURE
Table 9.11
Some Operating Systems
Computer Name
Year
Operating System or Overlay
IBM 360 IBM 4341 AT&T computers IBM PC Cray X/MP Digital Equipment Corp.PDP-10 Digital Equipment Corp. VAX11 Apple Macintosh IBM PC
1965 1980 1980 1981 1986 1977
HASP (Houston Automatic Spooling System) VM/CMS (overlay on HASP) UNIX Microsoft DOS Unicos (dialect of UNIX) TOPS-10
1980
VMS
1986 1990
Mac OS, Linux Microsoft Windows (overlay on DOS), Linux
All programs, compilers, linkers, libraries, and “drivers” (code to interact with and control data flow form and to peripheral devices, such as terminals, printers, disks, etc.) are all managed by an operating system (OS). Depending on the computer, OS is either single-tasking, multi-tasking with interrupts, or truly time-sharing (allowing several jobs to flow through the CPU in an interleaved fashion). The names of some operating systems are given in Table 9.11. Some, like VM/CMS or Microsoft Windows, are not true systems, but overlays on the simpler “bare bones” operating system that drives the hardware. There has been a convergence of all PC systems (except for Macintosh) to be driven by DOS, plus GUI overlays like Windows 3.1, 95, 98, 2000, NT, 7, and so on. There has been a convergence of workstations and large computer systems to operate under UNIX, a simple system developed by AT&T for the computers managing its telephone network. Linux has become an “open-source” noncommercial implementation of UNIX. All device calls are routed from the CPU to a “bus,” or common line, with addresses on the bus identifying which device is being called (disk drive, modem, terminal), with hardware commands to talk or to listen, and so on. Mainframe computers may have several buses, some to accommodate highspeed data transfer—for example, from memory to CPU or from disk drive to CPU. The access time on the bus is fairly slow for “finding” the data on a magnetic disk (say 5 ms), because an electromechanical arm has to be positioned to read the first bit; then serial reading of many data from the same file may be much much faster. Some buses (e.g., IEEE-488) are used for rapid access to scientific instruments with digital I/O capabilities. The need for speed in digital computers has been met by advances in hardware, first by the invention of the transistor and then by the development of the integrated circuit (IC), in which several transistors, rectifiers, resistors, capacitors, are built up on a single Si substrate. Lithography developed to provide ever faster and denser circuits; lithography uses a mask and a photoresist, which is oxidized or damaged by light and then chemically etched away to selectively bare for further action the parts of the surface that had not been exposed to light. Then reactive ion-beam etching, metal vapor deposition, and other processing steps allow for the build-up of metal, oxide, and insulator layers that form a multi-decker sandwich of electronic devices. By decreasing the wavelength of light (using long X-rays or energetic electron beams at the present time), “design rules” of dimensions down to half the wavelength of the
55 5
556
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
light used for closer approach of components, thus decreasing the transit time for electronic signals and increasing the CPU speed.
9.16 COMPILERS A solution to avoid all the pain of machine-language or assembler is a series of high-level languages, a “compiler,” which translates, much as a dictionary would, these instructions into machine code, and a “linker,” which connects parts of the task with each other, with “library” routines such as sin(x), cos(x), random(x), and so on, and with input/output (I/O) calls, to create an executable program. The compilers should allow the same “source” code to be compiled, with minor modifications, on any computer “platform” that supports a compiler for that language. Around the IBM 704 grew a scientific complier, FORTRAN (formula translator), developed by Backus61 in the mid-1950s; over the decades it evolved into FORTRAN IV, FORTRAN 77, FORTRAN 90, and so on, and is the preeminent (if often inconvenient) source language for scientific computation. ALGOL-60, an international algorithmic language, was developed with logical completeness by Wirth62 and Backus, but did not “catch on,” except among minor non-IBM computer manufacturers, and ultimately disappeared. PL-I, or programming language One, was meant by IBM to be a superset of both FORTRAN and ALGOL, but it was resisted in the marketplace and disappeared. BASIC, or Beginners Algebraic Symbolic Instruction Code, was developed by Kemeny63 as a “baby FORTRAN” for simple computers (e.g., minicomputers). BASIC does not wait for the whole user-written program to be finished, but compiled each typed line as soon as typed. It was ideally suited for a simple learning environment. Microsoft VISUAL BASIC is a GUI-interfaced version. Microsoft QUICK BASIC 4.5 is much better than FORTRAN embodiments in accessing instruments for real-time data acquisition and control. PASCAL was developed as a scientific language. It was followed by C and its successor Cþþ, which have practical shortcuts for matrix operations that FORTRAN treats so clumsily: Cþþ is a favorite for computer science courses. It has now morphed into Python. Around 1965 the US government sponsored the introduction of COBOL (Common Business-Oriented Language) as a simple platform-independent language for database management. Computer scientists induced LISP (a processor of hierarchical lists of commands) and SNOBOL, as well as many other compilers. Internet has generated HTML (hypertext language) and JavaScript, which are programs and protocols for creating websites. The present generation of Mac and PC-trained users know how to use word processing programs (Microsoft Word, Wordperfect, LaTEX, etc.), plotting programs (Origin, Kaleidagraph, Delta), CAD/CAM programs,
61
John Warner Backus (1934–2007). Niklaus Emil Wirth (1934– ). 63 John George Kemeny (1926–1992). 62
9.16
COMPILERS
55 7
database accounting programs (Lotus 1-2-3, Quicken, Excel), and Internet access programs (Netscape, Eudora), instrument interface protocols (LabVIEW), but have forgotten how to write computer code. A modest introduction to programming is given next. Table 9.21 shows a simple FORTRAN computer source program INVERT, to invert 3 3 matrices. The source program is written in “Hollerith card image” format: instructions must be between columns 7 and 72; a character in column 6 indicates that the line is a continuation of the previous line; comment lines have “C” in column 1 (these are instructive but not essential). Table 9.13 shows the input data, and Table 9.14 shows the output data. Of course, the INVERT program would be much shorter in Cþþ!
Table 9.12
Fortran Program INVERT with Comments (FILE NAME: INVERT.FOR)
C23456789a123456789b123456789c123456789d123456789e123456789f123456789g1 CB.01 PROGRAM MATRIX INVERT (VERSION 10 OCT 1998)-------------------C CALLS: DETA (PAGE 2), INV (PAGE 3) C---WRITTEN BY R.M.METZGER FOR DIGITAL FORTRAN 5.0 FOR IBM PC-----C---INPUT ON FORTRAL LOGICAL 5, OUTPUT ON FORTRAN LOGICAL 6 C01 DATA CARDS TYPE 01 (THREE OF THEM): 3X3 MATRIX TO BE INVERTED: C01 COLS 01-10: A(1,1)=ELEMENT IN FIRST ROW, FIRST COLUMN C01 (F10.3) C01 COLS 11-20: A(1,2)=ELEMENT IN FIRST ROW, SECOND COLUMN C01 (F10.3) C01 COLS 21-30: A(1,3)=ELEMENT IN FIRST ROW, THIRD COLUMN C0 (F10.3) C01 ---------CARD TWO: SAME FORMAT, FOR A(2,1), A(2,2), A(2,3) C01 ---------CARD THREE: SAME FORMAT, FOR A(3,1), A(3,2), A(3,3) C---IN FORTRAN, COMMENT LINES (NOT USED BY COMPILER) MUST HAVE A C C IN COLUMN 1. ALL PROGRAM STATEMENTS BELOW MUST BE BETWEEN COLUMNS C 7 AND 72; COLUMNS 73 TO 80 CAN BE USED TO STORE THE PROGRAM LINE C NUMBERS (NOT USED BY COMPILER). COLUMN 6, IF IT CONTAINS ANY C CHARACTER, DENOTES THAT THE CURRENT LINE IS A CONTINUATION OF C THE PREVIOUS LINE (UP TO 18 CONTINUATIONS ARE ALLOWED). C---SOME LINES CARRY NUMBERS IN COLS 2-5 FOR JUMPS, OR FOR DATA FORMAT C IN FORTRAN ALL VARIABLES (INTEGER, REAL=FLOATING POINT, CHARACTER) C CAN BE UP TO SIX CHARACTERS, THE FIRST BEING ALPHABETIC. C ALL SCALAR (NON-MATRIX) VARIABLES CAN BE LEFT UNDECLARED C---FORTRAN CONVENTION IS THAT ALL VARIABLES STARTING WITH LETTERS— C A THROUGH H, OR STARTING WITH LETTERS O THROUGH Z, ARE REAL, C (FLOATING-POINT), UNLESS DECLARED OTHERWISE, AND THAT ALL C VARIABLES STARTING WITH J THROUGH N ARE INTEGER. C---ALL CHARACTER STRING VARIABLES MUST BE DECLARED--------------C---OF COURSE, RESERVED WORDS LIKE WRITE, READ, IF, END, OR LIBRARY C CALLS LIKE SIN, COS, LOG10, ETC. CANNOT BE USED AS VARIABLES. C---DECLARATIONS (DIMENSION, REAL, INTEGER, DOUBLE PRECISION, ETC.)– C---MUST PRECEDE ANY EXECUTABLE STATEMENT IN A PROGRAM --------DIMENSION A(3,3), B(3,3) C---ABOVE DECLARATION ALLOWS FOR A 3 BY 3 SET OF ADDRESSES IN MEMORYC TO BE RESERVED FOR A, AND A 3 BY 3 ARRAY TO BE RESERVED FOR B. C---WE WILL READ IN MATRIX A, THEN INVERT IT AND PRINT OUT MATRIX B– C---FORTRAN II AND IV: LOWEST ARRAY ELEMENT INDEX MUST BE 1;--------C FORTRAN 77 AND 90: AN 8-ELEMENT ARRAY C(-2:5) CAN BE DEFINED,WITH C LOWEST INDEX -2, NEXT -1, NEXT 0, ETC., HIGHEST INDEX 5; THESE (continued )
558
Table 9.12
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
(Continued )
C---ARRAY ELEMENTS ARE USED WITH COLONS INSTEAD OF COMMAS--------OPEN (UNIT=5, FILE=MYMX.DAT, STATUS=OLD) C---THE ABOVE MACHINE-DEPENDENT STATEMENT ASSIGNS A PRE-WRITTEN DISKC FILE CALLED MYMX.DAT (WE SHOW IT BELOW), ASSIGNS TO IT A STATUS C AS AN OLD FILE (I.E. NOT TO BE OVERWRITTEN), AND SETS THE FORTRAN C LOGICAL FILE NUMBER TO BE 5; EVERY NEW READING OPERATION GIVEN C---BELOW FROM 5 WILL READ ONE MORE LINE FROM FILE MYMX.DAT--------OPEN (UNIT=6,FILE=INVERTED.MX) C---THE ABOVE MACHINE-DEPENDENT STATEMENT ASSIGNS A NEW FILE ON DISKC---CALLED INVERTED.MX, TO BE WRITTEN AS FORTRAN LOGICAL 6 BELOW--WRITE (*,1) 1 FORMAT ( THIS IS PROGRAM INVERT AT WORK. INPUT MATRIX FOLLOWS/) C---ON FORTRAN LOGICAL * (=THE USER CONSOLE TERMINAL) THE PROGRAM--C ANNOUNCES TO THE USER THAT IT HAS STARTED. THE LINE C THIS IS PROGRAM INVERT AT WORK. INPUT MATRIX FOLLOWS C---WILL BE PRINTED, FOLLOWED BY A CARRIAGE RETURN (/)-----------WRITE (6,1) C---WRITE SAME HEADER FOR FILE INVERTED.MX-----------------------READ (5,2) ((A(I,J),J=1,3),I=1,3) 2 FORMAT (3F10.3) C---PROGRAM READS IN MATRIX ELEMENTS A(1,1), THEN A(1,2), THEN A(1,3) C ASSUMING THAT A(1,1) IS WITHIN THE FIRST 10 COLUMNS (1-10) WITH C DECIMAL POINT IN COL.6 AND THREE DECIMAL DIGITS IN COLS. 8-10; C A(1,2) MUST BE WITHIN COLS 11-20, WITH DECIMAL PT. IN COL. 16; C A(1,3) MUST BE WITHIN COLS. 21-30, WITH DECIMAL PT. IN COL.26; C PROGRAM SKIPS TO NEXT LINE, ACCORDING TO THE FORMAT STATEMENT, C READS A(2,1), THEN A(2,2), THEN A(2,3), IN COLS.1-30, THEN C---SKIPS TO THIRD LINE TO READ A(3,1), A(3,2), A(3,3)-----------WRITE (*,3) ((A(I,J),J=1,3),I=1,3) 3 FORMAT (‘ ‘, 3F15.5) C---MIRROR THE INPUT DATA ONTO USER COMPUTER TERMINAL, BUT ALLOWING C---15 COLUMNS PER DATUM AND 5 DIGITS AFTER DECIMAL POINT WRITE (6,4) ((A(I,J),J=1,3),I=1,3) 4 FORMAT (‘ ‘, 3F15.5) C---WRITE THE INPUT DATA (MATRIX A) TO OUTPUT DISK FILE INVERTED.MX CALL DETA (A,DET) C---THIS IS A CALL TO A SUBROUTINE (GIVEN BELOW), WITH INPUT DATA C A (THE 3 BY 3 MATRIX) AND OUTPUT DATUM DET. THE SUBROUTINE WILL C---COMPUTE THE DETERMINANT OF MATRIX A--------------------------WRITE (6,5) DET 5 FORMAT (’ THE DETERMINANT OF MATRIX A IS DET=’,F15.5) IF (DET.EQ.0.0) GO TO 901 C---BRANCH TO 901 BELOW IF THE MATRIX IS SINGULAR. IF NOT, CONTINUE– CALL INV (A,DET,B) WRITE (*,6) 6 FORMAT (’ THE INVERSE MATRIX IS:’) WRITE (*,4) ((B(I,J),J=1,3),I=1,3) WRITE (6,6) WRITE (6,4) ((B(I,J),J=1,3),I=1,3) C---NEXT DO LOOP IS FORTRAN IV STYLE-----------------------------C---NOW FIND AMIN=THE SMALLEST MATRIX ELEMENT IN A(3,3)-----------C---FIRST INITIALIZE AMIN WITH A RIDICULOUSLY LARGE VALUE--------AMIN=1.4E+38 IMIN=0 JMIN=0 C---AMIN = 1.4 X 10**(+38)---------------------------------------
9.16
COMPILERS
Table 9.12
55 9
(Continued )
DO 8 I=1,3,1 C---EXECUTE ALL COMMANDS UP TO AND INCLUDING STATEMENT 8;--------C THE FIRST TIME, I IS SET TO 1. WHEN STATEMENT 8 IS FINISHED, C CONTROL RETURNS TO THE STATEMENT ABOVE, I IS INCREMENTED BY 1 C AND ALL STATEMENTS UP TO AND INCLUDING STATEMENT 8, ARE EXECUTED C THEN I IS INCREMENTED BY A STEP OF 1 UNTIL 3, AND THE LOOP IS C---EXECUTED ONE LAST TIME--------------------------------------DO 7 J=1,3 C------INDENTING 3 SPACES IS OPTIONAL, DONE FOR CLARITY. J IS NOW C INITIALIZED TO 1, AND THE LOOP UP TO AND INCLUDINGSTATEMENT 7, C IS EXECUTED, THEN J IS INCREMENTED (BY THE DEFAULTINCREMENT 1) C TO 2, LOOP IS EXECUTED, THEN J=3, AND LOOP IS INCREMENTED A C------LAST TIME IF (AMIN.LE.A(I,J)) GO TO 7 C---------IF AMIN IS EITHER LESS THAN, OR EQUAL TO, A(I,J), THEN C SKIP TO STATEMENT LABEL 7 AND EXECUTE IT. OTHERWISE EXECUTE C---------THE STATEMENT JUST BELOW THIS IF STATEMENT.--------------AMIN=A(I,J) C---------AHA, AMIN IS RESET TO THE NEW MINIMUM IMIN=I JMIN=J C---------THE INDICES I AND J FOR THAT (LOCAL) MINIMUM ARE STORED 7 CONTINUE C------CONTINUE IS AN INNOCUOUS STATEMENT, HELPFUL TO INDICATE C------END OF DO LOOP 8 CONTINUE C---CONTINUE IS AN INNOCUOUS STATEMENT, WITH NO CALCULATION. WRITE (*,9) IMIN,JMIN,AMIN 9 FORMAT ( THE SMALLEST MATRIX ELEMENTIS A(‘,I3,’,’,I3,’)=‘,F15.5) C---A SPACE IS (CARRIAGE CONTROL FOR LINE PRINTERS, NOT PRINTED) C THEN ALPHANUMERIC STRING ‘‘THE SMALLEST MATRIX ELEMENT IS A(‘‘, C THEN 3 SPACES FOR FIRST INTEGER VARIABLE, A COMMA, THEN 3 SPACES C FOR SECOND INTEGER VARIABLE, THEN 15 SPACES FOR THIRD OUTPUT C---VARIABLE (FLOATING-POINT), WITH 5 DECIMAL SPACES--------------WRITE (6,9) IMIN,JMIN,AMIN C---NEXT DO LOOP IS FORTRAN 77 STYLE--------------------------------C---NOW FIND AMAX=THE LARGEST MATRIX ELEMENT IN A(3,3)--------------C---INITIALIZE AMAX WITH A RIDICULOUSLY SMALL AND NEGATIVE VALUE-----AMAX=-1.4E-38 IMAX=0 JMAX=0 C---AMAX = -1.4 X 10**(-38)--------------------------------------DO I=1,3 C---EXECUTE ALL COMMANDS UP TO THE END DO STATEMENT--------------C THE FIRST TIME, I IS SET TO 1, AND LOOP IS EXECUTED. C THE NEXT TIME, I IS INCREMENTED BY 1 TO 2, AND LOOP IS REPEATED. C---THEN I BECOMES 3, AND THE LOOP IS EXECUTED ONE LAST TIME--------DO J=1,3,1 C------INDENTING 3 SPACES IS OPTIONAL, DONE FOR CLARITY. J IS NOW-----C INITIALIZED TO 1, AND THE LOOP UP TO END J IS EXECUTED, C THEN J=2, LOOP IS EXECUTED, THEN J=3, AND LOOP IS EXECUTED A C------THIRD TIME-----------------------------------------------IF (AMAX.LT.A(I,J)) THEN C---------IF AMAX IS GREATER THAN, OR EQUAL TO, A(I,J), THEN (continued )
560
Table 9.12
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
(Continued )
C---------SKIP; OTHERWISE, EXECUTE ALL UP TO ENDIF--------------AMAX=A(I,J) C---------AMAX IS RESET TO NEW LOCAL MAXIMUM-----------------IMAX=I JMAX=J END IF END DO C------END OF DO LOOP OVER J-----------------------------------END DO C---ENDS OUTER DO LOOP OVER I-----------------------------------WRITE (*,11) IMAX,JMAX,AMAX 11 FORMAT ( THE LARGEST MATRIX ELEMENT IS A(’,I3,’,’,I3,’)=’,F15.5) WRITE (6,11) IMAX,JMAX,AMAX C---LOOP TO IDENTIFY FIRST ZERO ELEMENT IN ARRAY. FORTRAN II STYLE IFOUND=0 JFOUND=0 DO 12 I=1,3 IFOUND=I DO 12 J=1,3 C------SECOND NESTED (INNER) LOOP OVER J COULD END AT SAME STATEMENT C------12 AS LOOP OVER I, BUT NEWER COMPILERS MAY COMPLAIN ABOUT IT. JFOUND=J IF (A(I,J)) 12,13,12 C---------IF VALUE INSIDE PARENTHESES IS NEGATIVE, GO TO 12, I.E. C STAY IN LOOP. IF IT IS ZERO, GO TO 14. IF IT IS POSITIVE, C GO TO 12. IT IS TO JUMP INTO A LOOP FROM OUTSIDE (LOOP C---------VARIABLE I OR J WOULD HAVE INDETERMINATE VALUES)--------12 CONTINUE 13 IF (IFOUND) 999,999,14 C---GO TO 14 IF IFOUND.GT.0--------------------------------------14 IF (JFOUND) 999,999,15 C---GO TO 15 IF JFOUND.GT.0--------------------------------------15 WRITE (6,16) IFOUND,JFOUND,A(IFOUND,JFOUND) WRITE (*,16) IFOUND,JFOUND,A(IFOUND,JFOUND) C---WRITE FIRST NON-ZERO VALUE-----------------------------------16 FORMAT (3H A(,I3,1H,,I3,2H)=,F15.5) C---THIS WILL PRINT A( 1, 2)= 0.00000-----------------------C---COMPUTE SOME LOGARITHMS--------------------------------------DO 19 I=1,3 DO 18 J=1,3 IF (A(I,J).LE.0.0) GO TO 18 C---------AVOID TAKING LOGS OF NEGATIVE NUMBERS--------------------AL=ALOG10(A(I,J)) C---------LIBRARY CALL TO LOGARITHM TO THE BASE 10-----------------WRITE (*,17) I,J,A(I,J),AL 17 FORMAT ( LOG(A(,I3,,,I3,)=,F10.5,)=,F10.5) WRITE (6,17) I,J,A(I,J),AL 18 CONTINUE 19 CONTINUE GO TO 999 C---ABNORMAL TERMINATION MESSAGE SECTION BEGINS--------------------901 WRITE (*,902) 902 FORMAT (’ PROGRAM ABORTED. MATRIX WAS SINGULAR (DETERMINANT=0)’) WRITE (6,902) GO TO 999
9.16
56 1
COMPILERS
Table 9.12
(Continued )
C---ABNORMAL TERMINATION MESSAGE SECTION ENDS-----------------------999 STOP C---THIS STOPS THE PROGRAM EXECUTION. MUST BE AT END OF MAIN PROGRAM, C---OR ALSO ELSEWHERE IF YOU WANT TO STOP EVERYTHING-----------------CE.01-END OF MAIN PROGRAM (PAGE 01 OF PROGRAM INVERT)-----------------END SUBROUTINE DETA (A,DET) CB.02-COMPUTE DETERMINANT OF MATRIX A BY SARRUS RULE (10 OCT 1998)—— C CALLED BY: MAIN PROGRAM (PAGE 1) C CALLS: NONE C---INPUT: 3 BY 3 MATRIX A-----------------------------------------C---OUTPUT: DET =DETERMINANT OF MATRIX A--------------------------DIMENSION A(3,3) DET=A(1,1)*A(2,2)*A(3,3)+A(1,2)*A(2,3)*A(3,1)+ 1A(1,3)*A(2,1)*A(3,2)-A(3,1)*A(2,2)*A(1,3)-A(3,2)*A(2,3)*A(1,1) C---DET= DETERMINANT OF 3 BY 3 MATRIX BY SARRUSRULE-----------------RETURN C---RETURN PUTS PROGRAM EXECUTION BACK INTO CALLING PROGRAM (HERE, C THE MAIN PROGRAM) ON THE LINE AFTER THE SUBROUTINE WAS CALLED. CE.02-END OF SUBROUTINE DETA (PAGE 02 OF PROGRAM INVERT)-----------END SUBROUTINE INV (A,DET,B) CB.03-COMPUTE INVERSE OF A = CLASSICAL ADJOINT/DETA (10 OCT 1998)-----C CALLED BY: MAIN PROGRAM (PAGE 1) C CALLS: NONE C---INPUT : 3 BY 3 MATRIX A--------------------------------------C DET =DETERMINANT OF MATRIX A C OUTPUT: 3 BY 3 MATRIX B, THE INVERSE MATRIX C---SEE J. B. DENCE, MATHEMATICAL TECHNIQUES OF CHEMISTRY (WILEY, C---1975) PAGE 285-----------------------------------------------DIMENSION A(3,3),B(3,3) B(1,1)= (A(2,2)*A(3,3)-A(3,2)*A(2,3))/DET B(2,1)=-(A(2,1)*A(3,3)-A(3,1)*A(2,3))/DET B(3,1)= (A(2,1)*A(3,2)-A(3,1)*A(2,2))/DET B(1,2)=-(A(1,2)*A(3,3)-A(3,2)*A(1,3))/DET B(2,2)= (A(1,1)*A(3,3)-A(3,1)*A(1,3))/DET B(3,2)=-(A(1,1)*A(3,2)-A(3,1)*A(1,2))/DET B(1,3)= (A(1,2)*A(2,3)-A(2,2)*A(1,3))/DET B(2,3)=-(A(1,1)*A(2,3)-A(2,1)*A(1,3))/DET B(3,3)= (A(1,1)*A(2,2)-A(2,1)*A(1,2))/DET RETURN C---RETURN PUTS PROGRAM EXECUTION BACK INTO CALLING PROGRAM (HERE, C THE MAIN PROGRAM) ON THE LINE AFTER THE SUBROUTINE WAS CALLED. CE.03-END OF SUBROUTINE INV (PAGE 03 OF PROGRAM INVERT)--------------END
Table 9.13 Sample Input File “MYMX.DAT” for FORTRAN Program “INVERT.FOR” 2.000 1.000 4.000
0.000 3.000 2.000
1.000 2.000 1.000
562
9
ELECTRICAL CIRCUITS, AMPLIFIERS, AND COMPUTE RS
Table 9.14 Sample Output File “INVERT.MX” Produced by FORTRAN Program “INVERT” from the Input File “MYMX.DAT” THIS IS PROGRAM INVERT AT WORK. INPUT MATRIX FOLLOWS 2.00000 0.00000 1.00000 3.00000 4.00000 2.00000 THE DETERMINANT OF MATRIX A IS DET¼ 28.12345 THE INVERSE MATRIX IS: 0.25000 0.07143 0.25000 0.21429 0.50000 0.14286 THE SMALLEST MATRIX ELEMENT IS A(3, 3)¼ -1.00000 THE LARGEST MATRIX ELEMENT IS A(3, 1)¼ 4.00000 A(1, 2)¼ 0.00000 LOG(A(1, 1)¼2.00000)¼0.30103 LOG(A(1, 3)¼ 1.00000)¼ 0.00000 LOG(A(2, 2)¼ 3.00000)¼ 0.47712 LOG(A(2, 3)¼ 2.00000)¼ 0.30103 LOG(A(3, 1)¼ 4.00000)¼ 0.60201 LOG(A(3, 2)¼ 2.00000)¼ 0.30103
1.00000 2.00000 1.00000
0.10714 0.17857 0.21429
N.B. CONSOLE TERMINAL OUTPUT IS THE SAME AS IN FILE INVERT.MX
On a PC, under Microsoft DOS control, is a partition called C:\DesignerStudio\Myprojects\:>,
by using the Digital Visual Fortran 5.0 compiler, you compile and link the program: C:\DesignerStudio\Myprojects\:>DF INVERT.FOR
The result should be a new file, complied and linked, called INVERT.EXE, that is ready to run. Then, using some editor and staying in the same subdirectory C:\DesignerStudio\Myprojects\:>, you create the input data file MYMX.DAT, for example, by writing C:\DesignerStudio\Myprojects\:>EDIT MYMX.DAT
and exiting the file created by saving it. Then run the program: C:\DesignerStudio\Myprojects\:>INVERT
For comparison, the same program has been re-written by Adam Csoeke Peck in programming language Cþþ (Table 9.15) with an input file (Table 9.16) and an output file (Table 9.17).
9.16
COMPILERS
Table 9.15 Peck)
Cþþ Program INVERT with Comments (Written by A. Csoeke
Driver.cpp //InverterDriver.cpp //Inverting a 3x3 array using procedure Invert() #include #include #include using namespace std; const int ROW = 3; //globals for 3x3 matrices const int COL = 3; double DETA(double A[ROW][COL]) { double DET; DET = A[0][0]*(A[1][1]*A[2][2]-A[1][2]*A[2][1]) +A[0][1]*(A[1][0]*A[2][2]-A[1][2]*A[2][0]) +A[0][2]*(A[1][0]*A[2][1]-A[1][1]*A[2][0]); return DET; } void INV(double A[ROW][COL], double DET, double B[ROW][COL]) { B[0][0]= (A[1][1]*A[2][2]-A[2][1]*A[1][2])/DET; B[1][0]=-(A[1][0]*A[2][2]-A[2][0]*A[1][2])/DET; B[2][0]= (A[1][0]*A[2][1]-A[2][0]*A[1][1])/DET; B[0][1]=-(A[0][1]*A[2][2]-A[2][1]*A[0][2])/DET; B[1][1]= (A[0][0]*A[2][2]-A[2][0]*A[0][2])/DET; B[2][1]=-(A[0][0]*A[2][1]-A[2][0]*A[0][1])/DET; B[0][2]= (A[0][1]*A[1][2]-A[1][1]*A[0][2])/DET; B[1][2]=-(A[0][0]*A[1][2]-A[1][0]*A[0][2])/DET; B[2][2]= (A[0][0]*A[1][1]-A[1][0]*A[0][1])/DET; } int main() { ifstream infile; ofstream outfile; double DET; double A[ROW][COL]; double B[ROW][COL]; infile.open("MYMX.DAT"); outfile.open("MYOUTPUT.DAT"); if (!infile) { cout C¼C<X C C C¼C
0 broadens the NMR signal greatly, making the experiment rather difficult, but not impossible. Magnitude of Relaxation Times. The relaxation times are such that very short-lived systems (e.g., transition states in chemical reactions) cannot be seen in NMR. NMR can detect species whose lifetime exceeds 1 ms. A tremendous advantage of work in solutions is that the rotational relaxation times (typically 1 ms to ns) average in three dimensions the
N=0
N=1
N=2
Singlet
Doublet, 1:1
Triplet, 1:2:1
Quartet, 1:3:3:1 N=3
Quintet, 1:4:6:4:1 N=4
FIGURE 11.59 Multiplet structure for N equivalent spins.
Sextet, 1:5:10:10:5:1 N=5
11.21
72 3
MAGNETIC RESONANCE
spin–spin splittings (due to the nuclei of the solvent adjacent to the molecule being studied). This minimizes, particularly with solvents of low viscosity or at high temperature, the effects of the solvent and “sharpens” the signal. As the temperature is lowered, or the solvent becomes more viscous, these averaging mechanisms will fail, and anisotropies in the signal will emerge. NMR in Solids. In solids, these spin–spin interactions are not averaged: An H1 NMR signal that is 1 Hz wide in solution will be broadened to about 100 kHz in a solid, decreasing signal-to-noise ratios and losing much chemically useful information. Two techniques have evolved in tandem to combat this broadening: magic-angle spinning and multiple-pulse narrowing. Magic-Angle Spinning. One technique is to spin the sample at the so-called “magic angle” of 54.74 , which minimizes the spin–spin interaction effects: In the dipolar expansion rewritten in terms of the dipole–dipole interaction between two dipoles mi and mj, with mutual angle of orientation yij: Edd ¼ ¼
XX i
i
ðmi mj 3mi rij mj rij rij2 Þ rij3
XX i
m m r 3 i i j ij
½1 3 cos2 ðyij Þ
ð11:21:29Þ ð11:21:30Þ
the numerator will vanish, and the dipole–dipole forces will cancel when 1 3 cos2 yij ¼ 0, that is, when yij ¼ 54.73561032 . In practice, the sample, fitted with a plastic propeller, is spun at 54.74 by a compressed-air jet at 3–5 kHz; this spinning cancels a large part of the dipolar broadening. Multiple-Pulse Narrowing. The other technique is to artificially realign the nuclear spins in a solid; several mutually orthogonal RF coils are mounted around the sample area; these coils receive RF energy at the frequency n of interest, but for varying times, in a precise sequence of impulses, first introduced by Hahn,119 Purcell, and Waugh:120 these multiple pulses are calculated to combat, and even exploit, thermal relaxation. The net effect is to narrow the NMR resonances by “kicking” the Boltzmann population of thermally varied orientations into a single orientation, watching as the free induction decay lets these excited nuclei slowly dephase, and then kicking them again at 90 , etc., forcing the dephasing back toward sharpening the signal. 2D NMR. Such multiple-pulse sequences not only can help to detect solidstate NMR spectra, but also are applied to decouple spectra of molecules in solution, where certain chemical groups can be studied, by applying combinations of two or more NMR frequencies. Names, such as Overhauser121, “Underhauser,” COSY, MAGIC, and so on, have been given to these pulse sequences. By varying two saturating frequencies, socalled “two-dimensional NMR” plots become possible: Plotting the signal
119
Erwin L. Hahn (1921– ). John S. Waugh (1929– ). 121 Albert W. Overhauser (1925– 120
).
724
11
IN STR UMEN TS
1.5
1
FIGURE 11.60
Abs (B) Abs (B) or (d Abs / d B)
Normalized Lorentzian absorption lineshape function Abs(H) ¼ w00 L(o o0) p1T2[1þT22(o o0)2]1 (dotted line) and its rescaled derivative (d Abs(H)/d H) ¼ dw00 L/d(o o0) 2p1 (o o0)T2[1 þ T22(oo0)2]2 (solid line) as a function of the DC magnetic field H. ThepeakofAbs(H) is at T2/p; The horizontal line indicates FWHM¼2T21; at the center of the absorbance (w00 L(o o0) ¼ max) the derivative vanishes; the derivative peaks are separated by 2 31/2T21.
0.5
Modulation sweep width
0
(d Abs / d B)
–0.5
–1 0.337
0.338
0.339
0.34 0.341 B (Tesla)
0.342
0.343
intensities as contour diagrams in a two-dimensional plot of varying frequencies may isolate the NMR transition of interest within very complicated biomolecules, by first saturating one resonance, then the other, thus allowing for the decipherment of local structure. Derivative Detection of EPR Transition. The EPR spectrum is usually displayed as the first derivative of the absorption w00 (H), because the nonresonant low-frequency and low-amplitude RF modulation (o1/ 2p ¼ typically 100 kHz) applied to the coils near the magnet is detected by a rectifier in addition to the drop in microwave power level due to the RF resonant absorption (typically o0/2p ¼ 9.1 GHz if H0 ¼ 0.34 T): The signal is processed by a phase-sensitive circuit, which detects a back-and-forth sweep across resonance in small magnetic field increments (relative to the DC field and to the width of the measured spectrum), thus generating a response dw00/dH (see Fig. 11.60). A 9.5-GHz 0.34-T EPR spectrometer can detect 1011 spins if the linewidth is 1 gauss; that is, an EPR spectrometer is four orders of magnitude more sensitive than an NMR spectrometer. However, paramagnetic samples are less prevalent than diamagnetic ones, so NMR has proven to be much more useful than EPR. EPR spectrometers now can reach 95 GHz, with a 10-fold increase in sensitivity over the 9.5-GHz instrument. A typical EPR spectrum of electrochemically or chemically generated benzene radical anion C6H6 in solution is displayed in Fig. 11.61. The spectrum is centered at g ¼ 2.003, and consists of 7 lines, due to the hyperfine splitting of the electron resonance by 6 chemically equivalent protons with a hyperfine splitting constant a ¼ 0.375 mT. If there are M chemically or topologically inequivalent nuclei, each spectrum is split M times, with a hyperfine splitting constant a; all the splittings (and splittings of splittings) are centered around the Larmor frequency; this can make a very complex spectrum. For instance, the naphthalene negative ion radical has 25 lines, due to four equivalent protons at positions 1, 4, 5, and 8, which generates a quintet with
11.21
72 5
MAGNETIC RESONANCE 60
40
d χ" (d B)
20
0
FIGURE 11.61 -20
-40
-60 0.3395
0.34
0.3405
0.341
0.3415
0.342
0.3425
EPR spectrum of electrochemically generated benzene radical anion, C6H6. The hyperfine interaction between the free spin and the six H1 nuclei generates a seven-line spectrum of nominal relative intensities 1:6:15:20:15:6:1. The hyperfine splitting constant is 0.375 mT.
Field B (Tesla)
relative intensities 1:4:6:4:1, with a ¼ 0.490 mT, and four other protons at positions 2, 3, 6, and 7, with a ¼ 0.183 mT, which generates a smaller quintet with relative intensities 1:4:6:4:1. The anthracene negative ion has a 1:2:1 triplet of splitting 0.43 mT, split into a 1:4:6:4:1 quintet of splitting 1:4:6:4:1, which is split again into a 1:4:6:4:1 quintet with splitting 0.11 mT. All of this makes sense in the McConnell122 equation: a ¼ Qr
ð11:21:31Þ
where Q 2.3 mT and r is the spin density of the C atom closest to the proton. Actually, this Q not too constant: Q ¼ 2.304 mT for *CH3, 2.99 mT for *C5H5, 2.25 mT for C6H6, 2.74 mT for *C7H7, and 2.57 mT for C8H8. Figure 11.62 shows the spin densities calculated from the spectra for several cyclic hydrocarbons. These spin densities can also be obtained from open-shell theoretical calculations of the spin densities ( ¼ density at atom of spin-up (alpha) electrons minus spin-down (beta) electrons $ charge densities). Stable Free Radicals. Stable free radicals are a small minority of the more than 6 million chemical compounds known by 2005. The oxygen molecule is paramagnetic (S ¼ 1). In 1896, Ostwald stated that “free radicals cannot be isolated.” Only four years later, Gomberg123 made triphenylmethyl (Fig. 11.63), the first proven stable and persistent free radical [48]! An infinitely stable free radical used as a reference in EPR is diphenyl-picryl hydrazyl (DPPH). Other persistent free radicals are Fremy’s124 salt (dipotassium nitrosodisulfonate KþO3S-NO-SO3 Kþ) 2,2-diphenyl-1-picrylhydrazy (DPPH)l, Galvinoxyl (2,6-di-tert-butyl-a-(3,5-di-tert-butyl-4-oxo-2,5-cyclohexadien-1-ylidene)-p-
122
Harden Marsden McConnell (1927– Moses Gomberg (1866–1947). 124 Edmond Fremy (1814–1894). 123
).
726
11
IN STR UMEN TS
(a) 0.200
0.333 0.125 0.1429 0.166
FIGURE 11.62
0.193
(a) Experimental spin densities on cyclic hydrocarbon radicals (b) Depiction of spin density of 1.000 on carbon atom of methyl radical, CH3. (c) Qualitative explanation of the McConnell equation: An electron spin in 2pz orbital on C induces an antiparallel nuclear spin orientation on the adjacent H1 nucleus, by polarizing the C–H electron pair bond.
0.097
0.22 0.048 0.08
*
(b) (c)
H H
C
C
H
1.000
H
tolyl-oxy), named after Galvin Coppinger125, and nitroxides R-NO, such as 2,2,6,6-tetramethyl-1-piperidinyloxy (TEMPO), with the spin concentrated in the terminal NO group and protected sterically from chemical attack by adjacent methyl groups. g-Tensor. So far, the g-value has been presented as an isotropic quantity; it ^ EZ for the electronic actually is a tensor, so that the spin Hamiltonian H Zeeman126 effect should be written as ^ EZ ¼ be H0 ge S H
ð11:21:32Þ
In organic radicals in solution, the g-factor anisotropy cannot be detected; one needs oriented samples. In crystals of free radicals, this anisotropy is easily measured—for example, in crystals of sodium formate (Naþ HCOO) the principal-axis components are gxx ¼ 2.0032, gyy ¼ 1.9975, and gzz ¼ 2.0014. If there is some spin–orbit interaction in an organic molecule (e.g., if a compound contains S or Cl), then g-values as high as 2.0080 are encountered. In disordered powders with narrow EPR lineshapes, the g-factor anisotropy can produce considerable distortion in the overall signal, due to averaging of the g-tensor.
125 126
Galvin M. Coppinger (1923– Pieter Zeeman (1865–1943).
).
11.21
72 7
MAGNETIC RESONANCE
C
N N
NO2
O2N
NO2 H3C H3C
N
CH3 CH3
O
O C(CH3)3
(H3C)3C
FIGURE 11.63 Some stable free radicals: (top left) triphenylmethyl; (top right) 2,2diphenyl-1-picrylhydrazyl (DPPH); (bottom left) TEMPO; (bottom right) Galvinoxyl.
C(CH3)3
O C(CH3)3
Fine-Structure Splittings in ESR Spectra of Triplet States. Consider the ^ for spin–spin (fine-structure) and Zeeman interactions of hamiltonian H two spins S1 and S2 a mutual distance r12 apart, in an external magnetic field H0: ^ ¼ be H0 ge ðS1 þ S2 Þ þ g 2 b 2 ½S1 S2 r 3 3 ðS1 r12 Þ ðS2 r12 Þ r 5 H e e 12 12 ð11:21:33Þ Assume (1) an isotropic g-tensor for simplicity, (2) that the two spins couple: S ¼ S1 þ S2, to form a singlet state 21/2{|aebe > |beae>} and three triplet states |aeae>, 21/2{|aebe>þ|beae>}, and |bebe>|. Then the Hamiltonian simplifies to ^ ¼ ge be H0 *S þ S*D*S H
ð11:21:34Þ
where the symmetric fine-structure tensor D has diagonal and off-diagonal components (in an arbitrary coordinate system, e.g. when H0 is along the z axis) of the type Dxx ¼ ð1=2Þge 2 be 2 < r123 3x122 r125 >
ð11:21:35Þ
Dxy ¼ ð1=2Þge 2 be 2 < 3x12 y12 r125 >
ð11:21:36Þ
728
11
IN STR UMEN TS
After a transformation into a principal-axis system (X, Y, Z) the fine-structure tensor becomes a traceless symmetric diagonal tensor: S D S ¼ XSX2 YSY2 ZSZ2 ¼ D Sz 2 31 Sz 2 þ E SX2 SY2 ð11:21:37Þ Experimentally, this means that, between measurements, a crystal must be rotated along two mutually orthogonal axes, until extrema in the signals (usually symmetrical about the “impurity” signal at g ¼ 2) are detected at certain angles. It is important and interesting to correlate the “principal axes” with the crystallographic axes and determine how they relate to axes in the molecules that exhibit the triplet signal. The D value gives the size of the interactions (typically, a fraction of 1 cm1), while E measures the asymmetry of the triplet state charge distribution (usually E is smaller than D). At high magnetic fields (H0 > 0.3 T) the transitions in the principal-axis plane are |D 3E| and (3/2)|D|; at zero external field, the transitions are |D þ E| and |D E|; the absolute signs of D and E cannot be determined from an EPR spectrum. Even the EPR “powder” spectrum of randomly oriented crystallites can sometimes yield D and E values at the “turning points” of the distribution of spins. Spin Labeling. The EPR of Fremy’s salt in water (or TEMPO in lowviscosity organic solvents) shows a 1:1:1 triplet with hyperfine splitting 1.3 mT, due to the I ¼ 1 spin of N14, centered around g ¼ 2.002. McConnell showed that TEMPO and similar nitroxides, appropriately synthesized to be biocompatible with the target region, can be incorporated as “spin labels” into biologically interesting regions: mitochondria and other cellular components, phospholipid bilayers, nerve cells, and active sites of enzymes; if the medium is viscous, then the symmetrical pattern of Fig. 11.61 becomes unsymmetrical and distorted; this probes the relaxation times within the biological system. In biophysical chemistry, the spin label method competes with the fluorescent label method, but both labels tend to be large molecules, which are intrusive in the very region they are probing. Nuclear Resonance in Paramagnetic Systems: Knight127 shift. If there is a paramagnetic species with z-component of spin Sz and a nuclear z-component of spin Iz in an external magnetic field H0 along z, then the interaction Hamiltonian is ^ ¼ ge be H0 Sz gN bN H0 Iz þ aIz Sz ¼ gN bN H0 Iz ðH0 aSz =gN bN Þ þ ge be H0 Sz H ð11:21:38Þ By collecting the terms in Iz we see that there is an effective field Heff acting on the nucleus: Heff H0 aSz =gN bN
ð11:21:39Þ
which can be very large: For instance, when H0 ¼ 1 T, then the H1 Larmor frequency is 42 MHz; an admittedly large hyperfine splitting constant 127
Walter David Knight (1919–2000).
11.21
72 9
MAGNETIC RESONANCE
a ¼ 84 MHz will then cause Heff to be either 0 T or 2 T, depending on the spin orientation! Therefore a nuclear resonance will shift upfield or downfield by an amount DH: DH=H0 ahSz i=gN bN H0
ð11:21:40Þ
which strongly depends on T. Using the magnetic susceptibility of electrons of spin S: w ¼ Nge2 be2 SðS þ 1Þ=3kB T
ðð5:9:12ÞÞ
and the Fermi contact term for the hyperfine interaction: a ð8p=3Þge be gN bN jc1s ð0Þj2
ðð11:21:25ÞÞ
this becomes the Knight shift DH/H0: DH=H0 ¼ ð8p=3NÞwjc1s ð0Þj2
ð11:21:41Þ
The Knight shift was first measured in metals, but is also appreciable for solutions containing paramagnetic ions. La3þ salts have been used to shift H1 resonances, although the spin-lattice relaxation times lengthen considerably, and thus the signals become harder to detect. Overhauser Effect. If one measures a nuclear transition in a paramagnetic system, while saturating the electron spin resonance, then the nuclear transition can be enhanced 100-fold, or a nuclear absorption can mutate into a nuclear emission. The reason is that one is playing with coupled Boltzmann populations of spins (electronic or nuclear): The relaxation rate for one is affected by the relaxation rate of the other. Consider the simultaneous change of electron spin (e) and nuclear spin (N),—for example, aebN K beaN or aeaN K bebN (see Fig. 11.64). Assume that the contact hyperfine interaction term a(t) I*S has a time-average value of zero, wiping out the hyperfine splittings: thus in the external magnetic field H0 the Hamiltonian is ^ ¼ ge be H0 Sz gN bN H0 Iz H
ð11:21:42Þ
At thermal equilibrium, when the transition rates between upper and lower states become equal, the ratio of the population N0þ of nuclear “up” spins bN (Iz ¼ þ1/2) to the population N0 of nuclear “down”-spins aN (Iz ¼ 1/2) is given by a Boltzmann factor: N0þ =N0 ðNaa þ Nba Þ ðNab þ Nbb Þ 1 ¼ expðgN bN H0 =kTB Þ ðat equilibriumÞ ð11:21:43Þ If, however, the electron spin transition is saturated (this is shown by the wide arrows in Fig. 11.64A), then the populations of the electron spin-up and spindown states are forced to become equal: Naa ¼ Nba, and Nab ¼ Nbb. Under these conditions, the spin populations will depend only on the rate of
730
11
(a)
IN STR UMEN TS
(b)
exp [(-geβe-gNβN)H/2kT] exp [(-geβe+gNβN)H/2kT]
αeβN
αeβN
PN αeαN
αeαN SATURATE
X
SATURATE
Pe
Pe
Pe
Pe
SATURATE
Y
Y βeβN
βeβN βeαN
SATURATE
X PN
βeαN
exp [(geβe-gNβN)H/2kT] exp [(geβe+gNβN)H/2kT]
FIGURE 11.64 Energy levels with Overhauser effect: (a) Relaxation due to a time-dependent isotropic contact electron-spin–nuclearspin hyperfine interaction a(t)I S which has a zero time-average, but allows processes X and Y and enhances nuclear spin transitions when the electron populations are made equal by saturation. (b) Relaxation is due to all dipole–dipole interactions, which allow processes X,Y, and PN; nuclear spin transitions are forced into emission by the Overhauser effect. In (a) the relative Boltzmann populations before saturation are shown.
simultaneous flips of both electron spin and nuclear spin (aebN K beaN, arrow X in Fig. 11.64A); this is permitted by contributions of the type IþS and ISþ in the expansion of the contact term a(t) I*S. Then the population ratios become N þ =N ¼ 2Naa =2Nab ¼ exp½ðgN bN þ ge be ÞH0 =kB T
ðat electron spin saturationÞ ð11:21:44Þ
Thus, the nuclear spin population difference should increase by a factor of (1þgebe/gNbN) ¼ 639, which is not always reached in practice. When dipole–dipole interactions dominate (Fig. 11.64B), then the nuclei will emit energy, and the enhancement factor becomes negative. Electron–Nuclear Double Resonance (ENDOR) Spectroscopy. This observes a spin resonance transition after a nuclear resonance transition has been saturated by a radio-frequency pulse (Fig. 11.65) so as to invert the relative populations of the |aeaN> and |aebN> spin states; this forces the populations of the |aeaN> and |beaN> states to be different and thus offers the opportunity to measure hyperfine splittings much more carefully, with better resolution than in standard EPR. There are many other specialized methods: electron–electron double resonance (ELDOR), TRIPLE, HYSCORE (hyperfine sublevel correlation spectroscopy, which is similar to 2D-EPR), electron spin-echo, and so on; these methods are not discussed here. Optically Detected Magnetic Resonance (ODMR). The first optically detected magnetic resonance experiment was done using the 3P1 state of
11.21
73 1
MAGNETIC RESONANCE
αe
αN gNβNH-a/2
Population before αeαNαeβN is saturated
αeαN
1+q
Population after αeαNαeβN is saturated 1-p+r-q
SATURATE
αeβN
1-p+r-q
1+q
FIGURE 11.65
βN geβeH
βN βe
ENDOR experiment. p gebe H/2 kB T; q gNbNH/2 kB T; r a/4 kBT; Hyperfine interaction present; but Overhauser effect is absent; the population expressions above are valid at T high enough to have p, q, r 1: then the Boltzmann factor is exp(x) 1 x.
OBSERVE
βeβN
1+p-r-q
βeαN
1+q
1+p-r-q
gNβNH+a/2
αN
1+q
gas-phase mercury atoms [49]. While the sensitivity for a CW X-Band EPR experiment is typically 1011 spins (for a 1-gauss linewidth), ODMR can detect optically 106 to 108 spins and has even been improved to detect the spin on a single molecule (pentacene radical anion or cation embedded in terphenyl). ODMR is a double-resonance technique, in which transitions between spin sublevels are detected by optical means. Usually these are sublevels of a triplet state, and the transitions are induced by microwaves. For the different types of optical detection the following abbreviations are used: ADMR (absorptiondetected magnetic resonance), DEDMR (delayed-emission, non-specifieddetected magnetic resonance), DFDMR (delayed-fluorescence-detected magnetic resonance), FDMR (fluorescence-detected magnetic resonance), and PDMR (phosphorescence-detected magnetic resonance). If a reaction yield is followed, the expression RYDMR (reaction-yield-detected magnetic resonance) is used. A “spin microscope” has been proposed, based on ODMR and using a Si cantilever similar to AFM. Nuclear Quadrupole Resonance (NQR) [50–54]. Nuclear (electric) quadrupole resonance (NQR) was invented in 1950 [55] and is applicable to nuclei with nonzero nuclear electric quadrupoles eQ, which are 3 3 tensors, whose significant components are the quadrupole coupling constant QCC: QCC e2 Qqzz =h
ð11:21:45Þ
and the electric field gradient asymmetry parameter: Z qxx qyy =qzz 0
ð11:21:46Þ
Here qij is defined as the gradient of the electrical field at the nucleus undergoing NQR, due to the local electron density and the nearby nuclear charges: qij @E=@xi ¼ @ 2 V=@xij 2
ð11:21:47Þ
732
11
Table 11.14 Nucleus 1H
2
6 3Li 7 3Li 9 Be 4 10 5B 11 5B 14 7N 17 O 8
23 11Na 27 13Al 33 16S 35 Cl 17 37 17Cl 39 17K 45 21Sc 51 V 23 55 25Mn a
I 1 1 3/2 3/2 3 3/2 1 5/2 3/2 5/2 3/2 3/2 3/2 3/2 7/2 7/2 5/2
IN STR UMEN TS
Stable Nuclei and Their Quadrupole Moments[50] Q (barns) 0.0027965 0.000741 0.039 þ0.029 0.074 0.036 0.0166 0.0301 0.140.15 þ0.151 0.064 0.0802 0.0632 0.11 0.22 0.04 0.355
Nucleus 27Co
59
63 29Cu 67 30Zn 69 31Ga 71 31Ga 75 33As 79 33Br 81 Br 33 85 37Rb 87 37Rb 87 38Sr 93 41Nb 95 42Mo 97 42Mo 113 49In 115 49In 121 51Sb
I 7/2 3/2 5/2 3/2 3/2 3/2 3/2 3/2 5/2 3/2 9/2 9/2 5/2 5/2 9/2 9/2 5/2
Q (barns) 0.40440 0.163 0.15 0.178 0.112 þ0.32 þ0.332 þ0.282 0.27 0.13 0.2 0.2 0.12 1.1 þ1.145 þ1.165 0.5310
Nucleus 123
51Sb 127 53I 133 55Cs 135 Ba 56 137 56Ba 139 57La 141 59Pr 143 60Nd 147 62Sm 149 62Sm 151 63Eu 152 63Eu 155 64Gd 157 64Gd 159 65Tb 161 66Dy 163 66Dy
I
Q (barns)
7/2 5/2 7/2 3/2 3/2 7/2 5/2 7/2 7/2 7/2 5/2 5/2 3/2 3/2 3/2 5/2 5/2
0.7 0.785 0.003 þ0.182 þ0.283 0.21 0.059 0.25 0.208 0.060 1.16 2.9 1.6 2 1.3 1.4 1.6
Nucleus 165
67Ho 167 68Er 173 70Yb 175 Lu 71 181 73Ta 185 75Re 189 76Os 191 Ir 77 193 77Ir 197 79Au 201 80Hg
I
Q (barns)
7/2 7/2 5/2 7/2 7/2 5/2 3/2 3/2 3/2 3/2 3/2
2.82 2.83 2.8 5.68 3 þ2.8 0.8 1.5 1.5 þ0.606 0.50
One barn equals 1024 cm2.
The potential energy due to the electrical quadrupole in a local principalaxis system, where the eQ tensor is diagonal and |qzz| |qyy| |qxx| is chosen, is ð jejq ¼ ðjej=2Þ rn ðrÞdvðrÞ x2 @ 2 V=@x2 þ y2 @ 2 V=@y2 þ z2 @ 2 V=@z2 ð
¼ jej rn ðrÞ 3z2 r2 dvðrÞ ð
X 2 2 ¼ jej cðrÞ* 3 cos2 y 1 r 2 cðrÞdvðrÞ þ Z R 3 cos y 1 i i i i ð11:21:48Þ where |e| is the electronic charge, rn(r) is the charge density, c(r) is the electronic wavefunction, and Zi and Ri refer to the nuclear charge and the distance of nucleus i from the nucleus undergoing NQR, respectively. The energy levels for nuclear spin I and its z-component m for the axially symmetric crystal are Em ¼ e2 Qq½4 I ð2 I 1Þ 1 3m2 I ðI þ 1Þ
ð11:21:49Þ
As mentioned earlier, nuclei have nonzero quadrupole moment Q if and only if their nuclear spin quantum number I is 1; such nuclei, if stable, are listed in Table 11.14. The formal expressions for the transition frequencies as a function of I and Z are shown in Table 11.15. A few 17Cl35 NQR frequencies are listed in Table 11.16; a few halogen NQR frequencies are shown in Table 11.17. The NQR signal is measured by coupling an oscillating radio-frequency magnetic field H with the magnetic dipole moment of the nucleus (as in
11.21
73 3
MAGNETIC RESONANCE
Table 11.15 Transition Frequencies in Units of the QCC (e2Qq/h) and as a Function of the Asymmetry Parameter h [50] I 1 3/2 5/2 7/2
9/2
Transition
Frequency
(þ1 ! 1) (3/2 ! 1/2) (5/2 ! 3/2) (3/2 ! 1/2) (7/2 ! 5/2) (5/2 ! 3/2) (3/2 ! 1/2) (9/2 ! 7/2) (7/2 ! 5/2) (5/2 ! 3/2) (3/2 ! 1/2)
(3/4)(1 Z/3) (1/2)(1 þ Z2/3)1/2 (3/10)(1 0.2037 Z2 þ 0.1622 Z4) (3/20)(1 þ 1.0926 Z2 0.6340 Z4) (3/14)(1 0.1001 Z2 0.0180 Z4) (2/14)(1 0.5667 Z2 þ 1.8595 Z4) (1/14)(1 þ 3.6333 Z2 7.2607 Z4) (4/24)(1 0.0809 Z2 0.0043 Z4) (3/24)(1 0.1875 Z2 0.1233 Z4) (2/24)(1 1.3381 Z2 þ 11.7224 Z4) (1/24)(1 þ 9.0333 Z2 45.6910 Z4)
Table 11.16 Cl35 NQR Frequencies for Several Chlorine-Containing Compounds [50] Compound Cl2(s) CHCl3 HgCl2 GaCl2 GeCl2 BiCl3 K2TeCl6 K2SnCl6 K2PtCl6 K2ReCl6 Rb2TeCl6 Rb2SnCl6 Rb2PtCl6 Rb2ReCl6 Cs2TeCl6 Rb2SnCl6 Rb2PtCl6 Rb2ReCl6 NaClO3 NaClO3 Ba(ClO3)2H2O Ba(ClO3)2H2O
Frequencies (MHz) 108.9 38.254 & 38.308 @ 77 K 22.251 & 22.0964 @ 296 K, 22.240 & 22.058 @ 300 K 20.302 & 19.204 @ 300 K, 22.23 & 19.08 @ 305 K 24.449 & 25.451 @ 77 K 15.952 & 19.173 @ 291 K 15.13 & 14.99 @ 298 K 15.06 @ 298 K 25.82 @ 298 K 13.89 @ 298 K 15.14 @ 298 K 15.60 @ 298 K 26.29 @ 298 K 14.28 @ 298 K 15.60 @ 298 K 16.05 @ 298 K 26.60 @ 298 K 14.61 @ 298 K 30.62 @ 77 K 29.92 @ 296 K 29.923 @ 77 K 29.322 @ 299 K
Table 11.17 Typical Values of e2Qq (MHz) for Halogen Nuclei in Covalently Bonded Crystals [55] 17Cl
35
: 80
33Br
79
: 500
127 : 53I
2,000
734
11 RESONANCE ABSORPTION LINE
IN STR UMEN TS
TRANSIENT FROM SWEEP SAWTOTH VOLTAGE
FIGURE 11.66 Oscilloscope tracing of 17Cl35 NQR signal from KClO3 [54].
28.2 Mc/s APPROX 15 kc/s
NMR), but the NQR signal is due to the interaction of the nuclear electric quadrupole moment eQ with the local electric field gradient. Large samples, preferably single crystals (typically 5 g, or 1 cm3), are placed in an RF pickup coil, and an adsorption is registered, using (i) marginal oscillators (< 10 MHz), (ii) regenerative or marginal oscillators (< 100 MHz), (iii) superregenerative oscillators (20–300 MHz) [55], (iv) microwave cavity oscillators (100–380 MHz), (v) microstrip oscillators (0.250–1 GHz), and (vi) pulsed (quadrupole spin-echo) methods. As in NMR, the width of the NQR line has contributions from the crystal inhomogeneity DB and from the reciprocals of the spin-lattice relaxation time T1, and the spin-spin or “spin memory” relaxation time T2. NQR is sometimes referred to as “zero-field NMR.” Figure 11.66 shows an old NQR signal for KClO3. The resonance frequency nNQR is very sensitive to the squared wavefunction at the nucleus |c1s(r ¼ 0)|2, to the local crystal electric field, and also to temperature changes (Table 11.16). The NQR data, combined with crystallographic information, can probe structure and bonding in the vicinity of the NQR nucleus. NQR has a regrettable appetite for large samples (grams), but applications have been proposed for explosives detection (e.g., 7N15 NQR at 700–900 kHz for the chemical RDX, as long as the sample is not encased in metal!). PROBLEM 11.21.4. If for HCl the 17Cl35 signal is found at (e2Qq/h) ¼ 67.9 MHz, and Q ¼ 0.0789 barns, then estimate the electric field gradient |e| q (whose cgs units are |e| cm3).
11.22 ELECTROCHEMICAL METHODS After earlier experiments with static electricity and with Franklin128 fishing for thunderbolts, electrochemistry was born with Galvani’s129 electrostatic stimulation of deceased frog muscles in 1791 and Volta’s130 development of the “voltaic pile.” The methodical and routine study of current versus voltage
128
Benjamin Franklin (1706–1790). Luigi Galvani (1737–1798). 130 Count Alessandro Giuseppe Antonio Anastasio Volta (1745–1827). 129
11.22
ELECTROCHEMICAL METHODS
characteristics started in 1922 with Heyrovsky’s131 invention of the polarograph. Modern electrochemical techniques can be divided into four groups: (1) In potentiometry, the electrical potential ( voltage) is measured at almost zero or very low current with an unpolarized working electrode. (2) In voltammetry (of which polarography is one technique) a significant current is measured as a function of voltage, and the working electrode is polarized. (3) In amperometry the current at a polarized working electrode, proportional to the analyte concentration, is measured at fixed potential. (4) In coulometry the complete conversion of the analyte to a product is determined by measuring the total charge consumed. As a reminder, galvanic cells are spontaneous (Ecell > 0 and DGcell < 0), while electrolytic cells are driven by an external voltage supply (Ecell < 0 and DGcell > 0). Primary reference electrodes, with their reduction potentials in H2O at 298.15 K, are: 1. The standard or normal hydrogen electrode (SHE or NHE) “Pt | H2(g) | Hþ (aq, 1 M)” at 0.000 V by definition. 2. The saturated calomel electrode (SCE) “Hg | Hg2Cl2, KCl (aq, sat’d)” electrode at 0.2412 V vs. SHE (which can also be used in nonaqueous solvents). 3. The silver/silver chloride electrode “Ag | AgCl, KCl (aq, sat’d)” electrode at 0.22 V vs. SHE. 4. The “Hg | Hg2SO4, K2SO4 (aq, sat’d)” electrode at 0.64 V vs SHE. 5. The “Hg | HgO, NaOH (aq, 0.1 M)” electrode at 0.926 V vs SHE. The solvent “windows”—that is, the potential ranges within which electrochemical measurements are possible, because within them the electrolyte does not undergo an unwanted side-reaction—are shown in Fig. 11.67. Electrochemical measurements require elaborately cleaned electrodes (polished metal surface, Hg drop, glassy carbon, etc.) and a “supporting electrolyte” (often at 0.1 M to 1.0 M concentration) which transmits the potential across the cell. If the reaction at the anode and the cathode must be “shielded” from each other, then a salt bridge is placed between the two solutions: The salt bridge typically consists of 4 M aqueous KCl in gelatinous agar agar; the Kþ and Cl ions have comparable sizes and hence almost equal ion mobilities (4% difference), so a small flow of these ions in and out of the salt bridge transmits electrical potential differences between the two solutions, at an acceptable cost of a few millivolts of junction potential. Empirical formulas exist to correct for the temperature dependence of the reference potentials in aqueous solution. When one must work in nonaqueous solvents, because of their conveniently large “window,” one must add a 0.1 M to 1.0 M salt (see Fig. 11.67) to help conduct current, but there can be a problem with referencing the working electrode potential to a standard electrode. SCE can be used in many nonaqueous solvents, but in some cases such a direct experiment does not work; one must use the Ag|Agþ ion
131
Jaroslav Heyrovsky (1890–1967).
73 5
736
11
+3.0
+1.0
+2.0
0.0
-1.0
-2.0
IN STR UMEN TS
-3.0
1 M H2SO4(aq) | Pt pH 7 buffer(aq) | Pt 1 M NaOH(aq) | Pt 1 M H2SO4(aq) | Hg 1 M KCl(aq) | Hg 1 M NaOH(aq) | Hg 0.1 M Et4NOH(aq) | Hg 1 M HClO4(aq) | graphite 1 M KCl(aq) | graphite MeCN, 0.1 M TBABF 4 | Pt DMF, 0.1 M TBABF 4 | Pt
FIGURE 11.67
C6H5CN, 0.1 M TBABF 4 | Pt
Practical limits or “windows” or potential ranges for electrochemical measurements in aqueous solution or in nonaqueous solvents. PC, propylene carbonate. The electrolytes are: TBAPF4, tetrabutylammonium tetrafluoroborate; TBAP, tetrabutylammonium phosphate, and TEAP, tetraethylammonium phosphate.
THF, 0.1 M TBAP | Pt PC, 0.1 M TEAP | Pt CH2Cl2, 0.1 M TBAP | Pt SO2, 0.1 M TBAP | Pt NH3, 0.1 M Kl | Pt +3.0
+2.0
+1.0
0.0
-1.0
-2.0
-3.0
electrode as a reference instead and must also use experiments with mixed cells that will allow a numerical change of reference to SCE. In an ideal cell there are two half-reactions at the “left” and “right” electrodes, and most often there is also a finite internal cell resistance R: E ¼ Eleft Eright IR
ð11:22:1Þ
An ideal unpolarized cell would have R ¼ 0 and infinite current; an ideal polarized cell would have a fixed R independent of E and thus a constant current. Reality is somewhere in between: There are several sources of “polarization” that can be considered as finite contributions to the overall resistance R > 0 (or better, the impedance Z). The IR drop, from whatever source, is also called the overpotential Z (i.e., IR > 0), which always decreases the overall E; remember that R is always a function of time and E. The causes of polarization are (1) diffusion-limited mass transfer of ions from bulk to electrode (2) chemical side reactions (if any), and (3) slow electron transfer at the electrode between the adsorbed species to be oxidized and the adsorbed species to be reduced. In potentiometry, where little current is passed, the emphasis is on electrodes that measure pH (pH electrode) or permit the chemical
11.22
73 7
ELECTROCHEMICAL METHODS
Table 11.18 Salt
Crystalline Salt Electrodes as Specific Ion-Sensitive Electrodes in H2O[56] Analyte Ion
AgBr CdEDTA AgCl Ag3CuS2 AgCN LaF3 þ EuF3 AgI PbS Ag2S
Br Cd2þ Cl Cu2þ CN F I Pb2þ Agþ/S2
AgSCN
SCN
Concentration Range 6
10 to 5 10 101 to 1 107 100 to 5 105 101 to 1 108 101 to 1 106 sat’d to 1 106 100 to 5 108 101 to 1 106 Agþ: 100 to 1 107 S2:100 to 1 107 100 to 5 106 0
identification of the analyte (ion-sensitive electrodes). The pH meter, invented by Beckmann132 in 1934 and by the Radiometer Co. in Denmark in 1936, is a high-impedance voltmeter that uses a pH electrode, consisting of a small Ag wire connected to an “Ag | AgCl | KCl (sat’d)” electrode, immersed in a small 0.1 M HCl solution, that is separated from bulk solution by a thin, H3Oþ-permeable thin glass membrane; a potential of 0.0592 V per pH unit is detected, amplified, corrected for temperature dependence, and converted to display pH units directly. Indicator electrodes can be metallic, conductive, or membrane-based. Metallic indicator electrodes may use a metal and its cation—for example, Ag|Agþ or Hg|Hg22þ in neutral solutions, or Zn|Zn2þ, Cd|Cd2þ, Bi|Bi3þ, Tl|Tlþ, or Pb|Pb2þ in de-areated solutions (other metals cannot be used because they are not too selective to specific cations or are too easily oxidized or too refractory). For instance, for the reduction Cd2þ þ 2e ! Cd, the Nernst133 equationreads:Eind ¼ ECdy (0.0592/2)log10(1/aCd2þ).Other metallic indicator electrodes use a metal and a very stable salt of that metal; for example, for the reduction AgCl(s) þ e ! Ag(s) þ Cl, the Nernst equation yields Eind ¼ 0.222 0.0592log10aCl; another such applicationusesScharzenbach’s134 ethylenetetracarboxylic acid, EDTA or COOH)2C¼C(COOH)2, also called “Y” for short: since Y can form a very stable complex HgY2 with Hg, therefore adding a known amount of HgY2 to the solution, for the reduction HgY4 þ 2e ! Hg(l) þ Y the Nernst equation Eind ¼ 0.21 – (0.0592/2) log10(aY4/aHgY2) can be used to measure aY4. Solid electrodes, consisting of any of the relatively few inorganic salts that are electrically conducting, will allow for the determination of certain ions. For instance, LaF3 (“doped” with ErF3) exhibits relatively mobile F ions, so it can be used as a fluoride-sensitive electrodes, although above pH 8 OH is an “interfering” ion, as is Hþ below pH 5; these interfering ions would be detected by this electrode as if they were F. Table 11.18 lists several such metal salt electrodes for the detection of specific anions.
132
Arnold Orville Beckmann (1900–2004). Walther Hermann Nernst (1864–1941). 134 Gerold Karl Schwarzenbach (1904–1978). 133
Interfering Species
CN , I , S2 Fe2þ, Pb2þ, Hg2þ, Agþ, Cu2þ CN, I, S2, OH, NH3 Hg2þ, Agþ, Cd2þ I, S2 OH above pH 8, Hþ below pH 5 CN Hg2þ, Agþ, Cu2þ Hgþþ Hgþþ Br, CN, I, S2
738
11
IN STR UMEN TS
Ion-sensitive electrodes can also be made using (i) specific ions dissolved in a nonpolar liquid, (ii) specific ions embedded in an ion-exchange polymer or liquid membrane matrix, or (iii) “hollow” molecules that can surround specific cations. The ion-sensitive layer must be separated from the bulk solution and also from an indicating electrode by ion-permeable polymers, such as poly-tetrafluoroethylene (TeflonÒ ), NafionÒ , or polyvinyl chloride (PVC). Rugged ion-sensitive field-effect transistor s (ISFETs) may replace an indicating electrode, by measuring the current due to ions that penetrate a Si3N4 layer placed over the gate electrode (with the rest of the FET protected from the solution by an impervious encapsulant polymer). Since the gate electric field change is not specific to which ion enters the Si3N4 layer, ISFETs must be designed with care and are coming to market rather slowly. Next, gas-sensitive membrane electrodes (GSME) are also in wide use. Finally, there are enzyme-based biosensors (EBB) that depend on specific enzyme– analyte interactions, which can be measured by various ion or moleculesensitive electrodes. Table 11.19 reviews these various electrodes. Electrochemical sensors that detect specific and preselected analytes are now incorporated into convenient encapsulated hand-held packages and are in routine commercial use. A few multisensors for explosives or trace amounts of gases (e.g., the “Caltech nose”) also exist. However, the shelf life and re-usability of all these sensors have been a vexing problem. Coulometry comes in several flavors: constant-potential or potentiostatic coulometry, constant-current or amperostatic coulometry, coulometric titrations, and electrogravimetry. Constant-potential coulometry is related to industrial electroplating: One wants to know how to completely deposit a certain metal ion onto an electrode, without gas evolution (which may make the electrode surface not smooth) or without depositing another ion that may be present in the electrolyte; in practice, one may have to program the applied potential electronically to secure a complete deposition. Consider the equation Eappl ¼ Eright Eleft þ Zconc;right Zconc;left þ Zkin;right Zkin;left IR ð11:22:2Þ where the term in square brackets is the difference between standard reduction potentials, corrected for the activitites of the relevant species and therefore obtainable from the Nernst equation. The next three terms represent the overpotential, due to either concentration effects, kinetic effects, or the “IR” drop due to the effective electrical resistance of the solution; alas, these must be obtained from experiment. If the left-hand electrode is the reference electrode, we may neglect both Zconc,left and Zkin,left leaving Eappl Eright Eleft þ Zconc;right þ Zkin;right IR
ð11:22:3Þ
As the electrolysis proceeds, the Nernst potential, R, and the two Z for the right-hand electrode will change with time. For instance, consider the electroplating reduction of Cu2þ [56]: Cu2þ ðaqÞ þ H2 OðlÞ ! CuðsÞ þ
1 O2 g þ 2Hþ ðaqÞ 2
ð11:22:4Þ
11.22
73 9
ELECTROCHEMICAL METHODS
Table 11.19 Liquid Membrane Electrodes (LME), Gas-Sensing Electrodes (GSME), and Enzyme-Based Biosensors (EBB) [56] Analyte
Type
Concentration Range
NH4þ
LME
100 to 5 107
Cd2þ
LME
100 to 5 107
Ca2þ
LME
100 to 5 107
Cl
LME
100 to 5 106
BF4
LME
100 to 7 106
NO3
LME
100 to 7 106
NO2
LME
1.4 100 to 3.6 106
ClO4
LME
100 to 7 106
Major Interferences, or Active Enzyme