CHARACTERIZATION OF MATERIALS
EDITORIAL BOARD Elton N. Kaufmann, (Editor-in-Chief)
Ronald Gronsky
Argonne National Laboratory Argonne, IL
University of California at Berkeley Berkeley, CA
Reza Abbaschian
Leonard Leibowitz
University of Florida at Gainesville Gainesville, FL
Argonne National Laboratory Argonne, IL
Peter A. Barnes
Thomas Mason
Clemson University Clemson, SC
Spallation Neutron Source Project Oak Ridge, TN
Andrew B. Bocarsly
Juan M. Sanchez
Princeton University Princeton, NJ
University of Texas at Austin Austin, TX
Chia-Ling Chien
Alan C. Samuels, Developmental Editor
Johns Hopkins University Baltimore, MD
Edgewood Chemical Biological Center Aberdeen Proving Ground, MD
David Dollimore University of Toledo Toledo, OH
Barney L. Doyle Sandia National Laboratories Albuquerque, NM
Brent Fultz California Institute of Technology Pasadena, CA
Alan I. Goldman Iowa State University Ames, IA
EDITORIAL STAFF VP, STM Books: Janet Bailey Executive Editor: Jacqueline I. Kroschwitz Editor: Arza Seidel Director, Book Production and Manufacturing: Camille P. Carter Managing Editor: Shirley Thomas Assistant Managing Editor: Kristen Parrish
CHARACTERIZATION OF MATERIALS VOLUMES 1 AND 2
Characterization of Materials is available Online in full color at www.mrw.interscience.wiley.com/com.
A John Wiley and Sons Publication
Copyright # 2003 by John Wiley & Sons, Inc. All rights reserved. Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400, fax 978-750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, e-mail:
[email protected]. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services please contact our Customer Care Department within the U.S. at 877-762-2974, outside the U.S. at 317-572-3993 or fax 317-572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print, however, may not be available in electronic format. Library of Congress Cataloging in Publication Data is available. Characterization of Materials, 2 volume set Elton N. Kaufmann, editor-in-chief ISBN: 0-471-26882-8 (acid-free paper) Printed in the United States of America 10 9 8 7 6 5 4 3 2 1
CONTENTS, VOLUMES 1 AND 2 FOREWORD
vii
THERMAL ANALYSIS
337
PREFACE
ix
Thermal Analysis, Introduction Thermal Analysis—Definitions, Codes of Practice, and Nomenclature Thermogravimetric Analysis Differential Thermal Analysis and Differential Scanning Calorimetry Combustion Calorimetry Thermal Diffusivity by the Laser Flash Technique Simultaneous Techniques Including Analysis of Gaseous Products
337
ELECTRICAL AND ELECTRONIC MEASUREMENTS
401
CONTRIBUTORS COMMON CONCEPTS Common Concepts in Materials Characterization, Introduction General Vacuum Techniques Mass and Density Measurements Thermometry Symmetry in Crystallography Particle Scattering Sample Preparation for Metallography COMPUTATION AND THEORETICAL METHODS
xiii 1 1 1 24 30 39 51 63
Electrical and Electronic Measurement, Introduction Conductivity Measurement Hall Effect in Semiconductors Deep-Level Transient Spectroscopy Carrier Lifetime: Free Carrier Absorption, Photoconductivity, and Photoluminescence Capacitance-Voltage (C-V) Characterization of Semiconductors Characterization of pn Junctions Electrical Measurements on Superconductors by Transport
71
Computation and Theoretical Methods, Introduction Introduction to Computation Summary of Electronic Structure Methods Prediction of Phase Diagrams Simulation of Microstructural Evolution Using the Field Method Bonding in Metals Binary and Multicomponent Diffusion Molecular-Dynamics Simulation of Surface Phenomena Simulation of Chemical Vapor Deposition Processes Magnetism in Alloys Kinematic Diffraction of X Rays Dynamical Diffraction Computation of Diffuse Intensities in Alloys
166 180 206 224 252
MECHANICAL TESTING
279
Mechanical Testing, Introduction Tension Testing High-Strain-Rate Testing of Materials Fracture Toughness Testing Methods Hardness Testing Tribological and Wear Testing
279 279 288 302 316 324
71 71 74 90 112 134 145
MAGNETISM AND MAGNETIC MEASUREMENTS
156
Magnetism and Magnetic Measurement, Introduction Generation and Measurement of Magnetic Fields Magnetic Moment and Magnetization Theory of Magnetic Phase Transitions Magnetometry Thermomagnetic Analysis Techniques to Measure Magnetic Domain Structures Magnetotransport in Metals and Alloys Surface Magneto-Optic Kerr Effect
v
337 344 362 373 383 392
401 401 411 418 427 456 466 472 491 491 495 511 528 531 540 545 559 569
ELECTROCHEMICAL TECHNIQUES
579
Electrochemical Techniques, Introduction Cyclic Voltammetry
579 580
vi
CONTENTS, VOLUMES 1 AND 2
Electrochemical Techniques for Corrosion Quantification Semiconductor Photoelectrochemistry Scanning Electrochemical Microscopy The Quartz Crystal Microbalance in Electrochemistry
592 605 636 653
OPTICAL IMAGING AND SPECTROSCOPY
665
Optical Imaging and Spectroscopy, Introduction Optical Microscopy Reflected-Light Optical Microscopy Photoluminescence Spectroscopy Ultraviolet and Visible Absorption Spectroscopy Raman Spectroscopy of Solids Ultraviolet Photoelectron Spectroscopy Ellipsometry Impulsive Stimulated Thermal Scattering
665 667 674 681 688 698 722 735 744
RESONANCE METHODS
761
Resonance Methods, Introduction Nuclear Magnetic Resonance Imaging Nuclear Quadrupole Resonance Electron Paramagnetic Resonance Spectroscopy Cyclotron Resonance Mo¨ ssbauer Spectrometry
761 762 775 792 805 816
X-RAY TECHNIQUES
835
X-Ray Techniques, Introduction X-Ray Powder Diffraction Single-Crystal X-Ray Structure Determination XAFS Spectroscopy X-Ray and Neutron Diffuse Scattering Measurements Resonant Scattering Techniques Magnetic X-Ray Scattering X-Ray Microprobe for Fluorescence and Diffraction Analysis X-Ray Magnetic Circular Dichroism X-Ray Photoelectron Spectroscopy Surface X-Ray Diffraction
835 835 850 869 882 905 917 939 953 970 1007
X-Ray Diffraction Techniques for Liquid Surfaces and Monomolecular Layers
1027
ELECTRON TECHNIQUES
1049
Electron Techniques, Introduction Scanning Electron Microscopy Transmission Electron Microscopy Scanning Transmission Electron Microscopy: Z-Contrast Imaging Scanning Tunneling Microscopy Low-Energy Electron Diffraction Energy-Dispersive Spectrometry Auger Electron Spectroscopy
1049 1050 1063 1090 1111 1120 1135 1157
ION-BEAM TECHNIQUES
1175
Ion-Beam Techniques, Introduction High-Energy Ion-Beam Analysis Elastic Ion Scattering for Composition Analysis Nuclear Reaction Analysis and Proton-Induced Gamma Ray Emission Particle-Induced X-Ray Emission Radiation Effects Microscopy Trace Element Accelerator Mass Spectrometry Introduction to Medium-Energy Ion Beam Analysis Medium-Energy Backscattering and Forward-Recoil Spectrometry Heavy-Ion Backscattering Spectrometry
1175 1176 1179 1200 1210 1223 1235 1258 1259 1273
NEUTRON TECHNIQUES
1285
Neutron Techniques, Introduction Neutron Powder Diffraction Single-Crystal Neutron Diffraction Phonon Studies Magnetic Neutron Scattering
1285 1285 1307 1316 1328
INDEX
1341
FOREWORD The successes that accompanied the new approach to materials research and development stimulated an entirely new spirit of invention. What had once been dreams, such as the invention of the automobile and the airplane, were transformed into reality, in part through the modification of old materials and in part by creation of new ones. The growth in basic understanding of electromagnetic phenomena, coupled with the discovery that some materials possessed special electrical properties, encouraged the development of new equipment for power conversion and new methods of long-distance communication with the use of wired or wireless systems. In brief, the successes derived from the new approach to the development of materials had the effect of stimulating attempts to achieve practical goals which had previously seemed beyond reach. The technical base of society was being shaken to its foundations. And the end is not yet in sight. The process of fabricating special materials for well defined practical missions, such as the development of new inventions or improving old ones, has, and continues to have, its counterpart in exploratory research that is carried out primarily to expand the range of knowledge and properties of materials of various types. Such investigations began in the field of mineralogy somewhat before the age of modern chemistry and were stimulated by the fact that many common minerals display regular cleavage planes and may exhibit unusual optical properties, such as different indices of refraction in different directions. Studies of this type became much broader and more systematic, however, once the variety of sophisticated exploratory tools provided by chemistry and physics became available. Although the groups of individuals involved in this work tended to live somewhat apart from the technologists, it was inevitable that some of their discoveries would eventually prove to be very useful. Many examples can be given. In the 1870s a young investigator who was studying the electrical properties of a group of poorly conducting metal sulfides, today classed among the family of semiconductors, noted that his specimens seemed to exhibit a different electrical conductivity when the voltage was applied in opposite directions. Careful measurements at a later date demonstrated that specially prepared specimens of silicon displayed this rectifying effect to an even more marked degree. Another investigator discovered a family of crystals that displayed surface
Whatever standards may have been used for materials research in antiquity, when fabrication was regarded more as an art than a science and tended to be shrouded in secrecy, an abrupt change occurred with the systematic discovery of the chemical elements two centuries ago by Cavendish, Priestly, Lavoisier, and their numerous successors. This revolution was enhanced by the parallel development of electrochemistry and eventually capped by the consolidating work of Mendeleyev which led to the periodic chart of the elements. The age of materials science and technology had finally begun. This does not mean that empirical or trial and error work was abandoned as unnecessary. But rather that a new attitude had entered the field. The diligent fabricator of materials would welcome the development of new tools that could advance his or her work whether exploratory or applied. For example, electrochemistry became an intimate part of the armature of materials technology. Fortunately, the physicist as well as the chemist were able to offer new tools. Initially these included such matters as a vast improvement of the optical microscope, the development of the analytic spectroscope, the discovery of x-ray diffraction and the invention of the electron microscope. Moreover, many other items such as isotopic tracers, laser spectroscopes and magnetic resonance equipment eventually emerged and were found useful in their turn as the science of physics and the demands for better materials evolved. Quite apart from being used to re-evaluate the basis for the properties of materials that had long been useful, the new approaches provided much more important dividends. The ever-expanding knowledge of chemistry made it possible not only to improve upon those properties by varying composition, structure and other factors in controlled amounts, but revealed the existence of completely new materials that frequently turned out to be exceedingly useful. The mechanical properties of relatively inexpensive steels were improved by the additions of silicon, an element which had been produced first as a chemist’s oddity. More complex ferrosilicon alloys revolutionized the performance of electric transformers. A hitherto all but unknown element, tungsten, provided a long-term solution in the search for a durable filament for the incandescent lamp. Eventually the chemists were to emerge with valuable families of organic polymers that replaced many natural materials. vii
viii
FOREWORD
charges of opposite polarity when placed under unidirectional pressure, so called piezoelectricity. Natural radioactivity was discovered in a specimen of a uranium mineral whose physical properties were under study. Superconductivity was discovered incidentally in a systematic study of the electrical conductivity of simple metals close to the absolute zero of temperature. The possibility of creating a light-emitting crystal diode was suggested once wave mechanics was developed and began to be applied to advance our understanding of the properties of materials further. Actually, achievement of the device proved to be more difficult than its conception. The materials involved had to be prepared with great care. Among the many avenues explored for the sake of obtaining new basic knowledge is that related to the influence of imperfections on the properties of materials. Some imperfections, such as those which give rise to temperature-dependent electrical conductivity in semiconductors, salts and metals could be ascribed to thermal fluctuations. Others were linked to foreign atoms which were added intentionally or occurred by accident. Still others were the result of deviations in the arrangement of atoms from that expected in ideal lattice structures. As might be expected, discoveries in this area not only clarified mysteries associated with ancient aspects of materials research, but provided tests that could have a
bearing on the properties of materials being explored for novel purposes. The semiconductor industry has been an important beneficiary of this form of exploratory research since the operation of integrated circuits can be highly sensitive to imperfections. In this connection, it should be added that the everincreasing search for special materials that possess new or superior properties under conditions in which the sponsors of exploratory research and development and the prospective beneficiaries of the technological advance have parallel interests has made it possible for those engaged in the exploratory research to share in the funds directed toward applications. This has done much to enhance the degree of partnership between the scientist and engineer in advancing the field of materials research. Finally, it should be emphasized again that whenever materials research has played a decisive role in advancing some aspect of technology, the advance has frequently been aided by the introduction of an increasingly sophisticated set of characterization tools that are drawn from a wide range of scientific disciplines. These tools usually remain a part of the array of test equipment. FREDERICK SEITZ President Emeritus, Rockefeller University Past President, National Academy of Sciences, USA
PREFACE that is observed. When both tool and sample each contribute their own materials properties—e.g., electrolyte and electrode, pin and disc, source and absorber, etc.—distinctions are blurred. Although these distinctions in principle ought not to be taken too seriously, keeping them in mind will aid in efficiently accessing content of interest in these volumes. Frequently, the materials property sought is not what is directly measured. Rather it is deduced from direct observation of some other property or phenomenon that acts as a signature of what is of interest. These relationships take many forms. Thermal arrest, magnetic anomaly, diffraction spot intensity, relaxation rate and resistivity, to name only a few, might all serve as signatures of a phase transition and be used as ‘‘spectator’’ properties to determine a critical temperature. Similarly, inferred properties such as charge carrier mobility are deduced from basic electrical quantities and temperature-composition phase diagrams are deduced from observed microstructures. Characterization of Materials, being organized by technique, naturally places initial emphasis on the most directly measured properties, but authors have provided many application examples that illustrate the derivative properties a techniques may address. First among our objectives is to help the researcher discriminate among alternative measurement modalities that may apply to the property under study. The field of possibilities is often very wide, and although excellent texts treating each possible method in great detail exist, identifying the most appropriate method before delving deeply into any one seems the most efficient approach. Characterization of Materials serves to sort the options at the outset, with individual articles affording the researcher a description of the method sufficient to understand its applicability, limitations, and relationship to competing techniques, while directing the reader to more extensive resources that fit specific measurement needs. Whether one plans to perform such measurements oneself or whether one simply needs to gain sufficient familiarity to effectively collaborate with experts in the method, Characterization of Materials will be a useful reference. Although our expert authors were given great latitude to adjust their presentations to the ‘‘personalities’’ of their specific methods, some uniformity and circumscription of content was sought. Thus, you will find most
Materials research is an extraordinarily broad and diverse field. It draws on the science, the technology, and the tools of a variety of scientific and engineering disciplines as it pursues research objectives spanning the very fundamental to the highly applied. Beyond the generic idea of a ‘‘material’’ per se, perhaps the single unifying element that qualifies this collection of pursuits as a field of research and study is the existence of a portfolio of characterization methods that is widely applicable irrespective of discipline or ultimate materials application. Characterization of Materials specifically addresses that portfolio with which researchers and educators must have working familiarity. The immediate challenge to organizing the content for a methodological reference work is determining how best to parse the field. By far the largest number of materials researchers are focused on particular classes of materials and also perhaps on their uses. Thus a comfortable choice would have been to commission chapters accordingly. Alternatively, the objective and product of any measurement,—i.e., a materials property—could easily form a logical basis. Unfortunately, each of these approaches would have required mention of several of the measurement methods in just about every chapter. Therefore, if only to reduce redundancy, we have chosen a less intuitive taxonomy by arranging the content according to the type of measurement ‘‘probe’’ upon which a method relies. Thus you will find chapters focused on application of electrons, ions, x rays, heat, light, etc., to a sample as the generic thread tying several methods together. Our field is too complex for this not to be an oversimplification, and indeed some logical inconsistencies are inevitable. We have tried to maintain the distinction between a property and a method. This is easy and clear for methods based on external independent probes such as electron beams, ion beams, neutrons, or x-rays. However many techniques rely on one and the same phenomenon for probe and property, as is the case for mechanical, electronic, and thermal methods. Many methods fall into both regimes. For example, light may be used to observe a microstructure, but may also be used to measure an optical property. From the most general viewpoint, we recognize that the properties of the measuring device and those of the specimen under study are inextricably linked. It is actually a joint property of the tool-plus-sample system ix
x
PREFACE
units organized in a similar fashion. First, an introduction serves to succinctly describe for what properties the method is useful and what alternatives may exist. Underlying physical principles of the method and practical aspects of its implementation follow. Most units will offer examples of data and their analyses as well as warnings about common problems of which one should be aware. Preparation of samples and automation of the methods are also treated as appropriate. As implied above, the level of presentation of these volumes is intended to be intermediate between cursory overview and detailed instruction. Readers will find that, in practice, the level of coverage is also very much dictated by the character of the technique described. Many are based on quite complex concepts and devices. Others are less so, but still, of course, demand a precision of understanding and execution. What is or is not included in a presentation also depends on the technical background assumed of the reader. This obviates the need to delve into concepts that are part of rather standard technical curricula, while requiring inclusion of less common, more specialized topics. As much as possible, we have avoided extended discussion of the science and application of the materials properties themselves, which, although very interesting and clearly the motivation for research in first place, do not generally speak to efficacy of a method or its accomplishment. This is a materials-oriented volume, and as such, must overlap fields such as physics, chemistry, and engineering. There is no sharp delineation possible between a ‘‘physics’’ property (e.g., the band structure of a solid) and the materials consequences (e.g., conductivity, mobility, etc.) At the other extreme, it is not at all clear where a materials property such as toughness ends and an engineering property associated with performance and life-cycle begins. The very attempt to assign such concepts to only one disciplinary category serves no useful purpose. Suffice it to say, therefore, that Characterization of Materials has focused its coverage on a core of materials topics while trying to remain inclusive at the boundaries of the field. Processing and fabrication are also important aspect of materials research. Characterization of Materials does not deal with these methods per se because they are not strictly measurement methods. However, here again no clear line is found and in such methods as electrochemistry, tribology, mechanical testing, and even ion-beam irradiation, where the processing can be the measurement, these aspects are perforce included. The second chapter is unique in that it collects methods that are not, literally speaking, measurement methods; these articles do not follow the format found in subsequent chapters. As theory or simulation or modeling methods, they certainly serve to augment experiment. They may
be a necessary corollary to an experiment to understand the result after the fact or to predict the result and thus help direct an experimental search in advance. More than this, as equipment needs of many experimental studies increase in complexity and cost, as the materials themselves become more complex and multicomponent in nature, and as computational power continues to expand, simulation of properties will in fact become the measurement method of choice in many cases. Another unique chapter is the first, covering ‘‘common concepts.’’ It collects some of the ubiquitous aspects of measurement methods that would have had to be described repeatedly and in more detail in later units. Readers may refer back to this chapter as related topics arise around specific methods, or they may use this chapter as a general tutorial. The Common Concepts chapter, however, does not and should not eliminate all redundancies in the remaining chapters. Expositions within individual articles attempt to be somewhat self-contained and the details as to how a common concept actually relates to a given method are bound to differ from one to the next. Although Characterization of Materials is directed more toward the research lab than the classroom, the focused units in conjunction with chapters one and two can serve as a useful educational tool. The content of Characterization of Materials had previously appeared as Methods in Materials Research, a loose-leaf compilation amenable to updating. To retain the ability to keep content as up to date as possible, Characterization of Materials is also being published on-line where several new and expanded topics will be added over time.
ACKNOWLEDGMENTS First we express our appreciation to the many expert authors who have contributed to Characterization of Materials. On the production side of the predecessor publication, Methods in Materials Research, we are pleased to acknowledge the work of a great many staff of the Current Protocols division of John Wiley & Sons, Inc. We also thank the previous series editors, Dr. Virginia Chanda and Dr. Alan Samuels. Republication in the present on-line and hard-bound forms owes its continuing quality to staff of the Major Reference Works group of John Wiley & Sons, Inc., most notably Dr. Jacqueline Kroschwitz and Dr. Arza Seidel.
For the editors, ELTON N. KAUFMANN Editor-in-Chief
CONTRIBUTORS Peter A. Barnes Clemson University Clemson, SC Electrical and Electronic Measurements, Introduction Capacitance-Voltage (C-V) Characterization of Semiconductors
Reza Abbaschian University of Florida at Gainesville Gainesville, FL Mechanical Testing, Introduction ˚ gren John A Royal Institute of Technology, KTH Stockholm, SWEDEN Binary and Multicomponent Diffusion
Jack Bass Michigan State University East Lansing, MI Magnetotransport in Metals and Alloys
Stephen D. Antolovich Washington State University Pullman, WA Tension Testing
Bob Bastasz Sandia National Laboratories Livermore, CA Particle Scattering
Samir J. Anz California Institute of Technology Pasadena, CA Semiconductor Photoelectrochemistry
Raymond G. Bayer Consultant Vespal, NY Tribological and Wear Testing
Georgia A. Arbuckle-Keil Rutgers University Camden, NJ The Quartz Crystal Microbalance In Electrochemistry
Goetz M. Bendele SUNY Stony Brook Stony Brook, NY X-Ray Powder Diffraction
Ljubomir Arsov University of Kiril and Metodij Skopje, MACEDONIA Ellipsometry
Andrew B. Bocarsly Princeton University Princeton, NJ Cyclic Voltammetry Electrochemical Techniques, Introduction
Albert G. Baca Sandia National Laboratories Albuquerque, NM Characterization of pn Junctions
Mark B.H. Breese University of Surrey, Guildford Surrey, UNITED KINGDOM Radiation Effects Microscopy
Sam Bader Argonne National Laboratory Argonne, IL Surface Magneto-Optic Kerr Effect James C. Banks Sandia National Laboratories Albuquerque, NM Heavy-Ion Backscattering Spectrometry
Iain L. Campbell University of Guelph Guelph, Ontario CANADA Particle-Induced X-Ray Emission
Charles J. Barbour Sandia National Laboratory Albuquerque, NM Elastic Ion Scattering for Composition Analysis
Gerbrand Ceder Massachusetts Institute of Technology Cambridge, MA Introduction to Computation xi
xii
CONTRIBUTORS
Robert Celotta National Institute of Standards and Technology Gaithersburg, MD Techniques to Measure Magnetic Domain Structures Gary W. Chandler University of Arizona Tucson, AZ Scanning Electron Microscopy Haydn H. Chen University of Illinois Urbana, IL Kinematic Diffraction of X Rays Long-Qing Chen Pennsylvania State University University Park, PA Simulation of Microstructural Evolution Using the Field Method Chia-Ling Chien Johns Hopkins University Baltimore, MD Magnetism and Magnetic Measurements, Introduction J.M.D. Coey University of Dublin, Trinity College Dublin, IRELAND Generation and Measurement of Magnetic Fields Richard G. Connell University of Florida Gainesville, FL Optical Microscopy Reflected-Light Optical Microscopy Didier de Fontaine University of California Berkeley, CA Prediction of Phase Diagrams T.M. Devine University of California Berkeley, CA Raman Spectroscopy of Solids David Dollimore University of Toledo Toledo, OH Mass and Density Measurements Thermal AnalysisDefinitions, Codes of Practice, and Nomenclature Thermometry Thermal Analysis, Introduction Barney L. Doyle Sandia National Laboratory Albuquerque, NM High-Energy Ion Beam Analysis Ion-Beam Techniques, Introduction Jeff G. Dunn University of Toledo Toledo, OH Thermogravimetric Analysis
Gareth R. Eaton University of Denver Denver, CO Electron Paramagnetic Resonance Spectroscopy Sandra S. Eaton University of Denver Denver, CO Electron Paramagnetic Resonance Spectroscopy Fereshteh Ebrahimi University of Florida Gainesville, FL Fracture Toughness Testing Methods Wolfgang Eckstein Max-Planck-Institut fur Plasmaphysik Garching, GERMANY Particle Scattering Arnel M. Fajardo California Institute of Technology Pasadena, CA Semiconductor Photoelectrochemistry Kenneth D. Finkelstein Cornell University Ithaca, NY Resonant Scattering Technique Simon Foner Massachusetts Institute of Technology Cambridge, MA Magnetometry Brent Fultz California Institute of Technology Pasadena, CA Electron Techniques, Introduction Mo¨ ssbauer Spectrometry Resonance Methods, Introduction Transmission Electron Microscopy Jozef Gembarovic Thermophysical Properties Research Laboratory West Lafayette, IN Thermal Diffusivity by the Laser Flash Technique Craig A. Gerken University of Illinois Urbana, IL Low-Energy, Electron Diffraction Atul B. Gokhale MetConsult, Inc. Roosevelt Island, NY Sample Preparation for Metallography Alan I. Goldman Iowa State University Ames, IA X-Ray Techniques, Introduction Neutron Techniques, Introduction
CONTRIBUTORS
John T. Grant University of Dayton Dayton, OH Auger Electron Spectroscopy
Robert A. Jacobson Iowa State University Ames, IA Single-Crystal X-Ray Structure Determination
George T. Gray Los Alamos National Laboratory Los Alamos, NM High-Strain-Rate Testing of Materials
Duane D. Johnson University of Illinois Urbana, IL Computation of Diffuse Intensities in Alloys Magnetism in Alloys
Vytautas Grivickas Vilnius University Vilnius, LITHUANIA Carrier Lifetime: Free Carrier Absorption, Photoconductivity, and Photoluminescence
Michael H. Kelly National Institute of Standards and Technology Gaithersburg, MD Techniques to Measure Magnetic Domain Structures
Robert P. Guertin Tufts University Medford, MA Magnetometry
Elton N. Kaufmann Argonne National Laboratory Argonne, IL Common Concepts in Materials Characterization, Introduction
Gerard S. Harbison University of Nebraska Lincoln, NE Nuclear Quadrupole Resonance
Janice Klansky Beuhler Ltd. Lake Bluff, IL Hardness Testing
Steve Heald Argonne National Laboratory Argonne, IL XAFS Spectroscopy
Chris R. Kleijn Delft University of Technology Delft, THE NETHERLANDS Simulation of Chemical Vapor Deposition Processes
Bruno Herreros University of Southern California Los Angeles, CA Nuclear Quadrupole Resonance
James A. Knapp Sandia National Laboratories Albuquerque, NM Heavy-Ion Backscattering Spectrometry
John P. Hill Brookhaven National Laboratory Upton, NY Magnetic X-Ray Scattering Ultraviolet Photoelectron Spectroscopy Kevin M. Horn Sandia National Laboratories Albuquerque, NM Ion Beam Techniques, Introduction Joseph P. Hornak Rochester Institute of Technology Rochester, NY Nuclear Magnetic Resonance Imaging James M. Howe University of Virginia Charlottesville, VA Transmission Electron Microscopy Gene E. Ice Oak Ridge National Laboratory Oak Ridge, TN X-Ray Microprobe for Fluorescence and Diffraction Analysis X-Ray and Neutron Diffuse Scattering Measurements
xiii
Thomas Koetzle Brookhaven National Laboratory Upton, NY Single-Crystal Neutron Diffraction Junichiro Kono Rice University Houston, TX Cyclotron Resonance Phil Kuhns Florida State University Tallahassee, FL Generation and Measurement of Magnetic Fields Jonathan C. Lang Argonne National Laboratory Argonne, IL X-Ray Magnetic Circular Dichroism David E. Laughlin Carnegie Mellon University Pittsburgh, PA Theory of Magnetic Phase Transitions Leonard Leibowitz Argonne National Laboratory Argonne, IL Differential Thermal Analysis and Differential Scanning Calorimetry
xiv
CONTRIBUTORS
Supaporn Lerdkanchanaporn University of Toledo Toledo, OH Simultaneouse Techniques Including Analysis of Gaseous Products
Daniel T. Pierce National Institute of Standards and Technology Gaithersburg, MD Techniques to Measure Magnetic Domain Structures
Nathan S. Lewis California Institute of Technology Pasadena, CA Semiconductor Photoelectrochemistry
Frank J. Pinski University of Cincinnati Cincinnati, OH Magnetism in Alloys Computation of Diffuse Intensities in Alloys
Dusan Lexa Argonne National Laboratory Argonne, IL Differential Thermal Analysis and Differential Scanning Calorimetry
Branko N. Popov University of South Carolina Columbia, SC Ellipsometry
Jan Linnros Royal Institute of Technology Kista-Stockholm, SWEDEN Carrier Liftime: Free Carrier Absorption, Photoconductivity, and Photoluminescene
Ziqiang Qiu University of California at Berkeley Berkeley, CA Surface Magneto-Optic Kerr Effect
David C. Look Wright State University Dayton, OH Hall Effect in Semiconductors
Talat S. Rahman Kansas State University Manhattan, Kansas Molecular-Dynamics Simulation of Surface Phenomena
Jeffery W. Lynn University of Maryland College Park, MD Magentic Neutron Scattering
T.A. Ramanarayanan Exxon Research and Engineering Corp. Annandale, NJ Electrochemical Techniques for Corrosion Quantification
Kosta Maglic Institute of Nuclear Sciences ‘‘Vinca’’ Belgrade, YUGOSLAVIA Thermal Diffusivity by the Laser Flash Technique
M. Ramasubramanian University of South Carolina Columbia, SC Ellipsometry
Floyd McDaniel University of North Texas Denton, TX Trace Element Accelerator Mass Spectrometry
S.S.A. Razee University of Warwick Coventry, UNITED KINGDOM Magnetism in Alloys
Michael E. McHenry Carnegie Mellon University Pittsburgh, PA Magnetic Moment and Magnetization Thermomagnetic Analysis Theory of Magnetic Phase Transitions
James L. Robertson Oak Ridge National Laboratory Oak Ridge, TN X-Ray and Neutron Diffuse Scattering Measurements
Keith A. Nelson Massachusetts Institute of Technology Cambridge, MA Impulsive Stimulated Thermal Scattering Dale E. Newbury National Institute of Standards and Technology Gaithersburg, MD Energy-Dispersive Spectrometry P.A.G. O’Hare Darien, IL Combustion Calorimetry Stephen J. Pennycook Oak Ridge National Laboratory Oak Ridge, TN Scanning Transmission Electron Microscopy: Z-Contrast Imaging
Ian K. Robinson University of Illinois Urbana, IL Surface X-Ray Diffraction John A. Rogers Bell Laboratories, Lucent Technologies Murray Hill, NJ Impulsive Stimulated Thermal Scattering William J. Royea California Institute of Technology Pasadena, CA Semiconductor Photoelectrochemistry Larry Rubin Massachusetts Institute of Technology Cambridge, MA Generation and Measurement of Magnetic Fields
CONTRIBUTORS
Miquel Salmeron Lawrence Berkeley National Laboratory Berkeley, CA Scanning Tunneling Microscopy
Hugo Steinfink University of Texas Austin, TX Symmetry in Crystallography
Alan C. Samuels Edgewood Chemical Biological Center Aberdeen Proving Ground, MD Mass and Density Measurements Optical Imaging and Spectroscopy, Introduction Thermometry
Peter W. Stephens SUNY Stony Brook Stony Brook, NY X-Ray Powder Diffraction
Juan M. Sanchez University of Texas at Austin Austin, TX Computational and Theoretical Methods, Introduction Hans J. Schneider-Muntau Florida State University Tallahassee, FL Generation and Measurement of Magnetic Fields Christian Schott Swiss Federal Institute of Technology Lausanne, SWITZERLAND Generation and Measurement of Magnetic Fields Justin Schwartz Florida State University Tallahassee, FL Electrical Measurements on Superconductors by Transport Supapan Seraphin University of Arizona Tucson, AZ Scanning Electron Microscopy Qun Shen Cornell University Ithaca, NY Dynamical Diffraction Y Jack Singleton Consultant Monroeville, PA General Vacuum Techniques Gabor A. Somorjai University of California & Lawrence Berkeley National Laboratory Berkeley, CA Low-Energy Electron Diffraction Cullie J. Sparks Oak Ridge National Laboratory Oak Ridge, TN X-Ray and Neutron Diffuse Scattering Measurements Costas Stassis Iowa State University Ames, IA Phonon Studies Julie B. Staunton University of Warwick Coventry, UNITED KINGDOM Computation of Diffuse Intensities in Alloys Magnetism in Alloys
xv
Ray E. Taylor Thermophysical Properties Research Laboratory West Lafayette, IN 47906 Thermal Diffusivity by the Laser Flash Technique Chin-Che Tin Auburn University Auburn, AL Deep-Level Transient Spectroscopy Brian M. Tissue Virginia Polytechnic Institute & State University Blacksburg, VA Ultraviolet and Visible Absorption Spectroscopy James E. Toney Applied Electro-Optics Corporation Bridgeville, PA Photoluminescene Spectroscopy John Unguris National Institute of Standards and Technology Gaithersburg, MD Techniques to Measure Magnetic Domain Structures David Vaknin Iowa State University Ames, IA X-Ray Diffraction Techniques for Liquid Surfaces and Monomolecular Layers Mark van Schilfgaarde SRI International Menlo Park, California Summary of Electronic Structure Methods Gyo¨ rgy Vizkelethy Sandia National Laboratories Albuquerque, NM Nuclear Reaction Analysis and Proton-Induced Gamma Ray Emission Thomas Vogt Brookhaven National Laboratory Upton, NY Neutron Powder Diffraction Yunzhi Wang Ohio State University Columbus, OH Simulation of Microstructural Evolution Using the Field Method Richard E. Watson Brookhaven National Laboratory Upton, NY Bonding in Metals
xvi
CONTRIBUTORS
Huub Weijers Florida State University Tallahassee, FL Electrical Measurements on Superconductors by Transport Jefferey Weimer University of Alabama Huntsville, AL X-Ray Photoelectron Spectroscopy Michael Weinert Brookhaven National Laboratory Upton, NY Bonding in Metals Robert A. Weller Vanderbilt University Nashville, TN
Introduction To Medium-Energy Ion Beam Analysis Medium-Energy Backscattering and Forward-Recoil Spectrometry Stuart Wentworth Auburn University Auburn University, AL Conductivity Measurement David Wipf Mississippi State University Mississippi State, MS Scanning Electrochemical Microscopy Gang Xiao Brown University Providence, RI Magnetism and Magnetic Measurements, Introduction
CHARACTERIZATION OF MATERIALS
This page intentionally left blank
COMMON CONCEPTS COMMON CONCEPTS IN MATERIALS CHARACTERIZATION, INTRODUCTION
As Characterization of Materials evolves, additional common concepts will be added. However, when it seems more appropriate, such content will appear more closely tied to its primary topical chapter.
From a tutorial standpoint, one may view this chapter as a good preparatory entrance to subsequent chapters of Characterization of Materials. In an educational setting, the generally applicable topics of the units in this chapter can play such a role, notwithstanding that they are each quite independent without having been sequenced with any pedagogical thread in mind. In practice, we expect that each unit of this chapter will be separately valuable to users of Characterization of Materials as they choose to refer to it for concepts underlying many of those exposed in units covering specific measurement methods. Of course, not every topic covered by a unit in this chapter will be relevant to every measurement method covered in subsequent chapters. However, the concepts in this chapter are sufficiently common to appear repeatedly in the pursuit of materials research. It can be argued that the units treating vacuum techniques, thermometry, and sample preparation do not deal directly with the materials properties to be measured at all. Rather, they are crucial to preparation and implementation of such a measurement. It is interesting to note that the properties of materials nevertheless play absolutely crucial roles for each of these topics as they rely on materials performance to accomplish their ends. Mass/density measurement does of course relate to a most basic materials property, but is itself more likely to be an ancillary necessity of a measurement protocol than to be the end goal of a measurement (with the important exceptions of properties related to porosity, defect density, etc.). In temperature and mass measurement, appreciating the role of standards and definitions is central to proper use of these parameters. It is hard to think of a materials property that does not depend on the crystal structure of the materials in question. Whether the structure is a known part of the explanation of the value of another property or its determination is itself the object of the measurement, a good grounding in essentials of crystallographic groups and syntax is a common need in most measurement circumstances. A unit provided in this chapter serves that purpose well. Several chapters in Characterization of Materials deal with impingement of projectiles of one kind or another on a sample, the reaction to which reflects properties of interest in the target. Describing the scattering of the projectiles is necessary in all these cases. Many concepts in such a description are similar regardless of projectile type, while the details differ greatly among ions, electrons, neutrons, and photons. Although the particle scattering unit in this chapter emphasizes the charged particle and ions in particular, the concepts are somewhat portable. A good deal of generic scattering background is provided in the chapters covering neutrons, x rays, and electrons as projectiles as well.
ELTON N. KAUFMANN
GENERAL VACUUM TECHNIQUES INTRODUCTION In this unit we discuss the procedures and equipment used to maintain a vacuum system at pressures in the range from 103 to 1011 torr. Total and partial pressure gauges used in this range are also described. Because there is a wide variety of equipment, we describe each of the various components, including details of their principles and technique of operation, as well as their recommended uses. SI units are not used in this unit. The American Vacuum Society attempted their introduction many years ago, but the more traditional units continue to dominate in this field in North America. Our usage will be consistent with that generally found in the current literature. The following units will be used. Pressure is given in torr. 1 torr is equivalent to 133.32 pascal (Pa). Volume is given in liters (L), and time in seconds (s). The flow of gas through a system, i.e., the ‘‘throughput’’ (Q), is given in torr-L/s. Pumping speed (S) and conductance (C) are given in L/s.
PRINCIPLES OF VACUUM TECHNOLOGY The most difficult step in designing and building a vacuum system is defining precisely the conditions required to fulfill the purpose at hand. Important factors to consider include: 1. The required system operating pressure and the gaseous impurities that must be avoided; 2. The frequency with which the system must be vented to the atmosphere, and the required recycling time; 3. The kind of access to the vacuum system needed for the insertion or removal of samples. For systems operating at pressures of 106 to 107 torr, venting the system is the simplest way to gain access, but for ultrahigh vacuum (UHV), e.g., below 108 torr, the pumpdown time can be very long, and system bakeout would usually be required. A vacuum load-lock antechamber for the introduction and removal of samples may be essential in such applications. 1
2
COMMON CONCEPTS
Because it is difficult to address all of the above questions, a viable specification of system performance is often neglected, and it is all too easy to assemble a more sophisticated and expensive system than necessary, or, if budgets are low, to compromise on an inadequate system that cannot easily be upgraded. Before any discussion of the specific components of a vacuum system, it is instructive to consider the factors that govern the ultimate, or base, pressure. The pressure can be calculated from P¼
Q S
ð1Þ
where P is the pressure in torr, Q is the total flow, or throughput of gas, in torr-L/s, and S is the pumping speed in L/s. The influx of gas, Q, can be a combination of a deliberate influx of process gas from an exterior source and gas originating in the system itself. With no external source, the base pressure achieved is frequently used as the principle indicator of system performance. The most important internal sources of gas are outgassing from the walls and permeation from the atmosphere, most frequently through elastomer O-rings. There may also be leaks, but these can readily be reduced to negligible levels by proper system design and construction. Vacuum pumps also contribute to background pressure, and here again careful selection and operation will minimize such problems. The Problem of Outgassing Of the sources of gas described above, outgassing is often the most important. With a new system, the origin of outgassing may be in the manufacture of the materials used in construction, in handling during construction, and in exposure of the system to the atmosphere. In general these sources scale with the area of the system walls, so that it is wise to minimize the surface area and to avoid porous materials in construction. For example, aluminum is an excellent choice for use in vacuum systems, but anodized aluminum has a porous oxide layer that provides an internal surface for gas adsorption many times greater than the apparent surface, making it much less suitable for use in vacuum. The rate of outgassing in a new, unbaked system, fabricated from materials such as aluminum and stainless steel, is initially very high, on the order of 106 to 107 torr-L/s cm2 of surface area after one hour of exposure to vacuum (O’Hanlon, 1989). With continued pumping, the rate falls by one or two orders of magnitude during the first 24 hr, but thereafter drops very slowly over many months. Typically the main residual gas is water vapor. In a clean vacuum system, operating at ambient temperature and containing only a moderate number of O-rings, the lowest achievable pressure is usually 107 to mid-108 torr. The limiting factor is generally residual outgassing, not the capability of the high-vacuum pump. The outgassing load is highest when a new system is put into service, but with steady use the sins of construction are slowly erased, and on each subsequent evacuation, the system will reach its typical base pressure more
rapidly. However, water will persist as the major outgassing load. Every time a system is vented to air, the walls are exposed to moisture and one or more layers of water will adsorb virtually instantaneously. The amount adsorbed will be greatest when the relative humidity is high, increasing the time needed to reach base pressure. Water is bound by physical adsorption, a reversible process, but the binding energy of adsorption is so great that the rate of desorption is slow at ambient temperature. Physical adsorption involves van der Waal’s forces, which are relatively weak. Physical adsorption should be distinguished from chemisorption, which typically involves the formation of chemical-type bonding of a gas to an atomically clean surface—for example, oxygen on a stainless steel surface. Chemisorption of gas is irreversible under all conditions normally encountered in a vacuum system. After the first few minutes of pumping, pressures are almost always in the free molecular flow regime, and when a water molecule is desorbed, it experiences only collisions with the walls, rather than with other molecules. Consequently, as it leaves the system, it is readsorbed many times, and on each occasion desorption is a slow process. One way of accelerating the removal of adsorbed water is by purging at a pressure in the viscous flow region, using a dry gas such as nitrogen or argon. Under viscous flow conditions, the desorbed water molecules rarely reach the system walls, and readsorption is greatly reduced. A second method is to heat the system above its normal operating temperature. Any process that reduces the adsorption of water in a vacuum system will improve the rate of pumpdown. The simplest procedure is to vent a vacuum system with a dry gas rather than with atmospheric air, and to minimize the time the system remains open following such a procedure. Dry air will work well, but it is usually more convenient to substitute nitrogen or argon. From Equation 1, it is evident that there are two approaches to achieving a lower ultimate pressure, and hence a low impurity level, in a system. The first is to increase the effective pumping speed, and the second is to reduce the outgassing rate. There are severe limitations to the first approach. In a typical system, most of one wall of the chamber will be occupied by the connection to the high-vacuum pump; this limits the size of pump that can be used, imposing an upper limit on the achievable pumping speed. As already noted, the ultimate pressure achieved in an unbaked system having this configuration will rarely reach the mid-108 torr range. Even if one could mount a similar-sized pump on every side, the best to be expected would be a 6-fold improvement, achieving a base pressure barely into the 109 torr range, even after very long exhaust times. It is evident that, to routinely reach pressures in the 1010 torr range in a realistic period of time, a reduction in the rate of outgassing is necessary—e.g., by heating the vacuum system. Baking an entire system to 4008C for 16 hr can produce outgassing rates of 1015 torr-L/ s cm2 (Alpert, 1959), a reduction of 108 from those found after 1 hr of pumping at ambient temperature. The magnitude of this reduction shows that as large a portion as
GENERAL VACUUM TECHNIQUES
possible of a system should be heated to obtain maximum advantage. PRACTICAL ASPECTS OF VACUUM TECHNOLOGY Vacuum Pumps The operation of most vacuum systems can be divided into two regimes. The first involves pumping the system from atmosphere to a pressure at which a high-vacuum pump can be brought into operation. This is traditionally known as the rough vacuum regime and the pumps used are commonly referred to as roughing pumps. Clearly, a system that operates at an ultimate pressure within the capability of the roughing pump will require no additional pumps. Once the system has been roughed down, a highvacuum pump must be used to achieve lower pressures. If the high-vacuum pump is the type known as a transfer pump, such as a diffusion or turbomolecular pump, it will require the continuous support of the roughing pump in order to maintain the pressure at the exit of the highvacuum pump at a tolerable level (in this phase of the pumping operation the function of the roughing pump has changed, and it is frequently referred to as a backing or forepump). Transfer pumps have the advantage that their capacity for continuous pumping of gas, within their operating pressure range, is limited only by their reliability. They do not accumulate gas, an important consideration where hazardous gases are involved. Note that the reliability of transfer pumping systems depends upon the satisfactory performance of two separate pumps. A second class of pumps, known collectively as capture pumps, require no further support from a roughing pump once they have started to pump. Examples of this class are cryogenic pumps and sputter-ion pumps. These types of pump have the advantage that the vacuum system is isolated from the atmosphere, so that system operation depends upon the reliability of only one pump. Their disadvantage is that they can provide only limited storage of pumped gas, and as that limit is reached, pumping will deteriorate. The effect of such a limitation is quite different for the two examples cited. A cryogenic pump can be totally regenerated by a brief purging at ambient temperature, but a sputter-ion pump requires replacement of its internal components. One aspect of the cryopump that should not be overlooked is that hazardous gases are stored, unchanged, within the pump, so that an unexpected failure of the pump can release these accumulated gases, requiring provision for their automatic safe dispersal in such an emergency. Roughing Pumps Two classes of roughing pumps are in use. The first type, the oil-sealed mechanical pump, is by far the most common, but because of the enormous concern in the semiconductor industry about oil contamination, a second type, the so-called ‘‘dry’’ pump, is now frequently used. In this context, ‘‘dry’’ implies the absence of volatile organics in the part of the pump that communicates with the vacuum system.
3
Oil-Sealed Pumps The earliest roughing pumps used either a piston or liquid to displace the gas. The first production methods for incandescent lamps used such pumps, and the development of the oil-sealed mechanical pump by Gaede, around 1907, was driven by the need to accelerate the pumping process. Applications. The modern versions of this pump are the most economic and convenient for achieving pressures as low as the 104 torr range. The pumps are widely used as a backing pump for both diffusion and turbomolecular pumps; in this application the backstreaming of mechanical pump oil is intercepted by the high vacuum pump, and a foreline trap is not required. Operating Principles. The oil-sealed pump is a positivedisplacement pump, of either the vane or piston type, with a compression ratio of the order of 105:1 (Dobrowolski, 1979). It is available as a single or two-stage pump, capable of reaching base pressures in the 102 and 104 torr range, respectively. The pump uses oil to maintain sealing, and to provide lubrication and heat transfer, particularly at the contact between the sliding vanes and the pump wall. Oil also serves to fill the significant dead space leading to the exhaust valve, essentially functioning as a hydraulic valve lifter and permitting the very high compression ratio. The speed of such pumps is often quoted as the ‘‘free-air displacement,’’ which is simply the volume swept by the pump rotor. In a typical two-stage pump this speed is sustained down to 1 101 torr; below this pressure the speed decreases, reaching zero in the 105 torr range. If a pump is to sustain pressures near the bottom of its range, the required pump size must be determined from published pumping-speed performance data. It should be noted that mechanical pumps have relatively small pumping speed, at least when compared with typical highvacuum pumps. A typical laboratory-sized pump, powered by a 1/3 hp motor, may have a speed of 3.5 cubic feet per minute (cfm), or rather less than 2 L/s, as compared to the smallest turbomolecular pump, which has a rated speed of 50 L/s. Avoiding Oil Contamination from an Oil-Sealed Mechanical Pump. The versatility and reliability of the oil-sealed mechanical pump carries with it a serious penalty. When used improperly, contamination of the vacuum system is inevitable. These pumps are probably the most prevalent source of oil contamination in vacuum systems. The problem arises when thay are untrapped and pump a system down to its ultimate pressure, often in the free molecular flow regime. In this regime, oil molecules flow freely into the vacuum chamber. The problem can readily be avoided by careful control of the pumping procedures, but possible system or operator malfunction, leading to contamination, must be considered. For many years, it was common practice to leave a system in the standby condition evacuated only by an untrapped mechanical pump, making contamination inevitable.
4
COMMON CONCEPTS
Mechanical pump oil has a vapor pressure, at room temperature, in the low 105 torr range when first installed, but this rapidly deteriorates up to two orders of magnitude as the pump is operated (Holland, 1971). A pump operates at temperatures of 608C, or higher, so the oil vapor pressure far exceeds 103 torr, and evaporation results in a substantial flux of oil into the roughing line. When a system at atmospheric pressure is connected to the mechanical pump, the initial gas flow from the vacuum chamber is in the viscous flow regime, and oil molecules are driven back to the pump by collisions with the gas being exhausted (Holland, 1971; Lewin, 1985). Provided the roughing process is terminated while the gas flow is still in the viscous flow regime, no significant contamination of the vacuum chamber will occur. The condition for viscous flow is given by the equation PD 0:5
ð2Þ
where P is the pressure in torr and D is the internal diameter of the roughing line in centimeters. Termination of the roughing process in the viscous flow region is entirely practical when the high-vacuum pump is either a turbomolecular or modern diffusion pump (see precautions discussed under Diffusion Pumps and Turbomolecular Pumps, below). Once these pumps are in operation, they function as an effective barrier against oil migration into the system from the forepump. Hoffman (1979) has described the use of a continuous gas purge on the foreline of a diffusion-pumped system as a means of avoiding backstreaming from the forepump. Foreline Traps. A foreline trap is a second approach to preventing oil backstreaming. If a liquid nitrogencooled trap is always in place between a forepump and the vacuum chamber, cleanliness is assured. But the operative word is ‘‘always.’’ If the trap warms to ambient temperature, oil from the trap will migrate upstream, and this is much more serious if it occurs while the line is evacuated. A different class of trap uses an adsorbent for oil. Typical adsorbents are activated alumina, molecular sieve (a synthetic zeolite), a proprietary ceramic (Micromaze foreline traps; Kurt J. Lesker Co.), and metal wool. The metal wool traps have much less capacity than the other types, and unless there is evidence of their efficacy, they are best avoided. Published data show that activated alumina can trap 99% of the backstreaming oil molecules (Fulker, 1968). However, one must know when such traps should be reactivated. Unequivocal determination requires insertion of an oil-detection device, such as a mass spectrometer, on the foreline. The saturation time of a trap depends upon the rate of oil influx, which in turn depends upon the vapor pressure of oil in the pump and the conductance of the line between pump and trap. The only safe procedure is frequent reactivation of traps on a conservative schedule. Reactivation may be done by venting the system, replacing the adsorbent with a new charge, or by baking the adsorbent in a stream of dry air or inert gas to a temperature of 3008C for several hours. Some traps can be regenerated by heating in situ, but only using a stream of inert gas, at a pressure in the viscous flow region,
flowing from the system side of the trap to the pump (D.J. Santeler, pers. comm.). The foreline is isolated from the rest of the system and the gas flow is continued throughout the heating cycle, until the trap has cooled back to ambient temperature. An adsorbent foreline trap must be optically dense, so the oil molecules have no path past the adsorbent; commercial traps do not always fulfill this basic requirement. Where regeneration of the foreline trap has been totally neglected, acceptable performance may still be achieved simply because a diffusion pump or turbomolecular pump serves as the true ‘‘trap,’’ intercepting the oil from the forepump. Oil contamination can also result from improperly turning a pump off. If it is stopped and left under vacuum, oil frequently leaks slowly across the exhaust valve into the pump. When it is partially filled with oil, a hydraulic lock may prevent the pump from starting. Continued leakage will drive oil into the vacuum system itself; an interesting procedure for recovery from such a catastrophe has been described (Hoffman, 1979). Whenever the pump is stopped, either deliberately or by power failure or other failure, automatic controls that first isolate it from the vacuum system, and then vent it to atmospheric pressure, should be used. Most gases exhausted from a system, including oxygen and nitrogen, are readily removed from the pump oil, but some can liquify under maximum compression just before the exhaust valve opens. Such liquids mix with the oil and are more difficult to remove. They include water and solvents frequently used to clean system components. When pumping large volumes of air from a vacuum chamber, particularly during periods of high humidity (or whenever solvent residues are present), it is advantageous to use a gas-ballast feature commonly fitted to two-stage and also to some single-stage pumps. This feature admits air during the final stage of compression, raising the pressure and forcing the exhaust valve to open before the partial pressure of water has reached saturation. The ballast feature minimizes pump contamination and reduces pumpdown time for a chamber exposed to humid air, although at the cost of about ten-times-poorer base pressure. Oil-Free (‘‘Dry’’) Pumps Many different types of oil-free pumps are available. We will emphasize those that are most useful in analytical and diagnostic applications. Diaphragm Pumps Applications: Diaphragm pumps are increasingly used where the absence of oil is an imperative, for example, as the forepump for compound turbomolecular pumps that incorporate a molecular drag stage. The combination renders oil contamination very unlikely. Most diaphragm pumps have relatively small pumping speeds. They are adequate once the system pressure reaches the operating range of a turbomolecular pump, usually well below 102 torr, but not for rapidly roughing down a large volume. Pumps are available with speeds up to several liters per second, and base pressures from a few torr to as low as 103 torr, lower ultimate pressures being associated with the lower-speed pumps.
GENERAL VACUUM TECHNIQUES
Operating Principles: Four diaphragm modules are often arranged in three separate pumping stages, with the lowest-pressure stage served by two modules in tandem to boost the capacity. Single modules are adequate for subsequent stages, since the gas has already been compressed to a smaller volume. Each module uses a flexible diaphragm of Viton or other elastomer, as well as inlet and outlet valves. In some pumps the modules can be arranged to provide four stages of pumping, providing a lower base pressure, but at lower pumping speed because only a single module is employed for the first stage. The major required maintenance in such pumps is replacement of the diaphragm after 10,000 to 15,000 hr of operation. Scroll Pumps Applications: Scroll pumps (Coffin, 1982; Hablanian, 1997) are used in some refrigeration systems, where the limited number of moving parts is reputed to provide high reliability. The most recent versions introduced for general vacuum applications have the advantages of diaphragm pumps, but with higher pumping speed. Published speeds on the order of 10 L/s and base pressures below 102 torr make this an appealing combination. Speeds decline rapidly at pressures below 2 102 torr. Operating Principles: Scroll pumps use two enmeshed spiral components, one fixed and the other orbiting. Successive crescent-shaped segments of gas are trapped between the two scrolls and compressed from the inlet (vacuum side) toward the exit, where they are vented to the atmosphere. A sophisticated and expensive version of this pump has long been used for processes where leaktight operation and noncontamination are essential, for example, in the nuclear industry for pumping radioactive gases. An excellent description of the characteristics of this design has been given by Coffin (1982). In this version, extremely close tolerances (10 mm) between the two scrolls minimize leakage between the high- and low-pressure ends of the scrolls. The more recent pump designs, which substitute Teflon-like seals for the close tolerances, have made the pump an affordable option for general oil-free applications. The life of the seals is reported to be in the same range as that of the diaphragm in a diaphragm pump. Screw Compressor. Although not yet widely used, pumps based on the principle of the screw compressor, such as that used in supercharging some high-performance cars, appear to offer some interesting advantages: i.e., pumping speeds in excess of 10 L/s, direct discharge to the atmosphere, and ultimate pressures in the 103 torr range. If such pumps demonstrate high reliability in diverse applications, they constitute the closest alternative, in a singleunit ‘‘dry’’ pump, to the oil-sealed mechanical pump. Molecular Drag Pump Applications: The molecular drag pump is useful for applications requiring pressures in the 1 to 107 torr range and freedom from organic contamination. Over this range the pump permits a far higher throughput of gas, compared to a standard turbomolecular pump. It has also
5
been used in the compound turbomolecular pump as an integral backing stage. This will be discussed in detail under Turbomolecular Pumps. Operating Principles: The pump uses one or more drums rotating at speeds as high as 90,000 rpm inside stationary, coaxial housings. The clearance between drum and housing is 0.3 mm. Gas is dragged in the direction of rotation by momentum transfer to the pump exit along helical grooves machined in the housing. The bearings of these devices are similar to those in turbomolecular pumps (see discussion of Turbomolecular Pumps, below). An internal motor avoids difficulties inherent in a high-speed vacuum seal. A typical pump uses two or more separate stages, arranged in series, providing a compression ratio as high as 1:107 for air, but typically less than 1:103 for hydrogen. It must be supported by a backing pump, often of the diaphragm type, that can maintain the forepressure below a critical value, typically 10 to 30 torr, depending upon the particular design. The much lower compression ratio for hydrogen, a characteristic shared by all turbomolecular pumps, will increase its percentage in a vacuum chamber, a factor to consider in rare cases where the presence of hydrogen affects the application. Sorption Pumps Applications: Sorption pumps were introduced for roughing down ultrahigh vacuum systems prior to turning on a sputter-ion pump (Welch, 1991). The pumping speed of a typical sorption pump is similar to that of a small oilsealed mechanical pump, but they are rather awkward in application. This is of little concern in a vacuum system likely to run many months before venting to the atmosphere. Occasional inconvenience is a small price for the ultimate in contamination-free operation. Operating Principles: A typical sorption pump is a cannister containing 3 lb of a molecular sieve material that is cooled to liquid nitrogen temperature. Under these conditions the molecular sieve can adsorb 7.6 104 torrliter of most atmospheric gases; exceptions are helium and hydrogen, which are not significantly adsorbed, and neon, which is adsorbed to a limited extent. Together, these gases, if not pumped, would leave a residual pressure in the 102 torr range. This is too high to guarantee the trouble-free start of a sputter-ion pump, but the problem is readily avoided. For example, a sorption pump connected to a vacuum chamber of 100 L volume exhausts air to a pressure in the viscous flow region, say 5 torr, and then is valved off. The nonadsorbing gases are swept into the pump along with the adsorbed gases; the pump now contains a fraction (760–5)/760 or 99.3% of the nonadsorbable gases originally present, leaving hydrogen, helium, and neon in the low 104 torr range in the vacuum chamber. A second sorption pump on the vacuum chamber will then readily achieve a base pressure below 5 104 torr, quite adequate to start even a recalcitrant ion pump. High-Vacuum Pumps Four types of high-vacuum pumps are in general use: diffusion, turbomolecular, cryosorption, and sputter-ion.
6
COMMON CONCEPTS
Each of these classes has advantages, and also some problems, and it is vital to consider both sides for a particular application. Any of these pumps can be used for ultimate pressures in the ultrahigh vacuum region and to maintain a working chamber that is substantially free from organic contamination. The choice of system rests primarily on the ease and reliability of operation in a particular environment, and inevitably on the capital and running costs. Diffusion Pumps Applications: The practical diffusion pump was invented by Langmuir in 1916, and this is the most common high-vacuum pump when all vacuum applications are considered. It is far less dominant where avoidance of organic contamination is essential. Diffusion pumps are available in a wide range of sizes, with speeds of up to 50,000 L/s; for such high-speed pumping only the cryopump seriously competes. A diffusion pump can give satisfactory service in a number of situations. One such case is in a large system in which cleanliness is not critical. Contamination problems of diffusion-pumped systems have actually been somewhat overstated. Commercial processes using highly reactive metals are routinely performed using diffusion pumps. When funds are scarce, a diffusion pump, which incurs the lowest capital cost of any of the high-vacuum alternatives, is often selected. The continuing costs of operation, however, are higher than for the other pumps, a factor not often considered. An excellent detailed discussion of diffusion pumps is available (Hablanian, 1995). Operating Principles: A diffusion pump normally contains three or more oil jets operating in series. It can be operated at a maximum inlet pressure of 1 103 torr and maintains a stable pumping speed down to 1010 torr or lower. As a transfer pump, the total amount of gas it can pump is limited only by its reliability, and accumulation of any hazardous gas is not a problem. However, there are a number of key requirements in maintaining its operation. First, the outlet of the pump must be kept below some maximum pressure, which can, however, be as high as the mid-101 torr range. If the pressure exceeds this limit, all oil jets in the pump collapse and the pumping stops. Consequently the forepump (often called the backing pump) must operate continuously. Other services that must be maintained without interruption include water or air cooling, electrical power to the heater, and refrigeration, if a trap is used, to prevent oil backstreaming. A major drawback of this type of pump is the number of such criteria. The pump oil undergoes continuous thermal degradation. However, the extent of such degradation is small, and an oil charge can last for many years. Oil decomposition products have considerably higher vapor pressure than their parent molecules. Therefore modern pumps are designed to continuously purify the working fluid, ejecting decomposition products toward the forepump. In addition, any forepump oil reaching the diffusion pump has a much higher vapor pressure than the working fluid, and it too must be ejected. The purification mechanism
primarily involves the oil from the pump jet, which is cooled at the pump wall and returns, by gravity, to the boiler. The cooling area extends only past the lowest pumping jet, below which returning oil is heated by conduction from the boiler, boiling off any volatile fraction, so that it flows toward the forepump. This process is greatly enhanced if the pump is fitted with an ejector jet, directed toward the foreline; the jet exhausts the volume directly over the boiler, where the decomposition fragments are vaporized. A second step to minimize the effect of oil decomposition is to design the heater and supply tubes to the jets so that the uppermost jet, i.e., that closest to the vacuum chamber, is supplied with the highest-boiling-point oil fraction. This oil, when condensed on the upper end of the pump wall, has the lowest possible vapor pressure. It is this film of oil that is a major source of backstreaming into the vacuum chamber. The selection of the oil used is important (O’Hanlon, 1989). If minimum backstreaming is essential, one can select an oil that has a very low vapor pressure at room temperature. A polyphenyl ether, such as Santovac 5, or a silicone oil, such as DC705, would be appropriate. However, for the most oil-sensitive applications, it is wise to use a liquid nitrogen (LN2) temperature trap between pump and vacuum chamber. Any cold trap will reduce the system base pressure, primarily by pumping water vapor, but to remove oil to a partial pressure well below 1011 torr it is essential that molecules make at least two collisions with surfaces at LN2 temperature. Such traps are thermally isolated from ambient temperature and only need cryogen refills every 8 hr or more. With such a trap, the vapor pressure of the pump oil is secondary, and a less expensive oil may be used. If a pump is exposed to substantial flows of reactive gases or to oxygen, either because of a process gas flow or because the chamber must be frequently pumped down after venting to air, the chemical stability of the oil is important. Silicone oils are very resistant to oxidation, while perfluorinated oils are stable against both oxygen and many reactive gases. When a vacuum chamber includes devices such as mass spectrometers, which depend upon maintaining uniform electrical potential on electrodes, silicone oils can be a problem, because on decomposition they may deposit insulating films on electrodes. Operating Procedures: A vacuum chamber free from organic contamination pumped by a diffusion pump requires stringent operating procedures. While the pump is warming, high backstreaming occurs until all jets are in full operation, so the chamber must be protected during this phase, either by a LN2 trap, before the pressure falls below the viscous flow regime, or by an isolation valve. The chamber must be roughed down to some predetermined pressure before opening to the diffusion pump. This cross-over pressure requires careful consideration. Procedures to minimize the backstreaming for the frequently used oil-sealed mechanical pump have already been discussed (see Oil-Sealed Pumps). If a trap is used, one can safely rough down the chamber to the ultimate pressure of the pump. Alternatively, backstreaming can be minimized
GENERAL VACUUM TECHNIQUES
by limiting the exhaust to the viscous flow regime. This procedure presents a potential problem. The vacuum chamber will be left at a pressure in the 101 torr range, but sustained operation of the diffusion pump must be avoided when its inlet pressure exceeds 103 torr. Clearly, the moment the isolation valve between diffusion pump and the roughed-down vacuum chamber is opened, the pump will suffer an overload of at least two decades pressure. In this condition, the upper jet of the pump will be overwhelmed and backstreaming will rise. If the diffusion pump is operated with a LN2 trap, this backstreaming will be intercepted. But, even with an untrapped diffusion pump, the overload condition rarely lasts more than 10 to 20 s, because the pumping speed of a diffusion pump is very high, even with one inoperative jet. Consequently, the backstreaming from roughing and high-vacuum pumps remains acceptable for many applications. Where large numbers of different operators use a system, fully automatic sequencing and safety interlocks are recommended to reduce the possibility of operator error. Diffusion pumps are best avoided if simplicity of operation is essential and freedom from organic contamination is paramount. Turbomolecular Pumps Applications: Turbomolecular pumps were introduced in 1958 (Becker, 1959) and were immediately hailed as the solution to all of the problems of the diffusion pump. Provided that recommended procedures are used, these pumps live up to the original high expectations. These are reliable, general-purpose pumps requiring simple operating procedures and capable of maintaining clean vacuum down to the 1010 torr range. Pumping speeds up to 10,000 L/s are available. Operating Principles: The pump is a multistage axial compressor, operating at rotational speeds from around 20,000 to 90,000 rpm. The drive motor is mounted inside the pump housing, avoiding the shaft seal needed with an external drive. Modern power supplies sense excessive loading of the motor, as when operating at too high an inlet pressure, and reduce the motor speed to avoid overheating and possible failure. Occasional failure of the frequency control in the supply has resulted in excessive speeds and catastrophic failure of the rotor. At high speeds, the dominant problem is maintenance of the rotational bearings. Careful balancing of the rotor is essential; in some models bearings can be replaced in the field, if rigorous cleanliness is assured, preferably in a clean environment such as a laminar-flow hood. In other designs, the pump must be returned to the manufacturer for bearing replacement and rotor rebalancing. This service factor should be considered in selecting a turbomolecular pump, since few facilities can keep a replacement pump on hand. Several different types of bearings are common in turbomolecular pumps: 1. Oil Lubrication. All first-generation pumps used oil-lubricated bearings that often lasted several
7
years in continuous operation. These pumps were mounted horizontally with the gas inlet between two sets of blades. The bearings were at the ends of the rotor shaft, on the forevacuum side. This type of pump, and the magnetically levitated designs discussed below, offer minimum vibration. Second-generation pumps are vertically mounted and single-ended. This is more compact, facilitating easy replacement of a diffusion pump. Many of these pumps rely on gravity return of lubrication oil to the reservoir and thus require vertical orientation. Using a wick as the oil reservoir both localizes the liquid and allows more flexible pump orientation. 2. Grease Lubrication: A low-vapor-pressure grease lubricant was introduced to reduce transport of oil into the vacuum chamber (Osterstrom, 1979) and to permit orientation of the pump in any direction. Grease has lower frictional loss and allows a lowerpower drive motor, with consequent drop in operating temperature. 3. Ceramic Ball Bearings: Most bearings now use a ceramic-balls/steel-race combination; the lighter balls reduce centrifugal forces and the ceramic-tosteel interface minimizes galling. There appears to be a significant improvement in bearing life for both oil and grease lubrication systems. 4. Magnetic Bearings: Magnetic suspension systems have two advantages: a non-contact bearing with a potentially unlimited life, and very low vibration. First-generation pumps used electromagnetic suspension with a battery backup. When nickelcadmium batteries were used, this backup was not continuously available; incomplete discharge before recharging cycles often reduces discharge capacity. A second generation using permanent magnets was more reliable and of lower cost. Some pumps now offer an improved electromagnetic suspension with better active balancing of the rotor on all axes. In some designs, the motor is used as a generator when power is interrupted, to assure safe shutdown of the magnetic suspension system. Magnetic bearing pumps use a second set of ‘‘touch-down’’ bearings for support when the pump is stationary. The bearings use a solid, low-vapor-pressure lubricant (O’Hanlon, 1989) and further protect the pump in an emergency. The life of the touch-down bearings is limited, and their replacement may be a nuisance; it is, however, preferable to replacing a shattered pump rotor and stator assembly. 5. Combination Bearings Systems: Some designs use combinations of different types of bearings. One example uses a permanent-magnet bearing at the high-vacuum end and an oil-lubricated bearing at the forevacuum end. A magnetic bearing does not contaminate the system and is not vulnerable to damage by aggressive gases as is a lubricated bearing. Therefore it can be located at the very end of the rotor shaft, while the oil-fed bearing is at the opposite forevacuum end. This geometry has the advantage of minimizing vibration.
8
COMMON CONCEPTS
Problems with Pumping Reactive Gases: Very reactive gases, common in the semiconductor industry, can result in rapid bearing failure. A purge with nonreactive gas, in the viscous flow regime, can prevent the pumped gases from contacting the bearings. To permit access to the bearing for a purge, pump designs move the upper bearing below the turbine blades, which often cantilevers the center of mass of the rotor beyond the bearings. This may have been a contributing factor to premature failure seen in some pump designs. The turbomolecular pump shares many of the performance characteristics of the diffusion pump. In the standard construction, it cannot exhaust to atmospheric pressure, and must be backed at all times by a forepump. The critical backing pressure is generally in the 101 torr, or lower, region, and an oil-sealed mechanical pump is the most common choice. Failure to recognize the problem of oil contamination from this pump was a major factor in the problems with early applications of the turbomolecular pump. But, as with the diffusion pump, an operating turbomolecular pump prevents significant backstreaming from the forepump and its own bearings. A typical turbomolecular pump compression ratio for heavy oil molecules, 1012:1, ensures this. The key to avoiding oil contamination during evacuation is the pump reaching its operating speed as soon as is possible. In general, turbomolecular pumps can operate continuously at pressures as high as 102 torr and maintain constant pumping speed to at least 1010 torr. As the turbomolecular pump is a transfer pump, there is no accumulation of hazardous gas, and less concern with an emergency shutdown situation. The compression ratio is 108:1 for nitrogen, but frequently below 1000:1 for hydrogen. Some first-generation pumps managed only 50:1 for hydrogen. Fortunately, the newer compound pumps, which add an integral molecular drag backing pump, often have compression ratios for hydrogen in excess of 105:1. The large difference between hydrogen (and to a lesser extent helium) and gases such as nitrogen and oxygen leaves the residual gas in the chamber enriched in the lighter species. If a low residual hydrogen pressure is an important consideration, it may be necessary to provide supplementary pumping for this gas, such as a sublimation pump or nonevaporable getter (NEG), or to use a different class of pump. The demand for negligible organic compound contamination has led to the compound pump, comprising a standard turbomolecular stage backed by a molecular drag stage, mounted on a common shaft. Typically, a backing pressure of only 10 torr or higher, conveniently provided by an oil-free (‘‘dry’’) diaphragm pump, is needed (see discussion of Oil-Free Pumps). In some versions, greased or oil-lubricated bearings are used (on the high-pressure side of the rotor); magnetic bearings are also available. Compound pumps provide an extremely low risk of oil contamination and significantly higher compression ratios for light gases. Operation of a Turbomolecular Pump System: Freedom from organic contamination demands care during both the evacuation and venting processes. However, if a pump is
contaminated with oil, the cleanup requires disassembly and the use of solvents. The following is a recommended procedure for a system in which an untrapped oil-sealed mechanical roughing/ backing pump is combined with an oil-lubricated turbomolecular pump, and an isolation valve is provided between the vacuum chamber and the turbomolecular pump. 1. Startup: Begin roughing down and turn on the pump as soon as is possible without overloading the drive motor. Using a modern electronically controlled supply, no delay is necessary, because the supply will adjust power to prevent overload while the pressure is high. With older power supplies, the turbomolecular pump should be started as soon as the pressure reaches a tolerable level, as given by the manufacturer, probably in the 10 torr region. A rapid startup ensures that the turbomolecular pump reaches at least 50% of the operating speed while the pressure in the foreline is still in the viscous flow regime, so that no oil backstreaming can enter the system through the turbomolecular pump. Before opening to the turbomolecular pump, the vacuum chamber should be roughed down using a procedure to avoid oil contamination, as was described for diffusion pump startup (see discussion above). 2. Venting: When the entire system is to be vented to atmospheric pressure, it is essential that the venting gas enter the turbomolecular pump at a point on the system side of any lubricated bearings in the pump. This ensures that oil liquid or vapor is swept away from the system towards the backing system. Some pumps have a vent midway along the turbine blades, while others have vents just above the upper, system-side, bearings. If neither of these vent points are available, a valve must be provided on the vacuum chamber itself. Never vent the system from a point on the foreline of the turbomolecular pump; that can flush both mechanical pump oil and turbomolecular pump oil into the turbine rotor and stator blades and the vacuum chamber. Venting is best started immediately after turning off the power to the turbomolecular pump and adjusting so the chamber pressure rises into the viscous flow region within a minute or two. Too-rapid venting exposes the turbine blades to excessive pressure in the viscous flow regime, with unnecessarily high upward force on the bearing assembly (often called the ‘‘helicopter’’ effect). When venting frequently, the turbomolecular pump is usually left running, isolated from the chamber, but connected to the forepump. The major maintenance is checking the oil or grease lubrication, as recommended by the pump manufacturer, and replacing the bearings as required. The stated life of bearings is often 2 years continuous operation, though an actual life of 5 years is not uncommon. In some facilities, where multiple pumps are used in production, bearings are checked by monitoring the amplitude of the
GENERAL VACUUM TECHNIQUES
vibration frequency associated with the bearings. A marked increase in amplitude indicates the approaching end of bearing life, and the pump is removed for maintenance. Cryopumps Applications: Cryopumping was first extensively used in the space program, where test chambers modeled the conditions encountered in outer space, notably that by which any gas molecule leaving the vehicle rarely returns. This required all inside surfaces of the chamber to function as a pump, and led to liquid-helium-cooled shrouds in the chambers on which gases condensed. This is very effective, but is not easily applicable to individual systems, given the expense and difficulty of handling liquid helium. However, the advent of reliable closed-cycle mechanical refrigeration systems, achieving temperatures in the 10 to 20 K range, allow reliable, contamination-free pumps, with a wide range of pumping speeds, and which are capable of maintaining pressures as low as the 1010 torr range (Welch, 1991). Cryopumps are general purpose and available with very high pumping speeds (using internally mounted cryopanels), so they work for all chamber sizes. These are capture pumps, and, once operating, are totally isolated from the atmosphere. All pumped gas is stored in the body of the pump. They must be regenerated on a regular basis, but the quantity of gas pumped before regeneration is very large for all gases that are captured by condensation. Only helium, hydrogen, and neon are not effectively condensed. They must be captured by adsorption, for which the capacity is far smaller. Indeed, if pumping any significant quantity of helium, regeneration would have to be so frequent that another type of pump should be selected. If the refrigeration fails due to a power interruption or a mechanical failure, the pumped gas will be released within minutes. All pumps are fitted with a pressure relief valve to avoid explosion, but provision must be made for the safe disposal of any hazardous gases released. Operating Principles: A cryopump uses a closed-cycle refrigeration system with helium as the working gas. An external compressor, incorporating a heat exchanger that is usually water-cooled, supplies helium at 300 psi to the cold head, which is mounted on the vacuum system. The helium is cooled by passing through a pair of regenerative heat exchangers in the cold head, and then allowed to expand, a process which cools the incoming gas, and in turn, cools the heat exchangers as the low-pressure gas returns to the compressor. Over a period of several hours, the system develops two cold zones, nominally 80 and 15 K. The 80 K zone is used to cool a shroud through which gas molecules pass into its interior; water is pumped by this shroud, and it also minimizes the heat load on the second-stage array from ambient temperature radiation. Inside the shroud is an array at 15 K, on which most other gases are condensed. The energy available to maintain the 15 K temperature is just a few watts. The second stage should typically remain in the range 10 to 20 K, low enough to pump most common gases to well below 1010 torr. In order to remove helium,
9
hydrogen, and neon the modern cryopump incorporates a bed of charcoal, having a very large surface area, cooled by the second-stage array. This bed is so positioned that most gases are first removed by condensation, leaving only these three to be physically adsorbed. As already noted, the total pumping capacity of a cryopump is very different for the gases that are condensed, as compared to those that are adsorbed. The capacity of a pump is frequently quoted for argon, commonly used in sputtering systems. For example, a pump with a speed of 1000 L/s will have the capability of pumping 3 105 torr-liter of argon before requiring regeneration. This implies that a 200-L volume could be pumped down from a typical roughing pressure of 2.5 101 torr 6000 times. The pumping speed of a cryopump remains constant for all gases that are condensable at 20 K, down to the 1010 torr range, so long as the temperature of the second-stage array does not exceed 20 K. At this temperature the vapor pressure of nitrogen is 1 1011 torr, and that of all other condensable gases lies well below this figure. The capacity for adsorption-pumped gases is not nearly so well defined. The capacity increases both with decreasing temperature and with the pressure of the adsorbing gas. The temperature of the second-stage array is controlled by the balance between the refrigeration capacity and generation of heat by both condensation and adsorption of gases. Of necessity, the heat input must be limited so that the second-stage array never exceeds 20 K, and this translates into a maximum permissible gas flow into the pump. The lowest temperature of operation is set by the pump design, nominally 10 K. Consequently the capacity for adsorption of a gas such as hydrogen can vary by a factor of four or more when between these two temperature extremes. For a given flow of hydrogen, if this is the only gas being pumped, the heat input will be low, permitting a higher pumping capacity, but if a mixture of gases is involved, then the capacity for hydrogen will be reduced, simply because the equilibrium operating temperature will be higher. A second factor is the pressure of hydrogen that must be maintained in a particular process. Because the adsorption capacity is determined by this pressure, a low hydrogen pressure translates into a reduced adsorptive capacity, and therefore a shorter operating time before the pump must be regenerated. The effect of these factors is very significant for helium pumping, because the adsorption capacity for this gas is so limited. A cryopump may be quite impractical for any system in which there is a deliberate and significant inlet of helium as a process gas. Operating Procedure: Before startup, a cryopump must first be roughed down to some recommended pressure, often 1 101 torr. This serves two functions. First, the vacuum vessel surrounding the cold head functions as a Dewar, thermally isolating the cold zone. Second, any gas remaining must be pumped by the cold head as it cools down; because adsorption is always effective at a much higher temperature than condensation, the gas is adsorbed in the charcoal bed of the 20 K array, partially saturating it, and limiting the capacity for subsequently adsorbing helium, hydrogen, and neon. It is essential to
10
COMMON CONCEPTS
avoid oil contamination when roughing down, because oil vapors adsorbed on the charcoal of the second-stage array cannot be removed by regeneration and irreversibly reduce the adsorptive capacity. Once the required pressure is reached, the cryopump is isolated from the roughing line and the refrigeration system is turned on. When the temperature of the second-stage array reaches 20 K, the pump is ready for operation, and can be opened to the vacuum chamber, which has previously been roughed down to a selected cross-over pressure. This cross-over pressure can readily be calculated from the figure for the impulse gas load, specified by the manufacturer, and the volume of the chamber. The impulse load is simply the quantity of gas to which the pump can be exposed without increasing the temperature of the second-stage array above 20 K. When the quantity of gas that has been pumped is close to the limiting capacity, the pump must be regenerated. This procedure involves isolation from the system, turning off the refrigeration unit, and warming the first- and second-stage arrays until all condensed and adsorbed gas has been removed. The most common method is to purge these gases using a warm (608C) dry gas, such as nitrogen, at atmospheric pressure. Internal heaters were deliberately avoided for many years, to avoid an ignition source in the event that explosive gas mixtures, such as hydrogen and oxygen, were released during regeneration. To the same end, the use of any pressure sensor having a hot surface was, and still is, avoided in the regeneration procedure. Current practice has changed, and many pumps now incorporate a means of independently heating each of the refrigerated surfaces. This provides the flexibility to heat the cold surfaces only to the extent that adsorbed or condensed gases are rapidly removed, greatly reducing the time needed to cool back to the operating temperature. Consider, for example, the case where argon is the predominant gas load. At the maximum operating temperature of 20 K, its vapor pressure is well below 1011 torr, but warming to 90 K raises the vapor pressure to 760 torr, facilitating rapid removal. In certain cases, the pumping of argon can cause a problem commonly referred to as argon hangup. This occurs after a high pressure of argon, e.g., >1 103 torr, has been pumped for some time. When the argon influx stops, the argon pressure remains comparatively high instead of falling to the background level. This happens when the temperature of the pump shroud is too low. At 40 K, in contrast to 80 K, argon condenses on the outer shroud instead of being pumped by the second-stage array. Evaporation from the shroud at the argon vapor pressure of 1 103 torr keeps the partial pressure high until all of the gas has desorbed. The problem arises when the refrigeration capacity is too large, for example, when several pumps are served by a single compressor and the helium supply is improperly proportioned. An internal heater to increase the shroud temperature is an easy solution. A cryopump is an excellent general-purpose device. It can provide an extremely clean environment at base pressures in the low 1010 torr range. Care must be taken to ensure that the pressure-relief valve is always operable, and to ensure that any hazardous gases are safely handled
in the event of an unscheduled regeneration. There is some possibility of energetic chemical reactions during regeneration. For example, ozone, which is generated in some processes, may react with combustible materials. The use of a nonreactive purge gas will minimize hazardous conditions if the flow is sufficient to dilute the gases released during regeneration. The pump has a high capital cost and fairly high running costs for power and cooling. Maintenance of a cryopump is normally minimal. Seals in the displacer piston in the cold head must be replaced as required (at intervals of one year or more, depending on the design); an oil-adsorber cartridge in the compressor housing requires a similar replacement schedule. Sputter-Ion Pumps Applications: These pumps were originally developed for ultrahigh vacuum (UHV) systems and are admirably suited to this application, especially if the system is rarely vented to atmospheric pressure. Their main advantages are as follows. 1. High reliability, because of no moving parts. 2. The ability to bake the pump up to 4008C, facilitating outgassing and rapid attainment of UHV conditions. 3. Fail-safe operation if on a leak-tight UHV system. If the power is interrupted, a moderate pressure rise will occur; the pump retains some pumping capacity by gettering. When power is restored, the base pressure is normally reestablished rapidly. 4. The pump ion current indicates the pressure in the pump itself, which is useful as a monitor of performance. Sputter-ion pumps are not suitable for the following uses. 1. On systems with a high, sustained gas load or frequent venting to atmosphere. 2. Where a well-defined pumping speed for all gases is required. This limitation can be circumvented with a severely conductance-limited pump, so the speed is defined by conductance rather than by the characteristics of the pump itself. Operating Principles: The operating mechanisms of sputter-ion pumps are very complex indeed (Welch, 1991). Crossed electrostatic and magnetic fields produce a confined discharge using a geometry originally devised by Penning (1937) to measure pressure in a vacuum system. A trapped cloud of electrons is produced, the density of which is highest in the 104 torr region, and falls off as the pressure decreases. High-energy ions, produced by electron collision, impact on the pump cathodes, sputtering reactive cathode material (titanium, and to a lesser extent, tantalum), which is deposited on all surfaces within line-of sight of the impact area. The pumping mechanisms include the following. 1. Chemisorption on the sputtered cathode material, which is the predominant pumping mechanism for reactive gases.
GENERAL VACUUM TECHNIQUES
2. Burial in the cathodes, which is mainly a transient contributor to pumping. With the exception of hydrogen, the atoms remain close to the surface and are released as pumping/sputtering continues. This is the source of the ‘‘memory’’ effect in diode ion pumps; previously pumped species show up as minor impurities when a different gas is pumped. 3. Burial of ions back-scattered as neutrals, in all surfaces within line-of sight of the impact area. This is a crucial mechanism in the pumping of argon and other noble gases (Jepsen, 1968). 4. Dissociation of molecules by electron impact. This is the mechanism for pumping methane and other organic molecules. The pumping speed of these pumps is variable. Typical performance curves show the pumping of a single gas under steady-state conditions. Figure 1 shows the general characteristic as a function of pressure. Note the pronounced drop with falling pressure. The original commercial pumps used anode cells the order of 1.2 cm in diameter and had very low pumping speeds even in the 109 torr range. However, newer pumps incorporate at least some larger anode cells, up to 2.5 cm diameter, and the useful pumping speed is extended into the 1011 torr range (Rutherford, 1963). The pumping speed of hydrogen can change very significantly with conditions, falling off drastically at low pressures and increasing significantly at high pressures (Singleton, 1969, 1971; Welch, 1994). The pumped hydrogen can be released under some conditions, primarily during the startup phase of a pump. When the pressure is 103 torr or higher, the internal temperatures can readily reach 5008C (Snouse, 1971). Hydrogen is released, increasing the pressure and frequently stalling the pumpdown. Rare gases are not chemisorbed, but are pumped by burial (Jepsen, 1968). Argon is of special importance, because it can cause problems even when pumping air. The release of argon, buried as atoms in the cathodes, sometimes causes a sudden increase in pressure of as much as three decades, followed by renewed pumping, and a concomitant drop in pressure. The unstable behavior
11
is repeated at regular intervals, once initiated (Brubaker, 1959). This problem can be avoided in two ways. 1. By use of the ‘‘differential ion’’ or DI pump (Tom and James, 1969), which is a standard diode pump in which a tantalum cathode replaces one titanium cathode. 2. By use of the triode sputter-ion pump, in which a third electrode is interposed between the ends of the cylindrical anode and the pump walls. The additional electrode is maintained at a high negative potential, serving as a sputter cathode, while the anode and walls are maintained at ground potential. This pump has the additional advantage that the ‘‘memory’’ effect of the diode pump is almost completely suppressed. The operating life of a sputter-ion pump is inversely proportional to the operating pressure. It terminates when the cathodes are completely sputtered through at a small area on the axis of each anode cell where the ions impact. The life therefore depends upon the thickness of the cathodes at the point of ion impact. For example, a conventional triode pump has relatively thin cathodes as compared to a diode pump, and this is reflected in the expected life at an operating pressure of 1 106 torr, i.e., 35,000 as compared to 50,000 hr. The fringing magnetic field in older pumps can be very significant. Some newer pumps greatly reduce this problem. A vacuum chamber can be exposed to ultraviolet and x radiation, as well as ions and electrons produced by an ion pump, so appropriate electrical and optical shielding may be required. Operating Procedures: A sputter-ion pump must be roughed down before it can be started. Sorption pumps or any other clean technique can be used. For a diode pump, a pressure in the 104 torr range is recommended, so that the Penning discharge (and associated pumping mechanisms) will be immediately established. A triode pump can safely be started at pressures about a decade higher than the diode, because the electrostatic fields are such that the walls are not subjected to ion bombardment
Figure 1. Schematic representation of the pumping speed of a diode sputter-ion pump as a function of pressure.
12
COMMON CONCEPTS
(Snouse, 1971). An additional problem develops in pumps that have operated in hydrogen or water vapor. Hydrogen accumulates in the cathodes and this gas is released when the cathode temperatures increase during startup. The higher the pressure, the greater the temperature; temperatures as high as 9008C have been measured at the center of cathodes under high gas loads (Jepsen, 1967). An isolation valve should be used to avoid venting the pump to atmospheric pressure. The sputtered deposits on the walls of a pump adsorb gas with each venting, and the bonding of subsequently sputtered material will be reduced, eventually causing flaking of the deposits. The flakes can serve as electron emitters, sustaining localized (non-pumping) discharges and can also short out the electrodes. Getter Pumps. Getter pumps depend upon the reaction of gases with reactive metals as a pumping mechanism; such metals were widely used in electronic vacuum tubes, being described as getters (Reimann, 1952). Production techniques for the tubes did not allow proper outgassing of tube components, and the getter completed the initial pumping on the new tube. It also provided continuous pumping for the life of the device. Some practical getters used a ‘‘flash getter,’’ a stable compound of barium and aluminum that could be heated, using an RF coil, once the tube had been sealed, to evaporate a mirror-like barium deposit on the tube wall. This provided a gettering surface that operated close to ambient temperature. Such films initially offer rapid pumping, but once the surface is covered, a much slower rate of pumping is sustained by diffusion into the bulk of the film. These getters are the forerunners of the modern sublimation pump. A second type of getter used a reactive metal, such as titanium or zirconium wire, operated at elevated temperature; gases react at the metal surface to produce stable, low-vapor-pressure compounds that then diffuse into the interior, allowing a sustained reaction at the surface. These getters are the forerunners or the modern nonevaporable getter (NEG). Sublimation pumps Applications: Sublimation pumps are frequently used in combination with a sputter-ion pump, to provide highspeed pumping for reactive gases with a minimum investment (Welch, 1991). They are more suitable for ultrahigh vacuum applications than for handling large pumping loads. These pumps have been used in combination with turbomolecular pumps to compensate for the limited hydrogen-pumping performance of older designs. The newer, compound turbomolecular pumps avoid this need. Operating Principles: Most sublimation pumps use a heated titanium surface to sublime a layer of atomically clean metal onto a surface, commonly the wall of a vacuum chamber. In the simplest version, a wire, commonly 85% Ti/15% Mo (McCracken and Pashley, 1966; Lawson and Woodward, 1967) is heated electrically; typical filaments deposit 1 g before failure. It is normal to mount two or
three filaments on a common flange for longer use before replacement. Alternatively, a hollow sphere of titanium is radiantly heated by an internal incandescent lamp filament, providing as much as 30 g of titanium. In either case, a temperature of 15008C is required to establish a useable sublimation rate. Because each square centimeter of a titanium film provides a pumping speed of several liters per second at room temperature (Harra, 1976), one can obtain large pumping speeds for reactive gases such as oxygen and nitrogen. The speed falls dramatically as the surface is covered by even one monolayer. Although the sublimation process must be repeated periodically to compensate for saturation, in an ultrahigh vacuum system the time between sublimation cycles can be many hours. With higher gas loads the sublimation cycles become more frequent, and continuous sublimation is required to achieve maximum pumping speed. A sublimator can only pump reactive gases and must always be used in combination with a pump for remaining gases, such as the rare gases and methane. Do not heat a sublimator when the pressure is too high, e.g., 103 torr; pumping will start on the heated surface, and can suppress the rate of sublimation completely. In this situation the sublimator surface becomes the only effective pump, functioning as a nonevaporable getter, and the effective speed will be very small (Kuznetsov et al., 1969). Nonevaporable Getter Pumps (NEGs) Applications: In vacuum systems, NEGs can provide supplementary pumping of reactive gases, being particularly effective for hydrogen, even at ambient temperature. They are most suitable for maintaining low pressures. A niche application is the removal of reactive impurities from rare gases such as argon. NEGs find wide application in maintaining low pressures in sealed-off devices, in some cases at ambient temperature (Giorgi et al., 1985; Welch, 1991). Operating Principles: In one form of NEG, the reactive metal is carried as a thin surface layer on a supporting substrate. An example is an alloy of Zr/16%Al supported on either a soft iron or nichrome substrate. The getter is maintained at a temperature of around 4008C, either by indirect or ohmic heating. Gases are chemisorbed at the surface and diffuse into the interior. When a getter has been exposed to the atmosphere, for example, when initially installed in a system, it must be activated by heating under vacuum to a high temperature, 6008 to 8008C. This permits adsorbed gases such as nitrogen and oxygen to diffuse into the bulk. With use, the speed falls off as the near-surface getter becomes saturated, but the getter can be reactivated several times by heating. Hydrogen is evolved during reactivation; consequently reactivation is most effective when hydrogen can be pumped away. In a sealed device, however, the hydrogen is readsorbed on cooling. A second type of getter, which has a porous structure with far higher accessible surface area, effectively pumps reactive gases at temperatures as low as ambient. In many cases, an integral heater is embedded in the getter.
GENERAL VACUUM TECHNIQUES
13
Figure 2. Approximate pressure ranges of total and partial pressure gauges. Note that only the capacitance manometer is an absolute gauge. Based, with permission, on Short Course Notes of the American Vacuum Society.
Total and Partial Pressure Measurement Figure 2 provides a summary of the approximate range of pressure measurement for modern gauges. Note that only the capacitance diaphragm manometers are absolute gauges, having the same calibration for all gases. In all other gauges, the response depends on the specific gas or mixture of gases present, making it impossible to determine the absolute pressure without knowing gas composition. Capacitance Diaphragm Manometers. A very wide range of gauges are available. The simplest are signal or switching devices with limited accuracy and reproducibility. The most sophisticated have the ability to measure over a range of 1:104, with an accuracy exceeding 0.2% of reading, and a long-term stability that makes them valuable for calibration of other pressure gauges (Hyland and Shaffer, 1991). For vacuum applications, they are probably the most reliable gauge for absolute pressure measurement. The most sensitive can measure pressures from 1 torr down to the 104 torr range and can sense changes in the 105 torr range. Another advantage is that some models use stainless-steel and inconel parts, which resist corrosion and cause negligible contamination. Operating Principles: These gauges use a thin metal, or in some cases, ceramic diaphragm, which separates two chambers, one connected to the vacuum system and the other providing the reference pressure. The reference chamber is commonly evacuated to well below the lowest pressure range of the gauge, and has a getter to maintain that pressure. The deflection of the diaphragm is measured using a very sensitive electrical capacitance bridge circuit that can detect changes of 2 1010 m. In the most sensitive gauges the device is thermostatted to avoid drifts due to temperature change; in less sensitive instruments there is no temperature control.
Operation: The bridge must be periodically zeroed by evacuating the measuring side of the diaphragm to a pressure below the lowest pressure to be measured. Any gauge that is not thermostatically controlled should be placed in such a way as to avoid drastic temperature changes, such as periodic exposure to direct sunlight. The simplest form of the capacitance manometer uses a capacitance electrode on both the reference and measurement sides of the diaphragm. In applications involving sources of contamination, or a radioactive gas such as tritium, this can lead to inaccuracies, and a manometer with capacitance probes only on the reference side should be used. When a gauge is used for precision measurements, it must be corrected for the pressure differential that results when the thermostatted gauge head is operating at a different temperature than the vacuum system (Hyland and Shaffer, 1991). Gauges Using Thermal Conductivity for the Measurement of Pressure Applications: Thermal conductivity gauges are relatively inexpensive. Many operate in a range of 1 103 to 20 torr. This range has been extended to atmospheric pressure in some modifications of the ‘‘traditional’’ gauge geometry. They are valuable for monitoring and control, for example, during the processes of roughing down from atmospheric pressure and for the cross-over from roughing pump to high-vacuum pump. Some are subject to drift over time, for example, as a result of contamination from mechanical pump oil, but others remain surprising stable under common system conditions. Operating Principles: In most gauges, a ribbon or filament serves as the heated element. Heat loss from this element to the wall is measured either by the change in element temperature, in the thermocouple gauge, or as a change in electrical resistance, in the Pirani gauge.
14
COMMON CONCEPTS
Heat is lost from a heated surface in a vacuum system by energy transfer to individual gas molecules at low pressures (Peacock, 1998). This process has been used in the ‘‘traditional’’ types of gauges. At pressures well above 20 torr, convection currents develop. Heat loss in this mode has recently been used to extend the pressure measurement range up to atmospheric. Thermal radiation heat loss from the heated element is independent of the presence of gas, setting a lower limit to the measurement of pressure. For most practical gauges this limit is in the mid- to upper-104 torr range. Two common sources of drift in the pressure indication are changes in ambient temperature and contamination of the heated element. The first is minimized by operating the heated element at 3008C or higher. However, this increases chemical interactions at the element, such as the decomposition of organic vapors into deposits of tars or carbon; such deposits change the thermal accommodation coefficient of gases on the element, and hence the gauge sensitivity. More satisfactory solutions to drift in the ambient temperature include a thermostatically controlled envelope temperature or a temperature-sensing element that compensates for ambient temperature changes. The problem of changes in the accommodation coefficient is reduced by using chemically stable heating elements, such as the noble metals or gold-plated tungsten. Thermal conductivity gauges are commonly calibrated for air, and it is important to note that this changes significantly with the gas. The gauge sensitivity is higher for hydrogen and lower for argon. Thus, if the gas composition is unknown, the gauge reading may be in error by a factor of two or more. Thermocouple Gauge. In this gauge, the element is heated at constant power, and its change in temperature, as the pressure changes, is directly measured using a thermocouple. In many geometries the thermocouple is spot welded directly at the center of the element; the additional thermal mass of the couple reduces the response time to pressure changes. In an ingenious modification, the thermocouple itself (Benson, 1957) becomes the heated element, and the response time is improved. Pirani Gauge. In this gauge, the element is heated electrically, but the temperature is sensed by measuring its resistance. The absence of a thermocouple permits a faster time constant. A further improvement in response results if the element is maintained at constant temperature, and the power required becomes the measure of pressure. Gauges capable of measurement over a range extending to atmospheric pressure use the Pirani principle. Those relying on convection are sensitive to gauge orientation, and the recommendation of the manufacturer must be observed if calibration is to be maintained. A second point, of great importance for safe operation, arises from the difference in gauge calibration with different gases. Such gauges have been used to control the flow of argon into a sputtering system measuring the pressure on the highpressure side of a flow restriction. If pressure is set close
to atmospheric, it is crucial to use a gauge calibrated for argon, or to apply the appropriate correction; using a gauge reading calibrated for air to adjust the argon to atmospheric results in an actual argon pressure well above one atmosphere, and the danger of explosion becomes significant. A second technique that extends the measurement range to atmospheric pressure is drastic reduction of gauge dimensions so that the spacing between the heated element and the room temperature gauge wall is only 5 mm (Alvesteffer et al., 1995). Ionization Gauges: Hot Cathode Type. The BayardAlpert gauge (Redhead et al., 1968) is the principal gauge used for accurate indication of pressure from 104 to 1010 torr. Over this range, a linear relationship exists between the measured ion current and pressure. The gauge has a number of problems, but they are fairly well understood and to some extent can be avoided. Modifications of the gauge structure, such as the Redhead Extractor Gauge (Redhead et al., 1968) permit measurement into the high 1013 torr region, and minimize errors due to electron-stimulated desorption (see below). Operating Principles: In a typical Bayard-Alpert gauge configuration, shown in Figure 3A, a current of electrons, between 1 and 10 mA, from a heated cathode, is accelerated towards an anode grid by a potential of 150 V. Ions produced by electron collision are collected on an axial, fine-wire ion collector, which is maintained 30 V negative with respect to the cathode. The electron energy of 150 V is selected for the maximum ionization probability with most common gases. The equation describing the gauge operation is P¼
iþ ði ÞðKÞ
ð3Þ
where P is pressure, in torr, iþ is the ion current, i is the electron current, and K, in torr1, is the gauge constant for the specific gas. The original design of the ionization gauge, the triode gauge, shown in Figure 3B, cannot read below 1 108 torr because of a spurious current, known as the
Figure 3. Comparison of the (A) Bayard-Alpert and (B) triode ion gauge geometries. Based, with permission, on Short Course Notes of the American Vacuum Society.
GENERAL VACUUM TECHNIQUES
x-ray effect. The electron impact on the grid produces soft x rays, many of which strike the coaxial ion collector cylinder, generating a flux of photoelectrons; an electron ejected from the ion collector cannot be distinguished from an arriving ion by the current-measuring circuit. The existence of the x ray effect was first proposed by Nottingham (1947), and studies stimulated by his proposal led directly to the development of the Bayard-Alpert gauge, which simply inverted the geometry of the triode gauge. The sensitivity of the gauge is little changed from that of the triode, but the area of the ion collector, and presumably the x rayinduced spurious current, is reduced by a factor of 300, extending the usable range of the gauge to the order of 1 1010 torr. The gauge and associated electronics are normally calibrated for nitrogen gas, but, as with the thermal conductivity gauge, the sensitivity varies with gas, so the gas composition must be known for an absolute pressure reading. Gauge constants for various gases can be found in many texts (Redhead et al., 1968). A gauge can affect the pressure in a system in three important ways. 1. An operating gauge functions as a small pump; at an electron emission of 10 mA the pumping speed is the order of 0.1 L/s. In a small system this can be a significant part of the pumping. In systems that are pumped at relatively large speeds, the gauge has negligible effect, but if the gauge is connected to the system by a long tube of small diameter, the limited conductance of the connection will result in a pressure drop, and the gauge will record a pressure lower than that in the system. For example, a gauge pumping at 0.1 L/s, connected to a chamber by a 100cm-long, 1-cm-diameter tube, with a conductance of 0.2 L/s for air, will give a reading 33% lower than the actual chamber pressure. The solution is to connect all gauges using short and fat (i.e., high-conductance) tubes, and/or to run the gauge at a lower emission current. 2. A new gauge is a source of significant outgassing, which increases further when turned on as its temperature increases. Whenever a well-outgassed gauge is exposed to the atmosphere, gas adsorption occurs, and once again significant outgassing will result after system evacuation. This affects measurements in any part of the pressure range, but is more significant at very low pressures. Provision is made for outgassing all ionization gauges. For gauges especially suitable for pressures higher than the low 107 torr range, the grid of the gauge is a heavy non-sag tungsten or molybdenum wire that can be heated using a high-current, low-voltage supply. Temperatures of 13008C can be achieved, but higher temperatures, desirable for UHV applications, can cause grid sagging; the radiation from the grid accelerates the outgassing of the entire gauge structure, including the envelope. The gauge remains in operation throughout the outgassing, and when the system pressure falls well below that
15
existing before starting the outgas, the process can be terminated. For a system operating in the 107 torr range, 30 to 60 min should be adequate. The higher the operating pressure, the lower is the importance of outgassing. For pressures in the ultrahigh vacuum region ( 50), is given by C¼
12:1ðD3 Þ L
ð6Þ
where L is the length in centimeters and C is in L/s. Molecular flow occurs in virtually all high-vacuum systems. Note that the conductance in this regime is independent of pressure. The performance of pumping systems is frequently limited by practical conductance limits. For any component, conductance in the low-pressure regime is lower than in any other pressure regime, so careful design consideration is necessary.
20
COMMON CONCEPTS
At higher pressures (PD 0.5) the flow becomes viscous. For long tubes, where laminar flow is fully developed (L/D 100), the conductance is given by ð182ÞðPÞðD4 Þ C¼ L
ð7Þ
As can be seen from this equation, in viscous flow, the conductance is dependent on the fourth power of the diameter, and is also dependent upon the average pressure in the tube. Because the vacuum pumps used in the higher-pressure range normally have significantly smaller pumping speeds than do those for high vacuum, the problems associated with the vacuum plumbing are much simpler. The only time that one must pay careful attention to the higher-pressure performance is when system cycling time is important, or when the entire process operates in the viscous flow regime. When a group of components are connected in series, the net conductance of the group can be approximated by the expression 1 1 1 1 ¼ þ þ þ Ctotal C1 C2 C3
ð8Þ
From this expression, it is clear that the limiting factor in the conductance of any string of components is the smallest conductance of the set. It is not possible to compensate low conductance, e.g., in a small valve, by increasing the conductance of the remaining components. This simple fact has escaped very many casual assemblers of vacuum systems. The vacuum system shown in Figure 5 is assumed to be operating with a fixed input of gas from an external source, which dominates all other sources of gas such as outgassing or leakage. Once flow equilibrium is established, the throughput of gas, Q, will be identical at any plane drawn through the system, since the only source of gas is the external source, and the only sink for gas is the pump. The pressure at the mouth of the pump is given by P2 ¼
Q Spump
ð9Þ
and the pressure in the chamber will be given by P1 ¼
Q Schamber
ð10Þ
Figure 5. Pressures and pumping speeds developed by a steady throughput of gas (Q) through a vacuum chamber, conductance (C) and pump.
Combining this with Equation 4, to eliminate pressure, we have 1 1 1 ¼ þ Schamber Spump C
ð11Þ
For the case where there are a series of separate components in the pumping line, the expression becomes 1 1 1 1 1 ¼ þ þ þ þ Schamber Spump C1 C2 C3
ð12Þ
The above discussion is intended only to provide an understanding of the basic principles involved and the type of calculations necessary to specify system components. It does not address the significant deviations from this simple framework that must be corrected for, in a precise calculation (O’Hanlon, 1989). The estimation of the base pressure requires a determination of gas influx from all sources and the speed of the high-vacuum pump at the base pressure. The outgassing contributed by samples introduced into a vacuum system should not be neglected. The critical sources are outgassing and permeation. Leaks can be reduced to negligible levels using good assembly techniques. Published outgassing and permeation rates for various materials can vary by as much as a factor of two (O’Hanlon, 1989; Redhead et al., 1968; Santeler et al., 1966). Several computer programs, such as that described by Santeler (1987), are available for more precise calculation.
LEAK DETECTION IN VACUUM SYSTEMS Before assuming that a vacuum system leaks, it is useful to consider if any other problem is present. The most important tool in such a consideration is a properly maintained log book of the operation of the system. This is particularly the case if several people or groups use a single system. If key check points in system operation are recorded weekly, or even monthly, then the task of detecting a slow change in performance is far easier. Leaks develop in cracked braze joints, or in torchbrazed joints once the flux has finally been removed. Demountable joints leak if the sealing surfaces are badly scratched, or if a gasket has been scuffed, by allowing the flange to rotate relative to the gasket as it is compressed. Cold flow of Teflon or other gaskets slowly reduces the compression and leaks develop. These are the easy leaks to detect, since the leak path is from the atmosphere into the vacuum chamber, and a trace gas can be used for detection. A second class of leaks arise from faulty construction techniques; they are known as virtual leaks. In all of these, a volume or void on the inside of a vacuum system communicates to that system only through a small leak path. Every time the system is vented to the atmosphere, the void fills with venting gas, then in the pumpdown this gas flows back into the chamber with a slowly decreasing throughput, as the pressure in the void falls. This extends
GENERAL VACUUM TECHNIQUES
the system pumpdown. A simple example of such a void is a screw placed in a blind tapped hole. A space always remains at the bottom of the hole and the void is filled by gas flowing along the threads of the screw. The simplest solution is a screw with a vent hole through the body, providing rapid pumpout. Other examples include a double O-ring in which the inside O-ring is defective, and a double weld on the system wall with a defective inner weld. A mass spectrometer is required to confirm that a virtual leak is present. The pressure is recorded during a routine exhaust, and the residual gas composition is determined as the pressure is approaching equilibrium. The system is again vented using the same procedure as in the preceding vent, but the vent uses a gas that is not significant in the residual gas composition; the gas used should preferably be nonadsorbing, such as a rare gas. After a typical time at atmospheric pressure, the system is again pumped down. If gas analysis now shows significant vent gas in the residual gas composition, then a virtual leak is probably present, and one can only look for the culprit in faulty construction. Leaks most often fall in the range of 104 to 106 torrL/s. The traditional leak rate is expressed in atmospheric cubic centimeters per second, which is 1.3 torr-L/s. A variety of leak detectors are available with practical sensitivities varying from around 1 103 to 2 1011 torr-L/s. The simplest leak detection procedure is to slightly pressurize the system and apply a detergent solution, similar to that used by children to make soap bubbles, to the outside of the system. With a leak of 1 103 torrL/s, bubbles should be detectable in a few seconds. Although the lower limit of detection is at least one decade lower than this figure, successful use at this level demands considerable patience. A similar inside-out method of detection is to use the kind of halogen leak detector commonly available for refrigeration work. The vacuum system is partially backfilled with a freon and the outside is examined using a sniffer hose connected to the detector. Leaks the order of 1 105 torr-L/s can be detected. It is important to avoid any significant drafts during the test, and the response time can be many seconds, so the sniffer must be moved quite slowly over the suspect area of the system. A far more sensitive instrument for this procedure is a dedicated helium leak detector (see below) with a sniffer hose testing a system partially back-filled with helium. A pressure gauge on the vacuum system can be used in the search for leaks. The most productive approach applies if the system can be segmented by isolation valves. By appropriate manipulation, the section of the system containing the leak can be identified. A second technique is not so straightforward, especially in a nonbaked system. It relies on the response of ion or thermal conductivity gauges differing from gas to gas. For example, if the flow of gas through a leak is changed from air to helium by covering the suspected area with helium, then the reading of an ionization gauge will change, since the helium sensitivity is only 16% of that for air. Unfortunately, the flow of helium through the leak is likely to be 2.7 times that for air, assuming a molecular flow leak, which partially offsets the change in gauge sensitivity. A much greater problem is that the search for a leak is often started just after expo-
21
sure to the atmosphere and pumpdown. Consequently outgassing is an ever-changing factor, decreasing with time. Thus, one must detect a relatively small decrease in a gauge reading, due to the leak, against a decreasing background pressure. This is not a simple process; the odds are greatly improved if the system has been baked out, so that outgassing is a much smaller contributor to the system pressure. A far more productive approach is possible if a mass spectrometer is available on the system. The spectrometer is tuned to the helium-4 peak, and a small helium probe is moved around the system, taking the precautions described later in this section. The maximum sensitivity is obtained if the pumping speed of the system can be reduced by partially closing the main pumping valve to increase the pressure, but no higher than the mid-105 torr range, so that the full mass spectrometer resolution is maintained. Leaks in the 1 108 torr-L/s range should be readily detected. The preferred method of leak detection uses a standalone helium mass spectrometer leak detector (HMSLD). Such instruments are readily available with detection limits of 2 1010 torr-L/s or better. They can be routinely calibrated so the absolute size of a leak can be determined. In many machines this calibration is automatically performed at regular intervals. Given this, and the effective pumping speed, one can find, using Equation 1, whether the leak detected is the source of the observed deterioration in the system base pressure. In an HMSLD, a small mass spectrometer tuned to detect helium is connected to a dedicated pumping system, usually a diffusion or turbomolecular pump. The system or device to be checked is connected to a separately pumped inlet system, and once a satisfactory pressure is achieved, the inlet system is connected directly to the detector and the inlet pump is valved off. In this mode, all of the gas from the test object passes directly to the helium leak detector. The test object is then probed with helium, and if a leak is detected, and is covered entirely with a helium blanket, the reading of the detector will provide an absolute indication of the leak size. In this detection mode, the pressure in the leak detector module cannot exceed 104 torr, which places a limit on the gas influx from the test object. If that influx exceeds some critical value, the flow of gas to the helium mass spectrometer must be restricted, and the sensitivity for detection will be reduced. This mode, of leak detection is not suitable for dirty systems, since the gas flows from the test object directly to the detector, although some protection is usually provided by interposing a liquid nitrogen cold trap. An alternative technique using the HMSLD is the socalled counterflow mode. In this, the mass spectrometer tube is pumped by a diffusion or turbomolecular pump which is designed to be an ineffective pump for helium (and for hydrogen), while still operating at normal efficiency for all higher-molecular-weight gases. The gas from the object under test is fed to the roughing line of the mass spectrometer high-vacuum pump, where a higher pressure can be tolerated (on the order of 0.5 torr). Contaminant gases, such as hydrocarbons, as well as air, cannot reach the spectrometer tube. The sensitivity of an
22
COMMON CONCEPTS
HMSLD in this mode is reduced about an order of magnitude from the conventional mode, but it provides an ideal method of examining quite dirty items, such as metal drums or devices with a high outgassing load. The procedures for helium leak detection are relatively simple. The HMSLD is connected to the test object for maximum possible pumping speed. The time constant for the buildup of a leak signal is proportional to V/S, where V is the volume of the test system and S the effective pumping speed. A small time constant allows the helium probe to be moved more rapidly over the system. For very large systems, pumped by either a turbomolecular or diffusion pump, the response time can be improved by connecting the HMSLD to the foreline of the system, so the response is governed by the system pump rather than the relatively small pump of the HMSLD. With pumping systems that use a capture-type pump, this procedure cannot be used, so a long time constant is inevitable. In such cases, use of an HMSLD and helium sniffer to probe the outside of the system, after partially venting to helium, may be a better approach. Further, a normal helium leak check is not possible with an operating cryopump; the limited capacity for pumping helium can result in the pump serving as a low-level source of helium, confounding the test. Rubber tubing must be avoided in the connection between system and HMSLD, since helium from a large leak will quickly permeate into the rubber and thereafter emit a steadily declining flow of helium, thus preventing use of the most sensitive detection scale. Modern leak detectors can offset such background signals, if they are relatively constant with time. With the HMSLD operating at maximum sensitivity, a probe, such as a hypodermic needle with a very slow flow of helium, is passed along any suspected leak locations, starting at the top of the system, and avoiding drafts. Whenever a leak signal is first heard, and the presence of a leak is quite apparent, the probe is removed, allowing the signal to decay; checking is resumed, using the probe with no significant helium flow, to pinpoint the exact location of the leak. Ideally, the leak should be fixed before the probe is continued, but in practice the leak is often plugged with a piece of vacuum wax (sometimes making the subsequent repair more difficult), and the probe is completed before any repair is attempted. One option, already noted, is to blanket the leak site with helium to obtain a quantitative measure of its size, and then calculate whether this is the entire problem. This is not always the preferred procedure, because a large slug of helium can lead to a lingering background in the detector, precluding a check for further leaks at maximum detector sensitivity. A number of points need to be made with regard to the detection of leaks: 1. Bellows should be flexed while covered with helium. 2. Leaks in water lines are often difficult to locate. If the water is drained, evaporative cooling may cause ice to plug a leak, and helium will permeate through the plug only slowly. Furthermore, the evaporating water may leave mineral deposits that plug the hole. A flow of warm gas through the line, overnight, will
often open up the leak and allow helium leak detection. Where the water lines are internal to the system, the chamber must be opened so that the entire line is accessible for a normal leak check. However, once the lines can be viewed, the location of the leak is often signaled by the presence of discoloration. 3. Do not leave a helium probe near an O-ring for more than a few seconds; if too much helium goes into solution in the elastomer, the delayed permeation that develops will cause a slow flow of helium into the system, giving a background signal which will make further leak detection more difficult. 4. A system with a high background of hydrogen may produce a false signal in the HMSLD because of inadequate resolution of the helium and hydrogen peaks. A system that is used for the hydrogen isotopes deuterium or tritium will also give a false signal because of the presence of D2 or HT, both of which have their major peaks at mass 4. In such systems an alternate probe gas such as argon must be used, together with a mass spectrometer which can be tuned to the mass 40 peak. Finally, if a leak is found in a system, it is wise to fix it properly the first time lest it come back to haunt you!
LITERATURE CITED Alpert, D. 1959. Advances in ultrahigh vacuum technology. In Advances in Vacuum Science and Technology, vol. 1: Proceedings of the 1st International Conference on Vacuum Technology (E. Thomas, ed. ) pp. 31–38. Pergamon Press, London. Alvesteffer, W. J., Jacobs, D. C., and Baker, D. H., 1995. Miniaturized thin film thermal vacuum sensor. J. Vac. Sci. Technol. A13:2980–298. Arnold, P. C., Bills, D. G., Borenstein, M. D., and Borichevsky, S. C. 1994. Stable and reproducible Bayard-Alpert ionization gauge. J. Vac. Sci. Technol. A12:580–586. ¨ ber eine neue Molekularpumpe. In Advances Becker, W. 1959. U in Ultrahigh Vacuum Technology. Proc. 1st. Int. Cong. on Vac. Tech. (E. Thomas, ed.) pp. 173–176. Pergamon Press, London. Benson, J. M., 1957. Thermopile vacuum gauges having transient temperature compensation and direct reading over extended ranges. In National Symp. on Vac. Technol. Trans. (E. S. Perry and J. H. Durrant, eds.) pp. 87–90. Pergamon Press, London. Bills, D. G. and Allen, F. G., 1955. Ultra-high vacuum valve. Rev. Sci. Instrum. 26:654–656. Brubaker, W. M. 1959. A method of greatly enhancing the pumping action of a Penning discharge. In Proc. 6th. Nat. AVS Symp. pp. 302–306. Pergamon Press, London. Coffin, D. O. 1982. A tritium-compatible high-vacuum pumping system. J. Vac. Sci. Technol. 20:1126–1131. Dawson, P. T. 1995. Quadrupole Mass Spectrometry and its Applications. AVS Classic Series in Vacuum Science and Technology. Springer-Verlag, New York. Drinkwine, M. J. and Lichtman, D. 1980. Partial pressure analyzers and analysis. American Vacuum Society Monograph Series, American Vacuum Society, New York. Dobrowolski, Z. C. 1979. Fore-Vacuum Pumps. In Methods of Experimental Physics, Vol. 14 (G. L. Weissler and R. W. Carlson, eds.) pp. 111–140. Academic Press, New York.
GENERAL VACUUM TECHNIQUES Filippelli, A. R. and Abbott, P. J. 1995. Long-term stability of Bayard-Alpert gauge performance: Results obtained from repeated calibrations against the National Institute of Standards and Technology primary vacuum standard. J. Vac. Sci. Technol. A13:2582–2586. Fulker, M. J. 1968. Backstreaming from rotary pumps. Vacuum 18:445–449. Giorgi, T. A., Ferrario, B., and Storey, B., 1985. An updated review of getters and gettering. J. Vac. Sci. Technol. A3:417–423. Hablanian, M. H. 1997. High-Vacuum Technology, 2nd ed., Marcel Dekker, New York. Hablanian, M. H. 1995. Diffusion pumps: Performance and operation. American Vacuum Society Monograph, American Vacuum Society, New York.
23
Peacock, R. N. 1998. Vacuum gauges. In Foundations of Vacuum Science and Technology (J. M. Lafferty, ed.) pp. 403–406. John Wiley & Sons, New York. Peacock, R. N., Peacock, N. T., and Hauschulz, D. S., 1991. Comparison of hot cathode and cold cathode ionization gauges. J. Vac. Sci. Technol. A9: 1977–1985. Penning, F. M. 1937. High vacuum gauges. Philips Tech. Rev. 2:201–208. Penning, F. M. and Nienhuis, K. 1949. Construction and applications of a new design of the Philips vacuum gauge. Philips Tech. Rev. 11:116–122. Redhead, P. A. 1960. Modulated Bayard-Alpert Gauge Rev. Sci. Instr. 31:343–344.
Harra, D. J. 1976. Review of sticking coefficients and sorption capacities of gases on titanium films. J. Vac. Sci. Technol. 13: 471–474.
Redhead, P. A., Hobson, J. P., and Kornelsen, E. V. 1968. The Physical Basis of Ultrahigh Vacuum AVS Classic Series in Vacuum Science and Technology. Springer-Verlag, New York.
Hoffman, D. M. 1979. Operation and maintenance of a diffusionpumped vacuum system. J. Vac. Sci. Technol. 16:71–74.
Reimann, A. L. 1952. Vacuum Technique. Chapman & Hall, London.
Holland, L. 1971. Vacua: How they may be improved or impaired by vacuum pumps and traps. Vacuum 21:45–53.
Rosebury, F., 1965. Handbook of Electron Tube and Vacuum Technique. AVS Classic Series in Vacuum Science and Technology. Springer-Verlag, New York.
Hyland, R. W. and Shaffer, R. S. 1991. Recommended practices for the calibration and use of capacitance diaphragm gages as transfer standards. J. Vac. Sci. Technol. A9:2843–2863. Jepsen, R. L. 1967. Cooling apparatus for cathode getter pumps. U. S. patent 3,331,975, July 16, 1967. Jepsen, R. L., 1968. The physics of sputter-ion pumps. Proc. 4th. Int. Vac. Congr. : Inst. Phys. Conf. Ser. No. 5. pp. 317–324. The Institute of Physics and the Physical Society, London. Kendall, B. R. F. and Drubetsky, E. 1997. Cold cathode gauges for ultrahigh vacuum measurements. J. Vac. Sci. Technol. A15: 740–746. Kohl, W. H., 1967. Handbook of Materials and Techniques for Vacuum Devices. AVS Classic Series in Vacuum Science and Technology. Springer-Verlag, New York. Kuznetsov, M. V., Nazarov, A. S., and Ivanovsky, G. F. 1969. New developments in getter-ion pumps in the U. S. S. R. J. Vac. Sci. Technol. 6:34–39. Lange, W. J., Singleton, J. H., and Eriksen, D. P., 1966. Calibration of a low pressure Penning discharge type gauges. J. Vac. Sci. Technol. 3:338–344. Lawson, R. W. and Woodward, J. W. 1967. Properties of titaniummolybdenum alloy wire as a source of titanium for sublimation pumps. Vacuum 17:205–209. Lewin, G. 1985. A quantitative appraisal of the backstreaming of forepump oil vapor. J. Vac. Sci. Technol. A3:2212–2213. Li, Y., Ryding, D., Kuzay, T. M., McDowell, M. W., and Rosenburg, R. A., 1995. X-ray photoelectron spectroscopy analysis of cleaning procedures for synchrotron radiation beamline materials at the Advanced Proton Source. J. Vac. Sci. Technol. A13:576–580. Lieszkovszky, L., Filippelli, A. R., and Tilford, C. R. 1990. Metrological characteristics of a group of quadrupole partial pressure analyzers. J. Vac. Sci. Technol. A8:3838–3854. McCracken,. G. M. and Pashley, N. A., 1966. Titanium filaments for sublimation pumps. J. Vac. Sci. Technol. 3:96–98. Nottingham, W. B. 1947. 7th. Annual Conf. on Physical Electronics, M.I.T.
Rosenburg, R. A., McDowell, M. W., and Noonan, J. R., 1994. X-ray photoelectron spectroscopy analysis of aluminum and copper cleaning procedures for the Advanced Proton Source. J. Vac. Sci. Technol. A12:1755–1759. Rutherford, 1963. Sputter-ion pumps for low pressure operation. In Proc. 10th. Nat. AVS Symp. pp. 185–190. The Macmillan Company, New York. Santeler, D. J. 1987. Computer design and analysis of vacuum systems. J. Vac. Sci. Technol. A5:2472–2478. Santeler, D. J., Jones, W. J., Holkeboer, D. H., and Pagano, F. 1966. AVS Classic Series in Vacuum Science and Technology. Springer-Verlag, New York. Sasaki, Y. T. 1991. A survey of vacuum material cleaning procedures: A subcommittee report of the American Vacuum Society Recommended Practices Committee. J. Vac. Sci. Technol. A9:2025–2035. Singleton, J. H. 1969. Hydrogen pumping speed of sputter-ion pumps. J. Vac. Sci. Technol. 6:316–321. Singleton, J. H. 1971. Hydrogen pumping speed of sputterion pumps and getter pumps. J. Vac. Sci. Technol. 8:275– 282. Snouse, T. 1971. Starting mode differences in diode and triode sputter-ion pumps J. Vac. Sci. Technol. 8:283–285. Tilford, C. R. 1994. Process monitoring with residual gas analyzers (RGAs): Limiting factors. Surface and Coatings Technol. 68/69: 708–712. Tilford, C. R., Filippelli, A. R., and Abbott, P. J. 1995. Comments on the stability of Bayard-Alpert ionization gages. J. Vac. Sci. Technol. A13:485–487. Tom, T. and James, B. D. 1969. Inert gas ion pumping using differential sputter-yield cathodes. J. Vac. Sci. Technol. 6:304– 307. Welch, K. M. 1991. Capture pumping technology. Pergamon Press, Oxford, U. K.
O’Hanlon, J. F. 1989. A User’s Guide to Vacuum Technology. John Wiley & Sons, New York.
Welch, K. M. 1994. Pumping of helium and hydrogen by sputterion pumps. II. Hydrogen pumping. J. Vac. Sci. Technol. A12:861–866.
Osterstrom, G. 1979. Turbomolecular vacuum pumps. In Methods of Experimental Physics, Vol. 14 (G. L. Weissler and R. W. Carlson, eds.) pp. 111–140. Academic Press, New York.
Wheeler, W. R. 1963. Theory And Application Of Metal Gasket Seals. Trans. 10th. Nat. Vac. Symp. pp. 159–165. Macmillan, New York.
24
COMMON CONCEPTS
KEY REFERENCES Dushman, 1962. See above. Provides the scientific basis for all aspects of vacuum technology. Hablanian, 1997. See above.
measurement of derived properties, particularly density, will also be discussed, as well as some indirect techniques used particularly by materials scientists in the determination of mass and density, such as the quartz crystal microbalance for mass measurement and the analysis of diffraction data for density determination.
Excellent general practical guide to vacuum technology.
INDIRECT MASS MEASUREMENT TECHNIQUES
Kohl, 1967. See above. A wealth of information on materials for vacuum use, and on electron sources. Lafferty, J. M. (ed.). 1998. Foundations of Vacuum Science and Technology. John Wiley & Sons, New York. Provides the scientific basis for all aspects of vacuum technology. O’Hanlon, 1989. See above. Probably the best general text for vacuum technology; SI units are used throughout. Redhead et al., 1968. See above. The classic text on UHV; a wealth of information. Rosebury, 1965. See above. An exceptional practical book covering all aspects of vacuum technology and the materials used in system construction. Santeler et al., 1966. See above. A very practical approach, including a unique treatment of outgassing problems; suffers from lack of an index.
JACK H. SINGLETON Consultant Monroeville Pennsylvania
MASS AND DENSITY MEASUREMENTS
A number of differential and equivalence methods are frequently used to measure mass, or obtain an estimate of the change in mass during the course of a process or analysis. Given knowledge of the system under study, it is often possible to ascertain with reasonable accuracy the quantity of material using chemical or physical equivalence, such as the evolution of a measurable quantity of liquid or vapor by a solid upon phase transition, or the titrimetric oxidation of the material. Electroanalytical techniques can provide quantitative numbers from coulometry during an electrodeposition or electrodissolution of a solid material. Magnetometry can provide quantitative information on the amount of material when the magnetic susceptibility of the material is known. A particularly important indirect mass measurement tool is the quartz crystal microbalance (QCM). The QCM is a piezoelectric quartz crystal routinely incorporated in vacuum deposition equipment to monitor the buildup of films. The QCM is operated at a resonance frequency that changes (shifts) as the mass of the crystal changes, providing the valuable information needed to estimate mass changes on the order of 109 to 1010 g/cm2, giving these devices a special niche in the differential mass measurement arena (Baltes et al., 1998). QCMs may also be coupled with analytical techniques such as electrochemistry or differential thermal analysis to monitor the simultaneous buildup or removal of a material under study.
INTRODUCTION The precise measurement of mass is one of the more challenging measurement requirements that materials scientists must deal with. The use of electronic balances has become so widespread and routine that the accurate measurement of mass is often taken for granted. While government institutions such as the National Institutes of Standards and Technology (NIST) and state metrology offices enforce controls in the industrial and legal sectors, no such rigors generally affect the research laboratory. The process of peer review seldom makes assessments of the accuracy of an underlying measurement involved unless an egregious problem is brought to the surface by the reported results. In order to ensure reproducibility, any measurement process in a laboratory should be subjected to a rigorous and frequent calibration routine. This unit will describe the options available to the investigator for establishing and executing such a routine; it will define the underlying terms, conditions, and standards, and will suggest appropriate reporting and documenting practices. The measurement of mass, which is a fundamental measurement of the amount of material present, will constitute the bulk of the discussion. However, the
DEFINITION OF MASS, WEIGHT, AND DENSITY Mass has already been defined as a measure of the amount of material present. Clearly, there is no direct way to answer the fundamental question ‘‘what is the mass of this material?’’ Instead, the question must be answered by employing a tool (a balance) to compare the mass of the material to be measured to a known mass. While the SI unit of mass is the kilogram, the convention in the scientific community is to report mass or weight measurements in the metric unit that more closely yields a whole number for the amount of material being measured (e.g., grams, milligrams, or micrograms). Many laboratory balances contain ‘‘internal standards,’’ such as metal rings of calibrated mass or an internally programmed electronic reference in the case of magnetic force compensation balances. To complicate things further, most modern electronic balances apply a set of empirically derived correction factors to the differential measurement (of the sample versus the internal standard) to display a result on the readout of the balance. This readout, of course, is what the investigator is to take on faith, and record the amount of material present
MASS AND DENSITY MEASUREMENTS
dAvg
mg
dAvg
Mg
Figure 1. Schematic diagram of an equal-arm two-pan balance.
to as many decimal places as appeared on the display. In truth one must consider several concerns: what is the actual accuracy of the balance? How many of the figures in the display are significant? What are the tolerances of the internal standards? These and other relevant issues will be discussed in the sections to follow. One type of balance does not cloak its modus operandi in internal standards and digital circuitry: the equal arm balance. A schematic diagram of an equal arm balance is shown in Figure 1. This instrument is at the origin of the term ‘‘balance,’’ which is derived from a Latin word meaning having two pans. This elegantly simple device clearly compares the mass of the unknown to a known mass standard (see discussion of Weight Standards, below) by accurately indicating the deflection of the lever from the equilibrium state (the ‘‘balance point’’). We quickly draw two observations from this arrangement. First, the lever is affected by a force, not a mass, so the balance can only operate in the presence of a gravitational field. Second, if the sample and reference mass are in a gaseous atmosphere, then each will have buoyancy characterized by the mass of the air displaced by each object. The amount of displaced air will depend on such factors as sample porosity, but for simplicity we assume here (for definition purposes) that neither the sample nor the reference mass are porous and the volume of displaced air equals the volume of the object. We are now in a position to define the weight of an object. The weight (W) is effectively the force exerted by a mass (M) under the influence of a gravitational field, i.e., W ¼ Mg, where g is the acceleration due to gravity (9.80665 m/s2). Thus, a mass of exactly 1 g has a weight in centimeter–gram–second (cgs) units of 1 g 980.665 cm/ s2 ¼ 980.665 dyn, neglecting buoyancy due to atmospheric displacement. It is common to state that the object ‘‘weighs’’ 1 g (colloquially equating the gram to the force exerted by gravity on one gram), and to do so neglects any effect due to atmospheric buoyancy. The American Society for Testing and Materials (ASTM, 1999) further defines the force (F) exerted by a weight measured in the air as Mg dA F¼ ð1Þ 1 D 9:80665
25
where dA is the density of air, and D is the density of the weight (standard E4). The ASTM goes on to define a set of units to use in reporting force measurements as mass-force quantities, and presents a table of correction factors that take into account the variation of the Earth’s gravitational field as a function of altitude above (or below) sea level and geographic latitude. Under this custom, forces are reported by relation to the newton, and by definition, one kilogram-force (kgf) unit is equal to 9.80665 N. The kgf unit is commonly encountered in the mechanical testing literature (see HARDNESS TESTING). It should be noted that the ASTM table considers only the changes in gravitational force and the density of dry air; i.e., the influence of humidity and temperature, for example, on the density of air is not provided. The Chemical Rubber Company’s Handbook of Chemistry and Physics (Lide, 1999) tabulates the density of air as a function of these parameters. The International Committee for Weights and Measures (CIPM) provides a formula for air density for use in mass calibration. The CIPM formula accounts for temperature, pressure, humidity, and carbon dioxide concentration. The formula and description can be found in the International Organization for Legal Metrology (OIML) recommendation R 111 (OIML, 1994). The ‘‘balance condition’’ in Figure 1 is met when the forces on both pans are equivalent. Taking M to be the mass of the standard, V to be the volume of the standard, m to be the mass of the sample, and v to be the volume of the sample, then the balance condition is met when mg dA vg ¼ Mg dA Vg. The equation simplifies to m dA v ¼ M dA V as long as g remains constant. Taking the density of the sample to be d (equal to m/v) and that of the standard to be D (equal to M/V), it is easily shown that m ¼ M ½ð1 dA =D ð1 dA =dÞ (Kupper, 1997). This equation illustrates the dependence of a mass measurement on the air density: only when the density of the sample is identical to that of the standard (or when no atmosphere is present at all) is the measured weight representative of the sample’s actual mass. To put the issue into perspective, a dry atmosphere at sea level has a density of 0.0012 g/cm3, while that in Denver, Colorado (1 mile above sea level) has a density of 0.00098 g/cm3 (Kupper, 1997). If we take an extreme example, the measurement of the mass of wood (density 0.373 g/cm3) against steel (density 8.0 g/cm3) versus the weight of wood against a steel weight, we find that a 1 g weight of wood measured at sea level corresponds to a 1.003077 mass of wood, whereas a 1 g weight of wood measured in Denver corresponds to a 1.002511 mass of wood. The error in reporting that the weight of wood (neglecting air buoyancy) did not change would then be (1.003077 g 1.002511 g)/ 1 g ¼ 0.06%, whereas the error in misreporting the mass of the wood at sea level to be 1 g would be (1.003077 g 1 g)/1.003077 g ¼ 0.3%. It is better to assume that the variation in weight as a function of air buoyancy is negligible than to assume that the weighed amount is synonymous with the mass (Kupper, 1990). We have not mentioned the variation in g with altitude, nor as influenced by solar and lunar tidal effects. We have already seen that g is factored out of the balance condition as long as it is held constant, so the problem will not be
26
COMMON CONCEPTS
encountered unless the balance is moved significantly in altitude and latitude without recalibrating. The calibration of a balance should nevertheless be validated any time it is moved to verify proper function. The effect of tidal variations on g has been determined to be of the order of 0.1 ppm (Kupper, 1997), arguably a negligible quantity considering the tolerance levels available (see discussion of Weight Standards). Density is a derived unit defined as the mass per unit volume. Obviously, an accurate measure of both mass and volume is necessary to effect a measurement of the density. In metric units, density is typically reported in g/cm3. A related property is the specific gravity, defined as the weight of a substance divided by the weight of an equal volume of water (the water standard is taken at 48C, where its density is 1.000 g/cm3). In metric units, the specific gravity has the same numerical value as the density, but is dimensionless. In practice, density measurements of solids are made in the laboratory by taking advantage of Archimedes’ principle of displacement. A fluid material, usually a liquid or gas, is used as the medium to be displaced by the material whose volume is to be measured. Precise density measurements require the material to be scrupulously clean, perhaps even degassed in vacuo to eliminate errors associated with adsorbed or absorbed species. The surface of the material may be porous in nature, so that a certain quantity of the displacement medium actually penetrates into the material. The resulting measured density will be intermediate between the ‘‘true’’ or absolute density of the material and the apparent measured density of the material containing, for example, air in its pores. Mercury is useful for the measurement of volumes of relatively smooth materials as the viscosity of liquid mercury at room temperature precludes the penetration of the liquid into pores smaller than 5 mm at ambient pressure. On the other hand, liquid helium may be used to obtain a more faithful measurement of the absolute density, as the fluid will more completely penetrate voids in the material through pores of atomic dimension. The true density of a material may be ascertained from the analysis of the lattice parameters obtained experimentally using diffraction techniques (see Parts X, XI, and XIII). The analysis of x-ray diffraction data elucidates the content of the unit cell in a pure crystalline material by providing lattice parameters that can yield information on vacant lattice sites versus free space in the arrangement of the unit cell. As many metallic crystals are heated, the population of vacant sites in the lattice are known to increase, resulting in a disproportionate decrease in density as the material is heated. Techniques for the measurement of true density has been reported by Feder and Nowick (1957) and by Simmons and Barluffi (1959, 1961, 1962).
WEIGHT STANDARDS Researchers who make an effort to establish a meaningful mass measurement assurance program quickly become embroiled in a sea of acronyms and jargon. While only cer-
tain weight standards are germane to user of the precision laboratory balance, all categories of mass standards may be encountered in the literature and so we briefly list them here. In the United States, the three most common sources of weight standard classifications are NIST (formerly the National Bureau of Standards or NBS), ASTM, and the OIML. A 1954 publication of the NBS (NBS Circular 547) established seven classes of standards: J (covering denominations from 0.05 to 50 mg), M (covering 0.05 mg to 25 kg), S (covering 0.05 mg to 25 kg), S-1 (covering 0.1 mg to 50 kg), P (covering 1 mg to 1000 kg), Q (covering 1 mg to 1000 kg), and T (covering 10 mg to 1000 kg). These classifications were all replaced in 1978 by the ASTM standard E617 (ASTM, 1997), which recognizes the OIML recommendation R 111 (OIML, 1994); this standard was updated in 1997. NIST Handbook 105-1 further establishes class F, covering 1 mg to 5000 kg, primarily for the purpose of setting standards for field standards used in commerce. The ASTM standard E617 establishes eight classes (generally with tighter tolerances in the earlier classes): classes 0, 1, 2, and 3 cover the range from 1 mg to 50 kg, classes 4 and 5 cover the range from 1 mg to 5000 kg, class 6 covers 100 mg to 500 kg, and a special class, class 1.1, covers the range from 1 to 500 mg with the lowest set tolerance level (0.005 mg). The OIML R 111 establishes seven classes (also with more stringent tolerances associated with the earlier classes): E1, E2, F1, F2, and M1 cover the range from 1 mg to 50 kg, M2 covers 200 mg to 50 kg, and M3 covers 1 g to 50 kg. The ASTM classes 1 and 1.1 or OIML classes F1 and E2 are the most relevant to the precision laboratory balance. Only OIML class E1 sets stricter tolerances; this class is applied to primary calibration laboratories for establishing reference standards. The most common material used in mass standards is stainless steel, with a density of 8.0 g/cm3. Routine laboratory masses are often made of brass with a density of 8.4 g/ cm3. Aluminum, with a 2.7-g/cm3 density, is often the material of choice for very small mass standards (50 mg). The international mass standard is a 1-kg cylinder made of platinum-iridium (density 21.5 g/cm3); this cylinder is housed in Sevres, France. Weight standard manufacturers should furnish a certificate that documents the traceability of the standard to the Sevres standard. A separate certificate may be issued that documents the calibration process for the weight, and may include a term to the effect ‘‘weights adjusted to an apparent density of 8.0 g/cm3’’. These weights will have a true density that may actually be different than 8.0 g/cm3 depending on the material used, as the specification implies that the weights have been adjusted so as to counterbalance a steel weight in an atmosphere of 0.0012 g/cm3. In practice, variation of apparent density as a function of local atmospheric density is less than 0.1%, which is lower than the tolerances for all but the most exacting reference standards. Test procedures for weight standards are detailed in annex B of the latest OIML R 111 Committee Draft (1994). The magnetic susceptibility of steel weights, which may affect the calibration of balances based on the electromagnetic force compensation principle, is addressed in these procedures. A few words about the selection of appropriate weight standards for a calibration routine are in order. A
MASS AND DENSITY MEASUREMENTS
fundamental consideration is the so-called 3:1 transfer ratio rule, which mandates that the error of the standard should be < 13 the tolerance of the device being tested (ASTM, 1997). Two types of weights are typically used during a calibration routine, test weights and standard weights. Test weights are usually made of brass and have less stringent tolerances. These are useful for repetitive measurements such as those that test repeatability and off-center error (see Types of Balances). Standard weights are usually manufactured from steel and have tight, NIST-traceable tolerances. The standard weights are used to establish the accuracy of a measurement process, and must be handled with meticulous care to avoid unnecessary wear, surface contamination, and damage. Recalibration of weight standards is a somewhat nebulous issue, as no standard intervals are established. The investigator must factor in such considerations as the requirements of the particular program, historical data on the weight set, and the requirements of the measurement assurance program being used (see Mass Measurement Process Assurance).
TYPES OF BALANCES NIST Handbook 44 (NIST, 1999) defines five classes of weighing device. Class I balances are precision laboratory weighing devices. Class II balances are used for laboratory weighing, precious metal and gem weighing, and grain testing. Class III, III L, and IIII balances are largercapacity scales used in commerce, including everything from postal scales to highway vehicle-weighing scales. Calibration and verification procedures defined in NIST Handbook 44 have been adopted by all state metrology offices in the U.S. Laboratory balances are chiefly available in three configurations: dual-pan equal-arm, mechanical single-pan, and top-loading. The equal arm balance is in essence that which is shown schematically in Figure 1. The single-pan balance replaces the second pan with a set of sliders, masses mounted on the lever itself, or in some cases a dial with a coiled spring that applies an adjustable and quantifiable counter-force. The most common laboratory balance is the top-loading balance. These normally employ an internal mechanism by which a series of internal masses (usually in the form of steel rings) or a system of mechanical flexures counter the applied load (Kupper, 1999). However, a spring load may be used in certain routine top-loading balances. Such concerns as changes in force constant of the spring, hysteresis in the material, etc., preclude the use of spring-loaded balances for all but the most routine measurements. On the other extreme are balances that employ electromagnetic force compensation in lieu of internal masses or mechanical flexures. These latter balances are becoming the most common laboratory balance due to their stability and durability, but it is important to note that the magnetic fields of the balance and sample may interact. Standard test methods for evaluating the performance of each of the three types of balance are set forth in ASTM standards E1270 (ASTM, 1988a; for equal-arm balances), E319 (ASTM,
27
1985; for mechanical single-pan balances), and E898 (ASTM, 1988b, for top-loading direct-reading balances). A number of salient terms are defined in the ASTM standards; these terms are worth repeating here as they are often associated with the specifications that a balance manufacturer may apply to its products. The application of the principles set forth by these definitions in the establishment of a mass measurement process calibration will be summarized in the next section (see Mass Measurement Process Assurance). Accuracy. The degree to which a measured value agrees with the true value. Capacity. The maximum load (mass) that a balance is capable of measuring. Linearity. The degree to which the measured values of a successive set of standard masses weighed on the balance across the entire operating range of the balance approximates a straight line. Some balances are designed to improve the linearity of a measurement by operating in two or more separately calibrated ranges. The user selects the range of operation before conducting a measurement. Off-center Error. Any differences in the measured mass as a function of distance from the center of the balance pan. Hysteresis. Any difference in the measured mass as a function of the history of the balance operation— e.g., a difference in measured mass when the last measured mass was larger than the present measurement versus the measurement when the prior measured mass was smaller. Repeatability. The closeness of agreement for successive measurements of the same mass. Reproducibility. The closeness of agreement of measured values when measurements of a given mass are repeated over a period of time (but not necessarily successively). Reproducibility may be affected by, e.g., hysteresis. Precision. The smallest amount of mass difference that a balance is capable of resolving. Readability. The value of the smallest mass unit that can be read from the readout without estimation. In the case of digital instruments, the smallest displayed digit does not always have a unit increment. Some balances increment the last digit by two or five, for example. Other balances incorporate a vernier or micrometer to subdivide the smallest scale division. In such cases, the smallest graduation on such devices represents the balance’s readability. Since the most common balance encountered in a research laboratory is of the electronic top-loading type, certain peculiar characteristics of this balance will be highlighted here. Balance manufacturers may refer to two categories of balance: those with versus those without internal calibration capability. In essence, an internal calibration capability indicates that a set of traceable standard masses is integrated into the mechanism of the counterbalance.
28
COMMON CONCEPTS
Table 1. Typical Types of Balance Available, by Capacity and Divisions Name Ultramicrobalance Microbalance Semimicrobalance Macroanalytical balance Precision balance Industrial balance
Capacity (range)
Divisions Displayed
2g 3–20 g 30–200 g 50–400 g
0.1 mg 1 mg 10 mg 0.1 mg
100 g–30 kg 30–6000 kg
0.1 mg–1 g 1 g–0.1 kg
A key choice that must be made in the selection of a balance for the laboratory is that of the operating range. A market survey of commercially available laboratory balances reveals that certain categories of balances are available. Table 1 presents common names applied to balances operating in a variety of weight measurement capacities. The choice of a balance or a set of balances to support a specific project is thus the responsibility of the investigator. Electronically controlled balances usually include a calibration routine documented in the operation manual. Where they differ, the routine set forth in the relevant ASTM reference (E1270, E319, or E898) should be considered while the investigator identifies the control standard for the measurement process. Another consideration is the comparison of the operating range of the balance with the requirements of the measurement. An improvement in linearity and precision can be realized if the calibration routine is run over a range suitable for the measurement, rather than the entire operating range. However, the integral software of the electronics may not afford the flexibility to do such a limited function calibration. Also, the actual operation of an electronically controlled balance involves the use of a tare setting to offset such weights as that of the container used for a sample measurement. The tare offset necessitates an extended range that is frequently significantly larger than the range of weights to be measured.
MASS MEASUREMENT PROCESS ASSURANCE An instrument is characterized by its capability to reproducibly deliver a result with a given readability. We often refer to a calibrated instrument; however, in reality there is no such thing as a calibrated balance per se. There are weight standards that are used to calibrate a weight measurement procedure, but that procedure can and should include everything from operator behavior patterns to systematic instrumental responses. In other words, it is the measurement process, not the balance itself that must be calibrated. The basic maxims for evaluating and calibrating a mass measurement process are easily translated to any quantitative measurement in the laboratory to which some standards should be attached for purposes of reporting, quality assurance, and reproducibility. While certain industrial, government, and even university programs have established measurement assurance programs [e.g., as required for International Standards Organization
(ISO) 9000/9001 certification], the application of established standards to research laboratories is not always well defined. In general, it is the investigator who bears the responsibility for applying standards when making measurements and reporting results. Where no quality assurance programs are mandated, some laboratories may wish to institute a voluntary accreditation program. NIST operates the National Voluntary Laboratory Accreditation Program [NVLAP, telephone number (301) 9754042] to assist such laboratories in achieving self-imposed accreditation (Harris, 1993). The ISO Guide on the Expression of Uncertainty in Measurements (1992) identified and recommended a standardized approach for expressing the uncertainty of results. The standard was adopted by NIST and is published in NIST Technical Note 1297 (1994). The NIST publication simplifies the technical aspects of the standard. It establishes two categories of uncertainty, type A and type B. Type A uncertainty contains factors associated with random variability, and is identified solely by statistical analysis of measurement data. Type B uncertainty consists of all other sources of variability; scientific judgment alone quantifies this uncertainty type (Clark, 1994). The process uncertainty under the ISO recommendation is defined as the square root of the sum of the squares of the standard deviations due to all contributing factors. At a minimum, the process variation uncertainty should consist of the standard deviations of the mass standards used (s), the standard deviations of the measurement process (sP), and the estimated standard deviations due to Type B uncertainty (uB). Then, the overall process uncertainty (combined standard uncertainty) is uc ¼ [(s)2 þ (sP)2 þ (uB)2]1/2 (Everhart, 1995). This combined uncertainty value is multiplied by the coverage factor (usually 2) to report the expanded uncertainty (U) to the 95% (2-sigma) confidence level. NIST adopted the 2-sigma level for stating uncertainties in January 1994; uncertainty statements from NIST prior to this date were based on the 3-sigma (99%) confidence level. More detailed guidelines for the computation and expression of uncertainty, including a discussion of scatter analysis and error propagation, is provided in NIST Technical Note 1297 (1994). This document has been adopted by the CIPM, and it is available online (see Internet Resources). J. Everhart (JTI Systems, Inc.) has proposed a process measurement assurance program that affords a powerful, systematic tool for accumulating meaningful data and insight on the uncertainties associated with a measurement procedure, and further helps to improve measurement procedures and data quality by integrating a calibration program with day-to-day measurements (Everhart, 1988). An added advantage of adopting such an approach is that procedural errors, instrument drift or malfunction, or other quality-reducing factors are more likely to be caught quickly. The essential points of Everhart’s program are summarized here. 1. Initial measurements are made by metrology specialists or experts in the measurement using a control standard to establish reference confidence limits.
MASS AND DENSITY MEASUREMENTS
29
Figure 2. The process measurement assurance program control chart, identifying contributions to the errors associated with any measurement process. (After Everhart, 1988.)
2. Technicians or operators measure the control standard prior to each significant event as determined by the principal investigator (say, an experiment or even a workday). 3. Technicians or operators measure the control standard again after each significant event. 4. The data are recorded and the control measurements checked against the reference confidence limits. The errors are analyzed and categorized as systematic (bias), random variability, and overall measurement system error. The results are plotted over time to yield a chart like that shown in Figure 2. It is clear how adopting such a discipline and monitoring the charted standard measurement data will quickly identify problems. Essential practices to institute with any measurement assurance program are to apply an external check on the program (e.g., round robin), to have weight standards recalibrated periodically while surveillance programs are in place, and to maintain a separate calibrated weight standard, which is not used as frequently as the working standards (Harris, 1996). These practices will ensure both accuracy and traceability in the measurement process. The knowledge of error in measurements and uncertainty estimates can immediately improve a quantitative measurement process. By establishing and implementing standards, higher-quality data and greater confidence in the measurements result. Standards established in industrial or federal settings should be applied in a research environment to improve data quality. The measurement of mass is central to the analysis of materials properties, so the importance of establishing and reporting uncertainties and confidence limits along with the measured results cannot be overstated. Accurate record keeping and data analysis can help investigators identify and correct such problems as bias, operator error, and instrument malfunctions before they do any significant harm.
ACKNOWLEDGMENTS The authors gratefully acknowledge the contribution of Georgia Harris of the NIST Office of Weights and Measures for providing information, resources, guidance, and direction in the preparation of this unit and for reviewing the completed manuscript for accuracy. We also wish to thank Dr. John Clark for providing extensive resources that were extremely valuable in preparing the unit.
LITERATURE CITED ASTM. 1985. Standard Practice for the Evaluation of Single-Pan Mechanical Balances, Standard E319 (reapproved, 1993). American Society for Testing and Materials, West Conshohocken, Pa. ASTM. 1988a. Standard Test Method for Equal-Arm Balances, Standard E1270 (reapproved, 1993). American Society for Testing Materials, West Conshohocken, Pa. ASTM. 1988b. Standard Method of Testing Top-Loading, DirectReading Laboratory Scales and Balances, Standard E898 (reapproved 1993). American Society for Testing Materials, West Conshohocken, Pa. ASTM. 1997. Standard Specification for Laboratory Weights and Precision Mass Standards, Standard E617 (originally published, 1978). American Society for Testing and Materials, West Conshohocken, Pa. ASTM. 1999. Standard Practices for Force Verification of Testing Machines, Standard E4-99. American Society for Testing and Materials, West Conshohocken, Pa. Baltes, H., Gopel, W., and Hesse, J. (eds.) 1998. Sensors Update, Vol. 4. Wiley-VCH, Weinheim, Germany. Clark, J. P. 1994. Identifying and managing mass measurement errors. In Proceedings of the Weighing, Calibration, and Quality Standards Conference in the 1990s, Sheffield, England, 1994. Everhart, J. 1988. Process Measurement Assurance Program. JTI Systems, Albuquerque, N. M.
30
COMMON CONCEPTS
Everhart, J. 1995. Determining mass measurement uncertainty. Cal. Lab. May/June 1995. Feder, R. and Nowick, A. S. 1957. Use of Thermal Expansion Measurements to Detect Lattice Vacancies Near the Melting Point of Pure Lead and Aluminum. Phys. Rev.109(6): 1959–1963. Harris, G. L. 1993. Ensuring accuracy and traceability of weighing instruments. ASTM Standardization News, April, 1993. Harris, G. L. 1996. Answers to commonly asked questions about mass standards. Cal. Lab. Nov./Dec. 1996. Kupper, W. E. 1990. Honest weight—limits of accuracy and practicality. In Proceedings of the 1990 Measurement Conference, Anaheim, Calif. Kupper, W. E. 1997. Laboratory balances. In Analytical Instrumentation Handbook, 2nd ed. (G.E. Ewing, ed.). Marcel Dekker, New York. Kupper, W. E. 1999. Verification of high-accuracy weighing equipment. In Proceedings of the 1999 Measurement Science Conference, Anaheim, Calif. Lide, D. R. 1999. Chemical Rubber Company Handbook of Chemistry and Physics, 80th Edition, CRC Press, Boca Raton, Flor. NIST. 1999. Specifications, Tolerances, and Other Technical Requirements for Weighing and Measuring Devices, NIST Handbook 44. U. S. Department of Commerce, Gaithersburg, Md. NIST. 1994. Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results, NIST Technical Note 1297. U. S. Department of Commerce, Gaithersburg, Md. OIML. 1994. Weights of Classes E1, E2, F1, F2, M1, M2, M3: Recommendation R111. Edition 1994(E). Bureau International de Metrologie Legale, Paris. Simmons, R. O. and Barluffi, R. W. 1959. Measurements of Equilibrium Vacancy Concentrations in Aluminum. Phys. Rev. 117(1): 52–61. Simmons, R. O. and Barluffi, R. W. 1961. Measurement of Equilibrium Concentrations of Lattice Vacancies in Gold. Phys. Rev. 125(3): 862–872.
http://www.usp.org United States Pharmacopea Home Page. General information about the program used my many disciplines to establish standards. http://www.nist.gov/owm Office of Weights and Measures Home Page. Information on the National Conference on Weights and Measures and laboratory metrology. http://www.nist.gov/metric NIST Metric Program Home Page. General information on the metric program including on-line publications. http://physics.nist.gov/Pubs/guidelines/contents.html NIST Technical Note 1297: Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results. http://www.astm.org American Society for Testing and Materials Home Page. Information on ASTM committees and standards and ASTM publication ordering services. http://iso.ch International Standards Organization (ISO) Home Page. Information and calendar on the ISO committee and certification programs. http://www.ansi.org American National Standards Institute Home Page. Information on ANSI programs, standards, and committees. http://www.quality.org/ Quality Resources On-line. Resource for quality-related information and groups. http://www.fasor.com/iso25 ISO Guide 25. International list of accreditation bodies, standards organizations, and measurement and testing laboratories.
Simmons, R. O. and Barluffi, R. W. 1962. Measurement of Equilibrium Concentrations of Vacancies in Copper. Phys. Rev. 129(4): 1533–1544.
DAVID DOLLIMORE
KEY REFERENCES
ALAN C. SAMUELS
ASTM, 1985, 1988a, 1988b (as appropriate for the type of balance used). See above.
Edgewood Chemical and Biological Center Aberdeen Proving Ground Maryland
The University of Toledo Toledo, Ohio
These documents delineate the recommended procedure for the actual calibration of balances used in the laboratory. OIML, 1994. See above. This document is basis for the establishment of an international standard for metrological control. A draft document designated TC 9/SC 3/N 1 is currently under review for consideration as an international standard for testing weight standards. Everhart, 1988. See above. The comprehensive yet easy-to-implement program described in this reference is a valuable suggestion for the implementation of a quality assurance program for everything from laboratory research to industrial production.
INTERNET RESOURCES http://www.oiml.org OIML Home Page. General information on the International Organization for Legal Metrology.
THERMOMETRY DEFINITION OF THERMOMETRY AND TEMPERATURE: THE CONCEPT OF TEMPERATURE Thermometry is the science of measuring temperature, and thermometers are the instruments used to measure temperature. Temperature must be regarded as the scientific measure of ‘‘hotness’’ or ‘‘coldness.’’ This unit is concerned with the measurement of temperatures in materials of interest to materials science, and the notion of temperature is thus limited in this discussion to that which applies to materials in the solid, liquid, or gas state (as opposed to the so-called temperature associated with ion
THERMOMETRY
gases and plasmas, which is no longer limited to a measure of the internal kinetic energy of the constituent atoms). A brief excursion into the history of temperature measurement will reveal that measurement of temperature actually preceded the modern definition of temperature and a temperature scale. Galileo in 1594 is usually credited with the invention of a thermometer in the form that indicated the expansion of air as the environment became hotter (Middleton, 1966). This instrument was called a thermoscope, and consisted of air trapped in a bulb by a column of liquid (Galileo used water) in a long tube attached to the bulb. It can properly be called an air thermometer when a scale is added to measure the expansion, and such an instrument was described by Telioux in 1611. Variation in atmospheric pressure would cause the thermoscope to develop different readings, as the liquid was not sealed into the tube and one surface of the liquid was open to the atmosphere. The simple expedient of sealing the instrument so that the liquid and gas were contained in the tube really marks the invention of a glass thermometer. By making the diameter of the tube small, so that the volume of the gas was considerably reduced, the liquid dilation in these sealed instruments could be used to indicate the temperature. Such a thermometer was used by Ferdinand II, Grand Duke of Tuscany, about 1654. Fahrenheit eventually substituted mercury for the ‘‘spirits of wine’’ earlier used as the working liquid fluid, because mercury’s thermal expansion with temperature is more nearly linear. Temperature scales were then invented using two selected fixed points—usually the ice point and the blood point or the ice point and the boiling point.
THE THERMODYNAMIC TEMPERATURE SCALE The starting point for the thermodynamic treatment of temperature is to state that it is a property that determines in which direction energy will flow when it is in contact with another object. Heat flows from a highertemperature object to a lower-temperature object. When two objects have the same temperature, there is no flow of heat between them and the objects are said to be in thermal equilibrium. This forms the basis of the Zeroth Law of thermodynamics. The First Law of thermodynamics stipulates that energy must be conserved during any process. The Second Law introduces the concepts of spontaneity and reversibility—for example, heat flows spontaneously from a higher-temperature system to a lower-temperature one. By considering the direction in which processes occur, the Second Law implicitly demands the passage of time, leading to the definition of entropy. Entropy, S, is defined as the thermodynamic state function of a system where dS dq=T, where q is the heat and T is the temperature. When the equality holds, the process is said to be reversible, whereas the inequality holds in all known processes (i.e., all known processes occur irreversibly). It should be pointed out that dS (and hence the ratio dq=T in a reversible process) is an exact differential, whereas dq is not. The flow of heat in an irreversible process is path dependent. The Third Law defines the absolute zero point of the ther-
31
Figure 1. Change in entropy when heat is completely converted into work.
modynamic temperature scale, and further stipulates that no process can reduce the temperature of a macroscopic system to this point. The laws of thermodynamics are defined in detail in all introductory texts on the topic of thermodynamics (e.g., Rock, 1983). A consideration of the efficiency of heat engines leads to a definition of the thermodynamic temperature scale. A heat engine is a device that converts heat into work. Such a process of producing work in a heat engine must be spontaneous. It is necessary then that the flow of energy from the hot source to the cold sink be accompanied by an overall increase in entropy. Thus in such a hypothetical engine, heat, jqj, is extracted from a hot sink of temperature, Th, and converted completely to work. This is depicted in Figure 1. The change in entropy, S, is then: S ¼
jqj Th
ð1Þ
This value of S is negative, and the process is nonspontaneous. With the addition of a cold sink (see Fig. 2), the removal of the jqh j from the hot sink changes its entropy by
Figure 2. Change in entropy when some heat from the hot sink is converted into work and some into a cold sink.
32
COMMON CONCEPTS
jqh j=Th and the transfer of jqc j to the cold sink increases its entropy by jqc j=Tc . The overall entropy change is: S ¼
jqh j jqc j þ Th Tc
ð2Þ
S is then greater than zero if: jqc j ðTc =Th Þ jqh j
ð3Þ
and the process is spontaneous. The maximum work of the engine can then be given by: jwmax j ¼ jqh j jqcmin j ¼ jqh j ðTc =Th Þ jqh j ¼ ½1 ðTc =Th Þ jqh j
ð4Þ
The maximum possible efficiency of the engine is: erev
jwmax j ¼ jqh j
ð5Þ
when by the previous relationship erev ¼ 1 ðTc =Th Þ. If the engine is working reversibly, S ¼ 0 and jqc j Tc ¼ jqh j Th
ð6Þ
Kelvin used this to define the thermodynamic temperature scale using the ratio of the heat withdrawn from the hot sink and the heat supplied to the cold sink. The zero in the thermodynamic temperature scale is the value of Tc at which the Carnot efficiency equals 1, and work output equals the heat supplied (see, e.g., Rock, 1983). Then for erev ¼ 1; T ¼ 0. If now a fixed point, such as the triple point of water, is chosen for convenience, and this temperature T3 is set as 273.16 K to make the Kelvin equivalent to the currently used Celsius degree, then: Tc ¼ ðjqc j=jqh jÞ T3
ð7Þ
The importance of this is that the temperature is defined independent of the working substance. The perfect-gas temperature scale is independent of the identity of the gas and is identical to the thermodynamic temperature scale. This result is due to the observation of Charles that for a sample of gas subjected to a constant low pressure, the volume, V, varied linearly with temperature whatever the identity of the gas (see, e.g., Rock, 1983; McGee, 1988). Thus V ¼ constant (y þ 273.158C) at constant pressure, where y denotes the temperature on the Celsius scale and T ( y þ 273.15) is the temperature on the Kelvin, or absolute scale. The volume will approach zero on cooling at T ¼ 273.158C. This is termed the absolute zero. It should be noted that it is not possible to cool real gases to zero volume because they condense to a liquid or a solid before absolute zero is reached.
DEVELOPMENT OF THE INTERNATIONAL TEMPERATURE SCALE OF 1990 In 1927, the International Conference of Weights and Measures approved the establishment of an International Temperature Scale (ITS, 1927). In 1960, the name was changed to the International Practical Temperature Scale (IPTS). Revisions of the scale took place in 1948, 1954, 1960, 1968, and 1990. Six international conferences under the title ‘‘Temperature, Its Measurement and Control in Science and Industry’’ were held in 1941, 1955, 1962, 1972, 1982, and 1992 (Wolfe, 1941; Herzfeld, 1955, 1962; Plumb, 1972; Billing and Quinn, 1975; Schooley, 1982, 1992). The latest revision in 1990 again changed the name of the scale to the International Temperature Scale 1990 (ITS-90). The necessary features required by a temperature scale are (Hudson, 1982): 1. 2. 3. 4.
Definition; Realization; Transfer; and Utilization.
The ITS-90 temperature scale is the best available approximation to the thermodynamic temperature scale. The following points deal with the definition and realization of the IPTS and ITS scale as they were developed over the years. 1. Fixed points are used based on thermodynamic invariant points. These are such points as freezing points and triple points of specific systems. They are classified as (a) defining (primary) and (b) secondary, depending on the system of measurement and/or the precision to which they are known. 2. The instruments used in interpolating temperatures between the fixed points are specified. 3. The equations used to calculate the intermediate temperature between each defining temperature must agree. Such equations should pass through the defining fixed points. In given applications, use of the agreed IPTS instruments is often not practicable. It is then necessary to transfer the measurement from the IPTS and ITS method to another, more practical temperature-measuring instrument. In such a transfer, accuracy is necessarily reduced, but such transfers are required to allow utilization of the scale. The IPTS-27 Scale The IPTS scale as originally set out defined four temperature ranges and specified three instruments for measuring temperatures and provided an interpolating equation for each range. It should be noted that this is now called the IPTS-27 scale even though the 1927 version was called ITS, and two revisions took place before the name was changed to IPTS.
THERMOMETRY
Range I: The oxygen point to the ice point (182.97 to 08C); Range II: The ice point to the aluminum point (0 to 6608C); Range III: The aluminum point to the gold point (660 to 1063.08C); Range IV: Above the gold point (1063.08C and higher). The oxygen point is defined as the temperature at which liquid and gaseous oxygen are in equilibrium, and the ice and metal points are defined as the temperature at which the solid and liquid phases of the material are in equilibrium. The platinum resistance thermometer was used for ranges I and II, the platinum versus platinum (90%)/rhodium (10%) thermocouple was used for range III, and the optical pyrometer was used for range IV. In subsequent IPTS scales (and ITS-90), these instruments are still used, but the interpolating equations and limits are modified. The IPTS-27 was based on the ice point and the steam point as true temperatures on the thermodynamic scale. Subsequent Revision of the Temperature Scales Prior to that of 1990 In 1954, the triple point of water and the absolute zero were used as the two points defining the thermodynamic scale. This followed a proposal originally advocated by Kelvin. In 1975, the Kelvin was adopted as the standard temperature unit. The symbol T represents the thermodynamic temperature with the unit Kelvin. The Kelvin is given the symbol K, and is defined by setting the melting point of water equal to 273.15 K. In practice, for historical reasons, the relationship between T and the Celsius temperature (t) is defined as t ¼ T 273.15 K (the fixed points on the Celsius scale are water’s melting point and boiling point). By definition, the degree Celsius (8C) is equal in magnitude to the Kelvin. The IPTS-68 was amended in 1975 and a set of defining fixed points (involving hydrogen, neon, argon, oxygen, water, tin, zinc, silver, and gold) were listed. The IPTS-68 had four ranges: Range I: 13.8 to 273.15 K, measured using a platinum resistance thermometer. This was divided into four parts. Part A: 13.81 to 17.042 K, determined using the triple point of equilibrium hydrogen and the boiling point of equilibrium hydrogen. Part B: 17.042 to 54.361 K, determined using the boiling point of equilibrium hydrogen, the boiling point of neon, and the triple point of oxygen. Part C: 54.361 to 90.188 K, determined using the triple point of oxygen and the boiling point of oxygen. Part D: 90.188 to 273.15 K, determined using the boiling point of oxygen and the boiling point of water. Range II: 273.15 to 903.89 K, measured using a platinum resistance thermometer, using the triple point
33
of water, the boiling point of water, the freezing point of tin, and the freezing point of zinc. Range III: 903.89 to 1337.58 K, measured using a platinum versus platinum (90%)/rhodium (10%) thermocouple, using the antimony point and the freezing points of silver and gold, with cross-reference to the platinum resistance thermometer at the antimony point. Range IV: All temperatures above 1337.58 K. This is the gold point (at 1064.438C), but no particular thermal radiation instrument is specified. It should be noted that temperatures in range IV are defined by fundamental relationships, whereas the other three IPTS defining equations are not fundamental. It must also be stated that the IPTS-68 scale is not defined below the triple point of hydrogen (13.81 K). The International Temperature Scale of 1990 The latest scale is the International Temperature Scale of 1990 (ITS-90). Figure 3 shows the various temperature ranges set out in ITS-90. T90 (meaning the temperature defined according to ITS-90) is stipulated from 0.65 K to the highest temperature using various fixed points and the helium vapor-pressure relations. In Figure 3, these ‘‘fixed points’’ are set out in the diagram to the nearest integer. A review has been provided by Swenson (1992). In ITS-90 an overlap exists between the major ranges for three of the four interpolation instruments, and in addition there are eight overlapping subranges. This represents a change from the IPTS-68. There are alternative definitions of the scale existing for different temperature ranges and types of interpolation instruments. Aspects of the temperature ranges and the interpolating instruments can now be discussed. Helium Vapor Pressure (0.65 to 5 K). The helium isotopes He and 4He have normal boiling points of 3.2 K and 4.2 K respectively, and both remain liquids to T ¼ 0. The helium vapor pressure–temperature relationship provides a convenient thermometer in this range. 3
Interpolating Gas Thermometer (3 to 24.5561 K). An interpolating constant-volume gas thermometer (1 CVGT) using 4He as the gas is suggested with calibrations at three fixed points (the triple point of neon, 24.6 K; the triple point of equilibrium hydrogen, 13.8 K; and the normal boiling point of 4He, 4.2 K). Platinum Resistance Thermometer (13.8033 K to 961.788C). It should be noted that temperatures above 08C are typically recorded in 8C and not on the absolute scale. Figure 3 indicates the fixed points and the range over which the platinum resistance thermometer can be used. Swenson (1992) gives some details regarding common practice in using this thermometer. The physical requirement for platinum thermometers that are used at high temperatures and at low temperatures are different, and no single thermometer can be used over the entire range.
34
COMMON CONCEPTS
Figure 3. The International Temperature Scale of 1990 with some key temperatures noted. Ge diode range is shown for reference only; it is not defined in the ITS-90 standard. (See text for further details and explanation.)
Optical Pyrometry (above 961.788C). In this temperature range the silver, gold, or copper freezing point can be used as reference temperatures. The silver point at 961.788C is at the upper end of the platinum resistance scale, because have thermometers have stability problems (due to such effects as phase changes, changes in heat capacity, and degradation of welds and joints between constituent materials) above this temperature.
TEMPERATURE FIXED POINTS AND DESIRED CHARACTERISTICS OF TEMPERATURE MEASUREMENT PROBES The Fixed Points The ITS-90 scale utilized various ‘‘defining fixed points.’’ These are given below in order of increasing temperature. (Note: all substances except 3He are defined to be of natural isotopic composition.)
The freezing point of tin at 505.078 K (231.9288C); The freezing point of zinc at 692.677 K (419.5278C); The freezing point of aluminum at 933.473 K (660.3238C); The freezing point of silver at 1234.93 K (961.788C); The freezing point of gold at 1337.33 K (1064.188C); The freezing point of copper at 1357.77 K (1084.628C). The reproducibility of the measurement (and/or known difficulties with the occurrence of systematic errors in measurement) dictates the property to be measured in each case on the scale. There remains, however, the question of choosing the temperature measurement probe—in other words, the best thermometer to be used. Each thermometer consists of a temperature sensing device, an interpretation and display device, and a method of connecting one to another. The Sensing Element of a Thermometer
The vapor point of 3He between 3 and 5 K; The triple point of equilibrium hydrogen at 13.8033 K (equilibrium hydrogen is defined as the equilibrium concentrations of the ortho and para forms of the substance); An intermediate equilibrium hydrogen vapor point at 17 K; The normal boiling point of equilibrium hydrogen at 20.3 K; The triple point of neon at 24.5561 K; The triple point of oxygen at 54.3584 K; The triple point of argon at 83.8058 K; The triple point of mercury at 234.3156 K; The triple point of water at 273.16 K (0.018C); The melting point of gallium at 302.9146 K (29.76468C); The freezing point of indium at 429.7485 K (156.59858C);
The desired characteristics for a sensing element always include: 1. An Unambiguous (Monotonic) Response with Temperature. This type of response is shown in Figure 4A. Note that the appropriate response need not be linear with temperature. Figure 4B shows an ambiguous response of the sensing property to temperature where there may be more than one temperature at which a particular value of the sensing property occurs. 2. Sensitivity. There must be a high sensitivity (d) of the temperature-sensing property to temperature. Sensitivity is defined as the first derivative of the property (X) with respect to temperature: d ¼ qX=qT. 3. Stability. It is necessary for the sensing element to remain stable, with the same sensitivity over a long time.
THERMOMETRY
35
1. 2. 3. 4.
Sufficient sensitivity for the sensing element; Stability with respect to time; Automatic response to the signal; Possession of data logging and archiving capabilities; 5. Low cost.
TYPES OF THERMOMETER Liquid-Filled Thermometers The discussion is limited to practical considerations. It should be noted, however, that the most common type of thermometer in this class is the mercury-filled glass thermometer. Overall there are two types of liquid-filled systems: 1. Systems filled with a liquid other than mercury 2. Mercury-filled systems.
Figure 4. (A) An acceptable unambiguous response of temperature-sensing element (X) versus temperature (T). (B) An unacceptable unambiguous response of temperature-sensing element (X) versus temperature.
4. Cost. A relatively low cost (with respect to the project budget) is desirable. 5. Range. A wide range of temperature measurements makes for easy instrumentation. 6. Size. The element should be small (i.e., with respect to the sample size, to minimize heat transfer between the sample and the sensor). 7. Heat Capacity. A relatively small heat capacity is desirable—i.e., the amount of heat required to change the temperature of the sensor must not be too large. 8. Response. A rapid response is required (this is achieved in part by minimizing sensor size and heat capacity). 9. Usable Output. A usable output signal is required over the temperature range to be measured (i.e., one should maximize the dynamic range of the signal for optimal temperature resolution). Naturally, all items are defined relative to the components of the system under study or the nature of the experiment being performed. It is apparent that in any single instrument, compromises must be made. Readout Interpretation Resolution of the temperature cannot be improved beyond that of the sensing device. Desirable features on the signalreceiving side include:
Both rely on the temperature being indicated by a change in volume. The lower range is dictated by the freezing point of the fill liquid; the upper range must be below the point at which the liquid is unstable, or where the expansion takes place in an unacceptably nonlinear fashion. The nature of the construction material is important. In many instances, the container is a glass vessel with expansion read directly from a scale. In other cases, it is a metal or ceramic holder attached to a capillary, which drives a Bourdon tube (a diaphragm or bellows device). In general organic liquids have a coefficient of expansion some 8 times that of mercury, and temperature spans for accurate work of some 10 to 258C are dictated by the size of the container bulb for the liquid. Organic fluids are available up to 2508C, while mercury-filled systems can operate up to 6508C. Liquid-filled systems are unaffected by barometric pressure changes. Certificates of performance can generally be obtained from the instrument manufacturers. Gas-Filled Thermometers In vapor-pressure thermometers, the container is partially filled with a volatile liquid. Temperature at the bulb is conditioned by the fact that the interface between the liquid and the gas must be located at the point of measurement, and the container must represent the coolest point of the system. Notwithstanding the previous considerations applying to temperature scale, in practice vapor-filled systems can be operated from 40 to 3208C. A further class of gas-filled systems is one that simply relies on the expansion of a gas. Such systems are based on Charles’ Law: P¼
KT V
ð8Þ
where P is the pressure, T is the temperature (Kelvin), and V is the volume. K is a constant. The range of practical application is the widest of any filled systems. The lowest temperature is that at which the gas becomes liquid. The
36
COMMON CONCEPTS
highest temperature depends on the thermal stability of the gas or the construction material. Electrical-Resistance Thermometers A resistance thermometer is dependent upon the electrical resistance of a conducting metal changing with the temperature. In order to minimize the size of the equipment, the resistance of the wire or film should be relatively high so that the resistance can easily be measured. The change in resistance with temperature should also be large. The most common material used is a platinum wire-wound element with a resistance of 100 at 08C. Calibration standards are essential (see the previous discussion on the International Temperature Standards). Commercial manufacturers will provide such instruments together with calibration details. Thin-film platinum elements may provide an alternative design feature and are priced competitively with the more common wire-wound elements. Nickel and copper resistance elements are also commercially available. Thermocouple Thermometers A thermocouple is an assembly of two wires of unlike metals joined at one end where the temperature is to be measured. If the other end of one of the thermocouple wires leads to a second, similar junction that is kept at a constant reference temperature, then a temperaturedependent voltage develops called the Seebeck voltage (see, e.g., McGee, 1988). The constant-temperature junction is often kept at 08C and is referred to as the cold junction. Tables are available of the EMF generated versus temperature when one thermocouple is kept as a cold junction at 08C for specified metal/metal thermocouple junctions. These scales may show a nonlinear variation with temperature, so it is essential that calibration be carried out using phase transitions (i.e., melting points or solid-solid transitions). Typical thermocouple systems are summarized in Table 1.
Thermistors and Semiconductor-Based Thermometers There are various types of semiconductor-based thermometers. Some semiconductors used for temperature measurements are called thermistors or resistive temperature detectors (RTDs). Materials can be classified as electrical conductors, semiconductors, or insulators depending on their electrical conductivity. Semiconductors have 10 to 106 -cm resistivity. The resistivity changes with temperature, and the logarithm of the resistance plotted against reciprocal of the absolute temperature is often linear. The actual value for the thermistor can be fixed by deliberately introducing impurities. Typical materials used are oxides of nickel, manganese, copper, titanium, and other metals that are sintered at high temperatures. Most thermistors have a negative temperature coefficient, but some are available with a positive temperature coefficient. A typical thermistor with a resistance of 1200 at 408C will have a 120- resistance at 1108C. This represents a decrease in resistance by a factor of about 2 for every 208C increase in temperature, which makes it very useful for measuring very small temperature spans. Thermistors are available in a wide variety of styles, such as small beads, discs, washers, or rods, and may be encased in glass or plastic or used bare as required by their intended application. Typical temperature ranges are from 308 to 1208C, with generally a much greater sensitivity than for thermocouples. The germanium diode is an important thermistor due to its responsivity range—germanium has a well-characterized response from 0.058 to 1008 Kelvin, making it well-suited to extremely low-temperature measurement applications. Germanium diodes are also employed in emissivity measurements (see the next section) due to their extremely fast response time—nanosecond time resolution has been reported (Xu et al., 1996). Ruthenium oxide RTDs are also suitable for extremely low-temperature measurement and are found in many cryostat applications.
Radiation Thermometers a
Table 1. Some Typical Thermocouple Systems System Iron-Constantan
Copper-Constantan
Chromel-Alumel Chromel-Constantan
Platinum-Rhodium (or suitable alloys) a
Use Used in dry reducing atmospheres up to 4008C General temperature scale: 08 to 7508C Used in slightly oxidizing or reducing atmospheres For low-temperature work: 2008 to 508C Used only in oxidizing atmospheres Temperature range: 08 to 12508C Not to be used in atmospheres that are strongly reducing atmospheres or contain sulfur compounds Temperature range: 2008 to 9008C Can be used as components for thermocouples operating up to 17008C
Operating conditions and calibration details should always be sought from instrument manufactures.
Radiation incident on matter must be reflected, transmitted, or absorbed to comply with the First Law of Thermodynamics. Thus the reflectance, r, the transmittance, t, and the absorbance, a, sum to unity (the reflectance [transmittance, absorbance] is defined as the ratio of the reflected [transmitted, absorbed] intensity to the incident intensity). This forms the basis of Kirchoff’s law of optics. Kirchoff recognized that if an object were a perfect absorber, then in order to conserve energy, the object must also be a perfect emitter. Such a perfect absorber/emitter is called a ‘‘black body.’’ Kirchoff further recognized that the absorbance of a black body must equal its emittance, and that a black body would be thus characterized by a certain brightness that depends upon its temperature (Wolfe, 1998). Max Planck identified the quantized nature of the black body’s emittance as a function of frequency by treating the emitted radiation as though it were the result of a linear field of oscillators with quantized energy states. Planck’s famous black-body law relates the radiant
THERMOMETRY
Figure 5. Spectral distribution of radiant intensity as a function of temperature.
intensity to the temperature as follows: 3 2
Iv ¼
2hv n 1 c2 expðhv=kTÞ 1
ð9Þ
where h is Planck’s constant, v is the frequency, n is the refractive index of the medium into which the radiation is emitted, c is the velocity of light, k is Boltzmann’s constant, and T is the absolute temperature. Planck’s law is frequently expressed in terms of the wavelength of the radiation since in practice the wavelength is the measured quantity. Planck’s law is then expressed as Il ¼
C1 l5 ðeC2 =lT 1Þ
ð10Þ
where C1 (¼ 2hc2 =n2 ) and C2 (¼ hc=nk) are known as the first and second radiation constant, respectively. A plot of the Planck’s law intensity at various temperatures (Fig. 5) demonstrates the principle by which radiation thermometers operate. The radiative intensity of a black-body surface depends upon the viewing angle according to the Lambert cosine law, Iy ¼ I cos y. Using the projected area in a given direction given by dAcos y, the radiant emission per unit of projected area, or radiance (L), for a black body is given by L ¼ I cos y/cos y ¼ I. For real objects, L 6¼ I because factors such as surface shape, roughness, and composition affect the radiance. An important consequence of this is that emittance is a not an intrinsic materials property. The emissivity, emittance from a perfect material under ideal conditions (pure, atomically smooth and flat surface free of pores or oxide coatings), is a fundamental materials property defined as the ratio of the radiant flux density of the material to that of a black body under the same conditions (likewise, absorptivity and reflectivity are materials properties). Emissivity is extremely difficult to measure accurately, and emittance is often erroneously reported as emissivity in the literature (McGee, 1988). Both emittance and emissivity can be taken at a single
37
wavelength (spectral emittance), over a range of wavelengths (partial emittance), or over all wavelengths (total emittance). It is important to properly determine the emittance of any real material being measured in order convert radiant intensity to temperature. For example, the spectral emittance of polished brass is 0.05, while that of oxidized brass is 0.61 (McGee, 1988). An important ramification of this effect is the fact that it is not generally possible to accurately measure the emissivity of polished, shiny surfaces, especially metallic ones, whose signal is dominated by reflectivity. Radiation thermometers measure the amount of radiation (in a selected spectral band—see below) emitted by the object whose temperature is to be measured. Such radiation can be measured from a distance, so there is no need for contact between the thermometer and the object. Radiation thermometers are especially suited to the measurement of moving objects, or of objects inside vacuum or pressure vessels. The types of radiation thermometers commonly available are: broadband thermometers bandpass thermometers narrow-band thermometers ratio thermometers optical pyrometers and fiberoptic thermometers. Broadband thermometers have a response from 0.3 mm optical wavelength to an upper limit of 2.5 to 20 mm, governed by the lens or window material. Bandpass thermometers have lenses or windows selected to view only nine selected portions of the spectrum. Narrow-band thermometers respond to an extremely narrow range of wavelengths. A ratio thermometer measures radiated energy in two narrow bands and calculates the ratio of intensities at the two energies. Optical pyrometers are really a special form of narrow-band thermometer, measuring radiation from a target in a narrow band of visible wavelengths centered at 0.65 mm in the red portion of the spectrum. A fiberoptic thermometer uses a light guide to guide the radiation from the target to the detector. An important consideration is the so-called ‘‘atmospheric window’’ when making distance measurements. The 8- to 14-mm range is the most common region selected for optical pyrometric temperature measurement. The constituents of the atmosphere are relatively transparent in this region (that is, there are no infrared absorption bands from the most common atmospheric constituents, so no absorption or emission from the atmosphere is observed in this range). TEMPERATURE CONTROL It is not necessarily sufficient to measure temperature. In many fields it is also necessary to control the temperature. In an oven, a furnace, or a water bath, a constant temperature may be required. In other uses in analytical instrumentation, a more sophisticated temperature program may be required. This may take the form of a constant heating rate or may be much more complicated.
38
COMMON CONCEPTS
A temperature controller must: 1. receive a signal from which a temperature can be deduced; 2. compare it with the desired temperature; and 3. produce a means of correcting the actual temperature to move it towards the desired temperature. The control action can take several forms. The simplest form is an on-off control. Power to a heater is turned on to reach a desired temperature but turned off when a certain temperature limit is reached. This cycling motion of control results in temperatures that oscillate between two set points once the desired temperature has been reached. A proportional control, in which the amount of temperature correction power depends on the magnitude of the ‘‘error’’ signal, provides a better system. This may be based on proportional bandwidth integral control or derivative control. The use of a dedicated computer allows observations to be set (and corrected) at desired intervals and corresponding real-time plots of temperature versus time to be obtained. The Bureau International des Poids et Mesures (BIPM) ensures worldwide uniformity of measurements and their traceability to the Syste`me Internationale (SI), and carries out measurement-related research. It is the proponent for the ITS-90 and the latest definitions can be found (in French and English) at http://www.bipm.fr. The ITS-90 site, at http://www.its-90.com, also has some useful information. The text of the document is reproduced here with the permission of Metrologia (Springer-Verlag).
ACKNOWLEDGMENTS The series editor gratefully acknowledges Christopher Meyer, of the Thermometry group of the National Institute of Standards and Technology (NIST), for discussions and clarifications concerning the ITS-90 temperature standards.
LITERATURE CITED Billing, B. F. and Quinn, T. J. 1975. Temperature Measurement, Conference Series No. 26. Institute of Physics, London. Herzfeld, C. M. (ed.) 1955. Temperature: Its Measurement and Control in Science and Industry, Vol. 2. Reinhold, New York. Herzfeld, C. M. (ed.) 1962. Temperature: Its Measurement and Control in Science and Industry, Vol. 3. Reinhold, New York. Hudson, R. P. 1982. In Temperature: Its Measurement and Control in Science and Industry, Vol. 5, Part 1 (J.F. Schooley, ed.). Reinhold, New York. ITS. 1927. International Committee of Weights and Measures. Conf. Gen. Poids. Mes. 7:94. McGee. 1988. Principles and Methods of Temperature Measurement. John Wiley & Sons, New York. Middleton, W. E. K. 1966. A History of the Thermometer and Its Use in Meteorology. John Hopkins University Press, Baltimore.
Plumb, H. H. (ed.) 1972. Temperature: Its Measurement and Control in Science and Industry, Vol. 4. Instrument Society of America, Pittsburgh. Rock, P. A. 1983. Chemical Thermodynamics. University Science Books, Mill Valley, Calif. Schooley, J. F. (ed.) 1982. Temperature: Its Measurement and Control in Science and Industry, Vol. 5. American Institute of Physics, New York. Schooley, J. F. (ed.) 1992. Temperature: Its Measurement and Control in Science and Industry, Vol. 6. American Institute of Physics, New York. Swenson, C. A. 1992. In Temperature: Its Measurement and Control in Science and Industry, Vol. 6 (J.F. Schooley, ed.). American Institute of Physics, New York. Wolfe, Q. C. (ed.) 1941. Temperature: Its Measurement and Control in Science and Industry, Vol. 1. Reinhold, New York. Wolfe, W. L. 1998. Introduction to Radiometry. SPIE Optical Engineering Press, Bellingham, Wash. Xu, X., Grigoropoulos, C. P., and Russo, R. E. 1996. Nanosecond– time resolution thermal emission measurement during pulsedexcimer. Appl. Phys. A 62:51–59.
APPENDIX: TEMPERATURE-MEASUREMENT RESOURCES Several manufacturers offer a significant level of expertise in the practical aspects of temperature measurement, and can assist researchers in the selection of the most appropriate instrument for their specific task. A noteworthy source for thermometers, thermocouples, thermistors, and pyrometers is Omega Engineering (www.omega.com). Omega provides a detailed catalog of products interlaced with descriptive essays of the underlying principles and practical considerations. The Mikron Instrument Company (www.mikron.com) manufactures an extensive line of infrared temperature measurement and calibration black-body sources. Graesby is also an excellent resource for extended-area calibration black-body sources. Inframetrics (www. inframetrics.com) manufactures a line of highly configurable infrared radiometers. All temperature measurement manufacturers offer calibration services with NIST-traceable certification. Germanium and ruthenium RTDs are available from most manufacturers specializing in cryostat applications. Representative companies include Quantum Technology (www.quantum-technology.com) and Scientific Instruments (www.scientificinstruments.com). An extremely useful source for instrument interfacing (for automation and digital data acquisition) is National Instruments (www. natinst.com) DAVID DOLLIMORE The University of Toledo Toledo, Ohio
ALAN C. SAMUELS Edgewood Chemical Biological Center Aberdeen Proving Ground, Maryland
SYMMETRY IN CRYSTALLOGRAPHY
SYMMETRY IN CRYSTALLOGRAPHY
39
periodic repetition of this unit cell. The atoms within a unit cell may be related by additional symmetry operators.
INTRODUCTION The study of crystals has fascinated humanity for centuries, with motivations ranging from scientific curiosity to the belief that they had magical powers. Early crystal science was devoted to descriptive efforts, limited to measuring interfacial angles and determining optical properties. Some investigators, such as Hau¨ y, attempted to deduce the underlying atomic structure from the external morphology. These efforts were successful in determining the symmetry operations relating crystal faces and led to the theory of point groups, the assignment of all known crystals to only seven crystal systems, and extensive compilations of axial ratios and optical indices. Roentgen’s discovery of x rays and Laue’s subsequent discovery of the scattering of x rays by crystals revolutionized the study of crystallography: crystal structures—i.e., the relative location of atoms in space—could now be determined unequivocally. The benefits derived from this knowledge have enhanced fundamental science, technology, and medicine ever since and have directly contributed to the welfare of human society. This chapter is designed to introduce those with limited knowledge of space groups to a topic that many find difficult.
SYMMETRY OPERATORS A crystalline material contains a periodic array of atoms in three dimensions, in contrast to the random arrangement of atoms in an amorphous material such as glass. The periodic repetition of a motif along a given direction in space within a fixed length t parallel to that direction constitutes the most basic symmetry operation. The motif may be a single atom, a simple molecule, or even a large, complex molecule such as a polymer or a protein. The periodic repetition in space along three noncollinear, noncoplanar vectors describes a unit parallelepiped, the unit cell, with periodically repeated lengths a, b, and c, the metric unit cell parameters (Fig. 1). The atomic content of this unit cell is the fundamental building block of the crystal structure. The macroscopic crystalline material results from the
Figure 1. A unit cell.
Proper Rotation Axes A proper rotation axis, n, repeats an object every 2p/n radians. Only 1-, 2-, 3-, 4-, and 6-fold axes are consistent with the periodic, space-filling repetition of the unit cell. In contrast, molecular symmetry axes can have any value of n. Figure 2A illustrates the appearance of space that results from the action of a proper rotation axis on a given motif. Note that a 1-fold axis—i.e., rotation by 3608—is a legitimate symmetry operation. These objects retain their handedness. The reversal of the direction of rotation will superimpose the objects without removing them from the plane perpendicular to the rotation axis. After 2p radians the rotated motif superimposes directly on the initial object. The repetition of motifs by a proper rotation axis forms congruent objects. Improper Rotation Axes Improper rotation axes are compound symmetry operations consisting of rotation followed by inversion or mirror reflection. Two conventions are used to designate symmetry operators. The International or Hermann-Mauguin symbols are based on rotoinversion operations, and the Scho¨ nflies notation is based on rotoreflection operations. The former is the standard in crystallography, while the latter is usually employed in molecular spectroscopy. Consider the origin of a coordinate system, a b c, and an object located at coordinates x y z. Atomic coordinates are expressed as dimensionless fractions of the threedimensional periodicities. From the origin draw a vector to every point on the object at x y z, extend this vector the same length through the origin in the opposite direction, and mark off this length. Thus, for every x y z there will be a x, y, z, ( x, y, z in standard notation). This mapping creates a center of inversion or center of symmetry at the origin. The result of this operation changes the handedness of an object, and the two symmetry-related objects are enantiomorphs. Figure 3A illustrates this operator. It has the International or Hermann-Mauguin read as ‘‘one bar.’’ This symbol can be interpreted symbol 1 as a 1-fold rotoinversion axis: i.e., an object is rotated 3608 followed by the inversion operation. Similarly, there are 2, 3, 4, and 6 axes (Fig. 2B). Consider the 2 operation: a 2-fold axis perpendicular to the ac plane rotates an object 1808 and immediately inverts it through the origin, defined as the point of intersection of the plane and the 2-fold axis. is usually The two objects are related by a mirror and 2 given the special symbol m. The object at x y z is reproduced at x y z (Fig. 3B). The two objects created by inversion or mirror reflection cannot be superimposed by a proper rotation axis operation. They are related as the right hand is to the left. Such objects are known as enantiomorphs. The Scho¨ nflies notation is based on the compound operation of rotation and reflection, and the operation is designated n~ or Sn . The subscript n denotes the rotation 2p/n and S denotes the reflection operation (the German
40
COMMON CONCEPTS
Figure 2. Symmetry of space around the five proper rotation axes giving rise to congruent objects (A.), and the five improper rotation axes and 6 denote mirror planes in giving rise to enantiomorphic objects (B.). Note the symbols in the center of the circles. The dark circles for 2 the plane of the paper. Filled triangles are above the plane of the paper and open ones below. Courtesy of Buerger (1970).
~ axis perpendicular word for mirror is spiegel). Consider a 2 to the ac plane that contains an object above that plane at x y z. Rotate the object 1808 and immediately reflect it through the plane. The positional parameters of this object are x, y, z. The point of intersection of the 2-fold rotor and the plane is an inversion point and the two objects are ~ or enantiomorphs. The special symbol i is assigned to 2 in the Hermann-Mauguin sysS2, and is equivalent to 1
tem. In this discussion only the International (HermannMauguin) symbols will be used. Screw Axes, Glide Planes A rotation axis that is combined with translation is called a screw axis and is given the symbol nt . The subscript t denotes the fractional translation of the periodicity
Figure 3. (A) A center of symmetry; (B) mirror reflection. Courtesy of Buerger (1970).
SYMMETRY IN CRYSTALLOGRAPHY
parallel to the rotation axis n, where t ¼ m/n, m ¼ 1, . . . , n 1. Consider a 2-fold screw axis parallel to the b-axis of a coordinate system. The 2-fold rotor acts on an object by rotating it 1808 and is immediately followed by a translation, t/2, of 12 the b-axis periodicity. An object at x y z is generated at x, y þ 12, z by this 21 screw axis. All crystallographic symmetry operations must operate on a motif a sufficient number of times so that eventually the motif coincides with the original object. This is not the case at this juncture. This screw operation has to be repeated again, resulting in an object at x, y þ 1, z. Now the object is located one b-axis translation from the original object. Since this constitutes a periodic translation, b, the two objects are identical and the space has the proper symmetry 21. The possible screw axes are 21, 31, 32, 41, 42, 43, 61, 62, 63, 64, and 65 (Fig. 4). Note that the screw axes 31 and 32 are related as a righthanded thread is to a left-handed one. Similarly, this relationship is present for spaces exhibiting 41 and 43, 61 and 65, etc., symmetries. Note the symbols above the axes in Figure 4 that indicate the type of screw axis. The combination of a mirror plane with translation parallel to the reflecting plane is known as a glide plane. Consider a coordinate system a b c in which the bc plane is a mirror. An object located at x y z is reflected, which would
41
Figure 5. A b glide plane perpendicular to the a-axis.
bring it temporarily to the position x y z. However, it does not remain there but is translated by 12 of the b-axis periodicity to the point x, y þ 12 , z (Fig. 5). This operation must be repeated to satisfy the requirement of periodicity so that the next operation brings the object to x; y þ 1; z, which is identical to the starting position but one b-axis periodicity away. Note that the first operation produces an enantiomorphic object and the second operation reverses this handedness, making it congruent with the initial object. This glide operation is designated as a b glide plane and has the symbol b. We could have moved the object parallel to the c axis by 12 of the c-axis periodicity, as well as by the vector 12 ðb þ cÞ. The former symmetry operator is a c glide plane, denoted by c, and the latter is an n glide plane, symbolized by n. Note that in this example an a glide plane operation is meaningless. If the glide plane is perpendicular to the b-axis, then a, c, and n ¼ 12 ða þ cÞ glide planes can exist. The extension to a glide plane perpendicular to the c-axis is obvious. One other glide operation needs to be described, the diamond glide d. It is characterized by the operation 14 ða þ bÞ, 14 ða þ cÞ, and 14 ðb þ cÞ. The diagonal glide with translation 14 ða þ b þ cÞ can be considered part of the diamond glide operation. All of these operations must be applied repeatedly until the last object that is generated is identical with the object at the origin but one periodicity away. Symmetry-related positions, or equivalent positions, can be generated from geometrical considerations. However, the operations can be represented by matrices operating on a given position. In general, one can write the matrix equation X0 ¼ RX þ T
Figure 4. The eleven screw axes. The pure rotation axes are also shown. Note the symbols above the axes. Courtesy of Azaroff (1968).
where X0 (x0 y0 z0 ) are the transformed coordinates, R is a rotation operator applied to X(x y z), and T is the transla operation, x y z ) x y z, and in tion operator. For the 1 matrix formulation the transformation becomes 0 1 0 0 0x1 x0 1 B 0C y0 ¼ @ 0 1 ð1Þ A@ y A 0 z z 0 0 1
42
COMMON CONCEPTS
For an a glide plane perpendicular to the c-axis, x y z ) x þ 12, y, z, or in matrix notation x0
0
1
B y0 ¼ @ 0 z
0
0
10 1 0 1 1 x 2 CB C B C 1 0 A@ y A þ @ 0 A 0 1 z 0 0 0
ð2Þ
(Hall, 1969; Burns and Glazer, 1978; Stout and Jensen, 1989; Giacovazzo et al., 1992). POINT GROUPS The symmetry of space about a point can be described by a collection of symmetry elements that intersect at that point. The point of intersection of the symmetry axes is the origin. The collection of crystallographic symmetry operators constitutes the 32 crystallographic point groups. The external morphology of three-dimensional crystals can be described by one of these 32 crystallographic point groups. Since they describe the interrelationship of faces on a crystal, the symmetry operators cannot contain translational components that refer to atomic-scale relations such as screw axes or glide planes. The point groups can be divided into (1) simple rotation groups and (2) higher symmetry groups. In (1), there exist only 2-fold axes or one unique symmetry axis higher than a 2-fold axis. There are 27 such point groups. In (2), no single unique axis exists but more than one n-fold axis is present, n > 2. The simple rotation groups consist only of one single nfold axis. Thus, the point groups 1, 2, 3, 4, and 6 constitute the five pure rotation groups (Fig. 2). There are four dis 2 ¼ m, 3, 4, 6. It tinct, unique rotoinversion groups: 1, is equivalent to a mirror, m, which has been shown that 2 is perpendicular to that axis, and the standard symbol for 2 is usually labeled 3/m and is assigned to is m. Group 6 group n/m. This last symbol will be encountered frequently. It means that there is an n-fold axis parallel to a given direction, and that perpendicular to that direction a mirror plane or some other symmetry plane exists. Next are four unique point groups that contain a mirror perpen 4/m, 6/m. There dicular to the rotation axis: 2/m, 3/m ¼ 6, are four groups that contain mirrors parallel to a rotation axis: 2mm, 3m, 4mm, 6mm. An interesting change in notation has occurred. Why is 2mm and not simply 2m used, while 3m is correct? Consider the intersection of two orthogonal mirrors. It is easy to show by geometry that the line of intersection is a 2-fold axis. It is particularly easy with matrix algebra (Stout and Jensen, 1989; Giacovazzo et al., 1992). Let the ab and ac coordinate planes be orthogonal mirror planes. The line of intersection is the a-axis. The multiplication of the respective mirror matrices yields the matrix representation of the 2-fold axis parallel to the a-axis: 0 10 1 0 1 1 0 0 1 0 0 1 0 0 0A ¼ @0 1 0A @ 0 1 0 A@ 0 1 ð3Þ 0 0 1 0 0 1 0 0 1 Thus, a combination of two intersecting orthogonal mirrors yields a 2-fold axis of symmetry and similarly
the combination of a 2-fold axis lying in a mirror plane produces another mirror orthogonal to it. Let us examine 3m in a similar fashion. Let the 3-fold axis be parallel to the c-axis. A 3-fold symmetry axis demands that the a and b axes must be of equal length and at 1208 to each other. Let the mirror plane contain the c-axis and the perpendicular direction to the b-axis. The respective matrices are 0 1 0 1 0 01 0 01 1 0 0 1 1 B C 0A ¼ B 0C ð4Þ @1 1 A@ 1 1 @0 1 0A 0 0 1 0 0 1 0 0 1 and the product matrix represents a mirror plane containing the c-axis and the perpendicular direction to the a-axis. These two directions are at 608 to each other. Since 3-fold symmetry requires a mirror every 1208, this is not a new symmetry operator. In general, when n of an n-fold rotor is odd no additional symmetry operators are generated, but when n is even a new symmetry operator comes into existence. One can combine two symmetry operators in an arbitrary manner with a third symmetry operator. Will this combination be a valid crystallographic point group? This complex problem was solved by Euler (Buerger, 1956; Azaroff, 1968; McKie and McKie, 1986). He derived the relation cos A ¼
cosðb=2Þ cosðg=2Þ þ cosða=2Þ sinðb=2Þ sinðg=2Þ
ð5Þ
where A is the angle between two rotation axes with rotation angles b and g, and a is the rotation angle of the third axis. Consider the combination of two 2-fold axes with one 3-fold axis. We must determine the angle between a 2-fold and 3-fold axis and the angle between the two 2-fold axes. Let angle A be the angle between the 2-fold and 3-fold axes. Applying the formula yields cos A ¼ 0 or A ¼ 908. Similarly, let B be the angle between the two 2-fold axes. Then cos B ¼ 1 2 and B ¼ 608. Thus, the 2-fold axes are orthogonal to the 3-fold axis and 608 to each other, consistent with 3-fold symmetry. The crystallographic point group is 32. Note again that the symbol is 32, while for a 4-fold axis combined with an orthogonal 2-fold axis the symbol is 422. So far 17 point groups have been derived. The next 10 groups are known as the dihedral point groups. There are four point groups containing n 2-fold axes perpendicular to a principal axis: 222, 32, 422, and 622. (Note that for n ¼ 3 only one unique 2-fold axis is shown.) These groups can be combined with diagonal mirrors that bisect the 2m and 2-fold axes, yielding the two additional groups 4 3m. Four additional groups result from the addition of a 2 2 2 mirror perpendicular to the principal axis, m m m, 6m2, 4 2 2 6 2 2 , and making a total of 27. mmm mmm We now consider the five groups that contain more than 3m, m3 , 4 3 2, one axis of at least 3-fold symmetry: 2 3, 4 and m3m. Note that the position of the 3-fold axis is in the second position of the symbols. This indicates that these point groups belong to the cubic crystal system. The stereographic projections of the 32 point groups are shown in Figure 6 (International Union of Crystallography, 1952).
Figure 6. The stereographic projections of the 32 point groups. From the International Tables for X-Ray Crystallography. (International Union of Crystallography, 1952). 43
44
COMMON CONCEPTS
CRYSTAL SYSTEMS The presence of a symmetry operator imparts a distinct appearance to a crystal. A 3-fold axis means that crystal faces around the axis must be identical every 1208. A 4fold axis must show a 908 relationship among faces. On the basis of the appearance of crystals due to symmetry, classical crystallographers could divide them into seven groups as shown in Table 1. The symbol 6¼ should be read as ‘‘not necessarily equal.’’ The relationship among the metric parameters is determined by the presence of the symmetry operators among the atoms of the unit cell, but the metric parameters do not determine the crystal system. Thus, one could have a metrically cubic cell, but if only 1-fold axes are present among the atoms of the unit cell, then the crystal system is triclinic. This is the case for hexamethylbenzene, but, admittedly, this is a rare occurrence. Frequently, the rhombohedral unit cell is reoriented so that it can be described on the basis of a hexagonal unit cell (Azaroff, 1968). It can be considered a subsystem of the hexagonal system, and then one speaks of only six crystal systems. We can now assign the various point groups to the six obviously belong to crystal systems. Point groups 1 and 1 the triclinic system. All point groups with only one unique ¼ m, 2-fold axis belong to the monoclinic system. Thus, 2, 2 and 2/m are monoclinic. Point groups like 222, mmm, etc., are orthorhombic; 32, 6, 6/mmm, etc., are hexagonal (point groups with a 3-fold axis are also labeled trigonal); 4, 4/m, 422, etc., are tetragonal; and 23, m3m, and 432, are cubic. Note the position of the 3-fold axis in the sequence of symbols for the cubic system. The distribution of the 32 point groups among the six crystal systems is shown in Figure 6. The rhombohedral and trigonal systems are not counted separately.
LATTICES When the unit cell is translated along three periodically repeated noncoplanar, noncollinear vectors, a threedimensional lattice of points is generated (Fig. 7). When looking at such an array one can select an infinite number of periodically repeated, noncollinear, noncoplanar vectors t1, t2, and t3, in three dimensions, connecting two lattice points, that will constitute the basis vectors of a unit cell for such an array. The choice of a unit cell is one of convenience, but usually the unit cell is chosen to reflect the
Figure 7. A point lattice with the outline of several possible unit cells.
symmetry operators present. Each lattice point at the eight corners of the unit cell is shared by eight other unit cells. Thus, a unit cell has 8 18 ¼ 1 lattice point. Such a unit cell is called primitive and is given the symbol P. One can choose nonprimitive unit cells that will contain more than one lattice point. The array of atoms around one lattice point is identical to the same array around every other lattice point. This array of atoms may consist either of molecules or of individual atoms, as in NaCl. In the latter case the Naþ and Cl atoms are actually located on lattice points. However, lattice points are usually not occupied by atoms. Confusing lattice points with atoms is a common beginners’ error. In Table 1 the unit cells for the various crystal systems are listed with the restrictions on the parameters as a result of the presence of symmetry operators. Let the baxis of a coordinate system be a 2-fold axis. The ac coordinate plane can have axes at any angle to each other: i.e., b can have any value. But the b-axis must be perpendicular to the ac plane or else the parallelogram defined by the vectors a and b will not be repeated periodically. The presence of the 2-fold axis imposes the restriction that a ¼ g ¼ 908. Similarly, the presence of three 2-fold axes requires an orthogonal unit cell. Other symmetry operators impose further restrictions, as shown in Table 1. Consider now a 2-fold b-axis perpendicular to a parallelogram lattice ac. How can this two-dimensional lattice be
Table 1. The Seven Crystal Systems Crystal System Triclinic (anorthic) Monoclinic Orthorhombic Rhombohedrala Tetragonal Hexagonalb Cubic (isometric) a b
Minimum Symmetry Only a 1-fold axis One 2-fold axis chosen to be the unique b-axis, [010] Three mutually perpendicular 2-fold axes, [100], [010], [001] One 3-fold axis parallel to the long axis of the rhomb, [111] One 4-fold axis parallel to the c-axis, [001] One 6-fold axis parallel to the c-axis, [001] Four 3-fold axes parallel to the four body diagonals of a cube, [111]
Usually transformed to a hexagonal unit cell. Point groups or space groups that are not rhombohedral but contain a 3-fold axis are labeled trigonal.
Unit Cell Parameter Relationships a 6¼ b 6¼ c; a 6¼ b 6¼ g a 6¼ b 6¼ c; a ¼ g ¼ 90 ; b 6¼ 90 a 6¼ b 6¼ c; a ¼ b ¼ g ¼ 90 a ¼ b ¼ c; a ¼ b ¼ g 6¼ 90 a ¼ b 6¼ c; a ¼ b ¼ g ¼ 90 a ¼ b 6¼ c; a ¼ b ¼ 90 g ¼ 120 a ¼ b ¼ c; a ¼ b ¼ g ¼ 90
SYMMETRY IN CRYSTALLOGRAPHY
45
Figure 8. The stacking of parallelogram lattices in the monoclinic system. (A) Shifting the zero level up along the 2-fold axis located at the origin of the parallelogram. (B) Shifting the zero level up on the 2-fold axis at the center of the parallelogram. (C) Shifting the zero level up on the 2-fold axis located at 12 c. (D) Shifting the zero level up on the 2-fold axis located at 12 a. Courtesy of Azaroff (1968).
Figure 9. A periodic repetition of a 2-fold axis creates new, crystallographically independent 2-fold axes. Courtesy of Azaroff (1968).
repeated along the third dimension? It cannot be along some arbitrary direction because such a unit cell would violate the 2-fold axis. However, the plane net can be stacked along the b-axis at some definite interval to complete the unit cell (Fig. 8A). This symmetry operation produces a primitive cell, P. Is this the only possibility? When a 2-fold axis is periodically repeated in space with period a, then the 2-fold axis at the origin is repeated at x ¼ 1, 2, . . ., n, but such a repetition also gives rise to new 2-fold axes at x ¼ 12 ; 32 ; . . . , z ¼ 12 ; 32 ; . . . , etc., and along the plane diagonal (Fig. 9). Thus, there are three additional 2-fold axes and three additional stacking possibilities along the 2-fold axes located at x ¼ 12 ; z ¼ 0; x ¼ 0, z ¼ 12; and x ¼ 12, z ¼ 12. However, the first layer stacked along the 2-fold axis located at x ¼ 12, z ¼ 0 does not result in a unit cell that incorporates the 2-fold axis. The vector from 0, 0, 0 to that lattice point is not along a 2-fold axis. The stacking sequence has to be repeated once more and now a lattice point on the second parallelogram lattice will lie above the point 0, 0, 0. The vector length from the origin to that point is the periodicity along b. An examination of this unit cell shows that there is a lattice point at 12, 12, 0 so that the ab face is centered. Such a unit cell is labeled C-face centered given the symbol C (Fig. 8D), and contains two lattice points: the origin lattice point shared among eight cells and the face-centered point shared between two cells. Stacking along the other 2-fold axes produces an A-face-centered cell given the symbol A (Fig. 8C) and a body-centered cell given the label I. (Fig. 8B). Since every direction in the monoclinic system is a 1-fold axis except for the 2-fold b-axis, the labeling of a and c directions in the plane perpendicular to b is arbitrary. Interchanging the a and c axial labels changes the A-face centering to C-face centering. By convention C-face centering is the
standard orientation. Similarly, an I-centered lattice can be reoriented to a C-centered cell by drawing a diagonal in the old cell as a new axis. Thus, there are only two unique Bravais lattices in the monoclinic system, Figure 10. The systematic investigation of the combinations of the 32 point groups with periodic translation in space generates 14 different space lattices. The 14 Bravais lattices are shown in Figure 10 (Azaroff, 1968; McKie and McKie, 1986). Miller Indices To explain x ray scattering from atomic arrays it is convenient to think of atoms as populating planes in the unit cell. The diffraction intensities are considered to arise from reflections of these planes. These planes are labeled by the inverse of their intercepts on the unit cell axes. If a plane intercepts the a-axis at 12, the b-axis at 12, and the c-axis at 23 the distances of their respective periodicities, then the Miller indices are h ¼ 4, k ¼ 4, l ¼ 3, the reciprocals cleared of fractions (Fig. 11), and the plane is denoted as (443). The round brackets enclosing the (hkl) indices indicate that this is a plane. Figure 11 illustrates this convention for several planes. Planes that are related by a symmetry operator such as a 4-fold axis in the tetragonal system are part of a common form. Thus, (100), (010), (100) are designated by ((100)) or {100}. A direction in and (010) the unit cell—e.g., the vector from the origin 0, 0, 0 to the lattice point 1, 1, 1—is represented by square brackets as [111]. Note that the above four planes intersect in a common line, the c-axis or the zone axis, which is designated as [001]. A family of zone axes is indicated by angle brackets, h111i. For the tetragonal system, this symbol denotes the symmetry-equivalent directions [111], [111], [111], and
46
COMMON CONCEPTS
Figure 10. The 14 Bravais lattices. Courtesy of Cullity (1978).
11]. [1 The plane (hkl) and the zone axis [uvw] obey the relationship hu þ kw þ lz ¼ 0. A complication arises in the hexagonal system. There are three equivalent a-axes due to the presence of a 3fold or 6-fold symmetry axis perpendicular to the ab plane. If the three axes are the vectors a1, a2, and a3, then a1 þ a2 ¼ a3. To remove any ambiguity about which of the axes are cut by a plane, four symbols are used. They are the Miller-Bravais indices hkil, where i ¼ (h þ k) (Azaroff, 1968). Thus, what is ordinarily written as (111) or (11 1), the becomes in the hexagonal system (1121)
The Miller indices for the unique direcdot replacing the 2. tions in the unit cells of the seven crystal systems are listed in Table 1.
SPACE GROUPS The combination of the 32 crystallographic point groups with the 14 Bravais lattices produces the 230 space groups. The atomic arrangement of every crystalline material displaying 3-dimensional periodicity can be assigned to one of
SYMMETRY IN CRYSTALLOGRAPHY
47
Figure 11. Miller indices of crystallographic planes.
the space groups. Consider the combination of point group 1 with a P lattice. An atom located at position x y z in the unit cell is periodically repeated in all unit cells. There is only one general position x y z in this unit cell. Of course, the values for x y z can vary and there can be many atoms in the unit cell, but usually additional symmetry relations will not exist among them. Now consider the combination with the Bravais lattice P. Every atom at of point group 1 x y z must have a symmetry-related atom at x y z. Again, these positional parameters can have different values so that many atoms may be present in the unit cell. But for every atom A there must be an identical atom A0 related by a center of symmetry. The two positions are known as equivalent positions and the atom is said to be located in the general position x y z. If an atom is located at the special position 0, 0, 0, then no additional atom is generated. There are eight such special positions, each a center of We have symmetry, in the unit cell of space group P1. just derived the first two triclinic space groups P1 and The first position in this nomenclature refers to the P1. Bravais lattice. The second refers to a symmetry operator, The knowledge of symmetry operators in this case 1 or 1. relating atomic positions is very helpful when determining crystal structures. As soon as spatial positions x y z of the atoms of a motif have been determined, e.g., the hand in Figure 3, then all atomic positions of the symmetry related motif(s) are known. The problem of determining all spatial parameters has been halved in the above example. The
motif determined by the minimum number of atomic parameters is known as the asymmetric unit of the unit cell. Let us investigate the combination of point group 2/m with a P Bravais lattice. The presence of one unique 2-fold axis means that this is a monoclinic crystal system and by convention the unique axis is labeled b. The 2-fold axis operating on x y z generates the symmetry-related position x y z. The mirror plane perpendicular to the 2-fold axis operating on these two positions generates the additional locations x y z and x y z for a total of four general equivalent positions. Note that the combination of 2/m gives rise to a center of symmetry. Special positions such as a location on a mirror x 12 z permit only the 2-fold axis to produce the related equivalent position x 12 z. Similarly, the special position at 0, 0, 0 generates no further symmetry-related positions. A total of 14 special positions exist in space group P2/m. Again, the symbol shows the presence of only one 2-fold axis; therefore, the space group belongs to the monoclinic crystal system. The Bravais lattice is primitive, the 2-fold axis is parallel to the b-axis, and the mirror plane is perpendicular to the b-axis (International Union of Crystallography, 1983). Let us consider one more example, the more complicated space group Cmmm (Fig. 12). We notice that the Bravais lattice is C-face centered and that there are three so that there are essenmirrors. We remember that m ¼ 2, tially three 2-fold axes present. This makes the crystal system orthorhombic. The three 2-fold axes are orthogonal to
48
COMMON CONCEPTS
each other. We also know that the line of intersection of two orthogonal mirrors is a 2-fold axis. This space group, therefore, should have as its complete symbol C2/m 2/m 2/m, but the crystallographer knows that the 2-fold axes are there because they are the intersections of the three orthogonal mirrors. It is customary to omit them and write the space group as Cmmm. The three symbols after the Bravais lattice refer to the three orthogonal axes of the unit cell a, b, c. The letters m are really in the denominator so that the three mirrors are located perpendicular to the a-axis, perpendicular to the b-axis, and perpendicular to the c-axis. For the sake of consistency it is wise to consider any numeral as an axis parallel to a direction and any letter as a symmetry operator perpendicular to a direction.
C-face centering means that there is a lattice point at the position 12, 12, 0 in the unit cell. The atomic environment around any one lattice point is identical to that of any other lattice point. Therefore, as soon as the position x y z is occupied there must be an identical occupant at x þ 12, y þ 12, z. Let us now develop the general equivalent positions, or equipoints, for this space group. The symmetry-related point due to the mirror perpendicular to the a-axis operating on x y z is x y z. The mirror operation on these two equipoints due to the mirror perpendicular to the b-axis yields x y z and x y z. The mirror perpendicular to the caxis operates on these four equipoints to yield x y z, x y z, x y z, and x y z. Note that this space group contains a center
Figure 12. The space group C.mmm. (A) List of general and special equivalent positions. (B) Changes in space groups resulting from relabeling of the coordinate axes. From International Union of Crystallography (1983).
SYMMETRY IN CRYSTALLOGRAPHY
49
Figure 12 (Continued)
of symmetry. To every one of these eight equipoints must be added 12, 12, 0 to take care of C-face centering. This yields a total of 16 general equipoints. When an atom is placed on one of the symmetry operators, the number of equipoints is reduced (Fig. 12A). Clearly, once one has derived all the positions of a space group there is no point in doing it again. Figure 12 is a copy of the space group information for Cmmm found in the International Tables for Crystallography, Vol. A (International Union of Crystallography, 1983). Note the diagrams in Figure 12B: they represent the changes in the space group symbols as a result of relabeling the unit cell axes. This is permitted in the orthorhombic crystal system since the a, b, and c axes are all 2-fold so that no label is unique.
The rectangle in Figure 12B represents the ab plane of the unit cell. The origin is in the upper left corner with the a-axis pointing down and the b-axis pointing to the right; the c-axis points out of the plane of the paper. Note the symbols for the symmetry elements and their locations in the unit cell. A complete description of these symbols can be found in the International Tables for Crystallography, Vol. A, pages 4–10 (International Union of Crystallography, 1983). In addition to the space group information there is an extensive discussion of many crystallographic topics. No x ray diffraction laboratory should be without this volume. The determination of the space group of a crystalline material is obtained from x ray diffraction data.
50
COMMON CONCEPTS
Space Group Symbols In general, space group notations consist of four symbols. The first symbol always refers to the Bravais lattice. Why, then, do we have P 1 or P 2/m? The full symbols are P111 and P 1 2/m 1. But in the triclinic system there is no unique direction, since every direction is a 1-fold axis of symmetry. It is therefore sufficient just to write P 1. In the monoclinic system there is only one unique direction—by convention it is the b-axis—and so only the symmetry elements related to that direction need to be specified. In the orthorhombic system there are the three unique 2-fold axes parallel to the lattice parameters a, b, and c. Thus, Pnma means that the crystal system is orthorhombic, the Bravais lattice is P, and there is an n glide plane perpendicular to the a-axis, a mirror perpendicular to the b-axis, and an a glide plane perpendicular to the c-axis. The complete symbol for this space group is P 21/n 21/m 21/a. Again, the 21 screw axes are a result of the other symmetry operators and are not expressly indicated in the standard symbol. The letter symbols are considered in the denominator and the interpretation is that the operators are perpendicular to the axes. In the tetragonal system there is the unique direction, the 4-fold c-axis. The next unique directions are the equivalent a and b axes, and the third directions are the four equivalent C-face diagonals, the h110i directions. The symbol I4cm means that the space group is tetragonal, the Bravais lattice is body centered, there is a 4-fold axis parallel to the c-axis, there is are c glide planes perpendicular to the equivalent a and b-axes, and there are mirrors perpendicular to the C-face diagonals. Note that one can say just as well that the symmetry operators c and m are parallel to the directions. tells us that the space group belongs to The symbol P3c1 the trigonal system, primitive Bravais lattice, with a 3-fold rotoinversion axis parallel to the c-axis, and a c glide plane perpendicular to the equivalent a and b axes (or parallel to the [210] and [120] directions); the third symbol refers to the face diagonal [110]. Why the 1 in this case? It serves to distinguish this space group from the space group P31c, which is different. As before, it is part of the trigonal rotoinversion axis parallel to the c axis is presystem. A 3 sent, but now the c glide plane is perpendicular to the [110] or parallel to the [110] directions. Since 6-fold symmetry must be maintained there are also c glide planes parallel to the a and b axes. The symbol R denotes the rhombohedral Bravais lattice, but the lattice is usually reoriented so that the unit cell is hexagonal. full symbol F 4/m 3 2/m, tells us that The symbol Fm3m this space group belongs to the cubic system (note the posi the Bravais lattice is all faces centered, and there tion of 3), are mirrors perpendicular to the three equivalent a, b, and rotoinversion axis parallel to the four body diagc axes, a 3 onals of the cube, the h111i directions, and a mirror perpendicular to the six equivalent face diagonals of the cube, the h110i directions. In this space group additional symmetry elements are generated, such as 4-fold, 2-fold, and 21 axes. Simple Example of the Use of Crystal Structure Knowledge Of what use is knowledge of the space group for a crystalline material? The understanding of the physical and che-
mical properties of a material ultimately depends on the knowledge of the atomic architecture—i.e., the location of every atom in the unit cell with respect to the coordinate axes. The density of a material is r ¼ M/V, where r is density, M the mass, and V the volume (see MASS AND DENSITY MEASUREMENTS). The macroscopic quantities can also be expressed in terms of the atomic content of the unit cell. The mass in the unit cell volume V is the formula weight M multiplied by the number of formula weights z in the unit cell divided by Avogadro’s number N. Thus, r ¼ Mz/VN. the Consider NaCl. It is cubic, the space group is Fm3m, ˚ , its density is 2.165 g/ unit cell parameter is a ¼ 4.872 A cm3, and its formula weight is 58.44, so that z ¼ 4. There are four Na and four Cl ions in the unit cell. The general gives rise to a total of position x y z in space group Fm3m 191 additional equivalent positions. Obviously, one cannot place an Na atom into a general position. An examination of the space group table shows that there are two special positions with four equipoints labeled 4a, at 0, 0, 0 and 4b, at 12, 12, 12. Remember that F means that x ¼ 12, y ¼ 12, z ¼ 0; x ¼ 0, y ¼ 12, z ¼ 12; and x ¼ 12, y ¼ 0, z ¼ 12 must be added to the positions. Thus, the 4 Naþ atoms can be located at the 4a position and the 4 Cl atoms at the 4b position, and since the positional parameters are fixed, the crystal structure of NaCl has been determined. Of course, this is a very simple case. In general, the determination of the space group from x-ray diffraction data is the first essential step in a crystal structure determination.
CONCLUSION It is hoped that this discussion of symmetry will ease the introduction of the novice to this admittedly arcane topic or serve as a review for those who want to extend their expertise in the area of space groups.
ACKNOWLEDGMENTS The author gratefully acknowledges the support of the Robert A. Welch Foundation of Houston, Texas.
LITERATURE CITED Azaroff, L. A. 1968. Elements of X-Ray Crystallography. McGrawHill, New York. Buerger, M. J. 1956. Elementary Crystallography. John Wiley & Sons, New York. Buerger, M. J. 1970. Contemporary Crystallography. McGrawHill, New York. Burns, G. and Glazer, A. M. 1978. Space Groups for Solid State Scientists. Academic Press, New York. Cullity, B. D. 1978. Elements of X-Ray Diffraction, 2nd ed. Addison-Wesley, Reading, Mass. Giacovazzo, C., Monaco, H. L., Viterbo, D., Scordari, F., Gilli, G., Zanotti, G., and Catti, M. 1992. Fundamentals of Crystallography. International Union of Crystallography, Oxford University Press, Oxford. Hall, L. H. 1969. Group Theory and Symmetry in Chemistry. McGraw-Hill, New York.
PARTICLE SCATTERING
51
International Union of Crystallography (Henry, N. F. M. and Lonsdale, K., eds.). 1952. International Tables for Crystallography, Vol. I: Symmetry Groups. The Kynoch Press, Birmingham, UK.
KINEMATICS
International Union of Crystallography (Hahn, T., ed.). 1983. International Tables for Crystallography, Vol. A: Space-Group Symmetry. D. Reidel, Dordrecht, The Netherlands.
The kinematics of two-body collisions are the key to understanding atomic scattering. It is most convenient to consider such binary collisions as occurring between a moving projectile and an initially stationary target. It is sufficient here to assume only that the particles act upon each other with equal repulsive forces, described by some interaction potential. The form of the interaction potential and its effects are discussed below (see Central-Field Theory). A binary collision results in a change in the projectile’s trajectory and energy after it scatters from a target atom. The collision transfers energy to the target atom, which gains energy and recoils away from its rest position. The essential parameters describing a binary collision are defined in Figure 1. These are the masses (m1 and m2) and the initial and final velocities (v0, v1, and v2) of the projectile and target, the scattering angle (ys ) of the projectile, and the recoiling angle (yr ) of the target. Applying the laws of conservation of energy and momentum establishes fundamental relationships among these parameters.
McKie, D. and McKie, C. 1986. Essentials of Crystallography. Blackwell Scientific Publications, Oxford. Stout, G. H. and Jensen, L. H. 1989. X-Ray Structure Determination: A Practical Guide, 2nd ed. John Wiley & Sons, New York.
KEY REFERENCES Burns and Glazer, 1978. See above. An excellent text for self-study of symmetry operators, point groups, and space groups; makes the International Tables for Crystallography understandable. Hahn, 1983. See above. Deals with space groups and related topics, and contains a wealth of crystallographic information. Stout and Jensen, 1989. See above. Meets its objective as a ‘‘practical guide’’ to single-crystal x-ray structure determination, and includes introductory chapters on symmetry.
Binary Collisions
Elastic Scattering and Recoiling In an elastic collision, the total kinetic energy of the particles is unchanged. The law of energy conservation dictates that
INTERNET RESOURCES E0 ¼ E1 þ E2
ð1Þ
http://www.hwi.buffalo.edu/aca American Crystallographic Association. Site directory has links to numerous topics including Crystallographic Resources. http://www.iucr.ac.uk International Union of Crystallography. Provides links to many data bases and other information about worldwide crystallographic activities.
where E ¼ 1=2 mv2 is a particle’s kinetic energy. The law of conservation of momentum, in the directions parallel and perpendicular to the incident particle’s direction, requires that m1 v0 ¼ m1 v1 cos ys þ m2 v2 cos yr
ð2Þ
HUGO STEINFINK University of Texas Austin, Texas
PARTICLE SCATTERING
v1
θs scattering angle
m1,v0
INTRODUCTION
projectile Atomic scattering lies at the heart of numerous materialsanalysis techniques, especially those that employ ion beams as probes. The concepts of particle scattering apply quite generally to objects ranging in size from nucleons to billiard balls, at classical as well as relativistic energies, and for both elastic and inelastic events. This unit summarizes two fundamental topics in collision theory: kinematics, which governs energy and momentum transfer, and central-field theory, which accounts for the strength of particle interactions. For definitions of symbols used throughout this unit, see the Appendix.
target (initially at rest)
θr recoiling angle
m2,v2 Figure 1. Binary collision diagram in a laboratory reference frame. The initial kinetic energy of the incident projectile is E0 ¼ 1=2m1 v20 . The initial kinetic energy of the target is assumed to be zero. The final kinetic energy for the scattered projectile is E1 ¼ 1=2m1 v21 , and for the recoiled particle is E2 ¼ 1=2m2 v22 . Particle energies (E) are typically expressed in units of electron volts, eV, and velocities (v) in units of m/s. The conversion between these units is E mv2 /(1.9297 108), where m is the mass of the particle in amu.
52
COMMON CONCEPTS
for the parallel direction, and 0 ¼ m1 v1 sin ys m2 v2 sin yr
ð3Þ
for the perpendicular direction. Eliminating the recoil angle and target recoil velocity from the above equations yields the fundamental elastic scattering relation for projectiles: 2 cos ys ¼ ð1 þ AÞvs þ ð1 AÞ=vs
ð4Þ
where A ¼ m2/m1 is the target-to-projectile mass ratio and vs ¼ v1/v0 is the normalized final velocity of the scattered particle after the collision. In a similar manner, eliminating the scattering angle and projectile velocity from Equations 1, 2, and 3 yields the fundamental elastic recoiling relation for targets: 2 cos yr ¼ ð1 þ AÞvr
ð5Þ
where vr ¼ v2/v0 is the normalized recoil velocity of the target particle. Inelastic Scattering and Recoiling If the internal energy of the particles changes during their interaction, the collision is inelastic. Denoting the change in internal energy by Q, the energy conservation law is stated as: E0 ¼ E1 þ E2 þ Q
ð6Þ
It is possible to extend the fundamental elastic scattering and recoiling relations (Equation 4 and Equation 5) to inelastic collisions in a straightforward manner. A kinematic analysis like that given above (see Elastic Scattering and Recoiling) shows the inelastic scattering relation to be: 2 cos ys ¼ ð1 þ AÞvs þ ½1 Að1 Qn Þ =vs
ð7Þ
where Qn ¼ Q/E0 is the normalized inelastic energy factor. Comparison with Equation 4 shows that incorporating the factor Qn accounts for the inelasticity in a collision. When Q > 0, it is referred to as an inelastic energy loss; that is, some of the initial kinetic energy E0 is converted into internal energy of the particles and the total kinetic energy of the system is reduced following the collision. Here Qn is assumed to have a constant value that is independent of the trajectories of the collision partners, i.e., its value does not depend on ys . This is a simplifying assumption, which clearly breaks down if the particles do not collide (ys ¼ 0). The corresponding inelastic recoiling relation is 2 cos yr ¼ ð1 þ AÞvr þ
Qn Avr
ð8Þ
In this case, inelasticity adds a second term to the elastic recoiling relation (Equation 5). A common application of the above kinematic relations is in identifying the mass of a target particle by measuring
the kinetic energy loss of a scattered probe particle. For example, if the mass and initial velocity of the probe particle are known and its elastic energy loss is measured at a particular scattering angle, then the Equation 4 can be solved in terms of m2. Or, if both the projectile and target masses are known and the collision is inelastic, Q can be found from Equation 7. A number of useful forms of the fundamental scattering and recoiling relations for both the elastic and inelastic cases are listed in the Appendix at the end of this unit (see Solutions of the Fundamental Scattering and Recoiling Relations in Terms of v, E, y, A, and Qn for Nonrelativistic Collisions). A General Binary Collision Formula It is possible to collect all the kinematic expressions of the preceding sections and cast them into a single fundamental form that applies to all nonrelativistic, mass-conserving binary collisions. This general formula in which the particles scatter or recoil through the laboratory angle y is 2 cos y ¼ ð1 þ AÞvn þ h=vn
ð9Þ
where vn ¼ v/v0 and h ¼ 1 A(1 Qn) for scattering and Qn/A for recoiling. In the above expression, v is a particle’s velocity after collision (v1 or v2) and the other symbols have their usual meanings. Equation 9 is the essence of binary collision kinematics. In experimental work, the measured quantity is often the energy of the scattered or recoiled particle, E1 or E2. Expressing Equation 9 in terms of energy yields qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 E A 1 ðcos y f 2 g2 sin2 yÞ ¼ E0 g 1 þ A
ð10Þ
where f 2 ¼ 1 Qn ð1 þ AÞ=A and g ¼ A for scattering and 1 for recoiling. The positive sign is taken when A > 1 and both signs are taken when A < 1. Scattering and Recoiling Diagrams A helpful and instructive way to become familiar with the fundamental scattering and recoiling relations is to look at their geometric representations. The traditional approach is to plot the relations in center-of-mass coordinates, but an especially clear way of depicting these relations, particularly for materials analysis applications, is to use the laboratory frame with a properly scaled polar coordinate system. This approach will be used extensively throughout the remainder of this unit. The fundamental scattering and recoil relations (Equation 4, Equation 5, Equation 7, and Equation 8) describe circles in polar coordinates, (HE; y). The radial coordinate is taken as the square root of normalized energy (Es or Er) and the angular coordinate, y, is the laboratory observation angle (ys or yr ). These curves provide considerable insight into the collision kinematics. Figure 2 shows a typical elastic scattering circle. Here, HE is H(Es), where Es ¼ E1/E0 and y is ys . Note that r is simply vs, so the circle traces out the velocity/angle relationship for scattering. Projectiles can be viewed as having initial velocity vectors
PARTICLE SCATTERING
53
ratio A. One simply uses Equation 11 to find the circle center at (xs,08) and then draws a circle of radius rs. The resulting scattering circle can then be used to find the energy of the scattered particle at any scattering angle by drawing a line from the origin to the circle at the selected angle. The length of the line is H(Es). Similarly, the polar coordinates for recoiling are ([H(Er)],yr ), where Er ¼ E2/E0. A typical elastic recoiling circle is shown in Figure 3. The recoiling circle passes through the origin, corresponding to the case where no collision occurs and the target remains at rest. The circle center, xr, is located at: xr ¼
pffiffiffiffi A 1þA
ð14Þ
and its radius is rr ¼ xr for elastic recoiling or rr ¼ fxr ¼ Figure 2. An elastic scattering circle plotted in polar coordinates (HE; y) where E is Es ¼ E1/E0 and y is the laboratory scattering angle, ys . The length of the line segment from the origin to a point on the circle gives the relative scattered particle velocity, vs, at that angle. Note that HðEs Þ ¼ vs ¼ v1/v0. Scattering circles are centered at (xs,08), where xs ¼ (1 þ A)1 and A ¼ m2/m1. All elastic scattering circles pass through the point (1,08). The circle shown is for the case A ¼ 4. The triangle illustrates the relationships sin(yc ys )/sin(ys ) ¼ xs/rs ¼ 1/A.
of unit magnitude traveling from left to right along the horizontal axis, striking the target at the origin, and leaving at angles and energies indicated by the scattering circle. The circle passes through the point (1,08), corresponding to the situation where no collision occurs. Of course, when there is no scattering (ys ¼ 08), there is no change in the incident particle’s energy or velocity (Es ¼ 1 and vs ¼ 1). The maximum energy loss occurs at y ¼ 1808, when a head-on collision occurs. The circle center and radius are a function of the target-to-projectile mass ratio. The center is located along the 08 direction a distance xs from the origin, given by xs ¼
1 1þA
pffiffiffiffi f A 1þA
ð15Þ
for inelastic recoiling. Recoiling circles can be readily constructed for any collision partners using the above equations. For elastic collisions ( f ¼ 1), the construction is trivial, as the recoiling circle radius equals its center distance. Figure 4 shows elastic scattering and recoiling circles for a variety of mass ratio A values. Since the circles are symmetric about the horizontal (08) direction, only semicircles are plotted (scattering curves in the upper half plane and recoiling curves in the lower quadrant). Several general properties of binary collisions are evident. First,
ð11Þ
while the radius for elastic scattering is rs ¼ 1 xs ¼ A xs ¼
A 1þA
ð12Þ
For inelastic scattering, the scattering circle center is also given by Equation 11, but the radius is given by rs ¼ fAxs ¼
fA 1þA
ð13Þ
where f is defined as in Equation 10. Equation 11 and Equation 12 or 13 make it easy to construct the appropriate scattering circle for any given mass
Figure 3. An elastic recoiling circle plotted in polar coordinates (HE,y) where E is Er ¼ E2/E0 and y is the laboratory recoiling angle, yr . The length of the line segment from the origin to a point on the circle gives HðEr Þ at that angle. Recoiling circles are centered at (xr,08), where xr ¼ HA/(1 þ A). Note that xr ¼ rr. All elastic recoiling circles pass through the origin. The circle shown is for the case A ¼ 4. The triangle illustrates the relationship yr ¼ (p yc )/2.
54
COMMON CONCEPTS
useful. It is a noninertial frame whose origin is located on the target. The relative frame is equivalent to a situation where a single particle of mass m interacts with a fixedpoint scattering center with the same potential as in the laboratory frame. In both these alternative frames of reference, the two-body collision problem reduces to a one-body problem. The relevant parameters are the reduced mass, m, the relative energy, Erel, and the center-of-mass scattering angle, yc . The term reduced mass originates from the fact that m < m1 þ m2. The reduced mass is m¼
m1 m2 m1 þ m2
ð16Þ
and the relative energy is Erel ¼ E0
Figure 4. Elastic scattering and recoiling diagram for various values of A. For scattering, HðEs Þ versus ys is plotted for A values of 0.2, 0.4, 0.6, 1, 1.5, 2, 3, 5, 10, and 100 in the upper half-plane. When A < 1, only forward scattering is possible. For recoiling, HðEr Þ versus yr is plotted for A values of 0.2, 1, 10, 25, and 100 in the lower quadrant. Recoiling particles travel only in the forward direction.
for mass ratio A > 1 (i.e., light projectiles striking heavy targets), scattering at all angles 08 < ys 1808 is permitted. When A ¼ 1 (as in billiards, for instance) the scattering and recoil circles are the same. A head-on collision brings the projectile to rest, transferring the full projectile energy to the target. When A < 1 (i.e., heavy projectiles striking light targets), only forward scattering is possible and there is a limiting scattering angle, ymax , which is found by drawing a tangent line from the origin to the scattering circle. The value of ymax is arcsin A, because ys ¼ rs =xs ¼ A. Note that there is a single scattering energy at each scattering angle when A 1, but two energies are possible when A < 1 and ys < ymax . This is illustrated in Figure 5. For all A, recoiling particles have only one energy and the recoiling angle yr < 908. It is interesting to note that the recoiling circles are the same for A and A1 , so it is not always possible to unambiguously identify the target mass by measuring its recoil energy. For example, using He projectiles, the energies of elastic H and O recoils at any selected recoiling angle are identical.
A 1þA
ð17Þ
The scattering angle yc is the same in the center-ofmass and relative reference frames. Scattering and recoiling circles show clearly the relationship between laboratory and center-of-mass scattering angles. In fact, the circles can readily be generated by parametric equations involving yc . These are simply x ¼ R cos yc þ C and y ¼ R sin yc , where R is the circle radius (R ¼ rs for scattering, R ¼ rr for recoiling) and (C,08) is the location of its center (C ¼ xs for scattering, C ¼ xr for recoiling). Figures 2 and 3 illustrate the relationships among ys , yr , and yc . The relationship between yc and ys can be found by examining the triangle in Figure 2 containing ys and having sides of lengths xs, rs, and vs. Applying the law of sines gives, for elastic scattering, tan ys ¼
sin yc A1 þ cos yc
ð18Þ
Center-of-Mass and Relative Coordinates In some circumstances, such as when calculating collision cross-sections (see Central-Field Theory), it is useful to evaluate the scattering angle in the center of mass reference frame where the total momentum of the system is zero. This is an inertial reference frame with its origin located at the center of mass of the two particles. The center of mass moves in a straight line at constant velocity in the laboratory frame. The relative reference frame is also
Figure 5. Elastic scattering (A) and recoiling (B) diagrams for the case A ¼ 1/2. Note that in this case scattering occurs only for ys 308. In general, ys ymax ¼ arcsin A, when A < 1. Below ymax , two scattered particle energies are possible at each laboratory observing angle. The relationships among yc1 , yc2 , and ys are shown at ys ¼ 208.
PARTICLE SCATTERING
55
This definition of xs is consistent with the earlier, nonrelativistic definition since, when g ¼ 1, the center is as given in Equation 11. The major axis of the ellipse for elastic collisions is a¼
Aðg þ AÞ 1 þ 2Ag þ A2
ð22Þ
and the minor axis is b¼
A ð1 þ 2Ag þ A2 Þ1=2
ð23Þ
When g ¼ 1, a ¼ b ¼ rs ¼ Figure 6. Correspondence between the center-of-mass scattering angle, yc , and the laboratory scattering angle, ys , for elastic collisions having various values of A: 0.5, 1, 2, and 100. For A ¼ 1, ys ¼ yc /2. For A 1, ys yc . When A < 1, ymax ¼ arcsin A.
Inspection of the elastic recoiling circle in Figure 3 shows that 1 yr ¼ ðp yc Þ 2
ð24Þ
which indicates that the ellipse turns into the familiar elastic scattering circle under nonrelativistic conditions. The foci of the ellipse are located at positions xs d and xs þ d along the horizontal axis of the scattering diagram, where d is given by d¼
ð19Þ
The relationship is apparent after noting that the triangle including yc and yr is isosceles. The various conversions between these three angles for elastic and inelastic collisions are listed in Appendix at the end of this unit (see Conversions among ys , yr , and yc for Nonrelativistic Collisions). Two special cases are worth mentioning. If A ¼ 1, then yc ¼ 2ys ; and as A ! 1, yc ! ys . These, along with intermediate cases, are illustrated in Figure 6.
A 1þA
Aðg2 1Þ1=2 1 þ 2Ag þ A2
ð25Þ
The eccentricity of the ellipse, e, is e¼
d ðg2 1Þ1=2 ¼ a Aþg
ð26Þ
Examples of relativistic scattering curves are shown in Figure 7.
Relativistic Collisions When the velocity of the projectile is a substantial fraction of the speed of light, relativistic effects occur. The effect most clearly seen as the projectile’s velocity increases is distortion of the scattering circle into an ellipse. The relativistic parameter or Lorentz factor, g, is defined as: g ¼ ð1 b2 Þ1=2
ð20Þ
where b, called the reduced velocity, is v0/c and c is the speed of light. For all atomic projectiles with kinetic energies 1, one finds that xs(e) > xs(c), where xs(e) is the location of the ellipse center and xs(c) is the location of the circle center. When A ¼ 1, then it is always true that xs(e) ¼ xs(c) ¼ 1/2. And finally, for a given A < 1, one finds that xs(e) < xs(c). This last inequality has an interesting consequence. As g increases when A < 1, the center of the ellipse moves towards the origin, the ellipse itself becomes more eccentric, and one finds that ymax does not change. The maximum allowed scattering angle when A < 1 is always arcsin A. This effect is diagrammed in Figure 7. For inelastic relativistic collisions, the center of the scattering ellipse remains unchanged from the elastic case (Equation 21). However, the major and minor axes are reduced. A practical way of plotting the ellipse is to use its parametric definition, which is x ¼ rs(a) cos yc þ xs and y ¼ rs(b) sin yc , where rs ðaÞ ¼ a
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 Qn =a
2 cos y1 ¼ ðA1 þ A2 Þvn1 þ
1 A2 ð1 Qn Þ A1 vn1
ð29Þ
where y1 is the emission angle of particle c with respect to the incident particle direction. As mentioned above, the normalized inelastic energy factor, Qn, is Q/E0, where E0 is the incident particle kinetic energy. In a similar manner, the fundamental relation for particle D is found to be 2 cos y2 ¼
ðA1 þ A2 Þvn2 1 A1 ð1 Qn Þ pffiffiffiffiffiffi pffiffiffiffiffiffi þ A2 A2 vn2
ð30Þ
ð27Þ
where y2 is the emission angle of particle D. Equations 29 and 30 can be combined into a single expression for the energy of the products:
ð28Þ
E ¼ Ai " 1 A1 þ A2
and pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rs ðbÞ ¼ b 1 Qn =b
vn2 as vn1 ¼ vc =vb and vn2 ¼ vD =vb . Note that A2 is equivalent to the previously defined target-to-projectile mass ratio A. Applying the energy and momentum conservation laws yields the fundamental kinematic relation for particle c, which is:
As in the nonrelativistic classical case, the relativistic scattering curves allow one to easily determine the scattered particle velocity and energy at any allowed scattering angle. In a similar manner, the recoiling curves for relativistic particles can be stated as a straightforward extension of the classical recoiling curves. Nuclear Reactions Scattering and recoiling circle diagrams can also depict the kinematics of simple nuclear reactions in which the colliding particles undergo a mass change. A nuclear reaction of the form A(b,c)D can be written as A þ b ! c þ D þ Qmass/ c2, where the mass difference is accounted for by Qmass, usually referred to as the ‘‘Q value’’ for the reaction. The sign of the Q value is conventionally taken as positive for a kinetic energy-producing (exoergic) reaction and negative for a kinetic energy-driven (endoergic) reaction. It is important to distinguish between Qmass and the inelastic energy factor Q introduced in Equation 6. The difference is that Qmass balances the mass in the above equation for the nuclear reaction, while Q balances the kinetic energy in Equation 6. These values are of opposite sign: i.e., Q ¼ Qmass. To illustrate, for an exoergic reaction (Qmass > 0), some of the internal energy (e.g., mass) of the reactant particles is converted to kinetic energy. Hence the internal energy of the system is reduced and Q is negative in sign. For the reaction A(b,c)D, A is considered to be the target nucleus, b the incident particle (projectile), c the outgoing particle, and D the recoil nucleus. Let mA ; mb ; mc ; and mD be the corresponding masses and vb ; vc ; and vD be the corresponding velocities (vA is assumed to be zero). We now define the mass ratios A1 and A2 as A1 ¼ mc =mb and A2 ¼ mD =mb and the velocity ratios vn1 and
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi!#2 ðA1 þ A2 Þ½1 Aj ð1 Qn Þ cos y cos2 y Ai ð31Þ
where the variables are assigned according to Table 1. Equation 31 is a generalization of Equation 10, and its symmetry with respect to the two product particles is noteworthy. The symmetry arises from the common origin of the products at the instance of the collision, at which point they are indistinguishable. In analogy to the results discussed above (see discussion of Scattering and Recoiling Diagrams), the expressions of Equation 31 describe circles in polar coordinates (HE; y). Here the circle center x1 is given by x1 ¼
pffiffiffiffiffiffi A1 A1 þ A2
ð32Þ
and the circle center x2 is given by pffiffiffiffiffiffi A2 x2 ¼ A1 þ A2 The circle radius r1 is pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r1 ¼ x2 ðA1 þ A2 Þð1 Qn Þ 1
ð33Þ
ð34Þ
Table 1. Assignment of Variables in Equation 31
Variable E y Ai Aj
Product Particle ———————————————— — c D E1 =E0 q1 A1 A2
E2 =E0 y2 A2 A1
PARTICLE SCATTERING
57
and the circle radius r2 is r2 ¼ x1
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðA1 þ A2 Þð1 Qn Þ 1
ð35Þ
Polar (HE; y) diagrams can be easily constructed using these definitions. In this case, the terms light product and heavy product should be used instead of scattering and recoiling, since the reaction products originate from a compound nucleus and are distinguished only by their mass. Note that Equations 29 and 30 become equivalent to Equations 7 and 8 if no mass change occurs, since if mb ¼ mc , then A1 ¼ 1. Similarly, Equations. 32 and 33 and Equations 34 and 35 are extensions of Equations. 11, 13, 14, and 16, respectively. It is also worth noting that the initial target mass, mA , does not enter into any of the kinematic expressions, since a body at rest has no kinetic energy or momentum.
Figure 8. Geometry for hard-sphere collisions in a laboratory reference frame. The spheres have radii R1 and R2. The impact parameter, p, is the minimum separation distance between the particles along the projectile’s path if no deflection were to occur. The example shown is for R1 ¼ R2 and A ¼ 1 at the moment of impact. From the triangle, it is apparent that p/D ¼ cos (yc /2), where D ¼ R1 þ R2.
CENTRAL-FIELD THEORY While kinematics tells us how energy is apportioned between two colliding particles for a given scattering or recoiling angle, it tells us nothing about how the alignment between the collision partners determines their final trajectories. Central-field theory provides this information by considering the interaction potential between the particles. This section begins with a discussion of interaction potentials, then introduces the notion of an impact parameter, which leads to the formulation of the deflection function and the evaluation of collision cross-sections.
and can be cast in a simple analytic form. At still higher energies, nuclear reactions can occur and must be considered. For particles with very high velocities, relativistic effects can dominate. Table 2 summarizes the potentials commonly used in various energy regimes. In the following, we will consider only central potentials, V(r), which are spherically symmetric and depend only upon the distance between nuclei. In many materials-analysis applications, the energy of the interacting particles is such that pure or screened Coulomb central potentials prove highly useful.
Interaction Potentials The form of the interaction potential is of prime importance to the accurate representation of atomic scattering. The appropriate form depends on the incident kinetic energy of the projectile, E0, and on the collision partners. When E0 is on the order of the energy of chemical bonds (1 eV), a potential that accounts for chemical interactions is required. Such potentials frequently consist of a repulsive term that operates at short distances and a long-range attractive term. At energies above the thermal and hyperthermal regimes (>100 eV), atomic collisions can be modeled using screened Coulomb potentials, which consider the Coulomb repulsion between nuclei attenuated by electronic screening effects. This energy regime extends up to some tens or hundreds of keV. At higher E0, the interaction potential becomes practically Coulombic in nature
Impact Parameters The impact parameter is a measure of the alignment of the collision partners and is the distance of closest approach between the two particles in the absence of any forces. Its measure is the perpendicular distance between the projectile’s initial direction and the parallel line passing through the target center. The impact parameter p is shown in Figure 8 for a hard-sphere binary collision. The impact parameter can be defined in a similar fashion for any binary collision; the particles can be point-like or have a physical extent. When p ¼ 0, the collision is head on. For hard spheres, when p is greater than the sum of the spheres’ radii, no collision occurs. The impact parameter is similarly defined for scattering in the relative reference frame. This is illustrated in
Table 2. Interatomic Potentials Used in Various Energy Regimesa,b
Regime
Energy Range
Applicable Potential
Thermal Hyperthermal Low Medium High Relativistic
100 MeV
Attractive/repulsive Many body Screened Coulomb Screened/pure Coulomb Coulomb Lie`nard-Wiechert
a b
Comments Van der Waals attraction Chemical reactions Binary collisions Rutherford scattering Nuclear reactions Pair production
Boundaries between regimes are approximate and depend on the characteristics of the collision partners. Below the Bohr electron velocity, e2 = ¼ 2:2 106 m/s, ionization and neutralization effects can be significant.
58
COMMON CONCEPTS
angles. This relationship is expressed by the deflection function, which gives the center-of-mass scattering angle in terms of the impact parameter. The deflection function is of considerable practical use. It enables one to calculate collision cross-sections and thereby relate the intensity of scattering or recoiling with the interaction potential and the number of particles present in a collision system. Deflection Function for Hard Spheres Figure 9. Geometry for scattering in the relative reference frame between a particle of mass m and a fixed point target with a replusive force acting between them. The impact parameter p is defined as in Figure 8. The actual minimum separation distance is larger than p, and is referred to as the apsis of the collision. Also shown are the orientation angle f and the separation distance r of the projectile as seen by an observer situated on the target particle. The apsis, r0, occurs at the orientation angle f0 . The relative scattering angle, shown as yc , is identical to the center-of-mass scattering angle. The relationship yc ¼ jp 2f0 j is apparent by summing the angles around the projectile asymptote at the apsis.
Figure 9 for a collision between two particles with a repulsive central force acting on them. For a given impact parameter, the final trajectory, as defined by yc , depends on the strength of the potential field. Also shown in the figure is the actual distance of closest approach, or apsis, which is larger than p for any collision involving a repulsive potential. Shadow Cones Knowing the interaction potential, it is straightforward, though perhaps tedious, to calculate the trajectories of a projectile and target during a collision, given the initial state of the system (coordinates and velocities). One does this by solving the equations of motion incrementally. With a sufficiently small value for the time step between calculations and a large number of time steps, the correct trajectory emerges. This is shown in Figure 10, for a representative atomic collision at a number of impact parameters. Note the appearance of a shadow cone, a region inside of which the projectile is excluded regardless of the impact parameter. Many weakly deflected projectile trajectories pass near the shadow cone boundary, leading to a flux-focusing effect. This is a general characteristic of collisions with A > 1. The shape of the shadow cone depends on the incident particle energy and the interaction potential. For a pure Coulomb interaction, the shadow cone (in an axial plane) ^ 1=2 forms a parabola whose radius, r^ is given by r^ ¼ 2ðb^lÞ 2 ^ ^ where b ¼ Z1 Z2 e =E0 and l is the distance beyond the target particle. The shadow cone narrows as the energy of the incident particles increases. Approximate expressions for the shadow-cone radius can be used for screened Coulomb interaction potentials, which are useful at lower particle energies. Shadow cones can be utilized by ion-beam analysis methods to determine the surface structure of crystalline solids. Deflection Functions A general objective is to establish the relationship between the impact parameter and the scattering and recoiling
The deflection function can be most simply illustrated for the case of hard-sphere collisions. Hard-sphere collisions have an interaction potential of the form 1 when 0 p D ð36Þ VðrÞ ¼ 0 when p > D where D, called the collision diameter, is the sum of the projectile and target sphere radii R1 and R2. When p is greater than D, no collision occurs. A diagram of a hard-sphere collision at the moment of impact is shown in Figure 8. From the geometry, it is seen that the deflection function for hard spheres is 2 arccosðp=DÞ when 0 p D ð37Þ yc ðpÞ ¼ when p > D 0 For elastic billiard ball collisions (A ¼ 1), the deflection function expressed in laboratory coordinates using Equations 18 and 19 is particularly simple. For the projectile it is ys ¼ arccosðp=DÞ
0 p D;
A¼1
ð38Þ
and for the target yr ¼ arcsinðp=DÞ
0pD
ð39Þ
Figure 10. A two-dimensional representation of a shadow cone. The trajectories for a 1-keV 4He atom scattering from a 197Au target atom are shown for impact parameters ranging from þ3 to 3 ˚ in steps of 0.1 A ˚ . The ZBL interaction potential was used. The A trajectories lie outside a parabolic shadow region. The full shadow cone is three dimensional and has rotational symmetry about its axis. Trajectories for the target atom are not shown. The dot marks the initial target position, but does not represent the size of the target nucleus, which would appear much smaller at this scale.
PARTICLE SCATTERING
59
r: particle separation distance; r0: distance at closest approach (turning point or apsis); V(r): the interaction potential; Erel: kinetic energy of the particles in the center-ofmass and relative coordinate systems (relative energy).
Figure 11. Deflection function for hard-sphere collisions. The center-of-mass deflection angle, yc is given by 2 cos1 (p/D), where p is the impact parameter (see Fig. 8) and D is the collision diameter (sum of particle radii). The scattering angle in the laboratory frame, ys , is given by Equation 37 and is plotted for A values of 0.5, 1, 2, and 100. When A ¼ 1, it equals cos1 (p/D). At large A, it converges to the center-of-mass function. The recoiling angle in the laboratory frame, yr , is given by sin1 (p/D) and does not depend on A.
When A ¼ 1, the projectile and target trajectories after the collision are perpendicular. For A 6¼ 1, the laboratory deflection function for the projectile is not as simple: 0
1
2
1 A þ 2Aðp=DÞ B C ys ¼ arccos@qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiA 2 2 1 2A þ A þ 4Aðp=DÞ
0pD ð40Þ
In the limit, as A ! 1, ys ! 2 arccos (p/D). In contrast, the laboratory deflection function for the target recoil is independent of A and Equation 39 applies for all A. The hard-sphere deflection function is plotted in Figure 11 in both coordinate systems for selected A values. Deflection Function for Other Central Potentials The classical deflection function, which gives the center-ofmass scattering angle yc as a function of the impact parameter p, is yc ðpÞ ¼ p 2p
ð1 r0
1 r2 f ðrÞ
dr 1=2
ð41Þ
where f ðrÞ ¼ 1
p2 VðrÞ r2 Erel
ð42Þ
and f(r0) ¼ 0. The physical meanings of the variables used in these expressions, for the case of two interacting particles, are as follows:
We will examine how the deflection function can be evaluated for various central potentials. When V(r) is a simple central potential, the deflection function can be evaluated analytically. For example, suppose V(r) ¼ k/r, where k is a constant. If k < 0, then V(r) represents an attractive potential, such as gravity, and the resulting deflection function is useful in celestial mechanics. For example, in Newtonian gravity, k ¼ Gm1m2, where G is the gravitational constant and the masses of the celestial bodies are m1 and m2. If k > 0, then V(r) represents a repulsive potential, such as the Coulomb field between likecharged atomic particles. For example, in Rutherford scattering, k ¼ Z1Z2e2, where Z1 and Z2 are the atomic numbers of the nuclei and e is the unit of elementary charge. Then the deflection function is exactly given by yc ðpÞ ¼ p 2 arctan
2pErel k ¼ 2 arctan 2pErel k
ð43Þ
Another central potential for which the deflection function can be exactly solved is the inverse square potential. In this case, V(r) ¼ k/r2, and the corresponding deflection function is: ! p yc ðpÞ ¼ p 1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð44Þ p2 þ k=Erel Although the inverse square potential is not, strictly speaking, encountered in nature, it is a rough approximation to a screened Coloumb field when k ¼ Z1Z2e2. Realistic screened Coulomb fields decrease even more strongly with distance than the k/r2 field. Approximation of the Deflection Function In cases where V(r) is a more complicated function, sometimes no analytic solution for the integral exists and the function must be approximated. This is the situation for atomic scattering at intermediate energies, where the appropriate form for V(r) is given by: k VðrÞ ¼ ðrÞ r
ð45Þ
F(r) is referred to as a screening function. This form for V(r) with k ¼ Z1Z2e2 is the screened Coulomb potential. ˚ (e2 ¼ ca, The constant term e2 has a value of 14.40 eV-A where ¼ h=2p, h is Planck’s constant, and a is the finestructure constant). Although the screening function is not usually known exactly, several approximations appear to be reasonably accurate. These approximate functions have the form n X bi r ðrÞ ¼ ð46Þ ai exp l i¼1
60
COMMON CONCEPTS
where ai, bi, and l are all constants. Two of the better known approximations are due to Molie´ re and to Ziegler, Biersack, and Littmark (ZBL). For the Molie´ re approximation, n ¼ 3, with a1 ¼ 0:35
b1 ¼ 0:3
a2 ¼ 0:55
b2 ¼ 1:2
a3 ¼ 0:10
b3 ¼ 6:0
" #1=3 2=3 1 ð3pÞ2 1=2 1=2 a0 Z1 þ Z2 2 4
"
b1 ¼ 3:19980
a2 ¼ 0:50986
b2 ¼ 0:94229
a3 ¼ 0:28022
b3 ¼ 0:40290
a4 ¼ 0:02817
b4 ¼ 0:20162
ð50Þ
ð51Þ
If no analytic form for the deflection integral exists, two types of approximations are popular. In many cases, analytic approximations can be devised. Otherwise, the function can still be evaluated numerically. Gauss-Mehler quadrature (also called Gauss-Chebyshev quadrature) is useful in such situations. To apply it, the change of variable x ¼ r0/r is made. This gives p yc ðpÞ ¼ p 2^
ð1
1 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dx V 0 ^ 2 1 ðpxÞ E
ð52Þ
where p^ ¼ p/r0. The Gauss-Mehler quadrature relation is ð1
n X gðxÞ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dx ¼_ wi gðxi Þ 2 ð1 x Þ 1 i¼1
pð2i 1Þ 2n
The concept of a scattering cross-section is used to relate the number of particles scattered into a particular angle to the number of incident particles. Accordingly, the scattering cross-section is ds(yc ) ¼ dN/n, where dN is the number of particles scattered per unit time between the angles yc and yc þ dyc , and n is the incident flux of projectiles. With knowledge of the scattering cross-section, it is possible to relate, for a given incident flux, the number of scattered particles to the number of target particles. The value of scattering cross-section depends upon the interaction potential and is expressed most directly using the deflection function. The differential cross-section for scattering into a differential solid angle d is dsðyc Þ p dp ¼ d sinðyc Þ dyc
ð53Þ
ð54Þ
ð57Þ
Here the solid and plane angle elements are related by d ¼ 2p sin ðyc Þ dyc . Hard-sphere collisions provide a simple example. Using the hard-sphere potential (Equation 36) and deflection function (Equation 37), one obtains dsðyc Þ=d ¼ D2 =4. Hard-sphere scattering is isotropic in the center-of-mass reference frame and independent of the incident energy. For the case of a Coulomb interaction potential, one obtains the Rutherford formula: dsðyc Þ ¼ d
2 Z1 Z2 e2 1 4Erel sin4 ðyc =2Þ
ð58Þ
This formula has proven to be exceptionally useful for ion-beam analysis of materials. For the inverse square potential (k/r2), the differential cross-section is given by dsðyc Þ k p2 ðp yc Þ 1 ¼ d Erel y2c ð2p yc Þ2 sinðyc Þ
where wi ¼ p/n and xi ¼ cos
ð56Þ
Cross-Sections
and " #1=3 1 1 ð3pÞ2 a0 Z0:23 þ Z0:23 l¼ 1 2 4 2
#
ð48Þ
In the above, l is referred to as the screening length (the form shown is the Firsov screening length), a0 is the Bohr radius, and me is the rest mass of the electron. For the ZBL approximation, n ¼ 4, with a1 ¼ 0:18175
ð55Þ
This is a useful approximation, as it allows the deflection function for an arbitrary central potential to be calculated to any desired degree of accuracy.
ð49Þ
me e 2
n=2
2X _ 1 gðxi Þ yc ðpÞ¼p n i¼1
ð47Þ
where a0 ¼
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 x2 gðxÞ ¼ p^ 1 ð^ pxÞ2 V=E it can be shown that
and
l¼
Letting
ð59Þ
For other potentials, numerical techniques (e.g., Equation 56) are typically used for evaluating collision crosssections. Equivalent forms of Equation 57, such as dsðyc Þ p dyc 1 dp2 ¼ ¼ d sin ðyc Þ dp 2dðcos yc Þ
ð60Þ
PARTICLE SCATTERING
61
show that computing the cross-section can be accomplished by differentiating the deflection function or its inverse. Cross-sections are converted to laboratory coordinates using Equations 18 and 19. This gives, for elastic collisions, dsðys Þ ð1 þ 2A cos yc þ A2 Þ3=2 dsðyc Þ ¼ do d A2 jðA þ cos yc Þj
ð61Þ
for scattering and dsðyr Þ yc dsðyc Þ ¼ 4 sin do d 2
ð62Þ
for recoiling. Here, the differential solid angle element in the laboratory reference frame, do, is 2p sin(y) dy and y is the laboratory observing angle, ys or yr . For conversions to laboratory coordinates for inelastic collisions, see Conversions among ys , yr , and yc for Nonrelativistic Collisions, in the Appendix at the end of this unit. Examples of differential cross-sections in laboratory coordinates for elastic collisions are shown in Figures 12 and 13 as a function of the laboratory observing angle. Some general observations can be made. When A > 1, scattering is possible at all angles (08 to 1808) and the scattering cross-sections decrease uniformly as the projectile energy and laboratory scattering angle increase. Elastic recoiling particles are emitted only in the forward direction regardless of the value of A. Recoiling cross-sections decrease as the projectile energy increases, but increase with recoiling angle. When A < 1, there are two branches in the scattering cross-section curve. The upper branch (i.e., the one with the larger cross-sections) results from collisions with the larger p. The two branches converge at ymax .
Figure 13. Differential atomic collision cross-sections in the laboratory reference frame for 20Ne projectiles striking 63Cu and 16 O target atoms calculated using ZBL interaction potential. Cross-sections are plotted for both the scattered projectiles (solid lines) and the recoils (dashed lines). The limiting angle for 20Ne scattering from 16O is 53.18.
Applications to Materials Analysis There are two general ways in which particle scattering theory is utilized in materials analysis. First, kinematics provides the connection between measurements of particle scattering parameters (velocity or energy, and angle) and the identity (mass) of the collision partners. A number of techniques analyze the energy of scattered or recoiled particles in order to deduce the elemental or isotopic identity of a substance. Second, central-field theory enables one to relate the intensity of scattering or recoiling to the amount of a substance present. When combined, kinematics and central-field theory provide exactly the tools needed to accomplish, with the proper measurements, compositional analysis of materials. This is the primary goal of many ionbeam methods, where proper selection of the analysis conditions enables a variety of extremely sensitive and accurate materials-characterization procedures to be conducted. These include elemental and isotopic composition analysis, structural analysis of ordered materials, two- and three-dimensional compositional profiles of materials, and detection of trace quantities of impurities in materials.
KEY REFERENCES Behrisch, R. (ed). 1981. Sputtering by Particle Bombardment I. Springer-Verlag, Berlin. Eckstein, W. 1991. Computer Simulation of Ion-Solid Interactions. Springer-Verlag, Berlin. Figure 12. Differential atomic collision cross-sections in the laboratory reference frame for 1-, 10-, and 100-keV 4He projectiles striking 197Au target atoms as a function of the laboratory observing angle. Cross-sections are plotted for both the scattered projectiles (solid lines) and the recoils (dashed lines). The crosssections were calculated using the ZBL screened Coulomb potential and Gauss-Mehler quadrature of the deflection function.
Eichler, J. and Meyerhof, W. E. 1995. Relativistic Atomic Collisions. Academic Press, San Diego. Feldman, L. C. and Mayer, J. W. 1986. Fundamentals of Surface and Thin Film Analysis. Elsevier Science Publishing, New York. Goldstein, H. G. 1959. Classical Mechanics. Addison-Wesley, Reading, Mass.
62
COMMON CONCEPTS
Hagedorn, R. 1963. Relativistic Kinematics. Benjamin/Cummings, Menlo Park, Calif.
In the above relations,
Johnson, R. E. 1982. Introduction to Atomic and Molecular Collisions. Plenum, New York. Landau, L. D. and Lifshitz, E. M. 1976. Mechanics. Pergamon Press, Elmsford, N. Y. Lehmann, C. 1977. Interaction of Radiation with Solids and Elementary Defect Production. North-Holland Publishing, Amsterdam. Levine, R. D. and Bernstein, R. B. 1987. Molecular Reaction Dynamics and Chemical Reactivity. Oxford University Press, New York. Mashkova, E. S. and Molchanov, V. A. 1985. Medium-Energy Ion Reflection from Solids. North-Holland Publishing, Amsterdam. Parilis, E. S., Kishinevsky, L. M., Turaev, N. Y., Baklitzky, B. E., Umarov, F. F., Verleger, V. K., Nizhnaya, S. L., and Bitensky, I. S. 1993. Atomic Collisions on Solid Surfaces. North-Holland Publishing, Amsterdam. Robinson, M. T. 1970. Tables of Classical Scattering Integrals. ORNL-4556, UC-34 Physics. Oak Ridge National Laboratory, Oak Ridge, Tenn. Satchler, G. R. 1990. Introduction to Nuclear Reactions. Oxford University Press, New York. Sommerfeld, A. 1952. Mechanics. Academic Press, New York. Ziegler, J. F., Biersack, J. P., and Littmark, U. 1985. The Stopping and Range of Ions in Solids. Pergamon Press, Elmsford, N.Y.
ROBERT BASTASZ Sandia National Laboratories Livermore, California
WOLFGANG ECKSTEIN
f2 ¼ 1
1þA Qn A
ð70Þ
and A ¼ m2 =m1 ; vs ¼ v1 =v0 ; Es ¼ E1 E0 ; Qn ¼ Q=E0 ; and ys is the laboratory scattering angle as defined in Figure 1. For elastic recoiling: rffiffiffiffiffiffi Er 2 cos yr ¼ A 1þA ð1 þ AÞvr yr ¼ arccos 2
vr ¼
A¼
2 cos yr 1 vr
ð72Þ ð73Þ
For inelastic recoiling: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffi 2 Er cos yr f 2 sin yr ¼ vr ¼ 1þA A ð1 þ AÞvr Qn yr ¼ arccos þ 2 2Avr qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 cos yr vr ð2 cos yr vr Þ2 4Qn A¼ 2vr pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 ðcos yr cos2 yr Er Qn Þ ¼ Er Qn ¼ Avr ½2 cos yr ð1 þ AÞvr
Max-Planck-Institut fu¨ r Plasmaphysik Garching, Germany
ð71Þ
ð74Þ ð75Þ
ð76Þ ð77Þ
In the above relations, f2 ¼ 1
APPENDIX
1þA Qn A
ð78Þ
Solutions of Fundamental Scattering and Recoiling Relations in Terms of n, E, h, A, and Qn for Nonrelativistic Collisions
and A ¼ m2/m1, vr ¼ v2/v0, Er ¼ E2/E0, Qn ¼ Q/E0, yr is the laboratory recoiling angle as defined in Figure 1.
For elastic scattering:
Conversions among hs, hr, and hc for Nonrelativistic Collisions
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffi cos ys A2 sin2 ys vs ¼ Es ¼ 1þA ð1 þ AÞvs 1 A ys ¼ arccos þ 2 2vs 2ð1 vs cos ys Þ A¼ 1 1 v2s For inelastic scattering: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffi cos ys A2 f 2 sin2 ys vs ¼ Es ¼ 1þA ð1 þ AÞvs 1 Að1 Qn Þ ys ¼ arccos þ 2vs 2 pffiffiffiffiffiffi 2 1 þ vs 2vs cos ys 1 þ Es 2 Es cos ys A¼ ¼ 1 v2s Qn 1 Es Qn 1 vs ½2 cos ys ð1 þ AÞvs Qn ¼ 1 A
ð63Þ
" ys ¼ arctan
ð64Þ ¼ arctan
ð65Þ
#
sin 2yr
ðAf Þ1 cos 2yr " # sin yc
ð79Þ
ðAf Þ1 þ cos yc sin yc yr ¼ arctan 1 f cos yc
ð80Þ
ð66Þ
1 yr ¼ ðp yc Þ for f ¼ 1 2 h i yc1 ¼ ys þ arcsin ðAf Þ1 sin ys
ð82Þ
ð67Þ
yc2 ¼ 2 ys yc1 þ p
ð83Þ
for
ð81Þ
sin ys < Af < 1
2 2 3=2
ð68Þ
dsðys Þ ð1 þ 2Af cos yc þ ðA f Þ ¼ A2 f 2 jðAf þ cos yc Þj do
dsðyc Þ d
ð69Þ
dsðyr Þ ð1 2f cos yc þ f 2 Þ3=2 dsðyc Þ ¼ f 2 j cos yc f j do d
ð84Þ ð85Þ
SAMPLE PREPARATION FOR METALLOGRAPHY
In the above relations: f ¼
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1þA 1 Qn A
ð86Þ
Note that: (1) f ¼ 1 for elastic collisions; (2) when A < 1 and sin ys A, two values of yc are possible for each ys ; and (3) when A ¼ 1 and f ¼ 1, (tan ys )(tan yr ) ¼ 1. Glossary of Terms and Symbols a A A1 A2 a a0 b b c d D ds e e E0 E1 E2 Er Erel Es g h l m m 1 , mb m2 mA mc mD me p f f0 Q Qmass Qn r r0 r1 r2 rr
Fine-structure constant (7.3 103) Target to projectile mass ratio (m2/m1) Ratio of product c mass to projectile b (mc/mb) Ratio of product D mass to projectile b mass (mD/mb) Major axis of scattering ellipse Bohr radius ( 29 1011 m) Reduced velocity (v0/c) Minor axis of scattering ellipse Velocity of light ( 3.0 108 m/s) Distance from scattering ellipse center to focal point Collision diameter Scattering cross-section Eccentricity of scattering ellipse Unit of elementary charge ( 1.602 1019 C) Initial kinetic energy of projectile Final kinetic energy of scattered projectile or product c Final kinetic energy of recoiled target or product D Normalized energy of the recoiled target (E2/E0) Relative energy Normalized energy of the scattered projectile (E1/E0) Relativistic parameter (Lorentz factor) Planck constant (4.136 1015 eV-s) Screening length 1 Reduced mass (m1 ¼ m1 1 þ m2 ) Mass of projectile Mass of recoiling particle (target) Initial target mass Light product mass Heavy product mass Electron rest mass ( 9.109 1031 kg) Impact parameter Particle orientation angle in the relative reference frame Particle orientation angle at the apsis Electron screening function Inelastic energy factor Energy equivalent of particle mass change (Q value) Normalized inelastic energy factor (Q/E0) Particle separation distance Distance of closest approach (apsis or turning point) Radius of product c circle Radius of product D circle Radius of recoiling circle or ellipse
rs R1 R2 y1 y2 yc ymax yr ys V v0, vb v1 v2 vc vD vn1 vn2 vr x1 x2 xr xs Z1 Z2
63
Radius of scattering circle or ellipse Hard sphere radius of projectile Hard sphere radius of target Emission angle of product c particle in the laboratory frame Emission angle of product D particle in the laboratory frame Center-of-mass scattering angle Maximum permitted scattering angle Recoiling angle of target in the laboratory frame Scattering angle of projectile in the laboratory frame Interatomic interaction potential Initial velocity of projectile Final velocity of scattered projectile Final velocity of recoiled target Velocity of light product Velocity of heavy product Normalized final velocity of product c particle (vc/vb) Normalized final velocity of product D particle (vD/vb) Normalized final velocity of target particle (v2/v0) Position of product c circle or ellipse center Position of product D circle or ellipse center Position of recoiling circle or ellipse center Position of scattering circle or ellipse center Atomic number of projectile Atomic number of target
SAMPLE PREPARATION FOR METALLOGRAPHY INTRODUCTION Metallography, the study of metal and metallic alloy structure, began at least 150 years ago with early investigations of the science behind metalworking. According to Rhines (1968), the earliest recorded use of metallography was in 1841(Anosov, 1954). Its first systematic use can be traced to Sorby (1864). Since these early beginnings, metallography has come to play a central role in metallurgical studies—a recent (1998) search of the literature revealed over 20,000 references listing metallography as a keyword! Metallographic sample preparation has evolved from a black art to the highly precise scientific technique it is today. Its principal objective is the preparation of artifact-free representative samples suitable for microstructural examination. The particular choice of a sample preparation procedure depends on the alloy system and also on the focus of the examination, which could include process optimization, quality assurance, alloy design, deformation studies, failure analysis, and reverse engineering. The details of how to make the most appropriate choice and perform the sample preparation are the subject of this unit. Metallographic sample preparation is divided broadly into two stages. The aim of the first stage is to obtain a planar, specularly reflective surface, where the scale of the artifacts (e.g., scratches, smears, and surface deformation)
64
COMMON CONCEPTS
is smaller than that of the microstructure. This stage commonly comprises three or four steps: sectioning, mounting (optional), mechanical abrasion, and polishing. The aim of the second stage is to make the microstructure more visible by enhancing the difference between various phases and microstructural features. This is generally accomplished by selective chemical dissolution or film formation—etching. The procedures discussed in this unit are also suitable (with slight modifications) for the preparation of metal and intermetallic matrix composites as well as for semiconductors. The modifications are primarily dictated by the specific applications, e.g., the use of coupled chemical-mechanical polishing for semiconductor junctions. The basic steps in metallographic sample preparation are straightforward, although for each step there may be several options in terms of the techniques and materials used. Also, depending on the application, one or more of the steps may be elaborated or eliminated. This unit pro-
vides guidance on choosing a suitable path for sample preparation, including advice on recognizing and correcting an unsuitable choice. This discussion assumes access to a laboratory equipped with the requisite equipment for metallographic sample preparation. Listings of typical equipment and supplies (see Table 1) and World Wide Web addresses for major commercial suppliers (see Internet Resources) are provided for readers wishing to start or upgrade a metallography laboratory.
STRATEGIC PLANNING Before devising a procedure for metallographic sample preparation, it is essential to define the scope and objectives of the metallographic analysis and to determine the requirements of the sample. Clearly defined objectives
Table 1. Typical Equipment and Supplies for Preparation of Metallographic Samples Application
Items required
Sectioning
Band saw Consumable-abrasive cutoff saw Low-speed diamond saw or continous-loop wire saw Silicon-carbide wheels (coarse and fine grade) Alumina wheels (coarse and fine grade) Diamond saw blades or wire saw wires Abrasive powders for wire saw (Silicon carbide, silicon nitride, boron nitride, alumina) Electric-discharge cutter (optional) Hot mounting press Epoxy and hardener dispenser Vacuum impregnation setup (optional) Thermosetting resins Thermoplastic resins Castable resins Special conductive mounting compounds Edge-retention additives Electroless-nickel plating solutions Belt sander Two-wheel mechanical abrasion and polishing station Automated polishing head (medium-volume laboratory) or automated grinding and polishing system (high-volume laboratory) Vibratory polisher (optional) Paper-backed emery and silicon-carbide grinding disks (120, 180, 240, 320, 400, and 600 grit) Polishing cloths (napless and with nap) Polishing suspensions (15-,9-,6-, and 1-mm diamond; 0.3-mm a-alumina and 0.05-mm g-alumina; colloidal silica; colloidal magnesia; 1200- and 3200-grit emery) Metal and resin-bonded-diamond grinding disks (optional) Commercial electropolisher (recommended) Chemicals for electropolishing Fume hood and chemical storage cabinets Etching chemicals Ultrasonic cleaner Stir/heat plates Compressed air supply (filtered) Specimen dryer Multimeter Acetone Ethyl and methyl alcohol First aid kit Access to the Internet Material safety data sheets for all applicable chemicals Appropriate reference books (see Key References)
Mounting
Mechanical abrasion and polishing
Electropolishing (optional) Etching Miscellaneous
SAMPLE PREPARATION FOR METALLOGRAPHY
may help to avoid many frustrating and unrewarding hours of metallography. It also important to search the literature to see if a sample preparation technique has already been developed for the application of interest. It is usually easier to fine tune an existing procedure than to develop a new one. Defining the Objectives Before proceeding with sample preparation, the metallographer should formulate a set of questions, the answers to which will lead to a definition of the objectives. The list below is not exhaustive, but it illustrates the level of detail required. 1. Will the sample be used only for general microstructural evaluation? 2. Will the sample be examined with an electron microscope? 3. Is the sample being prepared for reverse engineering purposes? 4. Will the sample be used to analyze the grain flow pattern that may result from deformation or solidification processing? 5. Is the procedure to be integrated into a new alloy design effort, where phase identification and quantitative microscopy will be used? 6. Is the procedure being developed for quality assurance, where a large number of similar samples will be processed on a regular basis? 7. Will the procedure be used in failure analysis, requiring special techniques for crack preservation? 8. Is there a requirement to evaluate the composition and thickness of any coating or plating? 9. Is the alloy susceptible to deformation-induced damage such as mechanical twinning? Answers to these and other pertinent questions will indicate the information that is already available and the additional information needed to devise the sample preparation procedure. This leads to the next step, a literature survey. Surveying the Literature In preparing a metallographic sample, it is usually easier to fine-tune an existing procedure, particularly in the final polishing and etching steps, than to develop a new one. Moreover, the published literature on metallography is exhaustive, and for a given application there is a high probability that a sample preparation procedure has already been developed; hence, a thorough literature search is essential. References provided later in this unit will be useful for this purpose (see Key References; see Internet Resources). PROCEDURES The basic procedures used to prepare samples for metallographic analysis are discussed below. For more detail, see
65
ASM Metals Handbook, Volume 9: Metallography and Microstructures (ASM Handbook Committee, 1985) and Vander Voort (1984). Sectioning The first step in sample preparation is to remove a small representative section from the bulk piece. Many techniques are available, and they are discussed below in order of increasing mechanical damage: Cutting with a continuous-loop wire saw causes the least amount of mechanical damage to the sample. The wire may have an embedded abrasive, such as diamond, or may deliver an abrasive slurry, such as alumina, silicon carbide, and boron nitride, to the root of the cut. It is also possible to use a combination of chemical attack and abrasive slurry. This cutting method does not generate a significant amount of heat, and it can be used with very thin components. Another important advantage is that the correct use of this technique reduces the time needed for mechanical abrasion, as it allows the metallographer to eliminate the first three abrasion steps. The main drawback is low cutting speed. Also, the proper cutting pressure often must be determined by trial-anderror. Electric-discharge machining is extremely useful when cutting superhard alloys but can be used with practically any alloy. The damage is typically low and occurs primarily by surface melting. However, the equipment is not commonly available. Moreover, its improper use can result in melted surface layers, microcracking, and a zone of damage several millimeters below the surface. Cutting with a nonconsumable abrasive wheel, such as a low-speed diamond saw, is a very versatile sectioning technique that results in minimal surface deformation. It can be used for specimens containing constituents with widely differing hardnesses. However, the correct use of an abrasive wheel is a trial-and-error process, as too much pressure can cause seizing and smearing. Cutting with a consumable abrasive wheel is especially useful when sectioning hard materials. It is important to use copious amounts of coolant. However, when cutting specimens containing constituents with widely differing hardnesses, the softer constituents are likely to undergo selective ablation, which increases the time required for the mechanical abrasion steps. Sawing is very commonly used and yields satisfactory results in most instances. However, it generates heat, so it is necessary to use copious amounts of cooling fluid when sectioning hard alloys; failure to do so can result in localized ‘‘burns’’ and microstructure alterations. Also, sawing can damage delicate surface coatings and cause ‘‘peel-back.’’ It should not be used when an analysis of coated materials or coatings is required. Shearing is typically used for sheet materials and wires. Although it is a fast procedure, shearing causes extremely heavy deformation, which may result in artifacts. Alternative techniques should be used if possible. Fracturing, and in particular cleavage fracturing, may be used for certain alloys when it is necessary to examine a
66
COMMON CONCEPTS
crystallographically specific surface. In general, fracturing is used only as a last resort. Mounting After sectioning, the sample may be placed on a plastic mounting material for ease of handling, automated grinding and polishing, edge retention, and selective electropolishing and etching. Several types of plastic mounting materials are available; which type should be used depends on the application and the nature of the sample. Thermosetting molding resins (e.g., bakelite, diallyl phthalate, and compression-mounting epoxies) are used when ease of mounting is the primary consideration. Thermoplastic molding resins (e.g., methyl methacrylate, PVC, and polystyrene) are used for fragile specimens, as the molding pressure is lower for these resins than for the thermosetting ones. Castable resins (e.g., acrylics, polyesters, and epoxies) are used when good edge retention and resistance to etchants is required. Additives can be included in the castable resins to make the mounts electrically conductive for electropolishing and electron microscopy. Castable resins also facilitate vacuum impregnation, which is sometimes required for powder metallurgical and failure analysis specimens. Mechanical Abrasion Mechanical abrasion typically uses abrasive particles bonded to a substrate, such as waterproof paper. Typically, the abrasive paper is placed on a platen that is rotated at 150 to 300 rpm. The particles cut into the specimen surface upon contact, forming a series of ‘‘vee’’ grooves. Successively finer grits (smaller particle sizes) of abrasive material are used to reduce the mechanically damaged layer and produce a surface suitable for polishing. The typical sequence is 120-, 180-, 240-, 320-, 400-, and 600-grit material, corresponding approximately to particle sizes of 106, 75, 52, 34, 22, and 14 mm. The principal abrasive materials are silicon carbide, emery, and diamond. In metallographic practice, mechanical abrasion is commonly called grinding, although there are distinctions between mechanical abrasion techniques and traditional grinding. Metallographic mechanical abrasion uses considerably lower surface speeds (between 150 and 300 rpm) and a copious amount of fluid, both for lubrication and for removing the grinding debris. Thus, frictional heating of the specimen and surface damage are significantly lower in mechanical abrasion than in conventional grinding. In mechanical abrasion, the specimen typically is held perpendicular to the grinding platen and moved from the edge to the center. The sample is rotated 908 with each change in grit size, in order to ensure that scratches from the previous operation are completely removed. With each move to finer particle sizes, the rule of thumb is to grind for twice the time used in the previous step. Consequently, it is important to start with the finest possible grit in order to minimize the time required. The size and grinding time for the first grit depends on the sectioning technique used. For semiautomatic or automatic operations, it is best to start with the manufacturer’s recommended procedures and fine-tune them as needed.
Mechanical abrasion operations can also be carried out by rubbing the specimen on a series of stationary abrasive strips arranged in increasing fineness. This method is not recommended because of the difficulty in maintaining a flat surface.
Polishing After mechanical abrasion, a sample is polished so that the surface is specularly reflective and suitable for examination with an optical or scanning electron microscope (SEM). Metallographic polishing is carried out both mechanically and electrolytically. In some cases, where etching is unnecessary or even undesirable—for example, the study of porosity distribution, the detection of cracks, the measurement of plating or coating thickness, and microlevel compositional analysis—polishing is the final step in sample preparation. Mechanical polishing is essentially an extension of mechanical abrasion; however, in mechanical polishing, the particles are suspended in a liquid within the fibers of a cloth, and the wheel rotation speed is between 150 and 600 rpm. Because of the manner in which the abrasive particles are suspended, less force is exerted on the sample surface, resulting in shallower grooves. The choice of polishing cloth depends on the particular application. When the specimen is particularly susceptible to mechanical damage, a cloth with high nap is preferred. On the other hand, if surface flatness is a concern (e.g., edge retention) or if problems such as second-phase ‘‘pullout’’ are encountered, a napless cloth is the proper choice. Note that in selected applications a high-nap cloth may be used as a ‘‘backing’’ for a napless cloth to provide a limited amount of cushioning and retention of polishing medium. Typically, the sample is rotated continously around the central axis of the wheel, counter to the direction of wheel rotation, and the polishing pressure is held constant until nearly the end, when it is greatly reduced for the finishing touches. The abrasive particles used are typically diamond (6 mm and 1 mm), alumina (0.5 mm and 0.03 mm), and colloidal silica and colloidal magnesia. When very high quality samples are required, rotatingwheel polishing is usually followed by vibratory polishing. This also uses an abrasive slurry with diamond, alumina, and colloidal silica and magnesia particles. The samples, usually in weighted holders, are placed on a platen which vibrates in such a way that the samples track a circular path. This method can be adapted for chemo-mechanical polishing by adding chemicals either to attack selected constituents or to suppress selective attack. The end result of vibratory polishing is a specularly reflective surface that is almost free of deformation caused by the previous steps in the sample preparation process. Once the procedure is optimized, vibratory polishing allows a large number of samples to be polished simultaneously with reproducibly excellent quality. Electrolytic polishing is used on a sample after mechanical abrasion to a 400- or 600-grit finish. It too produces a specularly reflective surface that is nearly free of deformation.
SAMPLE PREPARATION FOR METALLOGRAPHY
Electropolishing is commonly used for alloys that are hard to prepare or particularly susceptible to deformation artifacts, such as mechanical twinning in Mg, Zr, and Bi. Electropolishing may be used when edge retention is not required or when a large number of similar samples is expected, for example, in process control and alloy development. The use of electropolishing is not widespread, however, as it (1) has a long development time; (2) requires special equipment; (3) often requires the use of highly corrosive, poisonous, or otherwise dangerous chemicals; and (4) can cause accelerated edge attack, resulting in an enlargement of cracks and porosity, as well as preferential attack of some constituent phases. In spite these disadvantages, electropolishing may be considered because of its processing speed; once the technique is optimized for a particular application, there is none better or faster. In electropolishing, the sample is set up as the anode in an electrolytic cell. The cathode material depends on the alloy being polished and the electrolyte: stainless steel, graphite, copper, and aluminum are commonly used. Direct current, usually from a rectified current source, is supplied to the electrolytic cell, which is equipped with an ammeter and voltmeter to monitor electropolishing conditions. Typically, the voltage-current characteristics of the cell are complex. After an initial rise in current, an ‘‘electropolishing plateau’’ is observed. This plateau results from the formation of a ‘‘polishing film,’’ which is a stable, highresistance viscous layer formed near the anode surface by the dissolution of metal ions. The plateau represents optimum conditions for electropolishing: at lower voltages etching takes place, while at higher voltages there is film breakdown and gas evolution. The mechanism of electropolishing is not well understood, but is generally believed to occur in two stages: smoothing and brightening. The smoothing stage is characterized by a preferential dissolution of the ridges formed by mechanical abrasion (primarily because the resistance at the peak is lower than in the valley). This results in the formation of the viscous polishing film. The brightening phase is characterized by the elimination of extremely small ridges, on the order of 0.01 mm. Electropolishing requires the optimization of many parameters, including electrolyte composition, cathode material, current density, bath temperature, bath agitation, anode-to-cathode distance, and anode orientation (horizontal, vertical, etc.). Other factors, such as whether the sample should be removed before or after the current is switched off, must also be considered. During the development of an electropolishing procedure, the microstructure be should first be prepared by more conventional means so that any electropolishing artifacts can be identified. Etching After the sample is polished, it may be etched to enhance the contrast between various constituent phases and microstructural features. Chemical, electrochemical, and physical methods are available. Contrast on as-polished
67
surfaces may also be enhanced by nondestructive methods, such as dark-field illumination and backscattered electron imaging (see GENERAL VACCUM TECHNIQUES). In chemical and electrochemical etching, the desired contrast can be achieved in a number of ways, depending on the technique employed. Contrast-enhancement mechanisms include selective dissolution; formation of a film, whose thickness varies with the crystallographic orientation of grains; formation of etch pits and grooves, whose orientation and density depend on grain orientation; and precipitation etching. A variety of chemical mixtures are used for selective dissolution. Heat tinting—the formation of oxide film—and anodizing both produce films that are sensitive to polarized light. Physical etching techniques, such as ion etching and thermal etching, depend on the selective removal of atoms. When developing a particular etching procedure, it is important to determine the ‘‘etching limit,’’ below which some microstructural features are masked and above which parts of the microstructure may be removed due to excessive dissolution. For a given etchant, the most important factor is the etching time. Consequently, it is advisable to etch the sample in small time increments and to examine the microstructure between each step. Generally the optimum etching program is evident only after the specimen has been over-etched, so at least one polishing-etching-polishing iteration is usually necessary before a properly etched sample is obtained.
ILLUSTRATIVE EXAMPLES The particular combination of steps used in metallographic sample preparation depends largely on the application. A thorough literature survey undertaken before beginning sample preparation will reveal techniques used in similar applications. The four examples below illustrate the development of a successful metallographic sample preparation procedure. General Microstructural Evaluation of 4340 Steel Samples are to be prepared for the general microstructural evaluation of 4340 steel. Fewer than three samples per day of 1-in. (2.5-cm) material are needed. The general microstructure is expected to be tempered martensite, with a bulk hardness of HRC 40 (hardness on the Rockwell C scale). An evaluation of decarburization is required, but a plating-thickness measurement is not needed. The following sample preparation procedure is suggested, based on past experience and a survey of the metallographic literature for steel. Sectioning. An important objective in sectioning is to avoid ‘‘burning,’’ which can temper the martensite and cause some decarburization. Based on the hardness and required thickness in this case, sectioning is best accomplished using a 60-grit, rubber-resin-bonded alumina wheel and cutting the section while it is submerged in a coolant. The cutting pressure should be such that the 1-in. samples can be cut in 1 to 2 min.
68
COMMON CONCEPTS
Mounting. When mounting the sample, the aim is to retain the edge and to facilitate SEM examination. A conducting epoxy mount is suggested, using an appropriate combination of temperature, pressure, and time to ensure that the specimen-mount separation is minimized. Mechanical Abrasion. To minimize the time needed for mechanical abrasion, a semiautomatic polishing head with a three-sample holder should be used. The wheel speed should be 150 rpm. Grinding would begin with 180-grit silicon carbide, and continue in the sequence 240, 320, 400, and 600 grit. Water should be used as a lubricant, and the sample should be rinsed between each change in grit. Rotation of the sample holder should be in the sense counter to the wheel rotation. This process takes 35 min. Mechanical Polishing. The objective is to produce a deformation-free and specularly reflective surface. After mechanical abrasion, the sample-holder assembly should be cleaned in an ultrasonicator. Polishing is done with medium-nap cloths, using a 6-mm diamond abrasive followed by a 1-mm diamond abrasive. The holder should be cleaned in an ultrasonicator between these two steps. A wheel speed of 300 rpm should be used and the specimen should be rotated counter to the wheel rotation. Polishing requires 10 min for the first step and 5 min for the second step. (Duration decreases because successively lighter damage from previous steps requires shorter removal times in subsequent steps.) Etching. The aim is to reveal the structure of the tempered martensite as well as any evidence of decarburization. Etching should begin with super picral for 30 s. The sample should be examined and then etched for an additional 10 s, if required. In developing this procedure, the samples were found to be over-etched at 50 s. Measurement of Cadmium Plating Composition and Thickness on 4340 Steel This is an extension of the previous example. It illustrates the manner in which an existing procedure can be modified slightly to provide a quick and reliable technique for a related application. The measurement of plating composition and thickness requires a special edge-retention treatment due to the difference in the hardness of the cadmium plating and the bulk specimen. Minor modifications are also required to the polishing procedure due to the possibility of a selective chemical attack. Based on a literature survey and past experience, the previous sample preparation procedure was modified to accommodate a measurement of plating composition and thickness. Sectioning. When the sample is cut with an alumina wheel, several millimeters of the cadmium plating will be damaged below the cut. Hand grinding at 120 grit will quickly reestablish a sound layer of cadmium at the surface. An alternative would be to use a diamond saw for sec-
tioning, but this would require a significantly longer cutting time. After sectioning and before mounting, the sample should be plated with electroless nickel. This will surround the cadmium plating with a hard layer of nickel-sulfur alloy (HRC 60) and eliminate rounding of the cadmium plating during grinding. Polishing. A buffered solution should be used during polishing to reduce the possibility of selective galvanic attack at the steel-cadmium interface. Etching. Etching is not required, as the examination will be more exact on an unetched surface. The evaluation, which requires both thickness and compositional measurements, is best carried out with a scanning electron microscope equipped with an energy-dispersive spectroscope (EDS, see SYMMETRY IN CRYSTALLOGRAPHY). Microstructural Evaluation of 7075-T6 Anodized Aluminum Alloy Samples are required for the general microstructure evaluation of the aluminum alloy 7075-T6. The bulk hardness is HRB 80 (Rockwell B scale). A single 1/2-in.-thick (1.25-cm) sample will be prepared weekly. The anodized thickness is specified as 1 to 2 mm, and a measurement is required. The following sample preparation procedure is suggested, based on past experience and a survey of the metallographic literature for aluminum. Sectioning. The aim is to avoid excessive deformation. Based on the hardness and because the aluminum is anodized, sectioning should be done with a low-speed diamond saw, using copious quantities of coolant. This will take 20 min. Mounting. The goal is to retain the edge and to facilitate SEM examination. In order the preserve the thin anodized layer, electroless nickel plating is required before mounting. The anodized surface should be first brushed with an intermediate layer of colloidal silver paint and then plated with electroless nickel for edge retention. A conducting epoxy mount should be used, with an appropriate combination of temperature, pressure, and time to ensure that the specimen-mount separation is minimized. Mechanical Abrasion. Manual abrasion is suggested, with water as a lubricant. The wheel should be rotated at 150 rpm, and the specimen should be held perpendicular to the platen and moved from outer edge to center of the grinding paper. Grinding should begin with 320-grit silicon carbide and continue with 400- and 600-grit paper. The sample should be rinsed between each grit and turned 908. The time needed is 15 min. Mechanical Polishing. The aim is to produce a deformation-free and specularly reflective surface. After mechanical abrasion, the holder should be cleaned in an
SAMPLE PREPARATION FOR METALLOGRAPHY
ultrasonicator. Polishing is accomplished using mediumnap cloths, first with a 0.5-mm a-alumina abrasive and then with a 0.03-mm g-alumina abrasive. The holder should be cleaned in an ultrasonicator between these two steps. A wheel speed of 300 rpm should be used, and the specimen should be rotated counter to the wheel rotation. Polishing requires 10 min for the first step and 5 min for the second step. SEM Examination. The objective is to image the anodized layer in backscattered electron mode and measure its thickness. This step is best accomplished using an aspolished surface. Etching. Etching is required to reveal the microstructure in a T6 state (solution heat treated and artificially aged). Keller’s reagent (2 mL 48% HF/3 mL concentrated HCl/5 mL concentrated HNO3/190 mL H2O) can be used to distinguish between T4 (solution heat treated and naturally aged to a substantially stable condition) and T6 heat treatment; supplementary electrical conductivity measurements will also aid in distinguishing between T4 and T6. The microstructure should also be checked against standard sources in the literature, however. Microstructural Evaluation of Deformed High-Purity Aluminum A sample preparation procedure is needed for a high volume of extremely soft samples that were previously deformed and partially recrystallized. The objective is to produce samples with no artifacts and to reveal the fine substructure associated with the thermomechanical history. There was no in-house experience and an initial survey of the metallographic literature for high-purity aluminum did not reveal a previously developed technique. A broader literature search that included Ph.D. dissertations uncovered a successful procedure (Connell, 1972). The methodology is sufficiently detailed so that only slight inhouse adjustments are needed to develop a fast and highly reliable sample preparation procedure. Sectioning. The aim is to avoid excessive deformation of the extremely soft samples. A continuous-loop wire saw should be used with a silicon-carbide abrasive slurry. The 1/4-in. (0.6-cm) section will be cut in 10 min. Mounting. In order to avoid any microstructural recovery effects, the sample should be mounted at room temperature. An electrical contact is required for subsequent electropolishing; an epoxy mount with an embedded electrical contact could be used. Multiple epoxy mounts should be cured overnight in a cool chamber.
69
Mechanical Polishing. The objective is to produce a surface suitable for electropolishing and etching. After mechanical abrasion, and between the two polishing steps, the holder and samples should be cleaned in an ultrasonicator. Polishing is accomplished using medium-nap cloths, first with 1200- and then 3200-mesh emery in soap solution. A wheel speed of 300 rpm should be used, and the holder should be rotated counter to the wheel rotation. Mechanical polishing requires 10 min for the first step and 5 min for the second step. Electrolytic Polishing and Etching. The aim is to reveal the microstructure without metallographic artifacts. An electrolyte containing 8.2 cm3 HF, 4.5 g boric acid, and 250 cm3 deionized water is suggested. A chlorine-free graphite cathode should used, with an anode-cathode spacing of 2.5 cm and low agitation. The open circuit voltage should be 20 V. The time needed for polishing is 30 to 40 s with an additional 15 to 25 s for etching. COMMENTARY These examples emphasize two points. The first is the importance of a conducting thorough literature search before developing a new sample preparation procedure. The second is that any attempt to categorize metallographic procedures through a series of simple steps can misrepresent the field. Instead, an attempt has been made to give the reader an overview with selected examples of various complexity. While these metallographic sample preparation procedures were written with the layman in mind, the literature and Internet sources should be useful for practicing metallographers. LITERATURE CITED Anosov, P.P. 1954. Collected Works. Akademiya Nauk SSR, Moscow. ASM Handbook Committee. 1985. ASM Metals Handbook Volume 9: Metallography and Microstructures. ASM International, Metals Park, Ohio. Connell, R.G. Jr. 1972. The Microstructural Evolution of Aluminum During the Course of High-Temperature Creep. Ph.D. thesis, University of Florida, Gainesville. Rhines, F.N. 1968. Introduction. In Quantitative Microscopy (R.T. DeHoff and F.N. Rhines, eds.) pp. 1-10. McGraw-Hill, New York. Sorby, H.C. 1864. On a new method of illustrating the structure of various kinds of steel by nature printing. Sheffield Lit. Phil. Soc., Feb. 1964. Vander Voort, G. 1984. Metallography: Principle and Practice. McGraw-Hill, New York.
KEY REFERENCES Books
Mechanical Abrasion. Semiautomatic abrasion and polishing is suggested. Grinding begins with 600-grit silicon carbide, using water as lubricant. The wheel is rotated at 150 rpm, and the sample is held counter to wheel rotation and rinsed after grinding. This step takes 5 min.
Huppmann, W.J. and Dalal, K. 1986. Metallographic Atlas of Powder Metallurgy. Verlag Schmid. [Order from Metal Powder Industries Foundation, Princeton, N.J.] Comprehensive compendium of powder metallurgical microstructures.
70
COMMON CONCEPTS
ASM Handbook Committee, 1985. See above.
Microscopy and Microstructures
The single most complete and authoritative reference on metallography. No metallographic sample preparation laboratory should be without a copy.
http://microstructure.copper.org Copper Development Association. Excellent site for copper alloy microstructures. Few links to other sites.
Petzow, G. 1978. Metallographic Etching. American Society for Metals, Metals Park, Ohio.
http://www.microscopy-online.com
Comprehensive reference for etching recipes.
Microscopy Online. Forum for information exchange, links to vendors, and general information on microscopy.
Samuels, L.E. 1982. Metallographic Polishing by Mechanical Methods, 3rd ed. American Society for Metals, Metals Park, Ohio.
http://www.mwrn.com
Complete description of mechanical polishing methods.
MicroWorld Resources and News. Annotated guide to online resources for microscopists and microanalysts.
Smith, C.S. 1960. A History of Metallography. University of Chicago Press, Chicago.
http://www.precisionimages.com/gatemain.htm
Excellent account of the history of metallography for those desiring a deeper understanding of the field’s development.
Digital Imaging. Good background information on digital imaging technologies and links to other imaging sites.
Vander Voort, 1984. See above.
http://www.microscopy-online.com
One of the most popular and thorough books on the subject.
Microscopy Resource. Forum for information exchange, links to vendors, and general information on microscopy.
Periodicals Praktische Metallographie/Practical Metallography (bilingual German-English, monthly). Carl Hanser Verlag, Munich. Metallography (English, bimonthly). Elsevier, New York. Structure (English, German, French editions; twice yearly). Struers, Rodovre, Denmark. Microstructural Science (English, yearly). Elsevier, New York.
INTERNET RESOURCES
http://kelvin.seas.virginia.edu/jaw/mse3101/w4/mse40.htm#Objectives *Optical Metallography of Steel. Excellent exposition of the general concepts, by J.A. Wert
Commercial Producers of Metallographic Equipment and Supplies http://www.2spi.com/spihome.html
NOTE: *Indicates a ‘‘must browse’’ site.
Structure Probe. Good site for finding out about the latest in electron microscopy supplies, and useful for contacting SPI’s technical personnel. Good links to other microscopy sites.
Metallography: General Interest
http://www.lamplan.fr/ or
[email protected] http://www.metallography.com/ims/info.htm
LAM PLAN SA. Good site to search for Lam Plan products.
*International Metallographic Society. Membership information, links to other sites, including the virtual metallography laboratory, gallery of metallographic images, and more.
http://www.struers.com/default2.htm
http://www.metallography.com/index.htm *The Virtual Metallography Laboratory. Extremely informative and useful; probably the most important site to visit.
*Struers. Excellent site with useful resources, online guide to metallography, literature sources, subscriptions, and links. http://www.buehlerltd.com/index2.html Buehler. Good site to locate the latest Buehler products.
http://www.kaker.com
http://www.southbaytech.com
*Kaker d.o.o. Database of metallographic etches and excellent links to other sites. Database of vendors of microscopy products.
Southbay. Excellent site with many links to useful Internet resources, and good search engine for Southbay products.
http://www.ozelink.com/metallurgy Metallurgy Books. Good site to search for metallurgy books online.
Archaeometallurgy http://masca.museum.upenn.edu/sections/met_act.html
Standards http://www.astm.org/COMMIT/e-4.htm *ASTM E-4 Committee on Metallography. Excellent site for understanding the ASTM metallography committee activities. Good exposition of standards related to metallography and the philosophy behind the standards. Good links to other sites. http://www2.arnes.si/sgszmera1/standard.html#main *Academic and Research Network of Slovenia. Excellent site for list of worldwide standards related to metallography and microscopy. Good links to other sites.
Museum Applied Science Center for Archeology, University of Pennsylvania. Fair presentation of archaeometallurgical data. Few links to other sites. http://users.ox.ac.uk/salter *Materials ScienceBased Archeology Group, Oxford University. Excellent presentation of archaeometallurgical data, and very good links to other sites.
ATUL B. GOKHALE MetConsult, Inc. New York, New York
COMPUTATION AND THEORETICAL METHODS INTRODUCTION
ties of real materials. These simulations rely heavily on either a phenomenological or semiempirical description of atomic interactions. The units in this chapter of Methods in Materials Research have been selected to provide the reader with a suite of theoretical and computational tools, albeit at an introductory level, that begins with the microscopic description of electrons in solids and progresses towards the prediction of structural stability, phase equilibrium, and the simulation of microstructural evolution in real materials. The chapter also includes units devoted to the theoretical principles of well established characterization techniques that are best suited to provide exacting tests to the predictions emerging from computation and simulation. It is envisioned that the topics selected for publication will accurately reflect significant and fundamental developments in the field of computational materials science. Due to the nature of the discipline, this chapter is likely to evolve as new algorithms and computational methods are developed, providing not only an up-to-date overview of the field, but also an important record of its evolution.
Traditionally, the design of new materials has been driven primarily by phenomenology, with theory and computation providing only general guiding principles and, occasionally, the basis for rationalizing and understanding the fundamental principles behind known materials properties. Whereas these are undeniably important contributions to the development of new materials, the direct and systematic application of these general theoretical principles and computational techniques to the investigation of specific materials properties has been less common. However, there is general agreement within the scientific and technological community that modeling and simulation will be of critical importance to the advancement of scientific knowledge in the 21st century, becoming a fundamental pillar of modern science and engineering. In particular, we are currently at the threshold of quantitative and predictive theories of materials that promise to significantly alter the role of theory and computation in materials design. The emerging field of computational materials science is likely to become a crucial factor in almost every aspect of modern society, impacting industrial competitiveness, education, science, and engineering, and significantly accelerating the pace of technological developments. At present, a number of physical properties, such as cohesive energies, elastic moduli, and expansion coefficients of elemental solids and intermetallic compounds, are routinely calculated from first principles, i.e., by solving the celebrated equations of quantum mechanics: either Schro¨edinger’s equation, or its relativistic version, Dirac’s equation, which provide a complete description of electrons in solids. Thus, properties can be predicted using only the atomic numbers of the constituent elements and the crystal structure of the solid as input. These achievements are a direct consequence of a mature theoretical and computational framework in solid-state physics, which, to be sure, has been in place for some time. Furthermore, the ever-increasing availability of midlevel and high-performance computing, high-bandwidth networks, and high-volume data storage and management, has pushed the development of efficient and computationally tractable algorithms to tackle increasingly more complex simulations of materials. The first-principles computational route is, in general, more readily applicable to solids that can be idealized as having a perfect crystal structure, devoid of grain boundaries, surfaces and other imperfections. The realm of engineering materials, be it for structural, electronics, or other applications, is, however, that of ‘‘defective’’ solids. Defects and their control dictate the properties of real materials. There is, at present, an impressive body of work in materials simulation, which is aimed at understanding proper-
JUAN M. SANCHEZ
INTRODUCTION TO COMPUTATION Although the basic laws that govern the atomic interactions and dynamics in materials are conceptually simple and well understood, the remarkable complexity and variety of properties that materials display at the macroscopic level seem unpredictable and are poorly understood. Such a situation of basic well-known governing principles but complex outcomes is highly suited for a computational approach. This ultimate ambition of materials science— to predict macroscopic behavior from microscopic information (e.g., atomic composition)—has driven the impressive development of computational materials science. As is demonstrated by the number and range of articles in this volume, predicting the properties of a material from atomic interactions is by no means an easy task! In many cases it is not obvious how the fundamental laws of physics conspire with the chemical composition and structure of a material to determine a macroscopic property that may be of interest to an engineer. This is not surprising given that on the order of 1026 atoms may participate in an observed property. In some cases, properties are simple ‘‘averages’’ over the contributions of these atoms, while for other properties only extreme deviations from the mean may be important. One of the few fields in which a well-defined and justifiable procedure to go from the 71
72
COMPUTATION AND THEORETICAL METHODS
atomic level to the macroscopic level exists is the equilibrium thermodynamics of homogeneous materials. In this case, all atoms ‘‘participate’’ in the properties of interest and the macroscopic properties are determined by fairly straightforward averages of microscopic properties. Even with this benefit, the prediction of alloy phase diagrams is still a formidable challenge, as is nicely illustrated in PREDICTION OF PHASE DIAGRAMS. Unfortunately, for many other properties (e.g., fracture), the macroscopic evolution of the material is strongly influenced by singularities in the microscopic distribution of atoms: for instance, a few atoms that surround a void or a cluster of impurity atoms. This dependence of a macroscopic property on small details of the microscopic distribution makes defining a predictive link between the microscopic and macroscopic much more difficult. Placing some of these difficulties aside, the advantages of computational modeling for the properties that can be determined in this fashion are significant. Computational work tends to be less costly and much more flexible than experimental research. This makes it ideally suited for the initial phase of materials development, where the flexibility of switching between many different materials can be a significant advantage. However, the ultimate advantage of computing methods, both in basic materials research and in applied materials design, is the level of control one has over the system under study. Whereas in an experimental situation nature is the arbiter of what can be realized, in a computational setting only creativity limits the constraints that can be forced onto a material. A computational model usually offers full and accurate control over structure, composition, and boundary conditions. This allows one to perform computational ‘‘experiments’’ that separate out the influence of a single factor on the property of the material. An interesting example may be taken from this author’s research on lithium metal oxides for rechargeable Li batteries. These materials are crystalline oxides that can reversibly absorb and release Li ions through a mechanism called intercalation. Because they can do this at low chemical potential for Li, they are used on the cathode side of a rechargeable Li battery. In the discharge cycle of the battery, Li ions arrive at the cathode and are stored in the crystal structure of the lithium metal oxide. This process is reversed upon charging. One of the key properties of these materials is the electrochemical potential at which they intercalate Li ions, as it directly determines the battery voltage. Figure 1A shows the potential range at which many transition metal oxides intercalate Li as a function of the number of d electrons in the metal (Ohzuku and Atsushi, 1994). While the graph indicates some upward trend of potential with the number of d electrons, this relation may be perturbed by several other parameters that change as one goes from one material to the other: many of the transition metal oxides in Figure 1A are in different crystal structures, and it is not clear to what extent these structural variations affect the intercalation potential. An added complexity in oxides comes from the small variation in average valence state of the cations, which may result in different oxygen composition, even when the
Figure 1. (A) Intercalation potential curves for lithium in various metal oxides as a function of the number of d electrons on the transition metal in the compound. (Taken from Ohzuku and Atsushi, 1994.) (B) Calculated intercalation potential for lithium in various LiMO2 compounds as a function of the structure of the compound and the choice of metal M. The structures are denoted by their prototype.
chemical formula (based on conventional valences) would indicate the stoichiometry to be the same. These factors convolute the dependence of intercalation potential on the choice of transition metal, making it difficult to separate the roles of each independent factor. Computational methods are better suited to separating the influence of these different factors. Once a method for calculating the intercalation potential has been established, it can be applied to any system, in any crystal structure or oxygen
INTRODUCTION TO COMPUTATION
stoichiometry, whether such conditions correspond to the equilibrium structure of the material or not. By varying only one variable at a time in a calculation of the intercalation potential, a systematic study of each variable (e.g., structure, composition, stoichiometry) can be performed. Figure 1B, the result of a series of ab initio calculations (Aydinol et al., 1997) clearly shows the effect of structure and metal in the oxide independently. Within the 3d transition metals, the effect of structure is clearly almost as large as the effect of the number of d electrons. Only for the non-d metals (Zn, Al) is the effect of metal choice dramatic. The calculation also shows that among the 3d metal oxides, LiCoO2 in the spinel structure (Al2MgO4) would display the highest potential. Clearly, the advantage of the computational approach is not merely that one can predict the property of interest (in this case the intercalation potential) but also that the factors that may affect it can be controlled systematically. Whereas the links between atomic-level phenomena and macroscopic properties form the basis for the control and predictive capabilities of computational modeling, they also constitute its disadvantages. The fact that properties must be derived from microscopic energy laws (often quantum mechanics) leads to the predictive characteristics of a method but also holds the potential for substantial errors in the result of the calculation. It is not currently possible to exactly calculate the quantum mechanical energy of a perfect crystalline array of atoms. Any errors in the description of the energetics of a system will ultimately show up in the derived macroscopic results. Many computational models are therefore still not fully quantitative. In some cases, it has not even been possible to identify an explicit link between the microscopic and macroscopic, so quantitative materials studies are not as yet possible. The units in this chapter deal with a large variety of physical phenomena: for example, prediction of physical properties and phase equilibria, simulation of microstructural evolution, and simulation of chemical engineering processes. Readers may notice that these areas are at different stages in their evolution in applying computational modeling. The most advanced field is probably the prediction of physical properties and phase equilibria in alloys, where a well-developed formalism exists to go from the microscopic to the macroscopic. Combining quantum mechanics and statistical mechanics, a full ab initio theory has developed in this field to predict physical properties and phase equilibria, with no more input than the chemical construction of the system (Ducastelle, 1991; Ceder, 1993; de Fontaine, 1994; Zunger, 1994). Such a theory is predictive, and is well suited to the development and study of novel materials for which little or no experimental information is known and to the investigation of materials under extreme conditions. In many other fields, such as in the study of microstructure or mechanical properties, computational models are still at a stage where they are mainly used to investigate the qualitative behavior of model systems and systemspecific results are usually minimal or nonexistent. This lack of an ab initio theory reflects the very complex relation between these properties and the behavior of the constituent atoms. An example may be given from the
73
molecular dynamics work on fracture in materials (Abraham, 1997). Typically, such fracture simulations are performed on systems with idealized interactions and under somewhat restrictive boundary conditions. At this time, the value of such modeling techniques is that they can provide complete and detailed information on a well-controlled system and thereby advance the science of fracture in general. Calculations that discern the specific details between different alloys (say Ti-6Al-4V and TiAl) are currently not possible but may be derived from schemes in which the link between the microscopic and the macroscopic is derived more heuristically (Eberhart, 1996). Many of the mesoscale models (grain growth, film deposition) described in the papers in this chapter are also in this stage of ‘‘qualitative modeling.’’ In many cases, however, some agreement with experiments can be obtained for suitable values of the input parameters. One may expect that many of these computational methods will slowly evolve toward a more predictive nature as methods are linked in a systematic way. The future of computational modeling in materials science is promising. Many of the trends that have contributed to the rapid growth of this field are likely to continue into the next decade. Figure 2 shows the exponential increase in computational speed over the last 50 years. The true situation is even better than what is depicted in Figure 2 as computer resources have also become less expensive. Over the last 15 years the ratio of computational power to price has increased by a factor of 104. Clearly, no other tool in material science and engineering can boast such a dramatic improvement in performance.
Figure 2. Peak performance of the fastest computers models built as a function of time. The performance is in floating-point operations per second (FLOPS). Data from Fox and Coddington (1993) and from manufacturers’ information sheets.
74
COMPUTATION AND THEORETICAL METHODS
However, it would be unwise to chalk up the rapid progress of computational modeling solely to the availability of cheaper and faster computers. Even more significant for the progress of this field may be the algorithmic development for simulation and quantum mechanical techniques. Highly accurate implementations of the local density approximation (LDA) to quantum mechanics [and its extension to the generalized gradient approximation (GGA)] are now widely available. They are considerably faster and much more accurate now than only a few years ago. The Car-Parrinello method and related algorithms have significantly improved the equilibration of quantum mechanical systems (Car and Parrinello, 1985; Payne et al., 1992). There is no reason to expect this trend to stop, and it is likely that the most significant advances in computational materials science will be realized through novel methods development rather than from ultra-high-performance computing. Significant challenges remain. In many cases the accuracy of ab initio methods is orders of magnitude less than that of experimental methods. For example, in the calculation of phase diagrams an error of 10 meV, not large at all by ab initio standards, corresponds to an error of more than 100 K. The time and size scales over which materials phenomena occur remain the most significant challenge. Although the smallest size scale in a first-principles method is always that of the atom and electron, the largest size scale at which individual features matter for a macroscopic property may be many orders of magnitude larger. For example, microstructure formation ultimately originates from atomic displacements, but the system becomes inhomogeneous on the scale of micrometers through sporadic nucleation and growth of distinct crystal orientations or phases. Whereas statistical mechanics provide guidance on how to obtain macroscopic averages for properties in homogeneous systems, there is no theory for coarse-grain (average) inhomogeneous materials. Unfortunately, most real materials are inhomogeneous. Finally, all the power of computational materials science is worth little without a general understanding of its basic methods by all materials researchers. The rapid development of computational modeling has not been paralleled by its integration into educational curricula. Few undergraduate or even graduate programs incorporate computational methods into their curriculum, and their absence from traditional textbooks in materials science and engineering is noticeable. As a result, modeling is still a highly undervalued tool that so far has gone largely unnoticed by much of the materials science and engineering community in universities and industry. Given its potential, however, computational modeling may be expected to become an efficient and powerful research tool in materials science and engineering.
LITERATURE CITED Abraham, F. F. 1997. On the transition from brittle to plastic failure in breaking a nanocrystal under tension (NUT). Europhys. Lett. 38:103–106.
Aydinol, M. K., Kohan, A. F., Ceder, G., Cho, K., and Joannopoulos, J. 1997. Ab-initio study of litihum intercalation in metal oxides and metal dichalcogenides. Phys. Rev. B 56:1354–1365. Car, R. and Parrinello, M. 1985. Unified approach for molecular dynamics and density functional theory. Phys. Rev. Lett. 55:2471–2474. Ceder, G. 1993. A derivation of the Ising model for the computation of phase diagrams. Computat. Mater. Sci. 1:144–150. de Fontaine, D. 1994. Cluster approach to order-disorder transformations in alloys. In Solid State Physics (H. Ehrenreich and D. Turnbull, eds.). pp. 33–176. Academic Press, San Diego. Ducastelle, F. 1991. Order and Phase Stability in Alloys. NorthHolland Publishing, Amsterdam. Eberhart, M. E. 1996. A chemical approach to ductile versus brittle phenomena. Philos. Mag. A 73:47–60. Fox, G. C. and Coddington, P. D. 1993. An overview of high performance computing for the physical sciences. In High Performance Computing and Its Applications in the Physical Sciences: Proceedings of the Mardi Gras ‘93 Conference (D. A. Browne et al., eds.). pp. 1–21. World Scientific, Louisiana State University. Ohzuku, T. and Atsushi, U. 1994. Why transitional metal (di) oxides are the most attractive materials for batteries. Solid State Ionics 69:201–211. Payne, M. C., Teter, M. P., Allan, D. C., Arias, T. A., and Joannopoulos, J. D. 1992. Iterative minimization techniques for ab-initio total energy calculations: Molecular dynamics and conjugate gradients. Rev. Mod. Phys. 64:1045. Zunger, A. 1994. First-principles statistical mechanics of semiconductor alloys and intermetallic compounds. In Statics and Dynamics of Alloy Phase Transformations (P. E. A. Turchi and A. Gonis, eds.). pp. 361–419. Plenum, New York.
GERBRAND CEDER Massachusetts Institute of Technology Cambridge, Massachusetts
SUMMARY OF ELECTRONIC STRUCTURE METHODS INTRODUCTION Most physical properties of interest in the solid state are governed by the electronic structure—that is, by the Coulombic interactions of the electrons with themselves and with the nuclei. Because the nuclei are much heavier, it is usually sufficient to treat them as fixed. Under this Born-Oppenheimer approximation, the Schro¨ dinger equation reduces to an equation of motion for the electrons in a fixed external potential, namely, the electrostatic potential of the nuclei (additional interactions, such as an external magnetic field, may be added). Once the Schro¨ dinger equation has been solved for a given system, many kinds of materials properties can be calculated. Ground-state properties include the cohesive energy, or heats of compound formation, elastic constants or phonon frequencies (Giannozzi and de Gironcoli, 1991), atomic and crystalline structure, defect formation energies, diffusion and catalysis barriers (Blo¨ chl et al., 1993) and even nuclear tunneling rates (Katsnelson et al.,
SUMMARY OF ELECTRONIC STRUCTURE METHODS
1995), magnetic structure (van Schilfgaarde et al., 1996), work functions (Methfessel et al., 1992), and the dielectric response (Gonze et al., 1992). Excited-state properties are accessible as well; however, the reliability of the properties tends to degrade—or requires more sophisticated approaches—the larger the perturbing excitation. Because of the obvious advantage in being able to calculate a wide range of materials properties, there has been an intense effort to develop general techniques that solve the Schro¨ dinger equation from ‘‘first principles’’ for much of the periodic table. An exact, or nearly exact, theory of the ground state in condensed matter is immensely complicated by the correlated behavior of the electrons. Unlike Newton’s equation, the Schro¨ dinger equation is a field equation; its solution is equivalent to solving Newton’s equation along all paths, not just the classical path of minimum action. For materials with wide-band or itinerant electronic motion, a one-electron picture is adequate, meaning that to a good approximation the electrons (or quasiparticles) may be treated as independent particles moving in a fixed effective external field. The effective field consists of the electrostatic interaction of electrons plus nuclei, plus an additional effective (mean-field) potential that originates in the fact that by correlating their motion, electrons can avoid each other and thereby lower their energy. The effective potential must be calculated self-consistently, such that the effective one-electron potential created from the electron density generates the same charge density through the eigenvectors of the corresponding oneelectron Hamiltonian. The other possibility is to adopt a model approach that assumes some model form for the Hamiltonian and has one or more adjustable parameters, which are typically determined by a fit to some experimental property such as the optical spectrum. Today such Hamiltonians are particularly useful in cases beyond the reach of first-principles approaches, such as calculations of systems with large numbers of atoms, or for strongly correlated materials, for which the (approximate) first-principles approaches do not adequately describe the electronic structure. In this unit, the discussion will be limited to the first-principles approaches. Summaries of Approaches The local-density approximation (LDA) is the ‘‘standard’’ solid-state technique, because of its good reliability and relative simplicity. There are many implementations and extensions of the LDA. As shown below (see discussion of The Local Density Approximation) it does a good job in predicting ground-state properties of wide-band materials where the electrons are itinerant and only weakly correlated. Its performance is not as good for narrow-band materials where the electron correlation effects are large, such as the actinide metals, or the late-period transitionmetal oxides. Hartree-Fock (HF) theory is one of the oldest approaches. Because it is much more cumbersome than the LDA, and its accuracy much worse for solids, it is used mostly in chemistry. The electrostatic interaction is called the ‘‘Hartree’’ term, and the Fock contribution that approximates the correlated motion of the electrons is
75
called ‘‘exchange.’’ For historic reasons, the additional energy beyond the HF exchange energy is often called ‘‘correlation’’ energy. As we show below (see discussion of Hartree-Fock Theory), the principal failing of Hartree-Fock theory stems from the fact that the potential entering into the exchange interaction should be screened out by the other electrons. For narrow-band systems, where the electrons reside in atomic-like orbitals, Hartree-Fock theory has some important advantages over the LDA. Its nonlocal exchange serves as a better starting point for more sophisticated approaches. Configuration-interaction theory is an extension of the HF approach that attempts to solve the Schro¨ dinger equation with high accuracy. Computationally, it is very expensive and is feasible only for small molecules with 10 atoms or fewer. Because it is only applied to solids in the context of model calculations (Grant and McMahan, 1992), it is not considered further here. The so-called GW approximation may be thought of as an extension to Hartree-Fock theory, as described below (see discussion under Dielectric Screening, the RandomPhase, GW, and SX Approximations). The GW method incorporates a representation of the Green’s function (G) and the Coulomb interaction (W). It is a Hartree-Focklike theory for which the exchange interaction is properly screened. GW theory is computationally very demanding, but it has been quite successful in predicting, for example, bandgaps in semiconductors. To date, it has been only possible to apply the theory to optical properties, because of difficulties in reliably integrating the self-energy to obtain a total energy. The LDA þ U theory is a hybrid approach that uses the LDA for the ‘‘itinerant’’ part and Hartree-Fock theory for the ‘‘local’’ part. It has been quite successful in calculating both ground-state and excited-state properties in a number of correlated systems. One criticism of this theory is that there exists no unique prescription to renormalize the Coulomb interaction between the local orbitals, as will be described below. Thus, while the method is ab initio, it retains the flavor of a model approach. The self-interaction correction (Svane and Gunnarsson, 1990) is similar to LDA þ U theory, in that a subset of the orbitals (such as the f-shell orbitals) are partitioned off and treated in a HF-like manner. It offers a unique and welldefined functional, but tends to be less accurate than the LDA þ U theory, because it does not screen the local orbitals. The quantum Monte Carlo approach is not a mean-field approach. It is an ostensibly exact, or nearly exact, approach to determine the ground-state total energy. In practice, some approximations are needed, as described below (see discussion of Quantum Monte Carlo). The basic idea is to evaluate the Schro¨ dinger equation by brute force, using a Monte Carlo approach. While applications to real materials so far have been limited, because of the immense computational requirements, this approach holds much promise with the advent of faster computers. Implementation Apart from deciding what kind of mean-field (or other) approximation to use, there remains the problem of
76
COMPUTATION AND THEORETICAL METHODS
implementation in some kind of practical method. Many different approaches have been employed, especially for the LDA. Both the single-particle orbitals and the electron density and potential are invariably expanded in some basis set, and the various methods differ in the basis set employed. Figure 1 depicts schematically the general types of approaches commonly used. One principal distinction is whether a method employs plane waves for a basis, or atom-centered orbitals. The other primary distinction
PP-PW
PP-LO
APW
KKR
Figure 1. Illustration of different methods, as described in the text. The pseudopotential (PP) approaches can employ either plane waves (PW) or local atom-centered orbitals; similarly the augmented-wave approach employing PW becomes APW or LAPW; using atom-centered Hankel functions it is the KKR method or the method of linear muffin-tin orbitals (LMTO). The PAW (Blo¨ chl, 1994) is a variant of the APW method, as described in the text. LMTO, LSTO and LCGO are atom-centered augmentedwave approaches with Hankel, Slater, and Gaussian orbitals, respectively, used for the envelope functions.
among methods is the treatment of the core. Valence electrons must be orthogonalized to the inert core states. The various methods address this by (1) replacing the core with an effective (pseudo)potential, so that the (pseudo)wave functions near the core are smooth and nodeless, or (2) by ‘‘augmenting’’ the wave functions near the nuclei with numerical solutions of the radial Schro¨ dinger equation. It turns out that there is a connection between ‘‘pseudizing’’ or augmenting the core; some of the recently developed methods such as the Planar Augmented Wave method of Blo¨ chl (1994), and the pseudopotential method of Vanderbilt (1990) may be thought of as a kind of hybrid of the two (Dong, 1998). The augmented-wave basis sets are ‘‘intelligently’’ chosen in that they are tailored to solutions of the Schro¨ dinger equation for a ‘‘muffin-tin’’ potential. A muffin-tin potential is flat in the interstitial region, and then spherically symmetric inside nonoverlapping spheres centered at each nucleus, and, for close-packed systems, is a fairly good representation of the true potential. But because the resulting Hamiltonian is energy dependent, both the augmented plane-wave (APW) and augmented atom-centered (Korringa, Kohn, Rostoker; KKR) methods result in a nonlinear algebraic eigenvalue problem. Andersen and Jepsen (1984 also see Andersen, 1975) showed how to linearize the augmented-wave Hamiltonian, and both the APW (now LAPW) and KKR—renamed linear muffin-tin orbitals (LMTO)—methods are vastly more efficient. The choice of implementation introduces further approximations, though some techniques have enough machinery now to solve a given one-electron Hamiltonian nearly exactly. Today the LAPW method is regarded as the ‘‘industry standard’’ high-precision method, though some implementations of the LMTO method produces a corresponding accuracy, as does the plane-wave pseudopotential approach, provided the core states are sufficiently deep and enough plane waves are chosen to make the basis reasonably complete. It is not always feasible to generate a well-converged pseudopotential; for example, the highlying d cores in Ga can be a little too shallow to be ‘‘pseudized’’ out, but are difficult to treat explicitly in the valence band using plane waves. Traditionally the augmentedwave approaches have introduced shape approximations to the potential, ‘‘spheridizing’’ the potential inside the augmentation spheres. This is often still done today; the approximation tends usually to be adequate for energy bands in reasonably close-packed systems, and relatively coarse total energy differences. This approximation, combined with enlarging the augmentation spheres and overlapping them so that their volume equals the unit cell volume, is known as the atomic spheres approximation (ASA). Extensions, such as retaining the correction to the spherical part of the electrostatic potential from the nonspherical part of the density (Skriver and Rosengaard, 1991) eliminate most of the errors in the ASA. Extensions The ‘‘standard’’ implementations of, for example, the LDA, generate electron eigenstates through diagonalization of the one-electron wave function. As noted before, the
SUMMARY OF ELECTRONIC STRUCTURE METHODS
one-electron potential itself must be determined self-consistently, so that the eigenstates generate the same potential that creates them. Some information, such as the total energy and internuclear forces, can be directly calculated as a byproduct of the standard self-consistency cycle. There have been many other properties that require extensions of the ‘‘standard’’ approach. Linear-response techniques (Baroni et al., 1987; Savrasov et al., 1994) have proven particularly fruitful for calculation of a number of properties, such as phonon frequencies (Giannozzi and de Gironcoli, 1991), dielectric response (Gonze et al., 1992), and even alloy heats of formation (de Gironcoli et al., 1991). Linear response can also be used to calculate exchange interactions and spin-wave spectra in magnetic systems (Antropov et al., unpub. observ.). Often the LDA is used as a parameter generator for other methods. Structural energies for phase diagrams are one prime example. Another recent example is the use of precise energy band structures in GaN, where small details in the band structure are critical to how the material behaves under high-field conditions (Krishnamurthy et al., 1997). Numerous techniques have been developed to solve the one-electron problem more efficiently, thus making it accessible to larger-scale problems. Iterative diagonalization techniques have become indispensable to the plane wave basis. Though it was not described this way in their original paper, the most important contribution from Carr and Parrinello’s (1985) seminal work was their demonstration that special features of the plane-wave basis can be exploited to render a very efficient iterative diagonalization scheme. For layered systems, both the eigenstates and the Green’s function (Skriver and Rosengaard, 1991) can be calculated in O(N) time, with N being the number of layers (the computational effort in a straightforward diagonalization technique scales as cube of the size of the basis). Highly efficient techniques for layered systems are possible in this way. Several other general-purpose O(N) methods have been proposed. A recent class of these methods computes the ground-state energy in terms of the density matrix, but not spectral information (Ordejo´ n et al., 1995). This class of approaches has important advantages for large-scale calculations involving 100 or more atoms, and a recent implementation using the LDA has been reported (Ordejo´ n et al., 1996); however, they are mainly useful for insulators. A Green’s function approach suitable for metals has been proposed (Wang et al., 1995), and a variant of it (Abrikosov et al., 1996) has proven to be very efficient to study metallic systems with several hundred atoms.
HARTREE-FOCK THEORY In Hartree-Fock theory, one constructs a Slater determinant of one-electron orbitals cj . Such a construct makes the total wave function antisymmetric and better enables the electrons to avoid one another, which leads to a lowering of total energy. The additional lowering is reflected in the emergence of an additional effective (exchange) potential vx (Ashcroft and Mermin, 1976). The resulting one-
77
electron Hamiltonian has a local part from the direct electrostatic (Hartree) interaction vH and external (nuclear) potential vext, and a nonlocal part from vx "
# ð 2 2 h ext H r þ v ðrÞ þ v ðrÞ ci ðrÞ þ d3 r0 vx ðr; r0 Þcðr0 Þ 2m
¼ ei ci ðrÞ ð e2 nðr0 Þ vH ðrÞ ¼ jr r0 j X e2 vx ðr; r0 Þ ¼ c ðr0 Þcj ðrÞ jr r0 j j j
ð1Þ ð2Þ ð3Þ
where e is the electronic charge and n(r) is the electron density. Thanks to Koopman’s theorem, the change in energy from one state to another is simply the difference between the Hartree-Fock parameters e in two states. This provides a basis to interpret the e in solids as energy bands. In comparison to the LDA (see discussion of The Local Density Approximation), Hartree-Fock theory is much more cumbersome to implement, because of the nonlocal exchange potential vx ðr; r0 Þ which requires a convolution of vx and c. Moreover, the neglect of correlations beyond the exchange renders it a much poorer approximation to the ground state than the LDA. Hartree-Fock theory also usually describes the optical properties of solids rather poorly. For example, it rather badly overestimates the bandgap in semiconductors. The Hartree-Fock gaps in Si and GaAs are both 5 eV (Hott, 1991), in comparison to the observed 1.1 and 1.5 eV, respectively.
THE LOCAL-DENSITY APPROXIMATION The LDA actually originates in the X-a method of Slater (1951), who sought a simplifying approximation to the HF exchange potential. By assuming that the exchange varied in proportion to n1/3, with n the electron density, the HF exchange becomes local and vastly simplifies the computational effort. Thus, as it was envisioned by Slater, the LDA is an approximation to Hartree-Fock theory, because the exact exchange is approximated by a simple functional of the density n, essentially proportional to n1/3. Modern functionals go beyond Hartree-Fock theory because they include correlation energy as well. Slater’s X-a method was put on a firm foundation with the advent of density-functional theory (Hohenberg and Kohn, 1964). It established that the ground-state energy is strictly a functional of the total density. But the energy functional, while formally exact, is unknown. The LDA (Kohn and Sham, 1965) assumes that the exchange plus correlation part of the energy Exc is a strictly local functional of the density: ð Exc ½n d3 rnðrÞexc ½nðrÞ ð4Þ This ansatz leads, as in the Hartree-Fock case, to an equation of motion for electrons moving independently in
78
COMPUTATION AND THEORETICAL METHODS
an effective field, except that now the potential is strictly local: "
# 2 2 h ext H xc r þ v ðrÞ þ v ðrÞ þ v ðrÞ ci ðrÞ ¼ ei ci ðrÞ 2m
ð5Þ
This one-electron equation follows directly from a functional derivative of the total energy. In particular, vxc (r) is the functional derivative of Exc: dExc dn ð d LDA d3 rnðrÞexc ½nðrÞ ¼ dn
vxc ðrÞ
¼ exc ½nðrÞ þ nðrÞ
d exc ½nðrÞ dn
ð6Þ ð7Þ
Table 1. Heats of Formation, in eV, for the Hydroxyl (OH þ H2 !H2O þ H), Tetrazine (H2C2N4 !2 HCN þ N2), and Vinyl Alcohol (C2OH3 !Acetaldehyde) Reactions, Calculated with Different Methodsa Method HF LDA Becke PW-91 QMC Expt
Hydroxyl 0.07 0.69 0.44 0.64 0.65 0.63
Tetrazine 3.41 1.99 2.15 1.73 2.65 —
Vinyl Alcohol 0.54 0.34 0.45 0.45 0.43 0.42
a Abbreviations: HF, Hartree-Fock; LDA, local-density approximation; Becke, GGA functional (Becke, 1993); PW-91, GGA functional (Perdew, 1997); QMC, quantum Monte Carlo. Calculations by Grossman and Mitas (1997).
ð8Þ
Both exchange and correlation are calculated by evaluating the exact ground state for a jellium (in which the discrete nuclear charge is smeared out into a constant background). This is accomplished either by Monte Carlo techniques (Ceperley and Alder, 1980) or by an expansion in the random-phase approximation (von Barth and Hedin, 1972). When the exchange is calculated exactly, the selfinteraction terms (the interaction of the electron with itself) in the exchange and direct Coulomb terms cancel exactly. Approximation of the exact exchange by a local density functional means that this is no longer so, and this is one key source of error in the LD approach. For example, near surfaces, or for molecules, the asymptotic decay of the electron potential is exponential, whereas it should decay as 1/r, where r is the distance to the nucleus. Thus, molecules are less well described in the LDA than are solids. In Hartree-Fock theory, the opposite is the case. The self-interaction terms cancel exactly, but the operator 1=jr r0 j entering into the exchange should in effect be screened out. Thus, Hartree-Fock theory does a reasonable job in small molecules, where the screening is less important, while for solids it fails rather badly. Thus, the LDA generates much better total energies in solids than Hartree-Fock theory. Indeed, on the whole, the LDA predicts, with rather good accuracy, ground-state properties, such as crystal structures and phonon frequencies in itinerant materials and even in many correlated materials. Gradient Corrections Gradient corrections extend slightly the ansatz of the local density approximation. The idea is to assume that Exc is not only a local functional of the density, but a functional of the density and its Laplacian. It turns out that the leading correction term can be obtained exactly in the limit of a small, slowly varying density, but it is divergent. To render the approach practicable, a wave-vector analysis is carried out and the divergent, low wave-vector part of the functional is cut off; these are called ‘‘generalized gradient approximations’’ (GGAs). Calculations using gradient corrections have produced mixed results. It was hoped that since the LDA does quite well in predicting many groundstate properties, gradient corrections would introduce the
small corrections needed, particularly in systems in which the density is slowly varying. On the whole, the GGA tends to improve some properties, though not consistently so. This is probably not surprising, since the main ingredients missing in the LDA, (e.g., inexact cancellation of the selfinteraction and nonlocal potentials) are also missing for gradient-corrected functionals. One of the first approximations was that of Langreth and Mehl (1981). Many of the results in the next section were produced with their functional. Some newer functionals, most notably the so-called ‘‘PBE’’ (named after Perdew, Burke, Enzerhof) functional (Perdew, 1997) improve results for some properties of solids, while worsening others. One recent calculation of the heat of formation for three molecular reactions offers a detailed comparison of the HF, LDA, and GGA to (nearly) exact quantum Monte Carlo results. As shown in Table 1, all of the different mean-field approaches have approximately similar accuracy in these small molecules. Excited-state properties, such as the energy bands in itinerant or correlated systems, are generally not improved at all with gradient corrections. Again, this is to be expected since the gradient corrections do not redress the essential ingredients missing from the LDA, namely, the cancellation of the self-interaction or a proper treatment of the nonlocal exchange. LDA Structural Properties Figure 2 compares predicted atomic volumes for the elemental transition metals and some sp bonded semiconductors to corresponding experimental values. The errors shown are typical for the LDA, underestimating the volume by 0% to 5% for sp bonded systems, by 0% to 10% for d-bonded systems with the worst agreement in the 3d series, and somewhat more for f-shell metals (not shown). The error also tends to be rather severe for the extremely soft, weakly bound alkali metals. The crystal structure of Se and Te poses a more difficult test for the LDA. These elements form an open, lowsymmetry crystal with 90 bond angles. The electronic structure is approximately described by pure atomic p orbitals linked together in one-dimensional chains, with a weak interaction between the chains. The weak
SUMMARY OF ELECTRONIC STRUCTURE METHODS
79
Figure 2. Unit cell volume for the elemental transition metals (left) and semiconductors (right). Left: triangles, squares, and pentagons refer to 3-, 4-, and 5-d metals, respectively. Right: squares, pentagons, and hexagons refer to group IV, III-V, and II-VI compounds. Upper panel: volume per unit cell; middle panel: relative error predicted by the LDA; lower panel: relative error predicted by the LDA þ GGA of Langreth and Mehl (1981), except for light symbols, which are errors in the PBE functional (Perdew et al., 1996, 1997).
inter-chain interaction combined with the low symmetry and open structure make a difficult test for the local-density approximation. The crystal structure of Se and Te is hexagonal with three atoms per unit cell, and may be specified by the a and c parameters of the hexagonal cell, and one internal displacement parameter, u. Table 2 shows that the LDA predicts rather well the strong intra-chain bond length, but rather poorly reproduces the inter-chain bond length. One of the largest effects of gradient-corrected functionals is to increase systematically and on the average improve, the equilibrium bond lengths (Fig. 2). The GGA of Langreth and Mehl (1981) significantly improves on the transition metal lattice constants; they similarly
Table 2. Crystal Structure of Se, Comparing the LDA to GGA Results (Perdew, 1991), as taken from Dal Corso and Resta (1994)a
LDA GGA Expt
a
c
u
d1
d2
7.45 8.29 8.23
9.68 9.78 9.37
0.256 0.224 0.228
4.61 4.57 4.51
5.84 6.60 6.45
a Lattice parameters a and c are in atomic units (i.e., units of the Bohr radius a0), as are intra-chain bond length d1 and inter-chain bond length d2. The parameter u is an internal displacement parameter as described in Dal Corso and Resta (1994).
significantly improve on the predicted inter-chain bond length in Se (Table 2). In the case of the semiconductors, there is a tendency to overcorrect for the heavier elements. The newer GGA of Perdew et al. (1996, 1997) rather badly overestimates lattice constants in the heavy semiconductors. LDA Heats of Formation and Cohesive Energies One of the largest systematic errors in the LDA is the cohesive energy, i.e., the energy of formation of the crystal from the separated elements. Unlike Hartree-Fock theory, the LD functional has no variational principle that guarantees its ground-state energy is less than the true one. The LDA usually overestimates binding energies. As expected, and as Figure 3 illustrates, the errors tend to be greater for transition metals than for sp bonded compounds. Much of the error in the transition metals can be traced to errors in the spin multiplet structure in the atom (Jones and Gunnarsson, 1989); thus, the sudden change in the average overbinding for elements to the left of Cr and the right of Mn. For reasons explained above, errors in the heats of formation between molecules and solid phases, or between different solid phases (Pettifor and Varma, 1979) tend to be much smaller than those of the cohesive energies. This is especially true when the formation involves atoms arranged on similar lattices. Figure 4 shows the errors typically encountered in solid-solid reactions between a
80
COMPUTATION AND THEORETICAL METHODS
Figure 3. Heats of formation for elemental transition metals (left) and semiconductors (right). Left: triangles, squares, and pentagons refer to 3-, 4-, and 5-d metals, respectively. Right: squares, pentagons, and hexagons refer to group IV, III-V, and II-VI compounds. Upper panel: heat of formation per atom (Ry); middle panel: error predicted by the LDA; lower panel: error predicted by the LDA þ GGA of Langreth and Mehl (1981).
Figure 4. Cohesive energies (top), and heats of formation (bottom) of compounds from the elemental solids. The MoSi data are taken from McMahan et al. (1994). The other data were calculated by Berding and van Schilfgaarde using the FP-LMTO method (unpub. observ.).
wide range of dissimilar phases. Al is face-centered cubic (fcc), P and Si are open structures, and the other elements form a range of structures intermediate in their packing densities. This figure also encapsulates the relative merits of the LDA and GGA as predictors of binding energies. The GGA generally predicts the cohesive energies significantly better than the LDA, because the cohesive energies involve free atoms. But when solid-solid reactions are considered, the improvement disappears. Compounds of Mo and Si make an interesting test case for the GGA (McMahan et al., 1994). The LDA tends to overbind; but the GGA of Langreth and Mehl (1981) actually fares considerably worse, because the amount of overbinding is less systematic, leading to a prediction of the wrong ground state for some parts of the phase diagram. When recalculated using Perdew’s PBE functional, the difficulty disappears (J. Klepis, pers. comm.). The uncertainties are further reduced when reactions involve atoms rearranged on similar or the same crystal structures. One testimony to this is the calculation of structural energy differences of elemental transitional metals in different crystal structures. Figure 5 compares the local density hexagonal close packed–body centered cubic (hcp-bcc) and fcc-bcc energy differences in the 3-d transition metals, calculated nonmagnetically (Paxton et al., 1990; Skriver, 1985; Hirai, 1997). As the figure shows, there is a trend to stabilize bcc for elements with the d bands less than half-full, and to stabilize a closepacked structure for the late transition metals. This trend can be attributed to the two-peaked structure in the bcc d contribution to the density of states, which gains energy
SUMMARY OF ELECTRONIC STRUCTURE METHODS
81
Figure 5. Hexagonal close packedface centered cubic (hcp-fcc; circles) and body centered cubicface centered cubic (bcc-fcc; squares) structural energy differences, in meV, for the 3-d transition metals, as calculated in the LDA, using the full-potential LMTO method. Original calculations are nonmagnetic (Paxton et al., 1990; Skriver, 1985); white circles are recalculations by the present author, with spin-polarization included.
when the lower (bonding) portion is filled and the upper (antibonding) portion is empty. Except for Fe (and with the mild exception of Mn, which has a complex structure with noncollinear magnetic moments and was not considered here), each structure is correctly predicted, including resolution of the experimentally observed sign of the hcpfcc energy difference. Even when calculated magnetically, Fe is incorrectly predicted to be fcc. The GGAs of Langreth and Mehl (1981) and Perdew and Wang (Perdew, 1991) rectify this error (Bagno et al., 1989), possibly because the bcc magnetic moment, and thus the magnetic exchange energy, is overestimated for those functionals. In an early calculation of structural energy differences (Pettifor, 1970), Pettifor compared his results to inferences of the differences by Kaufman who used ‘‘judicious use of thermodynamic data and observations of phase equilibria in binary systems.’’ Pettifor found that his calculated differences are two to three times larger than what Kaufman inferred (Paxton’s newer calculations produces still larger discrepancies). There is no easy way to determine which is more correct. Figure 6 shows some calculated heats of formation for the TiAl alloy. From the point of view of the electronic structure, the alloy potential may be thought of as a rather weak perturbation to the crystalline one, namely, a permutation of nuclear charges into different arrangements on the same lattice (and additionally some small distortions about the ideal lattice positions). The deviation from the regular solution model is properly reproduced by the LDA, but there is a tendency to overbind, which leads to an overestimate of the critical temperatures in the alloy phase diagram (Asta et al., 1992). LDA Elastic Constants Because of the strong volume dependence of the elastic constants, the accuracy to which LDA predicts them depends on whether they are evaluated at the observed volume or the LDA volume. Figure 7 shows both for the elemental transition metals and some sp-bonded compounds. Overall, the GGA of Langreth and Mehl (1981) improves on the LDA; how much improvement depends on which lattice constant one takes. The accuracy of other
Figure 6. Heat of formation of compounds of Ti and Al from the fcc elemental states. Circles and hexagons are experimental data, taken from Kubaschewski and Dench (1955) and Kubaschewski and Heymer (1960). Light squares are heats of formation of compounds from the fcc elemental solids, as calculated from the LDA. Dark squares are the minimum-energy structures and correspond to experimentally observed phases. Dashed line is the estimated heat formation of a random alloy. Calculated values are taken from Asta et al. (1992).
elastic constants and phonon frequencies are similar (Baroni et al., 1987; Savrosov et al., 1994); typically they are predicted to within 20% for d-shell metals and somewhat better than that for sp-bonded compounds. See Figure 8 for a comparison of c44, or its hexagonal analog. LDA Magnetic Properties Magnetic moments in the itinerant magnets (e.g., the 3d transition metals) are generally well predicted by the LDA. The upper right panel of Figure 9 compares the LDA moments to experiment both at the LDA minimumenergy volume and at the observed volume. For magnetic properties, it is most sensible to fix the volume to experiment, since for the magnetic structure the nuclei may by viewed as an external potential. The classical GGA functionals of Langreth and Mehl (1981) and Perdew and Wang (Perdew, 1991) tend to overestimate the moments and worsen agreement with experiment. This is less the case with the recent PBE functional, however, as Figure 9 shows. Cr is an interesting case because it is antiferromagnetic along the [001] direction, with a spin-density wave, as Figure 9 shows. It originates as a consequence of a nesting vector in the Cr Fermi surface (also shown in the figure), which is incommensurate with the lattice. The half-period is approximately the reciprocal of the difference in the length of the nesting vector in the figure and the halfwidth of the Brillouin zone. It is experimentally 21.2 monolayers (ML) (Fawcett, 1988), corresponding to a nesting vector q ¼ 1:047. Recently, Hirai (1997) calculated the
82
COMPUTATION AND THEORETICAL METHODS
Figure 7. Bulk modulus for the elemental transition metals (left) and semiconductors (right). Left: triangles, squares, and pentagons refer, to 3-, 4-, and 5-d metals, respectively. Right: squares, pentagons, and hexagons refer to group IV, III-V, and II-VI compounds. Top panels: bulk modulus; second panel from top: relative error predicted by the LDA at the observed volume; third panel from top: same, but for the LDA þ GGA of Langreth and Mehl (1981) except for light symbols, which are errors in the PBE functional (Perdew et al., 1996, 1997); fourth and fifth panels from top: same as the second and third panels but evaluated at the minimum-energy volume.
Figure 8. Elastic constant (R) for the elemental transition metals (left) and semiconductors (right), and the experimental atomic volume. For cubic structures, R ¼ c44. For hexagonal structures, R ¼ (c11 þ 2c33 þ c12-4c13)/6 and is analogous to c44. Left: triangles, squares, and pentagons refer to 3-, 4-, and 5-d metals, respectively. Right: squares, pentagons, and hexagons refer to group IV, III-V and II-VI compounds. Upper panel: volume per unit cell; middle panel: relative error predicted by the LDA; lower panel: relative error predicted by the LDA þ GGA of Langreth and Mehl (1981) except for light symbols, which are errors in the PBE functional (Perdew et al., 1996, 1997).
SUMMARY OF ELECTRONIC STRUCTURE METHODS
83
Figure 9. Upper left: LDA Fermi surface of nonmagnetic Cr. Arrows mark the nesting vectors connecting large, nearly parallel sheets in the Brillouin zone. Upper right: magnetic moments of the 3-d transition metals, in Bohr magnetons, calculated in the LDA at the LDA volume, at the observed volume, and using the PBE (Perdew et al., 1996, 1997) GGA at the observed volume. The Cr data is taken from Hirai (1997). Lower left: the magnetic moments in successive atomic layers along [001] in Cr, showing the antiferromagnetic spin-density wave. The observed period is 21.2 lattice spacings. Lower right: spinwave spectrum in Fe, in meV, calculated in the LDA (Antropov et al., unpub. observ.) for different band fillings, as discussed in the text.
period in Cr by constructing long supercells and evaluating the total energy using a layer KKR technique as a function of the cell dimensions. The calculated moment amplitude (Fig. 9) and period were both in good agreement with experiment. Hirai’s calculated period was 20.8 ML, in perhaps fortuitously good agreement with experiment. This offers an especially rigorous test of the LDA, because small inaccuracies in the Fermi surface are greatly magnified by errors in the period. Finally, Figure 9 shows the spin-wave spectrum in Fe as calculated in the LDA using the atomic spheres approximation and a Green’s function technique, plotted along high-symmetry lines in the Brillouin zone. The spin stiffness D is the curvature of o at , and is calculated to be 330 meV-A2, in good agreement with the measured 280– 310 meV-A2. The four lines also show how the spectrum would change with different band fillings (as defined in the legend)—this is a ‘‘rigid band’’ approximation to alloying of Fe with Mn (EF < 0) or Co (EF > 0). It is seen that o is positive everywhere for the normal Fe case (black line), and this represents a triumph for the LDA, since it demonstrates that the global ground state of bcc Fe is the ferromagnetic one. o remains positive by shifting the Fermi level in the ‘‘Co alloy’’ direction, as is observed experimentally. However, changing the filling by only 0.2 eV (‘‘Mn alloy’’) is sufficient to produce an instability at H, thus driving it to an antiferromagnetic structure in the [001] direction, as is experimentally observed. Optical Properties In the LDA, in contradistinction to Hartree-Fock theory, there is no formal justification for associating the eigenva-
lues e of Equation 5 with energy bands. However, because the LDA is related to Hartree-Fock theory, it is reasonable to expect that the LDA eigenvalues e bear a close resemblance to energy bands, and they are widely interpreted that way. There have been a few ‘‘proper’’ local-density calculations of energy gaps, calculated by the total energy difference of a neutral and a singly charged molecule; see, for example, Cappellini et al. (1997) for such a calculation in C60 and Na4. The LDA systematically underestimates bandgaps by 1 to 2 eV in the itinerant semiconductors; the situation dramatically worsens in more correlated materials, notably f-shell metals and some of the latetransition-metal oxides. In Hartree-Fock theory, the nonlocal exchange potential is too large because it neglects the ability of the host to screen out the bare Coulomb interaction 1=jr r0 j. In the LDA, the nonlocal character of the interaction is simply missing. In semiconductors, the long-ranged part of this interaction should be present but screened by the dielectric constant e1 . Since e1 1, the LDA does better by ignoring the nonlocal interaction altogether than does Hartree-Fock theory by putting it in unscreened. Harrison’s model of the gap underestimate provides us with a clear physical picture of the missing ingredient in the LDA and a semiquantitative estimate for the correction (Harrison, 1985). The LDA uses a fixed one-electron potential for all the energy bands; that is, the effective one-electron potential is unchanged for an electron excited across the gap. Thus, it neglects the electrostatic energy cost associated with the separation of electron and hole for such an excitation. This was modeled by Harrison by noting a Coulombic repulsion U between the local excess charge and the excited electron. An estimate of this
84
COMPUTATION AND THEORETICAL METHODS
Coulombic repulsion U can be made from the difference between the ionization potential and electron affinity of the free atom; (Harrison, 1985) it is 10 eV. U is screened by the surrounding medium so that an estimate for the additional energy cost, and therefore a rigid shift for the entire conduction band including a correction to the bandgap, is U/e1 . For a dielectric constant of 10, one obtains a constant shift to the LDA conduction bands of 1 eV, with the correction larger for wider gap, smaller e materials.
DIELECTRIC SCREENING, THE RANDOM-PHASE, GW, AND SX APPROXIMATIONS The way in which screening affects the Coulomb interaction in the Fock exchange operator is similar to the screening of an external test charge. Let us then consider a simple model of static screening of a test charge in the random-phase approximation. Consider a lattice of points (spheres), with the electron density in equilibrium. We wish to calculate the screening response, i.e., the electron charge dqj at site j induced by the addition of a small external potential dVi0 at site i. Supposing the screening charge did not interact with itself—let us call this the noninteracting screening charge dq0j . This quantity is related to dVj0 by the noninteracting response function P0ij : dq0k ¼
X
P0kj dVj0
ð9Þ
j
P0kj can be calculated directly in first-order perturbation theory from the eigenvectors of the one-electron Schro¨ dinger equation (see Equation 5 under discussion of The Local Density Approximation), or directly from the induced change in the Green’s function, G0, calculated from the one-electron Hamiltonian. By linearizing the Dyson’s equation, one obtains an explicit representation of P0ij in terms of G0: dG ¼ G0 dV 0 G G0 dV 0 G0 ð EF 1 dz dGkk dq0k ¼ Im p 1 " # X ð EF 1 0 0 Im ¼ dzGkj Gjk dVj0 p 1 k X P0kj dVj0 ¼
ð10Þ ð11Þ ð12Þ ð13Þ
j
It is straightforward to see how the full screening proceeds in the random-phase approximation (RPA). The RPA assumes that the screening charge does not induce further correlations; that is, the potential induced by the screening charge is simply the classical electrostatic potential corresponding to the screening charge. Thus, dq0j induces a new electrostatic potential dVi1 In the discrete lattice model we consider here, the electrostatic potential is linearly related to a collection of charges by some matrix M, i.e., dVk1
¼
X j
Mjk dq0k
dVi1
¼
X k
Mik dq0k
If the qk correspond to spherical charges on a discrete lattice, Mij is e2 =jri rj j (omitting the on-site term), or given periodic boundary conditions, M is the Madelung matrix (Slater, 1967). Equation 14 can be Fourier transformed, and that is typically done when the qk are occupations of plane waves. In that case, Mjk ¼ 4pe2 V 1 =k2 djk . Now dV 1 induces a corresponding additional screening charge dq1j , which induces another screening charge dq2j , and so on. The total perturbing potential is the sum of the external potential and the screening potentials, and the total screening charge is the sum of the dqn . Carrying out the sum, one arrives at the screened charge, potential, and an explicit representation for the dielectric constant e dq ¼
dqn ¼ ð1 MP0 Þ1 P0 dV 0
n
dV ¼ dV 0 þ dV scr ¼
X
dV n
ð15Þ ð16Þ
n
¼ ð1 MP0 Þ1 dV 0
ð17Þ
¼ e1 dV 0
ð18Þ
In practical implementations for crystals with periodic boundary conditions, e is computed in reciprocal space. The formulas above assumed a static screening, but the generalization is obvious if the screening is dynamic, that is, P0 and e are functions of energy. The screened Coulomb interaction, W, proceeds just as in the screening of an external test charge. In the lattice model, the continuous variable r is replaced with the matrix Mij connecting discrete lattice points; it is the Madelung matrix for a lattice with periodic boundary conditions. Then: Wij ðEÞ ¼ ½e1 ðEÞM ij
ð19Þ
The GW Approximation Formally, the GW approximation is the first term in the series expansion of the self-energy in the screened Coulomb interaction, W. However, the series is not necessarily convergent, and in any case such a viewpoint offers little insight. It is more useful to think of the GW approximation as being a generalization of Hartree-Fock theory, with an energy-dependent, nonlocal screened interaction, W, replacing the bare coulomb interaction M entering into the exchange (see Equation 3). The one-electron equation may be written generally in terms of the self-energy : "
# 2 2 h ext H r þ v ðrÞ þ v ðrÞ ci ðrÞ 2m ð þ d3 r0 ðr; r0 ; Ei Þcðr0 Þ ¼ Ei ci ðrÞ
ð20Þ
In the GW approximation, GW ðr; r0 ; EÞ ¼
ð14Þ
X
i 2p
ð1
doeþi0o Gðr; r0 ; E þ oÞWðr; r0 ; oÞ
1
ð21Þ
SUMMARY OF ELECTRONIC STRUCTURE METHODS
The connection between GW theory (see Equations 20 and 21), and HF theory see Equations 1 and 3, is obvious once we make the identification of v x with the self-energy . This we can do by expressing the density matrix in terms of the Green’s function:
X
c j ðr0 Þcj ðrÞ ¼
j
1 p
ð EF
do Im Gðr0 ; r; oÞ
ð22Þ
1
HF ðr; r0 ; EÞ ¼ v x ðr; r0 Þ ð 1 EF do ½Im Gðr; r0 ; oÞ Mðr; r0 Þ ¼ p 1
ð23Þ
Comparison of Equations 21 and 23 show immediately that the GW approximation is a Hartree-Fock-like theory, but with the bare Coulomb interaction replaced by an energy-dependent screened interaction W. Also, note that in HF theory, is calculated from occupied states only, while in GW theory, the quasiparticle spectrum requires a summation over unoccupied states as well. GW calculations proceed essentially along these lines; in practice G, e1, W and are generated in Fourier space. The LDA is used to create the starting wave functions that generate them; however, once they are made the LDA does not enter into the Hamiltonian. Usually in semiconductors e1 is calculated only for o ¼ 0, and the o dependence is taken from a plasmon–pole approximation. The latter is not adequate for metals (Quong and Eguiluz, 1993). The GW approximation has been used with excellent results in the calculation of optical excitations, such as
Table 3. Energy Bandgaps in the LDA, the GW Approximation with the Core Treated in the LDA, and the GW Approximation for Both Valence and Corea Expt Si 8v !6c 8v !X 8v !L Eg
3.45 1.32 2.1, 2.4 1.17
GW þ GW þ LDA LDA Core QP Core 2.55 0.65 1.43 0.52
3.31 1.44 2.33 1.26
3.28 1.31 2.11 1.13
SX 3.59 1.34 2.25 1.25
SX þ P0(SX) 3.82 1.54 2.36 1.45
0.89 1.10 0.74
0.26 0.55 0.05
0.53 1.28 0.70
0.85 1.09 0.73
0.68 1.19 0.77
0.73 1.21 0.83
GaAs 8v !6c 8v !X 8v !L
1.52 2.01 1.84
0.13 1.21 0.70
1.02 2.07 1.56
1.42 1.95 1.75
1.22 2.08 1.74
1.39 2.21 1.90
3.13 2.24
1.76 1.22 1.91
2.74 2.09 2.80
2.93 2.03 2.91
2.82 2.15 2.99
3.03 2.32 3.14
a
the calculation of energy gaps. It is difficult to say at this time precisely how accurate the GW approximation is in semiconductors, because only recently has a proper treatment of the semicore states been formulated (Shirley et al., 1997). Table 3 compares some excitation energies of a few semiconductors to experiment and to the LDA. Because of the numerical difficulty in working with products of four wave functions, nearly all GW calculations are carried out using plane waves. There has been, however, an all-electron GW method developed (Aryasetiawan and Gunnarsson, 1994), in the spirit of the augmented wave. This implementation permits the GW calculation of narrow-band systems. One early application to Ni showed that it narrowed the valence d band by 1 eV relative to the LDA, in agreement with experiment. The GW approximation is structurally relatively simple; as mentioned above, it assumes a generalized HF form. It does not possess higher-order (most notably, vertex) corrections. These are needed, for example, to reproduce the multiple plasmon satellites in the photoemission of the alkali metals. Recently, Aryasetiawan and coworkers introduced a beyond-GW ‘‘cumulant expansion’’ (Aryasetiawan et al., 1996), and very recently, an ab initio T-matrix technique (Springer et al., 1998) that they needed to account for the spectra in Ni. Usually GW calculations to date use the LDA to generate G, W, etc. The procedure can be made self-consistent, i.e., G and W remade with the GW self-energy; in fact this was essential in the highly correlated case of NiO (Aryasetiawan and Gunnarson, 1995). Recently, Holm and von Barth (1998) investigated properties of the homogeneous electron gas with a G and W calculated self-consistently, i.e., from a GW potential. Remarkably, they found the self-consistency worsened the optical properties with respect to experiment, though the total energy did improve. Comparison of data in Table 3 shows that the self-consistency procedure overcorrected the gap widening in the semiconductors as well. It may be possible in principle to calculate ground-state properties in the GW, but this is extremely difficult in practice, and there has been no successful attempt to date for real materials. Thus, the LDA remains the ‘‘industry standard’’ for total-energy calculations.
The SX Approximation
Ge 8v !6c 8v !X 8v !L
AlAs 8v !6c 8v !X 8v !L
85
After Shirley et al. (1997). SX calculations are by the present author, using either the LDA G and W, or by recalculating G and W with the LDA þ SX potential.
Because the calculations are very heavy and unsuited to calculations of complex systems, there have been several attempts to introduce approximations to the GW theory. Very recently, Ru¨ cker (unpub. observ.) introduced a generalization of the LDA functional to account for excitations (van Schilfgaarde et al., 1997). His approach, which he calls the ‘‘screened exchange’’ (SX) theory, differs from the usual GW approach in that the latter does not use the LDA at all except to generate trial wave functions needed to make the quantities such as G, e1, and W. His scheme was implemented in the LMTO–atomic spheres approximation (LMTO-ASA; see the Appendix), and promises to be extremely efficient for the calculation of excited-state properties, with an accuracy approaching
86
COMPUTATION AND THEORETICAL METHODS
that of the GW theory. The principal idea is to calculate the difference between the screened exchange and the contribution to the screened exchange from the local part of the response function. The difference in may be similarly calculated:
dW ¼ W½P0 W½P0;LDA
ð24Þ
dW ¼ G dW
ð25Þ
dvSX
ð26Þ
The energy bands are generated like in GW theory, except that the (small) correction d is added to the local vXC , instead of being substituted for vXC . The analog with GW theory is that
ðr; r0 ; EÞ ¼ vSX ðr; r0 Þ þ dðr r0 Þ½vxc ðrÞ vsx;DFT ðrÞ
ð27Þ
Although it is not essential to the theory, Ru¨ cker’s implementation uses only the static response function, so that the one-electron equations have the Hartree-Fock form (see Equation 1). The theory is formulated in terms of a generalization of the LDA functional, so that the N-particle LDA ground state is exactly reproduced, and also the (N þ 1)-particle ground state is generated with a corresponding accuracy, provided interaction of the additional electron and the N particle ground state is correctly depicted by vXC þ d. In some sense, Ru¨ cker’s approach is a formal and more rigorous embodiment of Harrison’s model. Some results using this theory are shown in Table 3 along with Shirley’s results. The closest points of comparison are the GW calculations marked ‘‘GW þ LDA core’’ and ‘‘SX’’; these both use the LDA to generate the GW self-energy . Also shown is the result of a partially self-consistent calculation, in which the G and W, were remade using the LDA þ SX potential. It is seen that selfconsistency widens the gaps, as found by Holm and von Barth (1998) for a jellium.
the components of an electron-hole pair are infinitely separated. A motivation for the LDA þ U functional can already be seen in the humble H atom. The total energy of the Hþ ion is 0, the energy of the H atom is 1 Ry, and the H ion is barely bound; thus its total energy is also 1 Ry. Let us assume the LDA correctly predicts these total energies (as it does in practice; the LDA binding of H is 0.97 Ry). In the LDA, the number of electrons, N, is a continuous variable, and the one-electron term value is, by definition, e ffi dE=dN. Drawing a parabola through these three energies, it is evident that e 0.5 Ry in the LDA. By interpreting e as the energy needed to ionize H (this is the atomic analog of using energy bands in a solid for the excitation spectrum), one obtains a factor-of-two error. On the other hand, using the LDA total energy difference E(1) E(0) 0.97 Ry predicts the ionization of the H atom rather well. Essentially the same point was made in the discussion of Opical Properties, above. In the solid, this difficulty persists, but the error is much reduced because the other electrons that are present will screen out much of the effect, and the error will depend on the context. In semiconductors, bandgap is underestimated because the LDA misses the Coulomb repulsion associated with separating an electron-hole pair (this is almost exactly analogous to the ionization of H). The cost of separating an electronhole pair would be 1 cm. Furthermore, the gas is treated as an ideal gas and the flow is assumed to be laminar, the Reynolds number being well below values at which turbulence might be expected. In CVD we have to deal with multicomponent gas mixtures. The composition of an N-component gas mixture can be described in terms of the dimensionless mass fractions oi of its constituents, which sum up to unity: N X i¼1
oi ¼ 1
ð15Þ
SIMULATION OF CHEMICAL VAPOR DEPOSITION PROCESSES
Their diffusive fluxes can be expressed as mass fluxes ~ ji with respect to the mass-averaged velocity ~ v: ~ ji ¼ roi ð~ vi ~ vÞ
ð16Þ
The transport of momentum, heat, and chemical species is described by a set of coupled partial differential equations (Bird et al., 1960; Kleijn and Werner, 1993; Kleijn, 1995; Kleijn and Kuijlaars, 1995). The conservation of mass is given by the continuity equation: qr ¼ r ðr~ vÞ qt
ð17Þ
where r is the gas density and t the time. The conservation of momentum is given for Newtonian fluids by: qr~ v ¼ r ðr~ v~ vÞ þ r fm½r~ v þ ðr~ v Þy qt 2 vÞIg rp þ r~ g mðr ~ 3
ð18Þ
where m is viscosity, I the unity tensor, p is the pressure, and ~ g the gravity vector. The transport of thermal energy can be expressed in terms of temperature T. Apart from convection, conduction, and pressure terms, its transport equation comprises a term denoting the Dufour effect (transport of heat due to concentration gradients), a term representing the transport of enthalpy through diffusion of gas species, and a term representing production of thermal energy through chemical reactions, as follows: cp
qrT DP ¼ cp r ðr~ vTÞ þ r ðlrTÞþ qt ! Dt N N N X K X X DTi Hi X ~ ji r rðln xi Þ Hi nik Rgk þr RT M M i i i¼1 i¼1 i¼1 k¼1 |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflffl} inter-diffusion Dufour heat of reaction
ð19Þ where cp is the specific heat, l the thermal conductivity, P pressure, xi is the mole fraction of gas i, DTi its thermal diffusion coefficient, Hi its enthalpy, ~ ji its diffusive mass flux, nik its stoichiometric coefficient in reaction k, Mi its molar mass, and Rgk the net reaction rate of reaction k. The transport equation for the ith gas species is given by: K X qroi ¼ r ðr~ voi Þ r ~ j i þ Mi nik Rgk |ffl{zffl} |fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} qt k¼1 convection diffusion |fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl}
ð20Þ
sion is Fick’s Law, which, however, is valid for isothermal, binary mixtures only. In the rigorous kinetic theory of Ncomponent gas mixtures, the following expression for the diffusive mass flux vector is found (Hirschfelder et al., 1967), N jj oj ~ ji oi ~ M X r j¼1; j6¼i Mj Dij
¼ roi þ oi
N oi DTj oj DTi rM M X rðlnTÞ M r j¼1; j6¼i Mj Dij
ð21Þ
where M is the average molar mass, Dij is the binary diffusion coefficient of a gas pair and DTi is the thermal diffusion coefficient of a gas species. In general, DTi > 0 for large, heavy molecules (which therefore are driven toward cold zones in the reactor), DTi < 0 for small, light molecules (which therefore are driven toward hot zones in the reactor), and DTi ¼ 0: Equation 21 can be rewritten by separating the diffusive mass flux vector ~ ji into a flux driven by concentration gradients ~ jC and a flux driven by temperai ture gradients ~ j Ti : ~T ~ jC ji ¼ ~ i þji
ð22Þ
C N o ~ ~C MX i j j oj j i rM ¼ roi þ oi Mj Dij r j¼1 M
ð23Þ
j~Ti ¼ DTi rðln TÞ
ð24Þ
with
and
Equation 23 relates the N diffusive mass flux vectors ~ jC i to the N mass fractions and mass fraction gradients. In many numerical schemes however, it is desirable that the species transport equation (Eq. 20) contains a gradient-driven ‘‘Fickian’’ diffusion term. This can be obtained by rewriting Equation 23 as: ~ X jC j SM SM rm ~ jC þ moi DSM i ¼ rDi roi roi Di i |fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} m Mj Dij |fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl} j ¼ 1; j 6¼ i Fick term |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} multi-component 1 multi-component 2 ð25Þ and defining a diffusion coefficient DSM i :
reaction
where ~ ji represents the diffusive mass flux of species i. In an N-component gas mixture, there are N 1 independent species equations of the type of Equation 20, since the mass fraction must sum up to unity (see Eq. 15). Two phenomena of minor importance in many other processes may be specifically prominent in CVD, i.e., multicomponent effects and thermal diffusion (Soret effect). The most commonly applied theory for modeling gas diffu-
171
DSM i
¼
N X
xi D j ¼ 1; j 6¼ i ij
!1 ð26Þ
The divergence of the last two terms in Equation 25 is treated as a source term. Within an iterative solution scheme, the unknown diffusive fluxes ~ jC j can be taken from a previous iteration. The above transport equations are supplemented with the usual boundary conditions in the inlets and outlets
172
COMPUTATION AND THEORETICAL METHODS
and at the nonreacting walls. On reacting walls there will be a net gaseous mass production which leads to a velocity component normal to the wafer surface: X X ~ ðr~ n vÞ ¼ Mi sil Rsl ð27Þ i
l
~ is the outward-directed unity vector normal to the where n surface, r is the local density of the gas mixture, Rsl the rate of the lth surface reaction and sil the stoichiometric coefficient of species i in this reaction. The net total mass flux of the ith species normal to the wafer surface equals its net mass production: X ~ ðroi~ n v þ~ ji Þ ¼ Mi sil Rsl ð28Þ l
Radiation
Kinetic Theory The modeling of transport phenomena and chemical reactions in CVD processes requires knowledge of the thermochemical properties (specific heat, heat of formation, and entropy) and transport properties (viscosity, thermal conductivity, and diffusivities) of the gas mixture in the reactor chamber. Thermochemical properties of gases as a function of temperature can be found in various publications (Svehla, 1962; Coltrin et al., 1986, 1989; Giunta et al., 1990a,b; Arora and Pollard, 1991) and databases (Gordon and McBride, 1971; Barin and Knacke, 1977; Barin et al., 1977; Stull and Prophet, 1982; Wagman et al., 1982; Kee et al., 1990). In the absence of experimental data, thermochemical properties may be obtained from ab initio molecular structure calculations (Melius et al., 1997). Only for the most common gases can transport properties be found in the literature (Maitland and Smith, 1972; l’Air Liquide, 1976; Weast, 1984). The transport properties of less common gas species may be calculated from kinetic theory (Svehla, 1962; Hirschfelder et al., 1967; Reid et al., 1987; Kleijn and Werner, 1993; Kleijn, 1995; Kleijn and Kuijlaars, 1995). Assumptions have to be made for the form of the intermolecular potential energy function f(r). For nonpolar molecules, the most commonly used intermolecular potential energy function is the Lennard-Jones potential: s 12 s 6 fðrÞ ¼ 4e ð29Þ r r where r is the distance between the molecules, s the collision diameter of the molecules, and e their maximum energy of attraction. Lennard-Jones parameters for many CVD gases can be found in Svehla (1962), Coltrin et al. (1986), Coltrin et al. (1989), Arora and Pollard (1991), and Kee et al. (1991), or can be estimated from properties of the gas at the critical point or at the boiling point (Bird et al., 1960): e ¼ 0:77Tc kB s ¼ 0:841Vc
where Tc and Tb are the critical temperature and normal boiling point temperature (K), Pc is the critical pressure (atm), Vc and Vb,l are the molar volume at the critical point and the liquid molar volume at the normal boiling point (cm3 mol1), and kB is the Boltzmann constant. For most CVD gases, only rough estimates of Lennard-Jones parameters are available. Together with inaccuracies in the assumptions made in kinetic theory, this leads to an accuracy of predicted transport properties of typically 10% to 25%. When the transport properties of its constituent gas species are known, the properties of a gas mixture can be calculated from semiempirical mixture rules (Reid et al., 1987; Kleijn and Werner, 1993; Kleijn, 1995; Kleijn and Kuijlaars, 1995). The inaccuracy in predicted mixture properties may well be as large as 50%.
e ¼ 1:15Tb kB Tc or s ¼ 2:44 Pc
ð30Þ
or
or
s ¼ 1:166Vb;l
ð31Þ
CVD reactor walls, windows, and substrates adopt a certain temperature profile as a result of their conjugate heat exchange. These temperature profiles may have a large influence on the deposition process. This is even more true for lamp-heated reactors, such as rapid thermal CVD (RTCVD) reactors, in which the energy bookkeeping of the reactor system is mainly determined by radiative heat exchange. The transient temperature distribution in solid parts of the reactor is described by the Fourier equation (Bird et al., 1960): rs cp;s
qT ¼ r ðls rTs Þ þ q000 qt
ð32Þ
where q000 is the heat-production rate in the solid material, e.g., due to inductive heating, rs ; cp,s, ls , and Ts are the solid density, specific heat, thermal conductivity, and temperature. The boundary conditions at the solid-gas interfaces take the form: ~ ðls rTs Þ ¼ q00conv þ q00rad n
ð33Þ
where q00conv and q00rad are the convective and radiative heat ~ is the outward directed unity vecfluxes to the solid and n tor normal to the surface. For the interface between a solid and the reactor gases (the temperature distribution of which is known) we have: q00conv ¼ ~ n ðlg rTg Þ
ð34Þ
where lg and Tg are the thermal conductivity and temperature of the reactor gases. Usually, we do not have detailed information on the temperature distribution outside the reactor. Therefore, we have to use heat transfer relations like: q00conv ¼ aconv ðTs Tambient Þ
ð35Þ
to model the convective heat losses to the ambient, where aconv is a heat-transfer coefficient. The most challenging part of heat-transfer modeling is the radiative heat exchange inside the reactor chamber,
SIMULATION OF CHEMICAL VAPOR DEPOSITION PROCESSES
which is complicated by the complex geometry, the spectral and temperature dependence of the optical properties (Ordal et al., 1985; Palik, 1985), and the occurrence of specular reflections. An extensive treatment of all these aspects of the modeling of radiative heat exchange can be found, e.g., in Siegel and Howell (1992), Kersch and Morokoff (1995), Kersch (1995a), or Kersch (1995b). An approach that can be used if the radiating surfaces are diffuse-gray (i.e., their absorptivity and emissivity are independent of direction and wavelength) is the so-called Gebhart absorption-factor method (Gebhart, 1958, 1971). The reactor walls are divided into small surface elements, across which a uniform temperature is assumed. Exchange factors Gij between pairs i j of surface elements are evaluated, which are determined by geometrical line-of-sight factors and optical properties. The net radiative heat transfer to surface element j now equals: q00rad; j ¼
1X Gij ei sB Ti4 Ai ej sB Tj4 Aj i
ð36Þ
where e is the emissivity, sB the Stefan-Boltzmann constant, and A the surface area of the element. In order to incorporate further refinements, such as wavelength, temperature and directional dependence of the optical properties, and specular reflections, Monte Carlo methods (Howell, 1968) are more powerful than the Gebhart method. The emissive power is partitioned into a large number of rays of energy leaving each surface, which are traced through the reactor as they are being reflected, transmitted, or absorbed at various surfaces. By choosing the random distribution functions of the emission direction and the wavelength for each ray appropriately, the total emissive properties from the surface may be approximated. By averaging over a large number of rays, the total heat exchange fluxes may be computed (Coronell and Jensen, 1994; Kersch and Morokoff, 1995; Kersch, 1995a,b).
PRACTICAL ASPECTS OF THE METHOD In the previous section, the seven main aspects of CVD simulation (i.e., surface chemistry, gas-phase chemistry, free molecular flow, plasma physics, hydrodynamics, kinetic theory, and thermal radiation) have been discussed (see Principles of the Method). An ideal CVD simulation tool should integrate models for all these aspects of CVD. Such a comprehensive tool is not available yet. However, powerful software tools for each of these aspects can be obtained commercially, and some CVD simulation software combines several over the necessary models.
Surface Chemistry For modeling surface processes at the interface between a solid and a reactive gas, the SURFACE CHEMKIN package (available from Reaction Design; also see Coltrin et al., 1991a,b) is undoubtedly the most flexible and powerful
173
tool available at present. It is a suite of FORTRAN codes allowing for easily setting up surface reaction simulations. It defines a formalism for describing surface processes between various gaseous, adsorbed, and solid bulk species and performs bookkeeping on concentrations of all these species. In combination with the SURFACE PSR (Moffat et al., 1991b), SPIN (Coltrin et al., 1993a) and CRESLAF (Coltrin et al., 1993b) codes (all programs available from Reaction Design), it can be used to model the surface reactions in a perfectly stirred tank reactor, a rotating disk reactor, or a developing boundary layer flow along a reacting surface. Simple problems can be run on a personal computer in a few minutes; more complex problems may take dozens of minutes on a powerful workstation. No models or software are available for routinely predicting surface reaction kinetics. In fact, this is as yet probably the most difficult and unsolved issue in CVD modeling. Surface reaction kinetics are estimated based on bond dissociation enthalpies, transition state theory, and analogies with similar gas phase reactions; the success of this approach largely depends on the skills and expertise of the chemist performing the analysis. Gas-Phase Chemistry Similarly, for modeling gas-phase reactions, the CHEMKIN package (available from Reaction Design; also see Kee et al., 1989) is the de facto standard modeling tool. It is a suite of FORTRAN codes allowing for easily setting up reactive gas flow problems, which computes production/destruction rates and performs bookkeeping on concentrations of gas species. In combination with the CHEMKIN THERMODYNAMIC DATABASE (Reaction Design) it allows for the self-consistent evaluation of species thermochemical data and reverse reaction rates. In combination with the SURFACE PSR (Moffat et al., 1991a), SPIN (Coltrin et al., 1993a), and CRESLAF (Coltrin et al., 1993b) codes (all programs available from Reaction Design) it can be used to model reactive flows in a perfectly stirred tank reactor, a rotating disk reactor, or a developing boundary layer flow, and it can be used together with SURFACE CHEMKIN (Reaction Design) to simulate problems with both gas and surface reactions. Simple problems can be run on a personal computer in a few minutes; more complex problems may take dozens of minutes on a powerful workstation. Various proprietary and shareware software programs are available for predicting gas phase rate constants by means of theoretical chemical kinetics (Hase and Bunker, 1973) and for evaluating molecular and transition state structures and electronic energies (available from Biosym Technologies and Gaussian Inc.). These programs, especially the latter, require significant computing power. Once the possible reaction paths have been identified and reaction rates have been estimated, sensitivity analysis with, e.g., the SENKIN package (available from Reaction Design, also see Lutz et al., 1993) can be used to eliminate insignificant reactions and species. As in surface chemistry, setting up a reaction model that can confidently be used in predicting gas phase chemistry is still far from trivial, and the success largely depends on the skills and expertise of the chemist performing the analysis.
174
COMPUTATION AND THEORETICAL METHODS
Free Molecular Transport As described in the previous section (see Principles of the Method) the ‘‘ballistic transport-reaction’’ model is probably the most powerful and flexible approach to modeling free molecular gas transport and chemical reactions inside very small surface structures (i.e., much smaller than the gas molecules’ mean free path length). This approach has been implemented in the EVOLVE code. EVOLVE 4.1a is a lowpressure transport, deposition, and etch-process simulator developed by T.S. Cale at Arizona State University, Tempe, Ariz., and Motorola Inc., with support from the Semiconductor Research Corporation, The National Science Foundation, and Motorola, Inc. It allows for the prediction of the evolution of film profiles and composition inside small two-dimensional and three-dimensional holes of complex geometry as functions of operating conditions, and requires only moderate computing power, provided by a personal computer. A Monte Carlo model for microscopic film growth has been integrated into the computational fluid dynamics (CFD) code CFD-ACE (available from CFDRC). Plasma Physics The modeling of plasma physics and chemistry in CVD is probably not yet mature enough to be done routinely by nonexperts. Relatively powerful and user-friendly continuum plasma simulation tools have been incorporated into some tailored CFD codes, such as Phoenics-CVD from Cham, Ltd. and CFDPLASMA (ICP) from CFDRC. They allow for two- and three-dimensional plasma modeling on powerful workstations at typical CPU times of several hours. However, the accurate modeling of plasma properties requires considerable expert knowledge. This is even more true for the relation between plasma physics and plasma enhanced chemical reaction rates. Hydrodynamics General purpose CFD packages for the simulation of multi-dimensional fluid flow have become available in the last two decades. These codes have mostly been based on either the finite volume method (Patankar, 1980; Minkowycz et al., 1988) or the finite element method (Taylor and Hughes, 1981; Zienkiewicz and Taylor, 1989). Generally, these packages offer easy grid generation for complex two-dimensional and three-dimensional geometries, a large variety of physical models (including models for gas radiation, flow in porous media, turbulent flow, two-phase flow, non-Newtonian liquids, etc.), integrated graphical post-processing, and menu-driven user-interfaces allowing the packages to be used without detailed knowledge of fluid dynamics and computational techniques. Obviously, CFD packages are powerful tools for CVD hydrodynamics modeling. It should, however, be realized that they have not been developed specifically for CVD modeling. As a result: (1) the input data must be formulated in a way that is not very compatible with common CVD practice; (2) many features are included that are not needed for CVD modeling, which makes the packages
bulky and slow; (3) the numerical solvers are generally not very well suited for the solution of the stiff equations typical of CVD chemistry; (4) some output that is of specific interest in CVD modeling is not provided routinely; and (5) the codes do not include modeling features that are needed for accurate CVD modeling, such as gas species thermodynamic and transport property databases, solids thermal and optical property databases, chemical reaction mechanism and rate constants databases, gas mixture property calculation from kinetic theory, multicomponent ordinary and thermal diffusion, multiple chemical species in the gas phase and at the surface, multiple chemical reactions in the gas phase and at the surface, plasma physics and plasma chemistry models, and non-gray, non-diffuse wall-to-wall radiation models. The modifications required to include these features in general-purpose fluid dynamics codes are not trivial, especially when the source codes are not available. Nevertheless, promising results in the modeling of CVD reactors have been obtained with general purpose CFD codes. A few CFD codes have been specially tailored for CVD reactor scale simulations including the following. PHOENICS-CVD (Phoenics-CVD, 1997), a finite-volume CVD simulation tool based on the PHOENICS flow simulator by Cham Ltd. (developed under EC-ESPRIT project 7161). It includes databases for thermal and optical solid properties and thermodynamic and transport gas properties (CHEMKIN-format), models for multicomponent (thermal) diffusion, kinetic theory models for gas properties, multireaction gas-phase and surface chemistry capabilities, the effective drift diffusion plasma model and an advanced wall-to-wall thermal radiation model, including spectral dependent optical properties, semitransparent media and specular reflection. Its modeling capabilities and examples of its applications have been described in Heritage (1995). CFD-ACE, a finite-volume CFD simulation tool by CFD Research Corporation. It includes models for multicomponent (thermal) diffusion, efficient algorithms for stiff multistep gas and surface chemistry, a wall-to-wall thermal radiation model (both gray and non-gray) including semitransparent media, and integrated Monte Carlo models for free molecular flow phenomena inside small structures. The code can be coupled to CFD-PLASMA to perform plasma CVD simulations. FLUENT, a finite-volume CFD simulation tool by Fluent Inc. It includes models for multicomponent (thermal) diffusion, kinetic theory models for gas properties, and (limited) multistep gas-phase chemistry and simple surface chemistry. These codes have been available for a relatively short time now and are still continuously evolving. They allow for the two-dimensional modeling of gas flow with simple chemistry on powerful personal computers in minutes. For three-dimensional simulations and simulations including, e.g., plasma, complex chemistry, or radiation effects, a powerful workstation is needed, and CPU times may be several hours. Potential users should compare the capabilities, flexibility, and user-friendliness of the codes to their own needs.
SIMULATION OF CHEMICAL VAPOR DEPOSITION PROCESSES
Kinetic Theory Kinetic gas theory models for predicting transport properties of multicomponent gas mixtures have been incorporated into the PHOENICS-CVD, CVD-ACE, and FLUENT flow simulation codes. The CHEMKIN suite of codes contains a library of routines as well as databases (Kee et al., 1990) for evaluating transport properties of multicomponent gas mixture as well. Thermal Radiation Wall-to-wall thermal radiation models, including essential features for CVD modeling such as spectrally dependent (non-gray) optical properties, semitransparent media (e.g., quartz) and specular reflection, on, e.g., polished metal surfaces, have been incorporated in the PHOENICSCVD and CFD-ACE flow simulation codes. In addition many stand-alone thermal radiation simulators are available, e.g., ANSYS (from Swanson Analysis Systems).
PROBLEMS CVD simulation is a powerful tool in reactor design and process optimization. With commercially available CFD software, rather straightforward and reliable hydrodynamic modeling studies can be performed, which give valuable information on, e.g., flow recirculations, dead zones, and other important reactor design issues. Thermal radiation simulations can also be performed relatively easily, and they provide detailed insight in design parameters such as heating uniformities and peak thermal load. However, as soon as one wishes to perform a comprehensive CVD simulation to predict issues such as film deposition rate and uniformity, conformality, and purity, several problems arise. The first problem is that every available numerical simulation code to be used for CVD simulation has some limitations or drawbacks. The most powerful and tailored CVD simulation models—i.e., the CHEMKIN suite from Reaction Design—allows for the hydrodynamic simulation of highly idealized and simplified flow systems only, and does not include models for thermal radiation and molecular behavior in small structures. CFD codes (even the ones that have been tailored for CVD modeling, such as PHOENICS-CVD, CFD-ACE, and FLUENT) have limitations with respect to modeling stiff multi-reaction chemistry. The coupling to molecular flow models (if any) is only one-way, and incorporated plasma models have a limited range of validity and require specialist knowledge. The simulation of complex three-dimensional reactor geometries with detailed chemistry, plasma and/or radiation can be very cumbersome, and may require long CPU times. A second and perhaps even more important problem is the lack of detailed, reliable, and validated chemistry models. Models have been proposed for some important CVD processes (see Principles of the Method), but their testing and validation has been rather limited. Also, the ‘‘translation’’ of published models to the input format required by various software is error-prone. In fact, the unknown
175
chemistry of many processes is the most important bottle-neck in CVD simulation. The CHEMKIN codes come with detailed chemistry models (including rate constants) for a range of processes. To a lesser extent, detailed chemistry models for some CVD processes have been incorporated in the databases of PHOENICS-CVD as well. Lumped chemistry models should be used with the greatest care, since they are unlikely to hold for process conditions different from those for which they have been developed, and it is sometimes even doubtful whether they can be used in a different reactor than that for which they have been tested. Even detailed chemistry models based on elementary processes have a limited range of validity, and the use of these models in a different pressure regime is especially dangerous. Without fitting, their accuracy in predicting growth rates may well be off by 100%. The use of theoretical tools for predicting rate constants in the gas phase requires specialized knowledge and is not completely straightforward. This is even more the case for surface reaction kinetics, where theoretical tools are just beginning to be developed. A third problem is the lack of accurate input data, such as Lennard-Jones parameters for gas property prediction, thermal and optical solid properties (especially for coated surfaces), and plasma characteristics. Some scattered information and databases containing relevant parameters are available, but almost every modeler setting up CVD simulations for a new process will find that important data are lacking. Furthermore, the coupling between the macro-scale (hydrodynamics, plasma, and radiation) parts of CVD simulations and meso-scale models for molecular flow and deposition in small surface structures is difficult (Jensen et al., 1996; Gobbert et al., 1997), and this, in particular, is what one is interested in for most CVD modeling. Finally, CVD modeling does not, at present, predict film structure, morphology, or adhesion. It does not predict mechanical, optical, or electrical film properties. It does not lead to the invention of new processes or the prediction of successful precursors. It does not predict optimal reactor configurations or processing conditions (although it can be used in evaluating the process performance as a function of reactor configuration and processing conditions). It does not generally lead to quantitatively correct growth rate or step coverage predictions without some fitting aided by prior experimental knowledge of the deposition kinetics. However, in spite of all these limitations, carefully setup CVD simulations can provide reactor designers and process developers with a wealth of information, as shown in many studies (Kleijn, 1995). CVD simulations predict general trends in the process characteristics and deposition properties in relation to process conditions and reactor geometry and can provide fundamental insight into the relative importance of various phenomena. As such, it can be an important tool in process optimization and reactor design, pointing out bottlenecks in the design and issues that need to be studied more carefully. All of this leads to more efficient, faster, and less expensive process design, in which less trial and error is involved. Thus,
176
COMPUTATION AND THEORETICAL METHODS
successful attempts have been made in using simulation, for example, to optimize hydrodynamic reactor design and eliminate flow recirculations (Evans and Greif, 1987; Visser et al., 1989; Fotiadis et al., 1990), to predict and optimize deposition rate and uniformity (Jensen and Graves, 1983; Kleijn and Hoogendoorn, 1991; Biber et al., 1992), to optimize temperature uniformity (Badgwell et al., 1994; Kersch and Morokoff, 1995), to scale up existing reactors to large wafer diameters (Badgwell et al., 1992), to optimize process operation and processing conditions with respect to deposition conformality (Hasper et al., 1991; Kristof et al., 1997), to predict the influence of processing conditions on doping rates (Masi et al., 1992), to evaluate loading effects on selective deposition rates (Holleman et al., 1993), and to study the influence of operating conditions on self-limiting effects (Leusink et al., 1992) and selectivity loss (Werner et al., 1992; Kuijlaars, 1996). The success of these exercises largely depends on the skills and experience of the modeler. Generally, all available CVD simulation software leads to erroneous results when used by careless or inexperienced modelers.
LITERATURE CITED Allendorf, M. and Kee, R. 1991. A model of silicon carbide chemical vapor deposition. J. Electrochem. Soc. 138:841–852. Allendorf, M. and Melius, C. 1992. Theoretical study of the thermochemistry of molecules in the Si-C-H system. J. Phys. Chem. 96:428–437. Arora, R. and Pollard, R. 1991. A mathematical model for chemical vapor deposition influenced by surface reaction kinetics: Application to low pressure deposition of tungsten. J. Electrochem. Soc. 138:1523–1537. Badgwell, T., Edgar, T., and Trachtenberg, I. 1992. Modeling and scale-up of multiwafer LPCVD reactors. AIChE Journal 138:926–938. Badgwell, T., Trachtenberg, I., and Edgar, T. 1994. Modeling the wafer temperature profile in a multiwafer LPCVD furnace. J. Electrochem. Soc. 141:161–171. Barin, I. and Knacke, O. 1977. Thermochemical Properties of Inorganic Substances. Springer-Verlag, Berlin. Barin, I., Knacke, O., and Kubaschewski, O. 1977. Thermochemical Properties of Inorganic Substances. Supplement. SpringerVerlag, Berlin. Benson, S. 1976. Thermochemical Kinetics (2nd ed.). John Wiley & Sons, New York. Biber, C., Wang, C., and Motakef, S. 1992. Flow regime map and deposition rate uniformity in vertical rotating-disk omvpe reactors. J. Crystal Growth 123:545–554. Bird, R. B., Stewart, W., and Lightfood, E. 1960. Transport Phenomena. John Wiley & Sons, New York. Birdsall, C. 1991. Particle-in-cell charged particle simulations, plus Monte Carlo collisions with neutral atoms. IEEE Trans. Plasma Sci. 19:65–85. Birdsall, C. and Langdon, A. 1985. Plasma Physics via Computer Simulation. McGraw-Hill, New York. Boenig, H. 1988. Fundamentals of Plasma Chemistry and Technology. Technomic Publishing Co., Lancaster, Pa. Brinkmann, R., Vogg, G., and Werner, C. 1995a. Plasma enhanced deposition of amorphous silicon. Phoenics J. 8:512–522.
Brinkmann, R., Werner, C., and Fu¨ rst, R. 1995b. The effective drift-diffusion plasma model and its implementation into phoenics-cvd. Phoenics J. 8:455–464. Bryant, W. 1977. The fundamentals of chemical vapour deposition. J. Mat. Science 12:1285–1306. Bunshah, R. 1982. Deposition Technologies for Films and Coatings. Noyes Publications, Park Ridge, N.J. Cale, T. and Raupp, G. 1990a. Free molecular transport and deposition in cylindrical features. J. Vac. Sci. Technol. B 8:649–655. Cale, T. and Raupp, G. 1990b. A unified line-of-sight model of deposition in rectangular trenches. J. Vac. Sci. Technol. B 8:1242–1248. Cale, T., Raupp, G., and Gandy, T. 1990. Free molecular transport and deposition in long rectangular trenches. J. Appl. Phys. 68:3645–3652. Cale, T., Gandy, T., and Raupp, G. 1991. A fundamental feature scale model for low pressure deposition processes. J. Vac. Sci. Technol. A 9:524–529. Chapman, B. 1980. Glow Discharge Processes. John Wiley & Sons, New York. Chatterjee, S. and McConica, C. 1990. Prediction of step coverage during blanket CVD tungsten deposition in cylindrical pores. J. Electrochem. Soc. 137:328–335. Clark, T. 1985. A Handbook of Computational Chemistry. John Wiley & Sons, New York. Coltrin, M., Kee, R., and Miller, J. 1986. A mathematical model of silicon chemical vapor deposition. Further refinements and the effects of thermal diffusion. J. Electrochem. Soc. 133:1206– 1213. Coltrin, M., Kee, R., and Evans, G. 1989. A mathematical model of the fluid mechanics and gas-phase chemistry in a rotating disk chemical vapor deposition reactor. J. Electrochem. Soc. 136: 819–829. Coltrin, M., Kee, R., and Rupley, F. 1991a. Surface Chemkin: A general formalism and software for analyzing heterogeneous chemical kinetics at a gas-surface interface. Int. J. Chem. Kinet. 23:1111–1128. Coltrin, M., Kee, R., and Rupley, F. 1991b. Surface Chemkin (Version 4.0). Technical Report SAND90-8003. Sandia National Laboratories Albuquerque, N.M./Livermore, Calif. Coltrin, M., Kee, R., Evans, G., Meeks, E., Rupley, F., and Grcar, J. 1993a. SPIN (Version 3.83): A FORTRAN program for modeling one-dimensional rotating-disk/stagnation-flow chemical vapor deposition reactors. Technical Report SAND918003.UC-401 Sandia National Laboratories, Albuquerque, N.M./Livermore, Calif. Coltrin, M., Moffat, H., Kee, R., and Rupley, F. 1993b. CRESLAF (Version 4.0): A FORTRAN program for modeling laminar, chemically reacting, boundary-layer flow in cylindrical or planar channels. Technical Report SAND93-0478.UC-401 Sandia National Laboratories Albuquerque, N.M./Livermore, Calif. Cooke, M. and Harris, G. 1989. Monte Carlo simulation of thinfilm deposition in a rectangular groove. J. Vac. Sci. Technol. A 7:3217–3221. Coronell, D. and Jensen, K. 1993. Monte Carlo simulations of very low pressure chemical vapor deposition. J. Comput. Aided Mater. Des. 1:1–12. Coronell, D. and Jensen, K. 1994. Monte Carlo simulation study of radiation heat transfer in the multiwafer LPCVD reactor. J. Electrochem. Soc. 141:496–501. Dapkus, P. 1982. Metalorganic chemical vapor deposition. Annu. Rev. Mater. Sci. 12:243–269.
SIMULATION OF CHEMICAL VAPOR DEPOSITION PROCESSES Dewar, M., Healy, E., and Stewart, J. 1984. Location of transition states in reaction mechanisms. J. Chem. Soc. Faraday Trans. II 80:227–233. Evans, G. and Greif, R. 1987. A numerical model of the flow and heat transfer in a rotating disk chemical vapor deposition reactor. J. Heat Transfer 109:928–935. Forst, W. 1973. Theory of Unimolecular Reactions. Academic Press, New York. Fotiadis, D. 1990. Two- and Three-dimensional Finite Element Simulations of Reacting Flows in Chemical Vapor Deposition of Compound Semiconductors. Ph.D. thesis. University of Minnesota, Minneapolis, Minn.
of tungsten LPCVD in trenches J. Electrochem. Soc. 138:1728–1738.
and
contact
177 holes.
Hebb, J. B. and Jensen, K. F. 1996. The effect of multilayer patterns on temperature uniformity during rapid thermal processing. J. Electrochem. Soc. 143(3):1142–1151. Hehre, W., Radom, L., Schleyer, P., and Pople, J. 1986. Ab Initio Molecular Orbital Theory. John Wiley & Sons, New York. Heritage, J. R. (ed.) 1995. Special issue on PHOENICS-CVD and its applications. PHOENICS J. 8(4):402–552. Hess, D., Jensen, K., and Anderson, T. 1985. Chemical vapor deposition: A chemical engineering perspective. Rev. Chem. Eng. 3:97–186.
Fotiadis, D., Kieda, S., and Jensen, K. 1990. Transport phenomena in vertical reactors for metalorganic vapor phase epitaxy: I. Effects of heat transfer characteristics, reactor geometry, and operating conditions. J. Crystal Growth 102:441–470.
Hirschfelder, J., Curtiss, C., and Bird, R. 1967. Molecular Theory of Gases and Liquids. John Wiley & Sons Inc., New York.
Frenklach, M. and Wang, H. 1991. Detailed surface and gas-phase chemical kinetics of diamond deposition. Phys. Rev. B. 43:1520–1545.
Ho, P. and Melius, C. 1990. A theoretical study of the thermochemistry of sifn and SiHnFm compounds and Si2F6. J. Phys. Chem. 94:5120–5127. Ho, P., Coltrin, M., Binkley, J., and Melius, C. 1985. A theoretical study of the heats of formation of SiHn , SiCln , and SiHn Clm compounds. J. Phys. Chem. 89:4647–4654.
Gebhart, B. 1958. A new method for calculating radiant exchanges. Heating, Piping, Air Conditioning 30:131–135. Gebhart, B. 1971. Heat Transfer (2nd ed.). McGraw-Hill, New York. Gilbert, R., Luther, K., and Troe, J. 1983. Theory of unimolecular reactions in the fall-off range. Ber. Bunsenges. Phys. Chem. 87:169–177. Giunta, C., McCurdy, R., Chapple-Sokol, J., and Gordon, R. 1990a. Gas-phase kinetics in the atmospheric pressure chemical vapor deposition of silicon from silane and disilane. J. Appl. Phys. 67:1062–1075. Giunta, C., Chapple-Sokol, J., and Gordon, R. 1990b. Kinetic modeling of the chemical vapor deposition of silicon dioxide from silane or disilane and nitrous oxide. J. Electrochem. Soc. 137:3237–3253. Gobbert, M. K., Ringhofer, C. A., and Cale, T. S. 1996. Mesoscopic scale modeling of micro loading during low pressure chemical vapor deposition. J. Electrochem. Soc. 143(8):524–530. Gobbert, M., Merchant, T., Burocki, L., and Cale, T. 1997. Vertical integration of CVD process models. In Chemical Vapor Deposition: Proceedings of the 14th International Conference and EUROCVD-II (M. Allendorf and B. Bernard, eds.) pp. 254– 261. Electrochemical Society, Pennington, N.J. Gogolides, E. and Sawin, H. 1992. Continuum modeling of radiofrequency glow discharges. I. Theory and results for electropositive and electronegative gases. J. Appl. Phys. 72:3971– 3987. Gordon, S. and McBride, B. 1971. Computer Program for Calculation of Complex Chemical Equilibrium Compositions, Rocket Performance, Incident and Reflected Shocks and ChapmanJouguet Detonations. Technical Report SP-273 NASA. National Aeronautics and Space Administration, Washington, D.C. Granneman, E. 1993. Thin films in the integrated circuit industry: Requirements and deposition methods. Thin Solid Films 228:1–11. Graves, D. 1989. Plasma processing in microelectronics manufacturing. AIChE Journal 35:1–29. Hase, W. and Bunker, D. 1973. Quantum chemistry program exchange (qcpe) 11, 234. Quantum Chemistry Program Exchange, Department of Chemistry, Indiana University, Bloomington, Ind. Hasper, A., Kleijn, C., Holleman, J., Middelhoek, J., and Hoogendoorn, C. 1991. Modeling and optimization of the step coverage
Hitchman, M. and Jensen, K. (eds.) 1993. Chemical Vapor Deposition: Principles and Applications. Academic Press, London.
Ho, P., Coltrin, M., Binkley, J., and Melius, C. 1986. A theoretical study of the heats of formation of Si2Hn (n ¼ 06) compounds and trisilane. J. Phys. Chem. 90:3399–3406. Hockney, R. and Eastwood, J. 1981. Computer simulations using particles. McGraw-Hill, New York. Holleman, J., Hasper, A., and Kleijn, C. 1993. Loading effects on kinetical and electrical aspects of silane-reduced low-pressure chemical vapor deposited selective tungsten. J. Electrochem. Soc. 140:818–825. Holstein, W., Fitzjohn, J., Fahy, E., Golmour, P., and Schmelzer, E. 1989. Mathematical modeling of cold-wall channel CVD reactors. J. Crystal Growth 94:131–144. Hopfmann, C., Werner, C., and Ulacia, J. 1991. Numerical analysis of fluid flow and non-uniformities in a polysilicon LPCVD batch reactor. Appl. Surf. Sci. 52:169–187. Howell, J. 1968. Application of Monte Carlo to heat transfer problems. In Advances in Heat Transfer (J. Hartnett and T. Irvine, eds.), Vol. 5. Academic Press, New York. Ikegawa, M. and Kobayashi, J. 1989. Deposition profile simulation using the direct simulation Monte Carlo method. J. Electrochem. Soc. 136:2982–2986. Jansen, A., Orazem, M., Fox, B., and Jesser, W. 1991. Numerical study of the influence of reactor design on MOCVD with a comparison to experimental data. J. Crystal Growth 112:316–336. Jensen, K. 1987. Micro-reaction engineering applications of reaction engineering to processing of electronic and photonic materials. Chem. Eng. Sci. 42:923–958. Jensen, K. and Graves, D. 1983. Modeling and analysis of low pressure CVD reactors. J. Electrochem. Soc. 130:1950– 1957. Jensen, K., Mihopoulos, T., Rodgers, S., and Simka, H. 1996. CVD simulations on multiplelength scales. In CVD XIII: Proceedings of the 13th International Conference on Chemical Vapor Deposition (T. Besman, M. Allendorf, M. Robinson, and R. Ulrich, eds.) pp. 67–74. Electrochemical Society, Pennington, N.J. Jensen, K. F., Einset, E., and Fotiadis, D. 1991. Flow phenomena in chemical vapor deposition of thin films. Annu. Rev. Fluid Mech. 23:197–232. Jones, A. and O’Brien, P. 1997. CVD of compound semiconductors. VCH, Weinheim, Germany.
178
COMPUTATION AND THEORETICAL METHODS
Kalindindi, S. and Desu, S. 1990. Analytical model for the low pressure chemical vapor deposition of SiO2 from tetraethoxysilane. J. Electrochem. Soc. 137:624–628. Kee, R., Rupley, F., and Miller, J. 1989. Chemkin-II: A Fortran chemical kinetics package for the analysis of gas-phase chemical kinetics. Technical Report SAND89-8009B.UC-706. Sandia National Laboratories, Albuquerque, N.M. Kee, R., Rupley, F., and Miller, J. 1990. The Chemkin thermodynamic data base. Technical Report SAND87-8215B.UC-4. Sandia National Laboratories, Albuquerque, N.M./Livermore, Calif. Kee, R., Dixon-Lewis, G., Warnatz, J., Coltrin, M., and Miller, J. 1991. A FORTRAN computer code package for the evaluation of gas-phase multicomponent transport properties. Technical Report SAND86-8246.UC-401. Sandia National Laboratories, Albuquerque, N.M./Livermore, Calif. Kersch, A. 1995a. Radiative heat transfer modeling. Phoenics J. 8:421–438. Kersch, A. 1995b. RTP reactor simulations. Phoenics J. 8:500– 511. Kersch, A. and Morokoff, W. 1995. Transport Simulation in Microelectronics. Birkhuser, Basel. Kleijn, C. 1991. A mathematical model of the hydrodynamics and gas-phase reactions in silicon LPCVD in a single-wafer reactor. J. Electrochem. Soc. 138:2190–2200. Kleijn, C. 1995. Chemical vapor deposition processes. In Computational Modeling in Semiconductor Processing (M. Meyyappan, ed.) pp. 97–229. Artech House, Boston. Kleijn, C. and Hoogendoorn, C. 1991. A study of 2- and 3-d transport phenomena in horizontal chemical vapor deposition reactors. Chem. Eng. Sci. 46:321–334. Kleijn, C. and Kuijlaars, K. 1995. The modeling of transport phenomena in CVD reactors. Phoenics J. 8:404–420. Kleijn, C. and Werner, C. 1993. Modeling of Chemical Vapor Deposition of Tungsten Films. Birkhuser, Basel. Kline, L. and Kushner, M. 1989. Computer simulations of materials processing plasma discharges. Crit. Rev. Solid State Mater. Sci. 16:1–35. Knudsen, M. 1934. Kinetic Theory of Gases. Methuen and Co. Ltd., London. Kodas, T. and Hampden-Smith, M. 1994. The chemistry of metal CVD. VCH, Weinheim, Germany. Koh, J. and Woo, S. 1990. Computer simulation study on atmospheric pressure CVD process for amorphous silicon carbide. J. Electrochem. Soc. 137:2215–2222. Kristof, J., Song, L., Tsakalis, T., and Cale, T. 1997. Programmed rate and optimal control chemical vapor deposition of tungsten. In Chemical Vapor Deposition: Proceedings of the 14th International Conference and EUROCVD-II (M. Allendorf and C. Bernard, eds.) pp. 1566–1573. Electrochemical Society, Pennington, N.J. Kuijlaars, K. 1996. Detailed Modeling of Chemistry and Transport in CVD Reactors—Application to Tungsten LPCVD. Ph.D. thesis, Delft University of Technology, The Netherlands. Laidler, K. 1987. Chemical Kinetics (3rd ed.). Harper and Row, New York. l’Air Liquide, D. S. 1976. Encyclopdie des Gaz. Elseviers Scientific Publishing, Amsterdam. Leusink, G., Kleijn, C., Oosterlaken, T., Janssen, G., and Radelaar, S. 1992. Growth kinetics and inhibition of growth of chemical vapor deposited thin tungsten films on silicon from tungsten hexafluoride. J. Appl. Phys. 72:490–498.
Liu, B., Hicks, R., and Zinck, J. 1992. Chemistry of photo-assisted organometallic vapor-phase epitaxy of cadmium telluride. J. Crystal Growth 123:500–518. Lutz, A., Kee, R., and Miller, J. 1993. SENKIN: A FORTRAN program for predicting homogeneous gas phase chemical kinetics with sensitivity analysis. Technical Report SAND87-8248.UC401. Sandia National Laboratories, Albuquerque, N.M. Maitland, G. and Smith, E. 1972. Critical reassessment of viscosities of 11 common gases. J. Chem. Eng. Data 17:150–156. Masi, M., Simka, H., Jensen, K., Kuech, T., and Potemski, R. 1992. Simulation of carbon doping of GaAs during MOVPE. J. Crystal Growth 124:483–492. Meeks, E., Kee, R., Dandy, D., and Coltrin, M. 1992. Computational simulation of diamond chemical vapor deposition in premixed C2 H2 =O2 =H2 and CH4 =O2 –strained flames. Combust. Flame 92:144–160. Melius, C., Allendorf, M., and Coltrin, M. 1997. Quantum chemistry: A review of ab initio methods and their use in predicting thermochemical data for CVD processes. In Chemical Vapor Deposition: Proceedings of the 14th International Conference and EUROCVD-II. (M. Allendorf and C. Bernard, eds.) pp. 1– 14. Electrochemical Society, Pennington, N.J. Meyyappan, M. (ed.) 1995a. Computational Modeling in Semiconductor Processing. Artech House, Boston. Meyyappan, M. 1995b. Plasma process modeling. In Computational Modeling in Semiconductor Processing (M. Meyyappan, ed.) pp. 231–324. Artech House, Boston. Minkowycz, W., Sparrow, E., Schneider, G., and Pletcher, R. 1988. Handbook of Numerical Heat Transfer. John Wiley & Sons, New York. Moffat, H. and Jensen, K. 1986. Complex flow phenomena in MOCVD reactors. I. Horizontal reactors. J. Crystal Growth 77:108–119. Moffat, H., Jensen, K., and Carr, R. 1991a. Estimation of the Arrhenius parameters for SiH4 Ð SiH2 þ H2 and Hf(SiH2) by a nonlinear regression analysis of the forward and reverse reaction rate data. J. Phys. Chem. 95:145–154. Moffat, H., Glarborg, P., Kee, R., Grcar, J., and Miller, J. 1991b. SURFACE PSR: A FORTRAN Program for Modeling WellStirred Reactors with Gas and Surface Reactions. Technical Report SAND91-8001.UC-401. Sandia National Laboratories, Albuquerque, N.M./Livermore, Calif. Motz, H. and Wise, H. 1960. Diffusion and heterogeneous reaction. III. Atom recombination at a catalytic boundary. J. Chem. Phys. 31:1893–1894. Mountziaris, T. and Jensen, K. 1991. Gas-phase and surface reaction mechanisms in MOCVD of GaAs with trimethyl-gallium and arsine. J. Electrochem. Soc. 138:2426–2439. Mountziaris, T., Kalyanasundaram, S., and Ingle, N. 1993. A reaction-transport model of GaAs growth by metal organic chemical vapor deposition using trimethyl-gallium and tertiary-butylarsine. J. Crystal Growth 131:283–299. Okkerse, M., Klein-Douwel, R., de Croon, M., Kleijn, C., ter Meulen, J., Marin, G., and van den Akker, H. 1997. Simulation of a diamond oxy-acetylene combustion torch reactor with a reduced gas-phase and surface mechanism. In Chemical Vapor Deposition: Proceedings of the 14th International Conference and EUROCVD-II (M. Allendorf and C. Bernard, eds.) pp. 163–170. Electrochemical Society, Pennington, N.J. Ordal, M., Bell, R., Alexander, R., Long, L., and Querry, M. 1985. Optical properties of fourteen metals in the infrared and far infrared. Appl. Optics 24:4493. Palik, E. 1985. Handbook of Optical Constants of Solids. Academic Press, New York.
SIMULATION OF CHEMICAL VAPOR DEPOSITION PROCESSES
179
Park, H., Yoon, S., Park, C., and Chun, J. 1989. Low pressure chemical vapor deposition of blanket tungsten using a gaseous mixture of WF6, SiH4 and H2. Thin Solid Films 181:85–93.
Tirtowidjojo, M. and Pollard, R. 1988. Elementary processes and rate-limiting factors in MOVPE of GaAs. J. Crystal Growth 77:108–114.
Patankar, S. 1980. Numerical Heat Transfer and Fluid Flow. Hemisphere Publishing, Washington, D.C. Peev, G., Zambov, L., and Yanakiev, Y. 1990a. Modeling and optimization of the growth of polycrystalline silicon films by thermal decomposition of silane. J. Crystal Growth 106:377–386. Peev, G., Zambov, L., and Nedev, I. 1990b. Modeling of low pressure chemical vapour deposition of Si3N4 thin films from dichlorosilane and ammonia. Thin Solid Films 190:341–350. Pierson, H. 1992. Handbook of Chemical Vapor Deposition. Noyes Publications, Park Ridge, N.J. Raupp, G. and Cale, T. 1989. Step coverage prediction in low-pressure chemical vapor deposition. Chem. Mater. 1:207–214. Rees, W. Jr. (ed.) 1996. CVD of Nonmetals. VCH, Weinheim, Germany. Reid, R., Prausnitz, J., and Poling, B. 1987. The Properties of Gases and Liquids (2nd ed.). McGraw-Hill, New York. Rey, J., Cheng, L., McVittie, J., and Saraswat, K. 1991. Monte Carlo low pressure deposition profile simulations. J. Vac. Sci. Techn. A 9:1083–1087. Robinson, P. and Holbrook, K. 1972. Unimolecular Reactions. Wiley-Interscience, London.
Visser, E., Kleijn, C., Govers, C., Hoogendoorn, C., and Giling, L. 1989. Return flows in horizontal MOCVD reactors studied with the use of TiO particle injection and numerical calculations. J. Crystal Growth 94:929–946 (Erratum 96:732– 735).
Rodgers, S. T. and Jensen, K. F. 1998. Multiscale monitoring of chemical vapor deposition. J. Appl. Phys. 83(1):524–530. Roenigk, K. and Jensen, K. 1987. Low pressure CVD of silicon nitride. J. Electrochem. Soc. 132:448–454. Roenigk, K., Jensen, K., and Carr, R. 1987. Rice-RampsbergerKassel-Marcus theoretical prediction of high-pressure Arrhenius parameters by non-linear regression: Application to silane and disilane decomposition. J. Phys. Chem. 91:5732–5739. Schmitz, J. and Hasper, A. 1993. On the mechanism of the step coverage of blanket tungsten chemical vapor deposition. J. Electrochem. Soc. 140:2112–2116. Sherman, A. 1987. Chemical Vapor Deposition for Microelectronics. Noyes Publications, New York. Siegel, R. and Howell, J. 1992. Thermal Radiation Heat Transfer (3rd ed.). Hemisphere Publishing, Washington, D.C. Simka, H., Hierlemann, M., Utz, M., and Jensen, K. 1996. Computational chemistry predictions of kinetics and major reaction pathways for germane gas-phase reactions. J. Electrochem. Soc. 143:2646–2654.
Vossen, J. and Kern, W. (eds.) 1991. Thin Film Processes II. Academic Press, Boston. Wagman, D., Evans, W., Parker, V., Schumm, R., Halow, I., Bailey, S., Churney, K., and Nuttall, R. 1982. The NBS tables of chemical thermodynamic properties. J. Phys. Chem. Ref. Data 11 (Suppl. 2). Wahl, G. 1977. Hydrodynamic description of CVD processes. Thin Solid Films 40:13–26. Wang, Y. and Pollard, R. 1993. A mathematical model for CVD of tungsten from tungstenhexafluoride and silane. In Advanced Metallization for ULSI Applications in 1992 (T. Cale and F. Pintchovski, eds.) pp. 169–175. Materials Research Society, Pittsburgh. Wang, Y.-F. and Pollard, R. 1994. A method for predicting the adsorption energetics of diatomic molecules on metal surfaces. Surface Sci. 302:223–234. Wang, Y.-F. and Pollard, R. 1995. An approach for modeling surface reaction kinetics in chemical vapor deposition processes. J. Electrochem. Soc. 142:1712–1725. Weast, R. (ed.) 1984. Handbook of Chemistry and Physics. CRC Press, Boca Raton, Fla. Werner, C., Ulacia, J., Hopfmann, C., and Flynn, P. 1992. Equipment simulation of selective tungsten deposition. J. Electrochem. Soc. 139:566–574. Wulu, H., Saraswat, K., and McVitie, J. 1991. Simulation of mass transport for deposition in via holes and trenches. J. Electrochem. Soc. 138:1831–1840. Zachariah, M. and Tsang, W. 1995. Theoretical calculation of thermochemistry, energetics, and kinetics of high-temperature SixHyOz reactions. J. Phys. Chem. 99:5308–5318. Zienkiewicz, O. and Taylor, R. 1989. The Finite Element Method (4th ed.). McGraw-Hill, London.
Slater, N. 1959. Theory of Unimolecular Reactions. Cornell Press, Ithaca, N.Y.
KEY REFERENCES
Steinfeld, J., Fransisco, J., and Hase, W. 1989. Chemical Kinetics and Dynamics. Prentice-Hall, Englewood Cliffs, N.J.
Hitchman and Jensen, 1993. See above.
Stewart, J. 1983. Quantum chemistry program exchange (qcpe), no. 455. Quantum Chemistry Program Exchange, Department of Chemistry, Indiana University, Bloomington, Ind. Stull, D. and Prophet, H. (eds.). 1974–1982. JANAF thermochemical tables volume NSRDS-NBS 37. NBS, Washington D.C., second edition. Supplements by Chase, M. W., Curnutt, J. L., Hu, A. T., Prophet, H., Syverud, A. N., Walker, A. C., McDonald, R. A., Downey, J. R., Valenzuela, E. A., J. Phys. Ref. Data. 3, p. 311 (1974); 4, p. 1 (1975); 7, p. 793 (1978); 11, p. 695 (1982). Svehla, R. 1962. Estimated Viscosities and Thermal Conductivities of Gases at High Temperatures. Technical Report R-132 NASA. National Aeronautics and Space Administration, Washington, D.C. Taylor, C. and Hughes, T. 1981. Finite Element Programming of the Navier-Stokes Equations. Pineridge Press Ltd., Swansea, U.K.
Extensive treatment of the fundamental and practical aspects of CVD processes, including experimental diagnostics and modeling Meyyappan, 1995. See above. Comprehensive review of the fundamentals and numerical aspects of CVD, crystal growth and plasma modeling. Extensive literature review up to 1993 The Phoenics Journal, Vol. 8 (4), 1995. Various articles on theory of CVD and PECVD modeling. Nice illustrations of the use of modeling in reactor and process design
CHRIS R. KLEIJN Delft University of Technology Delft, The Netherlands
180
COMPUTATION AND THEORETICAL METHODS
MAGNETISM IN ALLOYS INTRODUCTION The human race has used magnetic materials for well over 2000 years. Today, magnetic materials power the world, for they are at the heart of energy conversion devices, such as generators, transformers, and motors, and are major components in automobiles. Furthermore, these materials will be important components in the energyefficient vehicles of tomorrow. More recently, besides the obvious advances in semiconductors, the computer revolution has been fueled by advances in magnetic storage devices, and will continue to be affected by the development of new multicomponent high-coercivity magnetic alloys and multilayer coatings. Many magnetic materials are important for some of their other properties which are superficially unrelated to their magnetism. Iron steels and iron-nickel (so-called ‘‘Invar’’, or volume INVARiant) alloys are two important examples from a long list. Thus, to understand a wide range of materials, the origins of magnetism, as well as the interplay with alloying, must be uncovered. A quantum-mechanical description of the electrons in the solid is needed for such understanding, so as to describe, on an equal footing and without bias, as many key microscopic factors as possible. Additionally, many aspects, such as magnetic anisotropy and hence permanent magnetism, need the full power of relativistic quantum electrodynamics to expose their underpinnings. From Atoms to Solids Experiments on atomic spectra, and the resulting highly abundant data, led to several empirical rules which we now know as Hund’s rules. These rules describe the filling of the atomic orbitals with electrons as the atomic number is changed. Electrons occupy the orbitals in a shell in such a way as to maximize both the total spin and the total angular momentum. In the transition metals and their alloys, the orbital angular momentum is almost ‘‘quenched;’’ thus the spin Hund’s rule is the most important. The quantum mechanical reasons behind this rule are neatly summarized as a combination of the Pauli exclusion principle and the electron-electron (Coulomb) repulsion. These two effects lead to the so-called ‘‘exchange’’ interaction, which forces electrons with the same spin states to occupy states with different spatial distribution, i.e., with different angular momentum quantum numbers. Thus the exchange interaction has its origins in minimizing the Coulomb energy locally—i.e., the intrasite Coulomb energy—while satisfying the other constraints of quantum mechanics. As we will show later in this unit, this minimization in crystalline metals can result in a competition between intrasite (local) and intersite (extended) effects—i.e. kinetic energy stemming from the curvature of the wave functions. When the overlaps between the orbitals are small, the intrasite effects dominate, and magnetic moments can form. When the overlaps are large, electrons hop from site to site in the lattice at such a rate that a local moment cannot be sustained. Most of the existing solids are characterized in terms of this latter picture. Only in
a few places in the periodic table does the former picture closely reflect reality. We will explore one of these places here, namely the 3d transition metals. In the 3d transition metals, the states that are derived from the 3d and 4s atomic levels are primarily responsible for a metal’s physical properties. The 4s states, being more spatially extended (higher principal quantum number), determine the metal’s overall size and its compressibility. The 3d states are more (but not totally) localized and give rise to a metal’s magnetic behavior. Since the 3d states are not totally localized, the electrons are considered to be mobile giving rise to the name ‘‘itinerant’’ magnetism for such cases. At this point, we want to emphasize that moment formation and the alignment of these moments with each other have different origins. For example, magnetism plays an important role in the stability of stainless steel, FeNiCr. Although it is not ferromagnetic (having zero net magnetization), the moments on its individual constituents have not disappeared; they are simply not aligned. Moments may exist on the atomic scale, but they might not point in the same direction, even at near-zero temperatures. The mechanisms that are responsible for the moments and for their alignment depend on different aspects of the electronic structure. The former effect depends on the gross features, while the latter depends on very detailed structure of the electronic states. The itinerant nature of the electrons makes magnetism and related properties difficult to model in transition metal alloys. On the other hand, in magnetic insulators the exchange interactions causing magnetism can be represented rather simply. Electrons are appropriately associated with particular atomic sites so that ‘‘spin’’ operators can be specified and the famous Heisenberg-Dirac Hamiltonian can then be used to describe the behavior of these systems. The Hamiltonian takes the following form, X H¼ Jij S^i S^j ð1Þ ij
in which Jij is an ‘‘exchange’’ integral, measuring the size of the electrostatic and exchange interaction and S^i is the spin vector on site i. In metallic systems, it is not possible to allocate the itinerant electrons in this way and such pairwise intersite interactions cannot be easily identified. In such metallic systems, magnetism is a complicated many-electron effect to which Hund’s rules contribute. Many have labored with significant effort over a long period to understand and describe it. One common approach involves a mapping of this problem onto one involving independent electrons moving in the fields set up by all the other electrons. It is this aspect that gives rise to the spin-polarized band structure, an often used basis to explain the properties of metallic magnets. However, this picture is not always sufficient. Herring (1966), among others, noted that certain components of metallic magnetism can also be discussed using concepts of localized spins which are, strictly speaking, only relevant to magnetic insulators. Later on in this unit, we discuss how the two pictures have been
MAGNETISM IN ALLOYS
combined to explain the temperature dependence of the magnetic properties of bulk transition metals and their alloys. In certain metals, such as stainless steel, magnetism is subtly connected with other properties via the behavior of the spin-polarized electronic structure. Dramatic examples are those materials which show a small thermal expansion coefficient below the Curie temperature, Tc, a large forced expansion in volume when an external magnetic field is applied, a sharp decrease of spontaneous magnetization and of the Curie temperature when pressure is applied, and large changes in the elastic constants as the temperature is lowered through Tc. These are the famous ‘‘Invar’’ materials, so called because these properties were first found to occur in the fcc alloys Fe-Ni (65% Fe), Fe-Pd, and Fe-Pt (Wassermann, 1991). The compositional order of an alloy is often intricately linked with its magnetic state, and this can also reveal physically interesting and technologically important new phenomena. Indeed, some alloys, such as Ni75Fe25, develop directional chemical order when annealed in a magnetic field (Chikazurin and Graham, 1969). Magnetic short-range correlations above Tc, and the magnetic order below, weaken and alter the chemical ordering in iron-rich Fe-Al alloys, so that a ferromagnetic Fe80Al20 alloy forms a DO3 ordered structure at low temperatures, whereas paramagnetic Fe75Al25 forms a B2 ordered phase at comparatively higher temperatures (Stephens, 1985; McKamey et al., 1991; Massalski et al., 1990; Staunton et al., 1997). The magnetic properties of many alloys are sensitive to the local environment. For example, ordered Ni-Pt (50%) is an anti-ferromagnetic alloy (Kuentzler, 1980), whereas its disordered counterpart is ferromagnetic (MAGNETIC MOMENT AND MAGNETIZATION, MAGNETIC NEUTRON SCATTERING). The main part of this unit is devoted to a discussion of the basis underlying such magneto-compositional effects. Since the fundamental electrostatic exchange interactions are isotropic, and do not couple the direction of magnetization to any spatial direction, they fail to give a basis for a description of magnetic anisotropic effects which lie at the root of technologically important magnetic properties, including domain wall structure, linear magnetostriction, and permanent magnetic properties in general. A description of these effects requires a relativistic treatment of the electrons’ motions. A section of this unit is assigned to this aspect as it touches the properties of transition metal alloys.
PRINCIPLES OF THE METHOD The Ground State of Magnetic Transition Metals: Itinerant Magnetism at Zero Temperature Hohenberg and Kohn (1964) proved a remarkable theorem stating that the ground state energy of an interacting many-electron system is a unique functional of the electron density n(r). This functional is a minimum when evaluated at the true ground-state density no(r). Later Kohn and Sham (1965) extended various aspects of this theorem, providing a basis for practical applications of the density functional theory. In particular, they derived a set of
181
single-particle equations which could include all the effects of the correlations between the electrons in the system. These theorems provided the basis of the modern theory of the electronic structure of solids. In the spirit of Hartree and Fock, these ideas form a scheme for calculating the ground-state electron density by considering each electron as moving in an effective potential due to all the others. This potential is not easy to construct, since all the many-body quantum-mechanical effects have to be included. As such, approximate forms of the potential must be generated. The theorems and methods of the density functional (DF) formalism were soon generalized (von Barth and Hedin, 1972; Rajagopal and Callaway, 1973) to include the freedom of having different densities for each of the two spin quantum numbers. Thus the energy becomes a functional of the particle density, n(r), and the local magnetic density, m(r). The former is sum of the spin densities, the latter, the difference. Each electron can now be pictured as moving in an effective magnetic field, B(r), as well as a potential, V(r), generated by the other electrons. This spin density functional theory (SDFT) is important in systems where spin-dependent properties play an important role, and provides the basis for the spin-polarized electronic structure mentioned in the introduction. The proofs of the basic theorems are provided in the originals and in the many formal developments since then (Lieb, 1983; Driezler and da Providencia, 1985). The many-body effects of the complicated quantummechanical problem are hidden in the exchange-correlation functional Exc[n(r), m(r)]. The exact solution is intractable; thus some sort of approximation must be made. The local approximation (LSDA) is the most widely used, where the energy (and corresponding potential) is taken from the uniformly spin-polarized homogeneous electron gas (see SUMMARY OF ELECTRONIC STRUCTURE METHODS and PREDICTION OF PHASE DIAGRAMS). Point by point, the functional is set equal to the exchange and correlation energies of a homogeneously polarized electron gas, exc , with the density and magnetization taken to be the local Ð values, Exc[n(r), m(r)] ¼ exc [n(r), m(r)] n(r) dr (von Barth and Hedin, 1972; Hedin and Lundqvist, 1971; Gunnarsson and Lundqvist, 1976; Ceperley and Alder, 1980; Vosko et al., 1980). Since the ‘‘landmark’’ papers on Fe and Ni by Callaway and Wang (1977), it has been established that spin-polarized band theory, within this Spin Density Functional formalism (see reviews by Rajagopal, 1980; Kohn and Vashishta, 1982; Driezler and da Providencia, 1985; Jones and Gunnarsson, 1989) provides a reliable quantitative description of magnetic properties of transition metal systems at low temperatures (Gunnarsson, 1976; Moruzzi et al., 1978; Koelling, 1981). In this modern version of the Stoner-Wohlfarth theory (Stoner, 1939; Wohlfarth, 1953), the magnetic moments are assumed to originate predominately from itinerant d electrons. The exchange interaction, as defined above, correlates the spins on a site, thus creating a local moment. In a ferromagnetic metal, these moments are aligned so that the systems possess a finite magnetization per site (see GENERATION AND MEASUREMENT OF MAGNETIC FIELDS, MAGNETIC MOMENT AND MAGNETIZATION,
182
COMPUTATION AND THEORETICAL METHODS
and THEORY OF MAGNETIC PHASE TRANSITIONS). This theory provides a basis for the observed non-integer moments as well as the underlying many-electron nature of magnetic moment formation at T ¼ 0 K. Within the approximations inherent in LSDA, electronic structure (band theory) calculations for the pure crystalline state are routinely performed. Although most include some sort of shape approximation for the charge density and potentials, these calculations give a good representation of the electronic density of states (DOS) of these metals. To calculate the total energy to a precision of less than a few milli–electron volts and to reveal fine details of the charge and moment density, the shape approximation must be eliminated. Better agreement with experiment is found when using extensions of the LSDA. Nonetheless, the LSDA calculations are important in that the groundstate properties of the elements are reproduced to a remarkable degree of accuracy. In the following, we look at a typical LSDA calculation for bcc iron and fcc nickel. Band theory calculations for bcc iron have been done for decades, with the results of Moruzzi et al. (1978) being the first of the more accurate LSDA calculations. The figure on p. 170 of their book (see Literature Cited) shows the electronic density of states (DOS) as a function of the energy. The density of states for the two spins are almost (but not quite) simply rigidly shifted. As typical of bcc structures, the d band has two major peaks. The Fermi energy resides in the top of d bands for the spins that are in the majority, and in the trough between the uppermost peaks. The iron moment extracted from this first-principles calculation is 2.2 Bohr magnetons per atom, which is in good agreement with experiment. Further refinements, such as adding the spin-orbit contributions, eliminating the shape approximation of the charge densities and potentials, and modifying the exchange-correlation function, push the calculations into better agreement with experiment. The equilibrium volume determined within LSDA is more delicate, with the errors being 3% about twice the amount for the typical nonmagnetic transition metal. The total energy of the ferromagnetic bcc phase was also found to be close to that of the nonmagnetic fcc phase, and only when improvements to the LSDA were incorporated did the calculations correctly find the former phase the more stable. On the whole, the calculated properties for nickel are reproduced to about the same degree of accuracy. As seen in the plot of the DOS on p. 178 of Moruzzi et al. (1978), the Fermi energy lies above the top of the majorityspin d bands, but in the large peak in the minority-spin d bands. The width of the d band has been a matter of a great deal of scrutiny over the years, since the width as measured in photoemission experiments is much smaller than that extracted from band-theory calculations. It is now realized that the experiments measure the energy of various excited states of the metal, whereas the LSDA remains a good theory of the ground state. A more comprehensive theory of the photoemission process has resulted in a better, but by no means complete, agreement with experiment. The magnetic moment, a ground state quantity extracted from such calculations, comes out to be 0.6 Bohr magnetons per atom, close to the experimental measurements. The equilibrium volume and other such
quantities are in good agreement with experiment, i.e., on the same order as for iron. In both cases, the electronic bands, which result from the solution of the one-electron Kohn-Sham Schro¨ dinger equations, are nearly rigidly exchange split. This rigid shift is in accord with the simple picture of StonerWohlfarth theory which was based on a simple Hubbard model with a single tight-binding d band treated in the Hartree-Fock approximation. The model Hamiltonian is X IX y ^¼ H ðes0 dij þ tsij Þayi;s aj;s þ ai;s ai;s ayi;s ai;s ð2Þ 2 ij;s i;s in which ai;s and ayi;s are respectively the creation and annihilation operators, es0 a site energy (with spin index s), tij a hopping parameter, inversely related to the dband width, and I the many-body Hubbard parameter representing the intrasite Coulomb interactions. And within the Hartree-Fock approximation, a pair of operators is replaced by their average value, h. . .i, i.e., their quantum mechanical expectation value. In particular, ayi;s ai;s ayi;s ai;s ayi; s ai;s hayi;s ai;s i, where hayi;s ai;s i ¼ 1=2 ðni mi sÞ. On each site, the average particle numbers i ¼ ni;þ1 ni;1 . are ni ¼ ni;þ1 þ ni;1 and the moments are m Thus the Hartree-Fock Hamiltonian is given by X 1 1 s s ^ mi dij þ tij ayi;s aj;s e0 þ I ni I ð3Þ H¼ 2 2 ij;s i the where ni is the number of electrons on site i and m magnetization. The terms I ni =2 and I mi =2 are the effective potential and magnetic fields, respectively. The main omission of this approximation is the neglect of the spinflip particle-hole excitations and the associated correlations. This rigidly exchange-split band-structure picture is actually valid only for the special cases of the elemental ferromagnetic transition metals Fe, Ni, and Co, in which the d bands are nearly filled, i.e., towards the end of the 3d transition metal series. Some of the effects which are to be extracted from the electronic structure of the alloys can be gauged within the framework of simple, single-dband, tight-binding models. In the middle of the series, the metals Cr and Mn are anti-ferromagnetic; those at the end, Fe, Ni, and Co are ferromagnetic. This trend can be understood from a band-filling point of view. It has been shown (e.g. Heine and Samson, 1983) that the exchange-splitting in a nearly filled tight-binding d band lowers the system’s energy and hence promotes ferromagnetism. On the other hand, the imposition of an exchange field that alternates in sign from site to site in the crystal lattice lowers the energy of the system with a half-filled d band, and hence drives anti-ferromagnetism. In the alloy analogy, almost-filled bands lead to phase separation, i.e., k ¼ 0 ordering; half-filled bands lead to ordering with a zone-boundary wavevector. This latter case is the analog of antiferromagnetism. Although the electronic structure of SDF theory, which provides such good quantitative estimates of magnetic properties of metals when compared to the experimentally measured values, is somewhat more complicated than this; the gross features can be usefully discussed in this manner.
MAGNETISM IN ALLOYS
Another aspect from the calculations of pure magnetic metals—which have been reviewed comprehensively by Moruzzi and Marcus (1993), for example—that will prove topical for the discussion of 3d metallic alloys, is the variation of the magnetic properties of the 3d metals as the crystal lattice spacing is altered. Moruzzi et al. (1986) have carried out a systematic study of this phenomenon with their ‘‘fixed spin moment’’ (FSM) scheme. The most striking example is iron on an fcc lattice (Bagayoko and Callaway, 1983). The total energy of fcc Fe is found to have a global minimum for the nonmagnetic state and a lattice ˚ . However, for spacing of 6.5 atomic units (a.u.) or 3.44 A ˚ , the an increased lattice spacing of 6.86 a.u. or 3.63 A energy is minimized for a ferromagnetic state, with a mag 1mB (Bohr magneton). With a marnetization per site, m ginal expansion of the lattice from this point, the 2:4 mB . These ferromagnetic state strengthens with m trends have also been found by LMTO calculations for non-collinear magnetic structures (Mryasov et al., 1992). There is a hint, therefore, that the magnetic properties of fcc iron alloys are likely to be connected to the alloy’s equilibrium lattice spacing and vice versa. Moreover these properties are sensitive to both thermal expansion and applied pressure. This apparently is the origin of the ‘‘low spinhigh spin’’ picture frequently cited in the many discussions of iron Invar alloys (Wassermann, 1991). In their review article, Moruzzi and Marcus (1993) have also summarized calculations on other 3d metals noting similar connections between magnetic structure and lattice spacing. As the lattice spacing is increased beyond the equilibrium value, the electronic bands narrow, and thus the magnetic tendencies are enhanced. More discussion on this aspect is included with respect to Fe-Ni alloys, below. We now consider methods used to calculate the spinpolarized electronic structure of the ferromagnetic 3d transition metals when they are alloyed with other metallic components. Later we will see the effects on the magnetic properties of these materials where, once again, the rigidly split band structure picture is an inappropriate starting point. Solid-Solution Alloys The self-consistent Korringa-Kohn-Rostoker coherentpotential approximation (KKR-CPA; Stocks et al., 1978; Stocks and Winter, 1982; Johnson et al., 1990) is a meanfield adaptation of the LSDA to systems with substitutional disorder, such as, solid-solution alloys, and this has been discussed in COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS. To describe the theory, we begin by recalling what the SDFT-LDA means for random alloys. The straightforward but computationally intractable track along which one could proceed involves solving the usual self-consistent Kohn-Sham single-electron equations for all configurations, and averaging the relevant expectation values over the appropriate ensemble of configurations to obtain the desired observables. To be specific, we introduce an occupation variable, xi , which takes on the value 1 or 0; 1 if there is an A atom at the lattice site i, or 0 if the site is occupied by a B atom. To specify a configuration, we must
183
then assign a value to these variables xi at each site. Each configuration can be fully described by a set of these variables {xi }. For an atom of type a on site k, the potential and magnetic field that enter the Kohn-Sham equations are not independent of its surroundings and depend on all the occupation variables, i.e., Vk,a(r, {xi }), Bk,a(r,{xi }). To find the ensemble average of an observable, for each configuration, we must first solve (self-consistently) the Kohn-Sham equations. Then for each configuration, we are able to calculate the relevant quantity. Finally, by summing these results, weighted by the correct probability factor, we find the required ensemble average. It is impossible to implement all of the above sequence of calculations as described, and the KKR-CPA was invented to circumvent these computational difficulties. The first premise of this approach is that the occupation of a site, by an A atom or a B atom, is independent of the occupants of any other site. This means that we neglect short-range order for the purposes of calculating the electronic structure and approximate the solid solution by a random substitutional alloy. A second premise is that we can invert the order of solving the Kohn-Sham equations and averaging over atomic configurations, i.e., find a set of Kohn-Sham equations that describe an appropriate ‘‘average’’ medium. The first step is to replace, in the spirit of a mean-field theory, the local potential function Vk,a(r,{xi }) and magnetic field Bk,a(r,{xi }) with Vk,a(r) and Bk,a(r), the average over all the occupation variables except the one referring to the site k, at which the occupying atom is known to be of type a. The motion of an electron, on the average, through a lattice of these potentials and magnetic fields randomly distributed with the probability c that a site is occupied by an A atom, and 1-c by a B atom, is obtained from the solution of the KohnSham equations using the CPA (Soven, 1967). Here, a lattice of identical effective potentials and magnetic fields is constructed such that the motion of an electron through this ordered array closely resembles the motion of an electron, on the average, through the disordered alloy. The CPA determines the effective medium by insisting that the substitution of a single site of the CPA lattice by either an A or a B atom produces no further scattering of the electron on the average. It is then possible to develop a spin density functional theory and calculational scheme in which the partially averaged electronic densities, nA(r) and nB(r), and the magnetization densities mA (r), mB (r), associated with the A and B sites respectively, total energies, and other equilibrium quantities are evaluated (Stocks and Winter, 1982; Johnson et al., 1986, 1990; Johnson and Pinski, 1993). The data from both x-ray and neutron scattering in solid solutions show the existence of Bragg peaks which define an underlying ‘‘average’’ lattice (see Chapters 10 and 13). This symmetry is evident in the average electronic structure given by the CPA. The Bloch wave vector is still a useful quantum number, but the average Bloch states also have a finite lifetime as a consequence of the disorder. Probably the strongest evidence for accuracy of the calculated electron lifetimes (and velocities) are the results for the residual resistivity of Ag-Pd alloys (Butler and Stocks, 1984; Swihart et al., 1986).
184
COMPUTATION AND THEORETICAL METHODS
The electron movement through the lattice can be described using multiple-scattering theory, a Green’sfunction method, which is sometimes called the Korringa-Kohn-Rostoker (KKR) method. In this merger of multiple scattering theory with the coherent potential approximation (CPA), the ensemble-averaged Green’s function is calculated, its poles defining the averaged energy eigenvalue spectrum. For systems without disorder, such energy eigenvalues can be labeled by a Bloch wavevector, k, are real, and thus can be related to states with a definite momentum and have infinite lifetimes. The KKR-CPA method provides a solution for the averaged electronic Green’s function in the presence of a random placement of potentials, corresponding to the random occupation of the lattice sites. The poles now occur at complex values, k is usually still a useful quantum number but in the presence of this disorder, (discrete) translation symmetry is not perfect, and electrons in these states are scattered as they traverse the lattice. The useful result of the KKR-CPA method is that it provides a configurationally averaged Green function, from which the ensemble average of various observables can be calculated (Faulkner and Stocks, 1980). Recently, super-cell versions of approximate ensembleaveraging are being explored due to advances in computers and algorithms (Faulkner et al., 1997). However, strictly speaking, such averaging is limited by the size of the cell and the shape approximation for the potentials and charge density. Several interesting results have been obtained from such an approach (Abrikosov et al., 1995; Faulkner et al., 1998). Neither the single-site CPA and the super-cell approach are exact; they give comple mentary information about the electronic structure in alloys. Alloy Electronic Structure and Slater-Pauling Curves Before the reasons for the loss of the conventional Stoner picture of rigidly exchange-split bands can be laid out, we describe some typical features of the electronic structure of alloys. A great deal has been written on this subject, which demonstrates clearly how these features are also connected with the phase stability of the system. An insight into this subject can be gained from many books and articles (Johnson et al., 1987; Pettifor, 1995; Ducastelle, 1991; Gyo¨ rffy et al., 1989; Connolly and Williams, 1983; Zunger, 1994; Staunton et al., 1994). Consider two elemental d-electron densities of states, each with approximate width W and one centered at energy eA, the other at eB, related to atomic-like d-energy levels. If (eA eB) W then the alloy’s densities of states will be ‘‘split band’’ in nature (Stocks et al., 1978) and, in Pettifor’s language, an ionic bond is established as charge flows from the A atoms to the B atoms in order to equilibrate the chemical potentials. The virtual bound states associated with impurities in metals are rough examples of split band behavior. On the other hand, if (eA eB) * W, then the alloy’s electronic structure can be categorized as ‘‘common-band’’-like. Large-scale hybridization now forms between states associated with the A and B atoms. Each site in the alloy is nearly charge-neutral as an individual
ion is efficiently screened by the metallic response function of the alloy (Ziman, 1964). Of course, the actual interpretation of the detailed electronic structure involving many bands is often a complicated mixture of these two models. In either case, half-filling of the bands lowers the total energy of the system as compared to the phase-separated case (Heine and Samson, 1983; Pettifor, 1995; Ducastelle, 1991), and an ordered alloy will form at low temperatures. When magnetism is added to the problem, an extra ingredient, namely the difference between the exchange field associated with each type of atomic species, is added. For majority spin electrons, a rough measure of the degree of ‘‘split-band’’ or ‘‘common-band’’ nature of the density of states is governed by (e"A e"B )/W and a similar measure (e#A e#B /W for the minority spin electrons. If the exchange fields differ to any large extent, then for electrons of one spin-polarization, the bands are common-band-like while for the others a ‘‘split-band’’ label may be more appropriate. The outcome is a spin-polarized electronic structure that cannot be described by a rigid exchange splitting. Hund’s rules dictate that it is frequently energetically favorable for the majority-spin d states to be fully occupied. In many cases, at the cost of a small charge transfer, this is accomplished. Nickel-rich nickel-iron alloys provide such examples (Staunton et al., 1987) as shown in Figure 1. A schematic energy level diagram is shown in Figure 2. One of the first tasks of theories or explanations based on electronic structure calculations is to provide a simple explanation of why the average magnetic moments per atom of so many alloys, M, fall on the famous Slater-Pauling curve, when plotted against the alloys’ valence electron per atom ratio. The usual Slater-Pauling curve for 3d row (Chikazumi, 1964) consists of two straight lines. The plot rises from the beginning of the 3d row, abruptly changes the sign of its gradient and then drops smoothly to zero at the end of the row. There are some important groups of compounds and alloys whose parameters do not fall on this line, but, for these systems also, there appears to be some simple pattern. For those ferromagnetic alloys of late transition metals characterized by completely filled majority spin d states, it is easy to see why they are located on the negative-gradient straight line. The magnetization per atom, M ¼ N" N# , where N"ð#Þ , describes the occupation of the majority (minority) spin states which can be trivially re-expressed in terms of the number of electrons per atom Z, so that M ¼ 2N" Z. The occupation of the s and p states changes very little across the 3d row, and thus M ¼ 2Nd" Z þ 2Nsp" , which gives M ¼ 10 Z þ 2Nsp" . Many other systems, most commonly bcc based alloys, are not strong ferromagnets in this sense of filled majority spin d bands, but possess a similar attribute. The chemical potential (or Fermi energy at T ¼ 0 K) is pinned in a deep valley in the minority spin density of states (Johnson et al., 1987; Kubler, 1984). Pure bcc iron itself is a case in point, the chemical potential sitting in a trough in the minority spin density of states (Moruzzi et al., 1978, p. 170). Figure 1B shows another example in an iron-rich, iron-vanadium alloy. The other major segment of the Slater-Pauling curve of a positive-gradient straight line can be explained by
MAGNETISM IN ALLOYS
185
Figure 2. (A) Schematic energy level diagram for Ni-Fe alloys. (B) Schematic energy level diagram for Fe-V Alloys.
Figure 1. (A) The electronic density of states for ferromagnetic Ni75Fe25. The upper half displays the density of states for the majority-spin electrons, the lower half, for the minority-spin electrons. Note, in the lower half, the axis for the abscissa is inverted. These curves were calculated within the SCF-KKR-CPA, see Johnson et al. (1987). (B) The electronic density of states for ferromagnetic Fe87V13. The upper half displays the density of states for the majority-spin electrons; the lower half, for the minority-spin electrons. Note, in the lower half, the axis for the abscissa is inverted. These curves were calculated within the SCF-KKRCPA (see Johnson et al., 1987).
using this feature of the electronic structure. The pinning of the chemical potential in a trough of the minority spin d density of states constrains Nd# to be fixed in all these alloys to be roughly three. In this circumstance the magnetization per atom M ¼ Z 2Nd# 2Nsp# ¼ Z 6 2Nsp# . Further discussion on this topic is given by Malozemoff et al. (1984), Williams et al. (1984), Kubler (1984), Gubanov et al. (1992), and others. Later in this unit, to illustrate some of the remarks made, we will describe electronic structure calculations of three compositionally disordered alloys together with the ramifications for understanding of their properties. Competitive and Related Techniques: Beyond the Local Spin-Density Approximation Over the past few years, improved approximations for Exc have been developed which maintain all the best features
of the local approximation. A stimulus has been the work of Langreth and Mehl (1981, 1983), who supplied corrections to the local approximation in terms of the gradient of the density. Hu and Langreth (1986) have specified a spin-polarized generalization. Perdew and co-workers (Perdew and Yue, 1986; Wang and Perdew, 1991) contributed several improvements by ensuring that the generalized gradient approximation (GGA) functional satisfies some relevant sum rules. Calculations of the ground state properties of ferromagnetic iron and nickel were carried out (Bagno et al., 1989; Singh et al., 1991; Haglund, 1993) and compared to LSDA values. The theoretically estimated lattice constants from these calculations are slightly larger and are therefore more in line with the experimental values. When the GGA is used instead of LSDA, one removes a major embarrassment for LSDA calculations, namely that paramagnetic bcc iron is no longer energetically stable over ferromagnetic bcc iron. Further applications of the SDFT-GGA include one on the magnetic and cohesive properties of manganese in various crystal structures (Asada and Terakura, 1993) and another on the electronic and magnetic structure of the ordered B2 FeCo alloy (Liu and Singh, 1992). In addition, Perdew et al. (1992) have presented a comprehensive study of the GGA for a range of systems and have also given a review of the GGA (Perdew et al., 1996; Ernzerhof et al., 1996). Notwithstanding the remarks made above, SDF theory within the local spin-density approximation (LSDA) provides a good quantitative description of the low-temperature properties of magnetic materials containing simple and transition metals, which are the main interests of this unit, and the Kohn-Sham electronic structure also gives a reasonable description of the quasi-particle
186
COMPUTATION AND THEORETICAL METHODS
spectral properties of these systems. But it is not nearly so successful in its treatment of systems where some states are fairly localized, such as many rare-earth systems (Brooks and Johansson, 1993) and Mott insulators. Much work is currently being carried out to address the shortcomings found for these fascinating materials. Anisimov et al. (1997) noted that in exact density functional theory, the derivative of the total energy with respect to number of electrons, qE/qN, should have discontinuities at integral values of N, and that therefore the effective one-electron potential of the Kohn-Sham equations should also possess appropriate discontinuities. They therefore added an orbitaldependent correction to the usual LDA potentials and achieved an adequate description of the photoemission spectrum of NiO. As an example of other work in this area, Severin et al. (1993) have carried out self-consistent electronic structure calculations of rare-earth(R)-Co2 and R-Co2H4 compounds within the LDA but in which the effect of the localized open 4f shell associated with the rare-earth atoms on the conduction band was treated by constraining the number of 4f electrons to be fixed. Brooks et al. (1997) have extended this work and have described crystal field quasiparticle excitations in rare earth compounds and extracted parameters for effective spin Hamiltonians. Another related approach to this constrained LSDA theory is the so-called ‘‘LSDA þ U’’ method (Anisimov et al., 1997) which is also used to account for the orbital dependence of the Coulomb and exchange interactions in strongly correlated electronic materials. It has been recognized for some time that some of the shortcomings of the LDA in describing the ground state properties of some strongly correlated systems may be due to an unphysical interaction of an electron with itself (Jones and Gunnarsson, 1989). If the exact form of the exchange-correlation functional Exc were known, this self-interaction would be exactly canceled. In the LDA, this cancellation is not perfect. Several efforts improve cancellation by incorporating this self-interaction correction (SIC; Perdew and Zunger, 1981; Pederson et al., 1985). Using a cluster technique, Svane and Gunnarsson (1990) applied the SIC to transition metal oxides where the LDA is known to be particularly defective and where the GGA does not bring any significant improvements. They found that this new approach corrected some of the major discrepancies. Similar improvements were noted by Szotek et al. (1993) in an LMTO implementation in which the occupied and unoccupied states were split by a large on-site Coulomb interaction. For Bloch states extending throughout the crystal, the SIC is small and the LDA is adequate. However, for localized states the SIC becomes significant. SIC calculations have been carried out for the parent compound of the high Tc superconducting ceramic, La2CuO4 (Temmerman et al., 1993) and have been used to explain the g-a transition in the strongly correlated metal, cerium (Szotek et al., 1994; Svane, 1994; Beiden et al., 1997). Spin Density Functional Theory within the local exchange and correlation approximation also has some serious shortcomings when straightforwardly extended to finite temperatures and applied to itinerant magnetic
materials of all types. In the following section, we discuss ways in which improvements to the theory have been made. Magnetism at Finite Temperatures: The Paramagnetic State As long ago as 1965, Mermin (1965) published the formal structure of a finite temperature density functional theory. Once again, a many-electron system in an external potential, Vext, and external magnetic field, Bext, described by the (non-relativistic) Hamiltonian is considered. Mermin proved that, in the grand canonical ensemble at a given temperature T and chemical potential n, the equilibrium particle n(r) and magnetization m(r) densities are determined by the external potential and magnetic field. The correct equilibrium particle and magnetization densities minimize the Gibbs grand potential,
ð ð
¼ V ext ðrÞnðrÞ dr Bext ðrÞ mðrÞ dr ðð ð e2 nðrÞnðr0 Þ 0 dr dr þ G½n; m n nðrÞ dr ð4Þ þ jr r0 j 2 where G is a unique functional of charge and magnetization densities at a given T and n. The variational principle now states that is a minimum for the equilibrium, n and m. The function G can be written as G½n; m ¼ Ts ½n; m TSs ½n; m þ xc ½n; m
ð5Þ
with Ts and Ss being respectively the kinetic energy and entropy of a system of noninteracting electrons with densities n, m, at a temperature T. The exchange and correlation contribution to the Gibbs free energy is xc. The minimum principle can be shown to be identical to the corresponding equation for a system of noninteracting electrons moving in an effective potential V~ ~ m ¼ V½n;
ð nðr0 Þ d xc ~ d xc 0 ext ~ dr B 1 þ V ext þ e2 þ s jr r0 j d nðrÞ dmðrÞ ð6Þ
which satsify the following set of equations ! h2 ~ ~2 ~ 1 r þ V ji ðrÞ ¼ ei fi ðrÞ 2m X f ðei nÞ tr ½f i ðrÞfi ðrÞ nðrÞ ¼ mðrÞ ¼
i X
f ðei nÞ tr ½f i ðrÞ~ sfi ðrÞ
ð7Þ ð8Þ ð9Þ
i
where f ðe nÞ is the Fermi-Dirac function. Rewriting as ðð X e2 nðrÞnðr0 Þ dr dr0 þ xc f ðei nÞNðei Þ
¼ 2 jr r0 j i ð d xc d xc nðrÞ þ mðrÞ ð10Þ dr dnðrÞ dmðrÞ involves a sum over effective single particle states and where tr represents the trace over the components of the Dirac spinors which in turn are represented by fi ðrÞ, its conjugate transpose being f i ðrÞ. The nonmagnetic part of the potential is diagonal in this spinor space, being propor-
MAGNETISM IN ALLOYS
~ The Pauli spin matrices s ~ tional to the 2 ! 2 unit matrix, 1. provide the coupling between the components of the spinors, and thus to the spin orbit terms in the Hamiltonian. Formally, the exchange-correlation part of the Gibbs free energy can be expressed in terms of spin-dependent pair correlation functions (Rajagopal, 1980), specifically
xc ½n; m ¼
ððX ð1 e2 ns ðrÞns0 ðr0 Þ dl gl ðs; s0 ; r; r0 Þ dr dr0 2 jr r0 j s;s0 0
ð11Þ The next logical step in the implementation of this theory is to form the finite temperature extension of the local approximation (LDA) in terms of the exchange-correlation part of the Gibbs free energy of a homogeneous electron gas. This assumption, however, severely underestimates the effects of the thermally induced spin-wave excitations. The calculated Curie temperatures are much too high (Gunnarsson, 1976), local moments do not exist in the paramagnetic state, and the uniform static paramagnetic susceptibility does not follow a Curie-Weiss behavior as seen in many metallic systems. Part of the pair correlation function gl ðs; s0 ; r; r0 Þ is related by the fluctuation-dissipation theorem to the magnetic susceptibilities that contain the information about these excitations. These spin fluctuations interact with each other as temperature is increased. xc should deviate significantly from the local approximation, and, as a consequence, the form of the effective single-electron states are modified. Over the past decade or so, many attempts have been made to model the effects of the spin fluctuations while maintaining the spin-polarized single-electron basis, and hence describe the properties of magnetic metals at finite temperatures. Evidently, the straightforward extension of spin-polarized band theory to finite temperatures misses the dominant thermal fluctuation of the magnetization and the thermally averaged magnetization, M, can only vanish along with the ‘‘exchange-splitting’’ of the electronic bands (which is destroyed by particle-hole, ‘‘Stoner’’ excitations across the Fermi surface). An important piece of this neglected component can be pictured as orientational fluctuations of ‘‘local moments,’’ which are the magnetizations within each unit cell of the underlying crystalline lattice and are set up by the collective behavior of all the electrons. At low temperatures, these effects have their origins in the transverse part of the magnetic susceptibility. Another related ingredient involves the fluctuations in the magnitudes of these ‘‘moments,’’ and concomitant charge fluctuations, which are connected with the longitudinal magnetic response at low temperatures. The magnetization M now vanishes as the disorder of the ‘‘local moments’’ grows. From this broad consensus (Moriya, 1981), several approaches exist which only differ according to the aspects of the fluctuations deemed to be the most important for the materials which are studied. Competitive and Related Techniques: Fluctuating ‘‘Local Moments’’ Some fifteen years ago, work on the ferromagnetic 3d transition metals—Fe, Co, and Ni—could be roughly parti-
187
tioned into two categories. In the main, the Stoner excitations were neglected and the orientations of the ‘‘local moments,’’ which were assumed to have fixed magnitudes independent of their orientational environment, corresponded to the degrees of freedom over which one thermally averaged. Firstly, the picture of the Fluctuating Local Band (FLB) theory was constructed (Korenman et al., 1977a,b,c; Capellman, 1977; Korenman, 1985), which included a large amount of short-range magnetic order in the paramagnetic phase. Large spatial regions contained many atoms, each with their own moment. These moments had sizes equivalent to the magnetization per site in the ferromagnetic state at T ¼ 0 K and were assumed to be nearly aligned so that their orientations vary gradually. In such a state, the usual spin-polarized band theory can be applied and the consequence of the gradual change to the orientations could be added perturbatively. Quasielastic neutron scattering experiments (Ziebeck et al., 1983) on the paramagnetic phases of Fe and Ni, later reproduced by Shirane et al. (1986), were given a simple though not uncontroversial (Edwards, 1984) interpretation of this picture. In the case of inelastic neutron scattering, however, even the basic observations were controversial, let alone their interpretations in terms of ‘‘spin-waves’’ above Tc which may be present in such a model. Realistic calculations (Wang et al., 1982) in which the magnetic and electronic structures are mutually consistent are difficult to perform. Consequently, examining the full implications of the FLB picture and systematic improvements to it has not made much headway. The second type of approach is labeled the ‘‘disordered local moment’’ (DLM) picture (Hubbard, 1979; Hasegawa, 1979; Edwards, 1982; Liu, 1978). Here, the local moment entities associated with each lattice site are commonly assumed (at the outset) to fluctuate independently with an apparent total absence of magnetic short-range order (SRO). Early work was based on the Hubbard Hamiltonian. The procedure had the advantage of being fairly straightforward and more specific than in the case of FLB theory. Many calculations were performed which gave a reasonable description of experimental data. Its drawbacks were its simple parameter-dependent basis and the fact that it could not provide a realistic description of the electronic structure, which must support the important magnetic fluctuations. The dominant mechanisms therefore might not be correctly identified. Furthermore, it is difficult to improve this approach systematically. Much work has focused on the paramagnetic state of body-centered cubic iron. It is generally agreed that ‘‘local moments’’ exist in this material for all temperatures, although the relevance of a Heisenberg Hamiltonian to a description of their behavior has been debated in depth. For suitable limits, both the FLB and DLM approaches can be cast into a form from which an effective classical Heisenberg Hamiltonian can be extracted
X
Jij e^i e^j
ð12Þ
ij
The ‘‘exchange interaction’’ parameters Jij are specified in terms of the electronic structure owing to the itinerant
188
COMPUTATION AND THEORETICAL METHODS
nature of the electrons in this metal. In the former FLB model, the lattice Fourier transform of the Jij’s LðqÞ ¼
X
Jij ðexpðiq Rij Þ 1Þ
ð13Þ
ij
is equal to Avq2, where v is the unit cell volume and A is the Bloch wall stiffness, itself proportional to the spin wave stiffness constant D (Wang et al., 1982). Unfortunately the Jij’s determined from this approach turn out to be too short-ranged to be consistent with the initial assumption of substantial magnetic SRO above Tc. In the DLM model for iron, the interactions, Jij’s, can be obtained from consideration of the energy of an interacting electron system in which the local moments are constrained to be oriented along directions e^i and e^j on sites i and j, averaging over all the possible orientations on the other sites (Oguchi et al., 1983; Gyo¨ rffy et al., 1985), albeit in some approximate way. The Jij’s calculated in this way are suitably short-ranged and a mutual consistency between the electronic and magnetic structures can be achieved. A scenario between these two limiting cases has been proposed (Heine and Joynt, 1988; Samson, 1989). This was also motivated by the apparent substantial magnetic SRO above Tc in Fe and Ni, deduced from neutron scattering data, and emphasized how the orientational magnetic disorder involves a balance in the free energy between energy and entropy. This balance is delicate, and it was shown that it is possible for the system to disorder on a scale coarser than the atomic spacing and for the magnetic and electronic structures. The length scale is, however, not as large as that initially proposed by the FLB theory.
ðf^ ei gÞ. In the implementation of this theory, the moments for bcc Fe and fictitious bcc Co are fairly independent of their orientational environment, whereas for those in fcc Fe, Co, and Ni, the moments are further away from being local quantities. The long time averages can be replaced by ensemble averages with the Gibbsian measure Pðf^ ej gÞ ¼ eb ðf^ej gÞ = Z, where the partition function is Z¼
Yð
d^ ei eb ðf^ej gÞ
ð14Þ
i
where b is the inverse of kB T with Boltzmann’s constant kB. The thermodynamic free energy, which accounts for the entropy associated with the orientational fluctuations as well as creation of electron-hole pairs, is given by F ¼ kB T ln Z. The role of a classical ‘‘spin’’ (local moment) Hamiltonian, albeit a highly complicated one, is played by
({^ ei }). By choosing a suitable reference ‘‘spin’’ Hamiltonian
({^ ei }) and expanding about it using the Feynman-Peierls’ inequality (Feynman, 1955), an approximation to the free energy is obtained F F0 þ h 0 i0 ¼ F~ with " F0 ¼ kB T ln
Yð
# d^ ei e
b 0 ðf^ ei gÞ
ð15Þ
i
and ‘‘First-Principles’’ Theories These pictures can be put onto a ‘‘first-principles’’ basis by grafting the effects of these orientational spin fluctuations onto SDF theory (Gyo¨ rffy et al., 1985; Staunton et al., 1985; Staunton and Gyo¨ rffy, 1992). This is achieved by making the assumption that it is possible to identify and to separate fast and slow motions. On a time scale long in comparison with an electronic hopping time but short when compared with a typical spin fluctuation time, the spin orientations of the electrons leaving a site are sufficiently correlated with those arriving so that a non-zero magnetization exists when the appropriate quantity is averaged on this time scale. These are the ‘‘local moments’’ which can change their orientations {^ ei } slowly with respect to the time scale, whereas their magnitudes {mi ({^ ej })} fluctuate rapidly. Note that, in principle, the magnitude of a moment on a site depends on its orientational environment. The standard SDF theory for studying electrons in spinpolarized metals can be adapted to describe the states of the system for each orientational configuration {^ ei } in a similar way as in the case of noncollinear magnetic systems (Uhl et al., 1992; Sandratskii and Kubler, 1993; Sandratskii, 1998). Such a description holds the possibility to yield the magnitudes of the local moments mk ¼ mk ({^ ej }) and the electronic Grand Potential for the constrained system
Q Ð ð ei Xeb 0 Y i Ðd^ Q d^ ei P0 ðf^ ¼ ei gÞXðf^ ei gÞ hXi0 ¼ ei eb 0 i d^ i
ð16Þ
With 0 expressed as
0 ¼
X i
ð1Þ
oi ð^ ei Þ þ
X
ð2Þ
oij ð^ ei ; e^j Þ þ
ð17Þ
i 6¼ j
a scheme is set up that can in principle be systematically improved. Minimizing F~ to obtain the best estimate of the ð1Þ ð2Þ free energy gives oi , oij etc., as expressions involving restricted averages of ({^ ei }) over the orientational configurations. A mean-field-type theory, which turns out to be equivalent to a ‘‘first principles’’ formulation of the DLM picture, is established by taking the first term only in the equation above. Although the SCF-KKR-CPA method (Stocks et al., 1978; Stocks and Winter, 1982; Johnson et al. 1990) was developed originally for coping with compositional disorder in alloys, using it in explicit calculations for bcc Fe and fcc Ni gave some interesting results. The average mag, in the nitude of the local moments, hmi ðf^ ej gÞie^i ¼ mi ð^ ei Þ ¼ m paramagnetic phase of iron was 1.91mB. (The total magnetization is zero since hmi ðf^ ej gÞi ¼ 0. This value is roughly the same magnitude as the magnetization per atom in
MAGNETISM IN ALLOYS
the low temperature ferromagnetic state. The uniform, paramagnetic susceptibility, w(T), followed a Curie-Weiss dependence upon temperature as observed experimentally, and the estimate of the Curie temperature Tc was found to be 1280 K, also comparing well with the experi was found mental value of 1040 K. In nickel, however, m to be zero and the theory reduced to the conventional LDA version of the Stoner model with all its shortcomings. This mean field DLM picture of the paramagnetic state was improved by including the effects of correlations between the local moments to some extent. This was achieved by incorporating the consequences of Onsager cavity fields into the theory (Brout and Thomas, 1967; Staunton and Gyo¨ rffy, 1992). The Curie temperature Tc for Fe is shifted downward to 1015 K and the theory gives a reasonable description of neutron scattering data (Staunton and Gyo¨ rffy, 1992). This approach has also been generalized to alloys (Ling et al., 1994a,b). A first application to the paramagnetic phase of the ‘‘spin-glass’’ alloy Cu85Mn15 revealed exponentially damped oscillatory magnetic interactions in agreement with extensive neutron scattering data and was also able to determine the underlying electronic mechanisms. An earlier application to fcc Fe showed how the magnetic correlations change from anti-ferromagnetic to ferromagnetic as the lattice is expanded (Pinski et al., 1986). This study complemented total energy calculations for fcc Fe for both ferromagnetic and antiferromagnetic states at absolute zero for a range of lattice spacings (Moruzzi and Marcus, 1993). For nickel, the theory has the form of the static, hightemperature limit of Murata and Doniach (1972), Moriya (1979), and Lonzarich and Taillefer (1985), as well as others, to describe itinerant ferromagnets. Nickel is still described in terms of exchange-split spin-polarized bands which converge as Tc is approached but where the spin fluctuations have drastically renormalized the exchange interaction and lowered Tc from 3000 K (Gunnarsson, 1976) to 450 K. The neglect of the dynamical aspects of these spin fluctuations has led to a slight overestimation of this renormalization, but w(T) again shows Curie-Weiss behavior as found experimentally, and an adequate description of neutron scattering data is also provided (Staunton and Gyo¨ rffy, 1992). Moreover, recent inverse photoemission measurements (von der Linden et al., 1993) have confirmed the collapse of the ‘‘exchange-splitting’’ of the electronic bands of nickel as the temperature is raised towards the Curie temperature in accord with this Stoner-like picture, although spin-resolved, resonant photoemission measurements (Kakizaki et al., 1994) indicate the presence of spin fluctuations. The above approach is parameter-free, being set up in the confines of SDF theory, and represents a fairly well defined stage of approximation. But there are still some obvious shortcomings in this work (as exemplified by the discrepancy between the theoretically determined and experimentally measured Curie constants). It is worth highlighting the key omission, the neglect of the dynamical effects of the spin fluctuations, as emphasized by Moriya (1981) and others.
189
Competitive and Related Technique for a ‘‘First-Principles’’ Treatment of the Paramagnetic States of Fe, Ni, and Co Uhl and Kubler (1996) have also set up an ab initio approach for dealing with the thermally induced spin fluctuations, and they also treat these excitations classically. They calculate total energies of systems constrained to have spin-spiral {^ ei } configurations with a range of different propagation vectors q of the spiral, polar angles y, and spiral magnetization magnitudes m using the non-collinear fixed spin moment method. A fit of the energies to an expression involving q, y, and m is then made. The Feynman-Peierls inequality is also used where a quadratic form is used for the ‘‘reference Hamiltonian,’’ H0. Stoner particle-hole excitations are neglected. The functional integrations involved in the description of the statistical mechanics of the magnetic fluctuations then reduce to Gaussian integrals. Similar results to Staunton and Gyo¨ rffy (1992) have been obtained for bcc Fe and for fcc Ni. Uhl and Kubler (1997) have also studied Co and have recently generalized the theory to describe magnetovolume effects. Face-centered cubic Fe and Mn have been studied alongside the ‘‘Invar’’ ordered alloy, Fe3Pt. One way of assessing the scope of validity of these sorts of ab initio theoretical approaches, and the severity of the approximations employed, is to compare their underlying electronic bases with suitable spectroscopic measurements. ‘‘Local Exchange Splitting’’ An early prediction from a ‘‘first principles’’ implementation of the DLM picture was that a ‘‘local-exchange’’ splitting should be evident in the electronic structure of the paramagnetic state of bcc iron (Gyo¨ rffy et al., 1983; Staunton et al., 1985). Moreover, the magnitude of this splitting was expected to vary sharply as a function of wave-vector and energy. At some wave-vectors, if the ‘‘bands’’ did not vary much as a function of energy, the local exchange splitting would be roughly of the same size as the rigid exchange splitting of the electronic bands of the ferromagnetic state, whereas at other points where the ‘‘bands’’ have greater dispersion, the splitting would vanish entirely. This local exchange-splitting is responsible for local moments. Photoemission (PES) experiments (Kisker et al., 1984, 1985) and inverse photoemission (IPES) experiments (Kirschner et al., 1984) observed these qualitative features. The experiments essentially focused on the electronic structure around the and H points for a range of energies. Both the 0 25 and 0 12 states were interpreted as being exchange-split, whereas the H0 25 state was not, although all were broadened by the magnetic disorder. Among the DLM calculations of the electronic structure for several wave-vectors and energies (Staunton et al., 1985), those for the and H points showed the 0 12 state as split and both the 0 25 and H0 25 states to be substantially broadened by the local moment disorder, but not locally exchange split. Haines et al. (1985, 1986) used a tightbinding model to describe the electronic structure, and employed the recursion method to average over various orientational configurations. They concluded that a
190
COMPUTATION AND THEORETICAL METHODS
modest degree of SRO is compatible with spectroscopic measurements of the 0 25 d state in paramagnetic iron. More extensive spectroscopic data on the paramagnetic states of the ferromagnetic transition metals would be invaluable in developing the theoretical work on the important spin fluctuations in these systems. As emphasized in the introduction to this unit, the state of magnetic order in an alloy can have a profound effect upon various other properties of the system. In the next subsection we discuss its consequence upon the alloy’s compositional order. Interrelation of Magnetism and Atomic Short Range Order A challenging problem to study in metallic alloys is the interplay between compositional order and magnetism and the dependence of magnetic properties on the local chemical environment. Magnetism is frequently connected to the overall compositional ordering, as well as the local environment, in a subtle and complicated way. For example, there is an intriguing link between magnetic and compositional ordering in nickel-rich Ni-Fe alloys. Ni75Fe25 is paramagnetic at high temperatures; it becomes ferromagnetic at 900 K, and then, at at temperature just 100 K cooler, it chemically orders into the Ni3Fe L12 phase. The Fe-Al phase diagram shows that, if cooled from the melt, paramagnetic Fe80Al20 forms a solid solution (Massalski et al., 1990). The alloy then becomes ferromagnetic upon further cooling to 935 K, and then forms an apparent DO3 phase at 670 K. An alloy with just 5% more aluminum orders instead into a B2 phase directly from the paramagnetic state at roughly 1000 K, before ordering into a DO3 phase at lower temperatures. In this subsection, we examine this interrelation between magnetism and compositional order. It is necessary to deal with the statistical mechanics of thermally induced compositional fluctuations to carry out this task. COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS has described this in some detail (see also Gyo¨ rffy and Stocks, 1983; Gyo¨ rffy et al., 1989; Staunton et al., 1994; Ling et al., 1994b), so here we will simply recall the salient features and show how magnetic effects are incorporated. A first step is to construct (formally) the grand potential for a system of interacting electrons moving in the field of a particular distribution of nuclei on a crystal lattice of an AcB1c alloy using SDF theory. (The nuclear diffusion times are very long compared with those associated with the electrons’ movements and thus the compositional and electronic degrees of freedom decouple.) For a site i of the lattice, the variable xi is set to unity if the site is occupied by an A atom and zero if a B atom is located on it. In other words, an Ising variable is specified. A configuration of nuclei is denoted {xi} and the associated electronic grand potential is expressed as ðfxi gÞ. Averaging over the compositional fluctuations with measure
gives an expression for the free energy of the system at temperature T " # YX Fðfxi gÞ ¼ kB T ln expðb fxi gÞ ð19Þ i
In essence, ðfxi gÞ can be viewed as a complicated concentration-fluctuation Hamiltonian determined by the electronic ‘‘glue’’ of the system. To proceed, some reasonable approximation needs to be made (see review provided in COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS. A course of action, which is analogous with our theory of spin fluctuations in metals at finite T, is to expand about a suitable reference Hamiltonian 0 and to make use of the Feynman-Peierls inequality (Feynman, 1955). A mean field theory is set up with the choice X
0 ¼ Vieff xi ð20Þ i
¼ where h i0AðBÞi is the Grand in which Potential averaged over all configurations with the restriction that an A(B) nucleus is positioned on the site i. These partial averages are, in principle, accessible from the SCFKKR-CPA framework and indeed this mean field picture has a satisfying correspondence with the single-site nature of the coherent potential approximation to the treatment of the electronic behavior. Hence at a temperature T, the chance of finding an A atom on a site i is given by Vieff
h i0Ai
ci ¼
xi
ð18Þ
h i0Bi
expðbðVieff nÞÞ ð1 þ expðbðVieff nÞÞ
ð21Þ
where n is the chemical potential difference which preserves the relative numbers of A and B atoms overall. Formally, the probability of occupation can vary from site to site, but it is only the case of a homogeneous probability distribution ci ¼ c (the overall concentration) that can be tackled in practice. By setting up a response theory, however, and using the fluctuation-dissipation theorem, it is possible to write an expression for the compositional correlation function and to investigate the system’s tendency to order or phase segregate. If a field, which couples to the occupation variables {xi} and varies from site-to-site, is applied to the high temperature homogeneously disordered system, it induces an inhomogeneous concentration distribution {c þ dci}. As a result, the electronic charge rearranges itself (Staunton et al., 1994; Treglia et al., 1978) and, for those alloys which are magnetic in the compositionally disordered state, the magB netization density also changes, i.e. {dmA i }, {dmi }. A theory for the compositional correlation function has been developed in terms of the SCF-KKR-CPA framework (Gyo¨ rffy and Stocks, 1983) and is discussed at length in COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS. In reciprocal ‘‘concentration-wave’’ vector space (Khachaturyan, 1983), this has the Ornstein-Zernicke form aðqÞ ¼
expðb ðfxi gÞÞ Pðfxi gÞ ¼ Q P expðb ðfxi gÞÞ i
xi
bcð1 cÞ ð1 bcð1 cÞSð2Þ ðqÞÞ
ð22Þ
in which the Onsager cavity fields have been incorporated (Brout and Thomas, 1967; Staunton and Gyo¨ rffy, 1992;
MAGNETISM IN ALLOYS
Staunton et al., 1994) ensuring that the site-diagonal part of the fluctuation dissipation theorem is satisfied. The key quantity S(2)(q) is the direct correlation function and is determined by the electronic structure of the disordered alloy. In this way, an alloy’s tendency to order depends crucially on the magnetic state of the system and upon whether or not the electronic structure is spin-polarized. If the system is paramagnetic, then the presence of ‘‘local moments’’ and the resulting ‘‘local exchange splitting’’ will have consequences. In the next section, we describe three case studies where we show the extent to which an alloy’s compositional structure is dependent on whether the underlying electronic structure is ‘‘globally’’ or ‘‘locally’’ spin-polarized, i.e., whether the system is quenched from a ferromagnetic or paramagnetic state. We look at nickel-iron alloys, including those in the ‘‘Invar’’ concentration range, iron-rich Fe-V alloys, and finally gold-rich AuFe alloys. The value of q for which S(2)(q), the direct correlation function, has its greatest value signifies the wavevector for the static concentration wave to which the system is unstable at a low enough temperature. For example, if this occurs at q ¼ 0, phase segregation is indicated, whilst for a A75B25 alloy a maximum value at q ¼ (1, 0, 0) points to an L12(Cu3Au) ordered phase at low temperatures. An important part of S(2)(q) derives from an electronic state filling effect and ties in neatly with the notion that half-filled bands promote ordered structures whilst nearly filled or nearly empty states are compatible with systems that cluster when cooled (Ducastelle, 1991; Heine and Samson, 1983). This propensity can be totally different depending on whether the electronic structure is spinpolarized or not, and hence whether the compositionally disordered state is ferromagnetic or paramagnetic as is the case for nickel-rich Ni75Fe25, for example (Staunton et al., 1987). The remarks made earlier in this unit about bonding in alloys and spin-polarization are clearly relevant here. For example, majority spin electrons in strongly ferromagnetic alloys like Ni75Fe25, which completely occupy the majority spin d states ‘‘see’’ very little difference between the two types of atomic site (Fig. 2) and hence contribute little to S(2)(q) and it is the filling of the minorityspin states which determine the eventual compositional structure. A contrasting picture describes those alloys, usually bcc-based, in which the Fermi energy is pinned in a valley in the minority density of states (Fig. 2, panel B) and where the ordering tendency is largely governed by the majority-spin electronic structure (Staunton et al., 1990). For a ferromagnetic alloy, an expression for the lattice Fourier transform of the magneto-compositional crosscorrelation function #ik ¼ hmi xk i hmi ihxk i can be written down and evaluated (Staunton et al., 1990; Ling et al., 1995a). Its lattice Fourier transform turns out to be a simple product involving the compositional correlation function, #(q) ¼ a(q)g(q), so that #ik is a convolution of gik ¼ dhmi i=dck and akj . The quantity gik has components gik ¼ ðmA mB Þdik þ c
dmA dmB i þ ð1 cÞ i dck dck
ð23Þ
191
B The last two quantities, dmA i =dck and dmi =dck , can also be evaluated in terms of the spin-polarized electronic structure of the disordered alloy. They describe the changes to the magnetic moment mi on a site i in the lattice occupied by either an A or a B atom when the probability of occupation is altered on another site k. In other words, gik quantifies the chemical environment effect on the sizes of the magnetic moments. We studied the dependence of the magnetic moments on their local environments in FeV and FeCr alloys in detail from this framework (Ling et al., 1995a). If the application of a small external magnetic field is considered along the direction of the magnetization, expressions dependent upon the electronic structure for the magnetic correlation function can be similarly found. These are related to the static longitudinal susceptibility w(q). The quantities a(q), #(q), and w(q) can be straightforwardly compared with information obtained from x-ray (Krivoglaz, 1969; also see Chapter 10, section b) and neutron scattering (Lovesey, 1984; also see MAGNETIC NEUTRON SCATTERING), nuclear magnetic resonance (NUCLEAR MAG¨ ssbauer spectroscopy NETIC RESONANCE IMAGING), and Mo (MOSSBAUER SPECTROMETRY) measurements. In particular, the cross-sections obtained from diffuse polarized neutron scattering can be written
" ds"" dsN ds dsM þ þ ¼ " do do do do
ð24Þ
where ¼ þ1ð1Þ if the neutrons are polarized (anti-) parallel to the magnetization (see MAGNETIC NEUTRON SCATN TERING). The nuclear component ds =do is proportional to the compositional correlation function, a(q) (closely related to the Warren-Cowley short-range order parameters). The magnetic component dsM =do is proportional to w(q). Finally dsNM =do describes the magneto-compositional correlation function g(q)a(q) (Marshall, 1968; Cable and Medina, 1976). By interpreting such experimental measurements by such calculations, electronic mechanisms which underlie the correlations can be extracted (Staunton et al., 1990; Cable et al., 1989). Up to now, everything has been discussed with respect to spin-polarized but non-relativistic electronic structure. We now touch briefly on the relativistic extension to this approach to describe the important magnetic property of magnetocrystalline anisotropy.
MAGNETIC ANISOTROPY At this stage, we recall that the fundamental ‘‘exchange’’ interactions causing magnetism in metals are intrinsically isotropic, i.e., they do not couple the direction of magnetization to any spatial direction. As a consequence they are unable to provide any sort of description of magnetic anisotropic effects which lie at the root of technologically important magnetic properties such as domain wall structure, linear magnetostriction, and permanent magnetic properties in general. A fully relativistic treatment of the electronic effects is needed to get a handle on these phenomena. We consider that aspect in this subsection. In a solid with
192
COMPUTATION AND THEORETICAL METHODS
an underlying lattice, symmetry dictates that the equilibrium direction of the magnetization be along one of the cystallographic directions. The energy required to alter the magnetization direction is called the magnetocrystalline anisotropy energy (MAE). The origin of this anisotropy is the interaction of magnetization with the crystal field (Brooks, 1940) i.e., the spin-orbit coupling. Competitive and Related Techniques for Calculating MAE Most present-day theoretical investigations of magnetocrystalline anisotropy use standard band structure methods within the scalar-relativistic local spin-density functional theory, and then include, perturbatively, the effects from spin-orbit coupling, a relativistic effect. Then by using the force theorem (Mackintosh and Anderson, 1980; Weinert et al., 1985), the difference in total energy of two solids with the magnetization in different directions is given by the difference in the Kohn-Sham singleelectron energy sums. In practice, this usually refers only to the valence electrons, the core electrons being ignored. There are several investigations in the literature using this approach for transition metals (e.g. Gay and Richter, 1986; Daalderop et al., 1993), as well as for ordered transition metal alloys (Sakuma, 1994; Solovyev et al., 1995) and layered materials (Guo et al., 1991; Daalderop et al., 1993; Victora and MacLaren, 1993), with varying degrees of success. Some controversy surrounds such perturbative approaches regarding the method of summing over all the ‘‘occupied’’ single-electron energies for the perturbed state which is not calculated self-consistently (Daalderop et al., 1993; Wu and Freeman, 1996). Freeman and coworkers (Wu and Freeman, 1996) argued that this ‘‘blind Fermi filling’’ is incorrect and proposed the state-tracking approach in which the occupied set of perturbed states are determined according to their projections back to the occupied set of unperturbed states. More recently, Trygg et al. (1995) included spin-orbit coupling self-consistently in the electronic structure calculations, although still within a scalar-relativistic theory. They obtained good agreement with experimental magnetic anisotropy constants for bcc Fe, fcc Co, and hcp Co, but failed to obtain the correct magnetic easy axis for fcc Ni. Practical Aspects of the Method The MAE in many cases is of the order of meV, which is several (as many as 10) orders of magnitude smaller than the total energy of the system. With this in mind, one has to be very careful in assessing the precision of the calculations. In many of the previous works, fully relativistic approaches have not been used, but it is possible that only a fully relativistic framework may be capable of the accuracy needed for reliable calculations of MAE. Moreover either the total energy or the single-electron contribution to it (if using the force theorem) has been calculated separately for each of the two magnetization directions and then the MAE obtained by a straight subtraction of one from the other. For this reason, in our work, some of which we outline below, we treat relativity and magnetization (spin polarization) on an equal footing. We also calculate the energy difference directly, removing many systematic errors.
Strange et al. (1989a, 1991) have developed a relativistic spin-polarized version of the Korringa-Kohn-Rostoker (SPR-KKR) formalism to calculate the electronic structure of solids, and Ebert and coworkers (Ebert and Akai, 1992) have extended this formalism to disordered alloys by incorporating coherent-potential approximation (SPR-KKRCPA). This formalism has successfully described the electronic structure and other related properties of disordered alloys (see Ebert, 1996 for a recent review) such as magnetic circular x-ray dichroism (X-RAY MAGNETIC CIRCULAR DICHROISM), hyperfine fields, magneto-optic Kerr effect (SURFACE MAGNETO-OPTIC KERR EFFECT). Strange et al. (1989a, 1989b) and more recently Staunton et al. (1992) have formulated a theory to calculate the MAE of elemental solids within the SPR-KKR scheme, and this theory has been applied to Fe and Ni (Strange et al., 1989a, 1989b). They have also shown that, in the nonrelativistic limit, MAE will be identically equal to zero, indicating that the origin of magnetic anisotropy is indeed relativistic. We have recently set up a robust scheme (Razee et al., 1997, 1998) for calculating the MAE of compositionally disordered alloys and have applied it to NicPt1c and CocPt1c alloys and we will describe our results for the latter system in a later section. Full details of our calculational method are found elsewhere (Razee et al., 1997) and we give a bare outline here only. The basis of the magnetocrystalline anisotropy is the relativistic spin-polarized version of density functional theory (see e.g. MacDonald and Vosko, 1979; Rajagopal, 1978; Ramana and Rajagopal, 1983; Jansen, 1988). This, in turn, is based on the theory for a many electron system in the presence of a ‘‘spin-only’’ magnetic field (ignoring the diamagnetic effects), and leads to the relativistic Kohn-Sham-Dirac single-particle equations. These can be solved using spin-polarized, relativistic, multiple scattering theory (SPR-KKR-CPA). From the key equations of the SPR-KKR-CPA formalism, an expression for the magnetocrystalline anisotropy energy of disordered alloys is derived starting from the total energy of a system within the local approximation of the relativistic spin-polarized density functional formalism. The change in the total energy of the system due to the change in the direction of the magnetization is defined as the magnetocrystalline anisotropy energy, i.e., E ¼ E[n(r), m(r,^ e1)]E[n(r), m(r,^ e2)], with m(r,^ e1), m(r,^ e2) being the magnetization vectors pointing along two directions e^1 and e^1 respectively; the magnitudes are identical. Considering the stationarity of the energy functional and the local density approximation, the contribution to E is predominantly from the single-particle term in the total energy. Thus, now we have ð eF1 ð eF2 E ¼ enðe; e^1 Þ de enðe; e^2 Þ de ð25Þ where eF1 and eF2 are the respective Fermi levels for the two orientations. This expression can be manipulated into one involving the integrated density of states and where a cancellation of a large part has taken place, i.e., ð eF1 E ¼ deðNðe; e^1 Þ Nðe; e^2 ÞÞ 1 NðeF2 ; e^2 ÞðeF1 eF2 Þ2 þ OðeF1 eF2 Þ3 2
ð26Þ
MAGNETISM IN ALLOYS
In most cases, the second term is very small compared to the first term. This first term must be evaluated accurately, and it is convenient to use the Lloyd formula for the integrated density of states (Staunton et al., 1992; Gubanov et al., 1992). MAE of the Pure Elements Fe, Ni, and Co Several groups including ours have estimated the MAE of the magnetic 3d transition metals. We found that the Fermi energy for the [001] direction of magnetization calculated within the SPR-KKR-CPA is 1 to 2 mRy above the scalar relativistic value for all the three elements (Razee et al., 1997). We also estimated the order of magnitude of the second term in the equation above for these three elements, and found that it is of the order of 102 meV, which is one order of magnitude smaller than the first term. We compared our results for bcc Fe, fcc Co, and fcc Ni with the experimental results, as well as the results of previous calculations (Razee et al., 1997). Among previous calculations, the results of Trygg et al. (1995) are closest to the experiment, and therefore we gauged our results against theirs. Their results for bcc Fe and fcc Co are in good agreement with the experiment if orbital polarization is included. However, in case of fcc Ni, their prediction of the magnitude of MAE, as well as the magnetic easy axis, is not in accord with experiment, and even the inclusion of orbital polarization fails to improve the result. Our results for bcc Fe and fcc Co are also in good agreement with the experiment, predicting the correct easy axis, although the magnitude of MAE is somewhat smaller than the experimental value. Considering that in our calculations orbital polarization is not included, our results are quite satisfactory. In case of fcc Ni, we obtain the correct easy axis of magnetization, but the magnitude of MAE is far too small compared to the experimental value, but in line with other calculations. As noted earlier, in the calculation of MAE, the convergence with regard to the Brillouin zone integration is very important. The Brillouin zone integrations had to be done with much care.
DATA ANALYSIS AND INITIAL INTERPRETATION The Energetics and Electronic Origins for Atomic Long- and Short-Range Order in NiFe Alloys The electronic states of iron and nickel are similar in that for both elements the Fermi energy is placed near or at the top of the majority-spin d bands. The larger moment in Fe as compared to Ni, however, manifests itself via a larger exchange-splitting. To obtain a rough idea of the electronic structures of NicFe1–c alloys, we imagine aligning the Fermi energies of the electronic structures of the pure elements. The atomic-like d levels of the two, marking the center of the bands, would be at the same energy for the majority spin electrons, whereas for the minority spin electrons, the levels would be at rather different energies, reflecting the differing exchange fields associated with each sort of atom (Fig. 2). In Figure 1, we show the density of states of Ni75Fe25 calculated by the SCF-KKR-CPA, and we interpreted along those lines. The majority spin density
193
of states possesses very sharp structure, which indicates that in this compositionally disordered alloy majority spin electrons ‘‘see’’ very little difference between the two types of atom, with the DOS exhibiting ‘‘common-band’’ behavior. For the minority spin electrons the situation is reversed. The density of states becomes ‘‘split-band’’-like owing to the large separation of levels (in energy) and due to the resulting compositional disorder. As pointed out earlier, the majority spin d states are fully occupied, and this feature persists for a wide range of concentrations of fcc NicFe1–c alloys: for c greater than 40%, the alloys’ average magnetic moments fall nicely on the negative gradient slope of the Slater-Pauling curve. For concentrations less than 35%, and prior to the Martensitic transition into the bcc structure at around 25% (the famous ‘‘Invar’’ alloys), the Fermi energy is pushed into the peak of majority-spin d states, propelling these alloys away from the Slater-Pauling curve. Evidently the interplay of magnetism and chemistry (Staunton et al., 1987; Johnson et al., 1989) gives rise to most of the thermodynamic and concentration-dependent properties of Ni-Fe alloys. The ferromagnetic DOS of fcc Ni-Fe, given in Figure 1A, indicates that the majority-spin d electrons cannot contribute to chemical ordering in Ni-rich Ni-Fe alloys, since the states in this spin channel are filled. In addition, because majority-spin d electrons ‘‘see’’ little difference between Ni and Fe, there can be no driving force for chemical order or for clustering (Staunton et al., 1987; Johnson et al., 1989). However, the difference in the exchange splitting of Ni and Fe leads to a very different picture for minority-spin d electrons (Fig. 2). The bonding-like states in the minority-spin DOS are mostly Ni, whereas the antibonding-like states are predominantly Fe. The Fermi level of the electrons lies between these bonding and anti-bonding states. This leads to the Cu-Au-type atomic short-range order and to the long-range order found in the region of Ni75Fe25 alloys. As the Ni concentration is reduced, the minorityspin bonding states are slowly depopulated, reducing the stability of the alloy, as seen in the heats of formation (Johnson and Shelton, 1997). Ultimately, when enough electrons are removed (by adding more iron), the Fermi level enters the majority-spin d band and the anomalous behavior of Ni-Fe alloys occurs: increases in resistivity and specific heat, collapse of moments (Johnson et al., 1987), and competing magnetic states (Johnson et al., 1989; Abrikosov et al., 1995). Moment Alignment Versus Moment Formation in fcc Fe. Before considering the last aspect, that of competing magnetic states and their connection to volume effects, it is instructive to consider the magnetic properties of Fe on an fcc lattice, even though it exists only at high temperatures. Moruzzi and Marcus (1993) have reviewed the calculations of the energetics and moments of fcc Fe in both antiferromagnetic (AFM) and ferromagnetic (FM) states for a range of lattice spacings. Here we refer to a comparison with the DLM paramagnetic state (PM; Pinski et al., 1986; Johnson et al., 1989). For large volumes (lattice spacings), the FM state has large moments and is lowest in energy. At small volumes, the PM state is lowest in energy and is the global energy minimum. At intermediate
194
COMPUTATION AND THEORETICAL METHODS
Figure 3. The volume dependence of the total energy of various magnetic states of Ni25Fe75. The total energy of the states of fcc Ni25Fe75 with the designations FM (moments aligned), the DLM (moments disordered), and NM (zero moments) are plotted as a function of the fcc lattice parameter. See Johnson et al. (1989), and Johnson and Shelton (1997).
volumes, however, the AFM and PM states have similarsize moments and energies, although at a value of the lat˚ , the local moments in the tice constant of 6.6 a.u. or 3.49 A PM state collapse. These results suggest that the Fe-Fe magnetic correlations on an fcc lattice are extremely sensitive to volume and evolve from FM to AFM as the lattice is compressed. This suggestion was confirmed by explicit calculations of the magnetic correlations in the PM state (Pinski et al., 1986). In Figure 9 of Johnson et al. (1989), the energetics of fcc Ni35Fe65 were a particular focal point. This alloy composition is well within the Invar region, near to the magnetic collapse, and exhibiting the famous negative thermal expansion. The energies of four magnetic states—i.e., non-magnetic (NM), ferromagnetic (FM), paramagnetic (PM), represented by the disordered local moment state (DLM), and anti-ferromagnetic (AFM)—were within 1.5 mRy, or 250 K of each other (Fig. 3). The Ni35Fe65 calculations were a subset of many calculations that were done for various Ni compositions and magnetic states. As questions still remained regarding the true equilibrium phase diagram of Ni-Fe, Johnson and Shelton (1997) calculated the heats of formation, Ef, or mixing energies, for various Ni compositions and for several magnetic fcc and bcc Ni-Fe phases relative to the pure endpoints, NM-fcc Fe and FMfcc Ni. For the NM-fcc Ni-rich alloys, they found the function Ef (as a function of composition) to be positive and convex everywhere, indicating that these alloys should cluster. While this argument is not always true, we have shown that the calculated ASRO for NM-fcc Ni-Fe does indeed show clustering (Staunton et al., 1987; Johnson et al., 1989). This was a consequence of the absence of exchange-splitting in a Stoner paramagnet and filling of unfavorable antibonding d-electron states. This, at best, would be a state seen only at extreme temperatures, possibly near melting. Thermochemical measurements at high temperatures in Ni-rich, Ni-Fe alloys appear to support this hypothesis (Chuang et al., 1986).
Figure 4. The concentration dependence of the total energy of various magnetic states of Ni-Fe Alloys. The total energy of the some magnetic states of Ni-Fe alloys are plotted as a function of concentration. Note that the Maxwell construction indicates that the ordered fcc phases, Fe50Ni50 and Fe75Ni25, are metastable. Adapted from Johnson and Shelton (1997).
In the past, the NM (possessing zero local moments) state has been used as an approximate PM state, and the energy difference between the FM and NM state seems to reflect well the observed non-symmetric behavior of the Curie temperature when viewed as a function of Ni concentration. However, this is fortuitous agreement, and the lack of exchange-splitting in the NM state actually suppresses ordering. As shown in figure 2 of Johnson and Shelton (1997) and in Figure 4 of this unit, the PM-DLM state, with its local exchange-splitting on the Fe sites, is lower in energy, and therefore a more relevant (but still approximate) PM state. Even in the Invar region, where the energy differences are very small, the exchange-splitting has important consequences for ording. While the DLM state is much more representative of the PM state, it does not contain any of magnetic shortrange order (MSRO) that exists above the Curie temperature. This shortcoming of the model is relevant because the ASRO calculated from this approximate PM state yields very weak ordering (spinodal-ordering temperature below 200 K) for Ni75Fe25, which is not, however, of L12 type. The ASRO calculated for fully-polarized FM Ni75Fe25 is L12like, with a spinodal around 475 K, well below the actual chemical-ordering temperature of 792 K (Staunton et al., 1987; Johnson et al., 1989). Recent diffuse scattering measurements by Jiang et al. (1996) find weak L12-like ASRO in Ni3Fe samples quenched from 1273 K, which is above the Curie temperature of 800 K. It appears that some degree of magnetic order (both short- or long-range) is required for the ASRO to have k ¼ (1,0,0) wavevector instabilities (or L12 type chemical ordering tendencies). Nonetheless, the local exchange splitting in the DLM state, which exists only on the Fe sites (the Ni moments are quenched), does lead to weak ordering, as compared
MAGNETISM IN ALLOYS
to the tendency to phase-separate that is found when local exchange splitting is absent in the NM case. Importantly, this indicates that sample preparation (whether above or below the Curie point) and the details of the measuring procedure (e.g., if data is taken in situ or after quench) affect what is measured. Two time scales are important: roughly speaking, the electron hopping time is 1015 sec, whereas the chemical hopping time (or diffusion) is 103 to 10þ6 sec. Now we consider diffuse-scattering experiments. For samples prepared in the Ni-rich alloys, but below the Curie temperature, it is most likely that a smaller difference would be found from data taken in situ or on quenched samples, because the (global) FM exchange-split state has helped establish the chemical correlations in both cases. On the other hand, in the Invar region, the Curie temperature is much lower than that for Ni-rich alloys and lies in the two-phase region. Samples annealed in the high-temperature, PM, fcc solid-solution phase and then quenched should have (at best) very weak ordering tendencies. The electronic and chemical degrees of freedom respond differently to the quench. Jiang et al. (1996) have recently measured ASRO versus composition in Ni-Fe system using anomalous x-ray scattering techniques. No evidence for ASRO is found in the Invar region, and the measured diffuse intensity can be completely interpreted in terms of static-displacement (size-effect) scattering. These results are in contrast to those found in the 50% and 75% Ni samples annealed closer to, but above, the Curie point and before being quenched. The calculated ASRO intensities in 35%, 50%, and 75% Ni FM alloys are very similar in magnitude and show the Cu-Au ordering tendencies. Figure 2 of Johnson and Shelton (1997) shows that the Cu-Au-type T ¼ 0 K ordering energies lie close to one another. While this appears to contradict the experimental findings (Jiang et al., 1996), recall that the calculated ASRO for PM-DLM Ni3Fe shows ordering to be suppressed. The scattering data obtained from the Invar alloy was from a sample quenched well above the Curie temperature. Theory and experiment may then be in agreement: the ASRO is very weak, allowing sizeeffect scattering to dominate. Notably, volume fluctuations and size effects have been suggested as being responsible for, or at least contributing to, many of the anomalous Invar properties (Wassermann, 1991; Mohn et al., 1991; Entel et al., 1993, 1998). In all of our calculations, including the ASRO ones, we have ignored lattice distortions and kept an ideal lattice described by only a single-lattice parameter. From anomalous x-ray scattering data, Jiang et al. (1996) find that for the differing alloy compositions in the fcc phase, the Ni-Ni nearest-neighbor (NN) distance follows a linear concentration dependence (i.e., Vegard’s rule), the Fe-Fe NN distance is almost independent of concentration, and the Ni-Fe NN distance is actually smaller than that of Ni-Ni. This latter measurement is obviously contrary to hard-sphere packing arguments. Basically, Fe-Fe like to have larger ‘‘local volume’’ to increase local moments, and for Ni-Fe pairs the bonding is promoted (smaller distance) with a concomitant increase in the local Ni moment. Indeed, experiment and our calculations find
195
about a 5% increase in the average moment upon chemical ordering in Ni3Fe. These small local displacements in the Invar region actively contribute to the diffuse intensity (discussed above) when the ASRO is suppressed in the PM phase. The Negative Thermal Expansion Effect. While many of the thermodynamic behaviors and anomalous properties of Ni-Fe Invar have been explained, questions remain regarding the origin of the negative thermal expansion. It is difficult to incorporate the displacement fluctuations (thermal phonons) on the same footing as magnetic and compositional fluctuations, especially within a first-principles approach. Progress on this front has been made by Mohn et al. (1991) and others (Entel et al., 1993, 1998; Uhl and Kubler, 1997). Recently, a possible cause of the negative thermal-expansion coefficient in Ni-Fe has been given within an effective Gru¨ neisen theory (Abrikosov et al., 1995). Yet, this explanation is not definitive because the effect of phonons was not considered, i.e., only the electronic part of the Gru¨ neisen constant was calculated. For example, at 35% Ni, we find within the ASA calculations that the Ni and Fe moments respectively, are 0.62 mB and 2.39 mB for the T ¼ 0 K FM state, and 0.00 mB and 1.56 mB in the DLM state, in contrast to a NM state (zero moments). From neutron-scattering data (Collins, 1966), the PM state contains moments of 1.42 mB on iron, similar to that found in the DLM calculations (Johnson et al., 1989). Now we move on to consider a purely electronic explanation. In Figure 3, we show a plot of energy versus lattice parameter for a 25% Ni alloy in the NM, PM, and FM states. The FM curve has a double-well feature, i.e., two solutions, one with a large lattice parameter with high moments; the other, at a smaller volume has smaller moments. For the spin-restricted NM calculation (i.e. zero moments), a significant energy difference exists, even near low-spin FM minimum. The FM moments at smaller lattice constants are smaller than 0.001 Bohr magnetons, but finite. As Abrikosov et al. (1995) discuss, this double solution of the energy-versus-lattice parameter of the T ¼ 0 K FM state produces an anomaly in the Gru¨ neisen constant that leads to a negative thermal expansion effect. They argue that this is the only possible electronic origin of a negative thermal expansion coefficient. However, if temperature effects are considered—in particular, thermally induced phonons and local moment disorder— then it is not clear that this double-solution behavior is relevant near room temperature, where the lattice measurements are made. Specifically, calculations of the heats of formation as in Figure 4 indicate that already at T ¼ 0 K, neglecting the large entropy of such a state, the DLM state (or an AFM state) is slightly more energetically stable than the FM state at 25% Ni, and is intermediate to the NM and FM states at 35% Ni. Notice that the energy differences for 25% Ni are 0.5 mRy. Because of the high symmetry of the DLM state, in contrast to the FM case, a doublewell feature is not seen in the energy-versus-volume curve (see Fig. 3). As Ni content is increased from 25%, the
196
COMPUTATION AND THEORETICAL METHODS
low-spin solution rises in energy relative to high-spin solution before vanishing for alloys with more than 35% Ni (see figure 9 of Johnson et al., 1989). Thus, there appears to be less of a possibility of having a negative thermal expansion from this double-solution electronic effect as the temperature is raised, since thermal effects disorder the orientations of the moments (i.e. magnetization versus T should become Brillouin-like) and destroy, or lessen, this doublewell feature. In support of this argument, consider that the Invar alloys do have signatures like spin-glasses—e.g., magnetic susceptibility—and the DLM state at T ¼ 0 K could be supposed to be an approximate uncorrelated spin-glass (see discussion in Johnson et al., 1989). Thus, at elevated temperatures, both electronic and phonon effects contribute in some way, or, as one would think intuitively, phonons dominate. The data from figure 3 and figure 9 of Johnson et al. (1989) show that a small energy is associated with orientation of disordering moments at 35% (an energy gain at 25% Ni) and that this yields a volume contraction of 2%, from the high-spin FM state to (low-spin) DLM-PM state. On the other hand, raising the temperature gives rise to a lattice expansion due to phonon effects of 1% to 2%. Therefore, the competition of these two effects lead to a small, perhaps negative, thermal expansion. This can only occur in the Invar region (for compositions greater than 25% and less than 40% Ni) because here states are sufficiently close in energy, with the DLM state being higher in energy. A Maxwell construction including these four states rules out the low-spin FM solution. A more quantitative explanation remains. Only by merging the effects of phonons with the magnetic disorder at elevated temperatures can one balance the expansion due to the former with the contraction due to the latter, and form a complete theory of the INVAR effect.
separated in energy, ‘‘split bands’’ form, i.e., states which reside mostly on one constituent or the other. In Figure 1B, we show the spin-polarized density of states of an iron-rich FeV alloy determined by the SCF-KKR-CPA method, where all these features can be identified. Since the Fe and V majority-spin d states are well separated in energy, we expect a very smeared DOS in the majority-spin channel, due to the large disorder that the majority-spin electrons ‘‘see’’ as they travel through the lattice. On the other hand, the minority (spin-down) electron DOS should have peaks associated with the lowerenergy, bonding states, as well as other peaks associated with the higher-energy, antibonding states. Note that the majority-spin DOS is very smeared due to chemical disorder, and the minority-spin DOS is much sharper, with the bonding states fully occupied and the antibonding states unoccupied. Note that the vertical line indicates the Fermi level, or chemical potential, of the electrons, below which the states are occupied. The Fermi level lies in this trough of the minority density of states for almost the entire concentration range. As discussed earlier, it is this positioning of the Fermi level holding the minority-spin electrons at a fixed number which gives rise to the mechanism for the straight 45 line on the left hand side of the Slater-Pauling curve. In general, the DOS depends on the underlying symmetry of the lattice and the subtle interplay between bonding and magnetism. Once again, we emphasize that the rigidly-split spin densities of states seen in the ferromagnetic elemental metals clearly do not describe the electronic structure in alloys. The variation of the moments on the Fe and V sites, as well as the average moments per site versus concentration as described by SCF-KKR-CPA calculations, are in good agreement with experimental measurement (Johnson et al., 1987).
Magnetic Moments and Bonding in FeV Alloys
ASRO and Magnetism in FeV
A simple schematic energy level diagram is shown in Figure 2B for FecV1–c. The d energy levels of Fe are exchangesplit, showing that it is energetically favorable for pure bcc iron to have a net magnetization. Exchange-splitting is absent in pure vanadium. As in the case of the NicFe1–c alloys, we assume charge neutrality and align the two Fermi energies. The vanadium d levels lie much more closer in energy to the minority-spin d levels of iron than to its majority-spin ones. Upon alloying the two metals in a bcc structure, the bonding interactions have a larger effect on the minority-spin levels than those of the majority spin, owing to the smaller energy separation. In other words, Fe induces an exchange-splitting on the V sites to lower the kinetic energy which results in the formation of bonding and anti-bonding minority-spin alloy states. More minority-spin V-related d states are occupied than majorityspin d states, with the consequence of a moment on the vanadium sites anti-parallel to the larger moment on the Fe sites. The moments are not sustained for concentrations of iron less than 30%, since the Fe-induced exchange-splitting on the vanadium sites diminishes along with the average number of Fe atoms surrounding a vanadium site in the alloy. As for the majority-spin levels, well
In this subsection we describe our investigation of the atomic short-range order in iron-vanadium alloys at (or rapidly quenched from) temperatures T0 above any compositional ordering temperature. For these systems we find the ASRO to be rather insensitive to whether T0 is above or below the alloy’s magnetic Curie temperatures Tc, owing to the presence of ‘‘local exchange-splitting’’ in the electronic structure of the paramagnetic state. Iron-rich FeV alloys have several attributes that make them suitable systems in which to investigate both ASRO and magnetism. Firstly, their Curie temperatures (1000 K) lie in a range where it is possible to compare and contrast the ASRO set up in both the ferromagnetic and paramagnetic states. The large difference in the coherent neutron scattering lengths, bFe bV 10 fm, together with the small size difference, make them good candidates for neutron diffuse scattering experimental analyses. In figure 1 of Cable et al. (1989), the neutron-scattering cross-sections as displayed along three symmetry directions measured in the presence of a saturating magnetic field for a Fe87V13 single crystal quenched a ferromagnetically ordered state. The structure of the curves is attributed to nuclear scattering connected with the ASRO,
MAGNETISM IN ALLOYS
197
cð1 cÞðbFe bV Þ2 aðqÞ. The most intense peaks occur at (1,0,0) and (1,1,1), indicative of a b-CuZn(B2) ordering tendency. Substantial intensity lies in a double peak structure around (1/2,1/2,1/2). We showed (Staunton et al., 1990, 1997) how our ASRO calculations for ferromagnetic Fe87V13 could reproduce all the details of the data. With the chemical potential being pinned in a trough of the minority-spin density of states (Fig. 1B), the states associated with the two different atomic species are substantially hybridized. Thus, the tendency to order is governed principally by the majority-spin electrons. These splitband states are roughly half-filled to produce the strong ordering tendency. The calculations also showed that part of the structure around (1/2,1/2,1/2) could be traced back to the majority-spin Fermi surface of the alloy. By fitting the direct correlation function S(2)(q) in terms of real-space parameters ð2Þ
Sð2Þ ðqÞ ¼ S0 þ
XX n
Sð2Þ n expðiq Ri Þ
ð27Þ
i2n
we found the fit is dominated by the first two parameters which determine the large peak at 1,0,0. However, the fit also showed a long-range component that was derived from the Fermi-surface effect. The real-space fit of data produced by Cable et al. (1989) showed large negative values for the first two shells, also followed by a weak long-ranged tail. Cable et al. (1989) claimed that the effective temperature, for at least part of the sample, was indeed below its Curie temperature. To investigate this aspect, we carried out calculations for the ASRO of paramagnetic (DLM) Fe87V13 (Staunton et al., 1997). Once again, we found the largest peaks to be located at (1,0,0) and (1,1,1) but a careful scrutiny found less structure around (1/2,1/2,1/2) than in the ferromagnetic alloy. The ordering correlations are also weaker in this state. For the paramagnetic DLM state, the local exchange-splitting also pushes many antibonding states above the chemical potential n (see Fig. 5). This happens although n is no longer wedged in a valley in the density of states. The compositional ordering mechanism is similar to, although weaker than, that of the ferromagnetic alloy. The real space fit of S(2)(q) also showed a smaller long-ranged tail. Evidently the ‘‘local-moment’’ spin fluctuation disorder has broadened the alloy’s Fermi surface and diminished its effect upon the ASRO. Figure 3 of Pierron-Bohnes et al. (1995) shows measured neutron diffuse scattering intensities from Fe80V20 in its paramagnetic state at 1473 K and 1133 K (the Curie temperature is 1073 K) for scattering vectors in both the (1,0,0) and (1,1,0) planes, following a standard correction for instrumental background and multiple scattering. Maximal intensity lies near (1,0,0) and (1,1,1) without subsidiary structure about (1/2, 1/2,1/2). Our calculations of the ASRO of paramagnetic Fe80V20, very similar to those of Fe87V13, are consistent with these features. We also studied the type and extent of magnetic correlations in the paramagnetic state. Ferromagnetic correlations were shown which grow in intensity as T is reduced. These lead to an estimate of Tc ¼ 980 K, which agrees well
Figure 5. (A) The local electronic density of states for Fe87V13 with the moment directions being disordered. The upper half displays the density of states for the majority-spin electrons, the lower half, for the minority-spin electrons. Note that in the lower half the axis for the abscissa is inverted. These curves were calculated within the SCF-KKR-CPA (see Staunton et al., 1997). The solid line indicated contributions on the iron sites; the dashed line, the vanadiums sites. (B) The total electronic density of states for Fe87V13 with the moment directions being disordered. These curves were calculated within the SCF-KKR-CPA, see Johnson et al. (1989), and Johnson and Shelton (1997). The solid line indicates contributions on the iron sites; the dashed line, the vanadium sites.
with the measured value of 1073 K. (The calculated Tc for Fe87V13 of 1075 K also compares well with the measured value of 1180 K.) We also examined the importance of modeling the paramagnetic alloy in terms of local moments by repeating the calculations of ASRO, assuming a Stoner paramagnetic (NM) state in which there are no local moments and hence zero exchange splitting of the electronic structure, local or otherwise. The maximum intensity is now found at about (1/2,1/2,0) in striking contrast to both the DLM calculations and the experimental data. In summary, we concluded that experimental data on FeV alloys are well interpreted by our calculations of ASRO and magnetic correlations. ASRO is evidently strongly affected by the local moments associated with the iron sites in the paramagnetic state, leading to only small differences between the topologies of the ASRO established in samples quenched from above and below Tc. The principal difference is the growth of structure around (1/2,1/2,1/2) for the ferromagnetic state. The ASRO strengthens quite sharply as the system orders magnetically, and it would be interesting if an in situ,
198
COMPUTATION AND THEORETICAL METHODS
polarized-neutron, scattering experiment could be carried out to investigate this. The ASRO of Gold-Rich AuFe Alloys: Dependence Upon Magnetic State In sharp contrast to FeV, this study shows that magnetic order, i.e., alignment of the local moments, has a profound effect upon the ASRO of AuFe alloys. In Chapter 18 we discussed the electronic hybridization (size) effect which gives rise to the q ¼ {1,0,0} ordering in NiPt. This is actually a more ubiquitous effect than one may at first imagine. In this subsection we show that the observed q ¼ (1,1/2,0) short-range order in paramagnetic AuFe alloys that have been fast quenched from high temperature results partially from such an effect. Here we point out how magnetic effects also have an influence upon this unusual q ¼ (1,1/ 2,0) short-range order (Ling et al., 1995b). We note that there has been a lengthy controversy over whether these alloys form a Ni4Mo-type, or (1,1/2,0) special-point ASRO when fast-quenched from high temperatures, or whether the observed x-ray and neutron diffuse scattering intensities (or electron micrograph images) around (1,1/2,0) are merely the result of clusters of iron atoms arranged so as to produce this unusual type of ASRO. The issue was further complicated by the presence of intensity peaks around small q ¼ (0,0,0) in diffuse x-ray scattering measurements and electron micrographs of some heat-treated AuFe alloys. The uncertainty about the ASRO in these alloys arises from their strong dependence on thermal history. For example, when cooled from high temperatures, AuFe alloys in the concentration range of 10% to 30% Fe first form solid solutions on an underlying fcc lattice at around 1333 K. Upon further cooling below 973 K, a-Fe clusters begin to precipitate, coexisting with the solid solution and revealing their presence in the form of subsidiary peaks at q ¼ (0,0,0) in the experimental scattering data. The number of a-Fe clusters formed within the fcc AuFe alloy, however, depends strongly on its thermal history and the time scale of annealing (Anderson and Chen, 1994; Fratzl et al., 1991). The miscibility gap appears to have a profound effect on the precipitation of a-Fe clusters, with the maximum precipitation occurring if the alloys had been annealed in the miscibility gap, i.e., between 573 and 773 K (Fratzl et al., 1991). Interestingly, all the AuFe crystals that reveal q ¼ (0,0,0) correlations have been annealed at temperatures below both the experimental and our theoretical spinodal temperatures. On the other hand, if the alloys were homogenized at high temperatures outside the miscibility gap and then fast quenched, no aFe nucleation was found. We have modeled the paramagnetic state of Au-Fe alloys in terms of disordered local moments in accord with the theoretical background described earlier. We calculated both a(q) and w(q) in DLM-paramagnetic Au75Fe25 and for comparison have also investigated the ASRO in ferromagnetic Au75Fe25 (Ling et al., 1995b). Our calculations of a(q) for Au75Fe25 in the paramagnetic state show peaks at (1,1/2,0) with a spinodal ordering temperature of 780 K. This is in excellent agreement with experiment.
Remarkably, as the temperature is lowered below 1600 K the peaks in a(q) shift to the (1,0.45,0) position with a gradual decrease towards (1,0,0) (Ling et al., 1995b). This streaking of the (1,1/2, 0) intensities along the (1,1,0) direction is also observed in electron micrograph measurements (van Tendeloo et al., 1985). The magnetic SRO in this alloy is found to be clearly ferromagnetic, with w(q) peaking at (0,0,0). As such, we explored the ASRO in the ‘‘fictitious’’ FM alloy and find that a(q) shows peaks at (1,0,0). Next, we show that the special point ordering in paramagnetic Au75Fe25 has its origins in the inherent ‘‘locally exchange-split’’ electronic structure of the disordered alloy. This is most easily understood from the calculated compositionally averaged densities of states (DOS), shown in Figure 5. Note that the double peak in the paramagnetic DLM Fe density of states in Figure 5A arises from the ‘‘local’’ exchange splitting, which sets up the ‘‘local moments’’ on the Fe sites. Similar features exist in the DOS of DLM Fe87V13. Within the DLM picture of the paramagnetic phase, it is important to note that this local DOS is obtained from the local axis of quantization on a given site due to the direction of the moment. All compositional and moment orientations contributing to the DOS must be averaged over, since moments point randomly in all directions. In comparison to a density of states in a ferromagnetic alloy, which has one global axis of quantization, the peaks in the DLM density of states are reminiscent of the more usual FM exchange splitting in Fe, as shown in Figure 5B. What is evident from the DOS is that the chemical potential in the paramagnetic DLM state is located in an ‘‘antibonding’’-like, exchange-split Fe peak. In addition, the ‘‘hybridized’’ bonding states that are created below the Fe d band are due to interaction with the wider-band Au (just as in NiPt). As a result of these two electronic effects, one arising from hybridization and the other from electronic exchange-splitting, a competition arises between (1,0,0)-type ordering from the t2g hybridization states well below the Fermi level and (0,0,0)-type ‘‘ordering’’ (i.e., clustering) from the filling of unfavorable antibonding states. Recall again that the filling of bondingtype states favors chemical ordering, while the filling of antibonding-type states opposes chemical ordering, i.e., favors clustering. The competition between (1,0,0) and (0,0,0) type ordering from the two electronic effects yields a (1,1/2,0)-type ASRO. In this calculation, we can check this interpretation by artificially changing the chemical potential (or Fermi energy at T ¼ 0 K) and then perform the calculation at a slightly different band-filling, or e/a. As the Fermi level is lowered below the higher-energy, exchange-split Fe peak, we find that the ASRO rapidly becomes (1,0,0)-type, simply because the unfavorable antibonding states are being depopulated and thus the clustering behavior suppressed. As we have already stated, the ferromagnetic alloy exhibits (1,0,0)-type ASRO. In Figure 5B, at the Fermi level, the large antibonding, exchange-split, Fe peak is absent in the majority-spin manifold of the DOS, although it remains in the minority-spin manifold DOS. In other words, half of the states that were giving rise to the clustering behavior have been removed from consideration.
MAGNETISM IN ALLOYS
This happens because of the global exchange-splitting in the FM alloy; that is, a larger exchange-splitting forms and the majority-spin states become filled. Thus, rather than changing the ASRO by changing the electronic band-filling, one is able to alter the ASRO by changing the distribution of electronic states via the magnetic properties. Because the paramagnetic susceptibility w(q) suggests that the local moments in the PM state are ferromagnetically correlated (Ling et al., 1995b), the alloy already is susceptible to FM ordering. This can be readily accomplished, for example, by magnetic-annealing the Au75Fe25 when preparing them at high temperatures, i.e. by placing the samples in situ into a strong magnetic field to align the moments. After the alloy is thermally annealed, the chemical response of the alloy is dictated by the electronic DOS in the FM disordered alloy, rather than that of the PM alloy, with the resulting ASRO being of (1,0,0)-type. In summary, we have described two competing electronic mechanisms responsible for the unusual (1,1/2,0) ordering propensity observed in fast-quenched gold-rich AuFe alloys. This special point ordering we find to be determined by the inherent nature of the disordered alloy’s electronic structure. Because the magnetic correlations in paramagnetic Au75Fe25 are found to be clearly ferromagnetic, we proposed that AuFe alloys grown in a magnetic field after homogenization at high temperature in the field, and then fast quenching, will produce a novel (1,0,0)-type ASRO in these crystals (Ling et al., 1995b). We now move on and describe our studies of magnetocrystalline anisotropy in compositionally disordered alloys and hence show the importance of relativistic spin-orbit coupling upon the spin-polarized electronic structure. Magnetocrystalline Anisotropy of CocPt1–c Alloys CocPt1–c alloys are interesting for many reasons. Large magnetic anisotropy (MAE; Hadjipanayis and Gaunt, 1979; Lin and Gorman, 1992) and large magneto-optic Kerr effect (SURFACE MAGNETO-OPTIC KERR EFFECT) signals compared to the Co/Pt multilayers in the whole range of wavelengths (820 to 400 nm; Weller et al., 1992, 1993) make these alloys potential magneto-optical recording materials. The chemical stability of these alloys, a suitable Curie temperature, and the ease of manufacturing enhance their usefulness in commercial applications. Furthermore, study of these alloys may lead to an improved understanding of the fundamental physics of magnetic anisotropy; the spin-polarization in the alloys being induced by the presence of Co whereas a large spin-orbit coupling effect can be associated with the Pt atoms. Most experimental work on Co-Pt alloys has been on the ordered tetragonal phase, which has a very large magnetic anisotropy 400 meV, and magnetic easy axis along the c axis (Hadjipanayis and Gaunt, 1979; Lin and Gorman, 1992). We are not aware of any experimental work on the bulk disordered fcc phase of these alloys. However, some results have been reported for disordered fcc phase in the form of thin films (Weller et al., 1992, 1993; Suzuki et al., 1994; Maret et al., 1996; Tyson et al., 1996). It is found that the magnitude of MAE is more than one order
199
of magnitude smaller than that of the bulk ordered phase, and that the magnetic easy axis varies with film thickness. From these data we can infer that a theoretical study of the MAE of the bulk disordered alloys provides insight into the mechanism of magnetic anisotropy in the ordered phase as well as in thin films. We investigated the magnetic anisotropy of disordered fcc phase of CocPt1–c alloys for c ¼ 0.25, 0.50, and 0.75 (as well as the pure elements Fe, Ni, and Co, and also NicPt1–c). In our calculations, we used selfconsistent potentials from spin-polarized scalar-relativistic KKR-CPA calculations and predicted that the easy axis of magnetization is along the h111i direction of the crystal for all the three compositions, and the anisotropy is largest for c ¼ 0.50. In this first calculation of the MAE of disordered alloys we started with atomic sphere potentials generated from the self-consistent spin-polarized scalar relativistic KKRCPA for CocPt1–c alloys and constructed spin-dependent potentials. We recalculated the Fermi energy within the SPR-KKR-CPA method for magnetization along the h001i direction. This was necessary since earlier studies on the MAE of the 3d transition metal magnets were found to be quite sensitive to the position of the Fermi level (Daalderop et al., 1993; Strange et al., 1991). For all the three compositions of the alloy, the difference in the Fermi energies of the scalar relativistic and fully relativistic cases were of the order of 5 mRy, which is quite large compared to the magnitude of MAE. The second term in the expression above for the MAE was indeed small in comparison with the first, which needed to be evaluated very accurately. Details of the calculation can be found elsewhere (Razee et al., 1997). In Figure 6, we show the MAE of disordered fcc-CocPt1–c alloys for c ¼ 0.25, 0.5, and 0.75 as a function of temperature between 0 K and 1500 K. We note that for all the three compositions, the MAE is positive at all temperatures, implying that the magnetic easy axis is always along the h111i direction of the crystal, although the magnitude of MAE decreases with increasing temperature. The magnetic easy axis of fcc Co is also along the h111i direction but the magnitude of MAE is smaller. Thus, alloying
Figure 6. Magneto-anisotropy energy of disordered fcc-CocPt1–c alloys for c ¼ 0.25, 0.5, and 0.75 as a function of temperature. Adapted from Razee et al. (1997).
200
COMPUTATION AND THEORETICAL METHODS
Figure 7. (A) The spin-resolved density of states on Co and Pt Atoms in the Co0.50Pt0.50 alloy magnetized along the h001i direction. (B) The density of states difference between the two magnetization directions for Co0.50Pt0.50. Adapted from Razee et al. (1997).
with Pt does not alter the magnetic easy axis. The equiatomic composition has the largest MAE, which is 3.0 meV at 0 K. In these alloys, one component (Co) has a large magnetic moment but weak spin-orbit coupling, while the other component (Pt) has strong spin-orbit coupling but small magnetic moment. Adding Pt to Co results in a monotonic decrease in the average magnetic moment of the system with the spin-orbit coupling becoming stronger. At c ¼ 0.50, both the magnetic moment as well as the spinorbit coupling are significant; for other compositions either the magnetic moment or the spin-orbit coupling is weaker. This trade-off between spin-polarization and spin-orbit coupling is the main reason for the MAE being largest around this equiatomic composition. In finer detail, the magnetocrystalline anisotropy of a system can be understood in terms of its electronic structure. In Figure 7A, we show the spin-resolved density of states on Co and Pt atoms in the Co0.50Pt0.50 alloy magnetized along the h001i direction. The Pt density of states is
rather structureless, except around the Fermi energy where there is spin-splitting due to hybridization with Co d bands. When the direction of magnetization is oriented along the h111i direction of the crystal, the electronic structure also changes due to redistribution of the electrons, but the difference is quite small in comparison with the overall density of states. So in Figure 7B, we have plotted the density of states difference for the two magnetization directions. In the lower part of the band, which is Pt-dominated, the difference between the two is small, whereas it is quite oscillatory in the upper part dominated by Co d-band complex. There are also spikes at energies where there are peaks in the Co-related part of the density of states. Due to the oscillatory nature of this curve, the magnitude of MAE is quite small; the two large peaks around 2 eV and 3 eV below the Fermi energy almost cancel each other, leaving only the smaller peaks to contribute to the MAE. Also, due to this oscillatory behavior, a shift in the Fermi level will alter the magnitude as well as the sign of the MAE. This curve also tells us that states far removed from the Fermi level (in this case, 4eV below the Fermi level) can also contribute to the MAE, and not just the electrons near the Fermi surface. In contrast to what we have found for the disordered fcc phase of CocPt1–c alloys, in the ordered tetragonal CoPt alloy the MAE is quite large (400 meV), two orders of magnitude greater than what we find for the disordered Co0.50Pt0.50 alloy. Moreover, the magnetic easy axis is along the c axis (Hadjipanayis and Gaunt, 1979). Theoretical calculations of MAE for ordered tetragonal CoPt alloy (Sakuma, 1994; Solovyev et al., 1995), based on scalar relativistic methods, do reproduce the correct easy axis but overestimate the MAE by a factor of 2. Currently, it is not clear whether it is the atomic ordering or the loss of cubic symmetry of the crystal in the tetragonal phase which is responsible for the altogether different magnetocrystalline anisotropies in disordered and ordered CoPt alloys. A combined effect of the two is more likely; we are studying the effect of atomic short-range order on the magnetocrystalline anisotropy of alloys.
PROBLEMS AND CONCLUSIONS Magnetism in transition metal materials can be described in quantitative detail by spin-density functional theory (SDFT). At low temperatures, the magnetic properties of a material are characterized in terms of its spin-polarized electronic structure. It is on this aspect of magnetic alloys that we have concentrated. From this basis, the early Stoner-Wohlfarth picture of rigidly exchange-split, spinpolarized bands is shown to be peculiar to the elemental ferromagnets only. We have identified and shown the origins of two commonly occurring features of ferromagnetic alloy electronic structures, and the simple structure of the Slater-Pauling curve for these materials (average magnetic moment versus electron per atom ratio), can be traced back to the spin-polarized electronic structure. The details of the electronic basis of the theory can, with care, be compared to results from modern spectroscopic
MAGNETISM IN ALLOYS
experiment. Much work is ongoing to make this comparison as rigorous as possible. Indeed, our understanding of metallic magnets and their scope for technological application are developing via the growing sophistication of some experiments, together with improvements in quantitative theory. Although SDFT is ‘‘first-principled,’’ most applications resort to the local approximation (LSDA) for the many electron exchange and correlation effects. This approximation is widely used and delivers good results in many calculations. It does have shortcomings, however, and there are many efforts aimed at trying to improve it. We have referred to some of this work, mentioning the ‘‘generalized gradient approximation’’ GGA and the ‘‘selfinteraction correction’’ SIC in particular. The LDA in magnetic materials fails when it is straightforwardly adapted to high temperatures. This failure can be redressed by a theory that includes the effects of thermally induced magnetic excitations, but which still maintains the spin-polarized electronic structure basis of standard SDFT. ‘‘Local moments,’’ which are set up by the collective behavior of all the electrons, and are associated with atomic sites, change their orientations on a time scale which is long compared to the time that itinerant d electrons take to progress from site to site. Thus, we have a picture of electrons moving through a lattice of effective magnetic fields set up by particular orientations of these ‘‘local moments.’’ At high temperatures, the orientations are thermally averaged, so that in the paramagnetic state there is zero magnetization overall. Although not spin-polarized ‘‘globally’’—i.e., when averaged over all orientational configurations—the electronic structure is modified by the local-moment fluctuations, so that ‘‘local spin-polarization’’ is evident. We have described a mean field theory of this approach and have described its successes for the elemental ferromagnetic metals and for some iron alloys. The dynamical effects of these spin fluctuations in a first-principles theory remain to be included. We have also emphasized how the state of magnetic order of an alloy can have a major effect on various other properties of the system, and we have dealt at length with its effect upon atomic short-range order by describing case studies of NiFe, FeV, and AuFe alloys. We have linked the results of our calculations with details of ‘‘globally’’ and ‘‘locally’’ spin-polarized electronic structure. The full consequences of lattice displacement effects have yet to be incorporated. We have also discussed the relativistic generalization of SDFT and covered its implication for the magnetocrystalline anisotropy of disordered alloys, with specific illustrations for CoPt alloys. In summary, the magnetic properties of transition metal alloys are fundamentally tied up with the behavior of their electronic ‘‘glues.’’ As factors like composition and temperature are varied, the underlying electronic structure can change and thus modify an alloy’s magnetic properties. Likewise, as the magnetic order transforms, the electronic structure is affected and this, in turn, leads to changes in other properties. Here we have focused upon the effect on ASRO, but much could also have been written about the fascinating link between magnetism and elastic
201
properties—‘‘Invar’’ phenomena being a particularly dramatic example. The electronic mechanisms that thread all these properties together are very subtle, both to understand and to uncover. Consequently, it is often required that a study be attempted that is parameter-free as far as possible, so as to remove any pre-existing bias. This calculational approach can be very fruitful, provided it is followed alongside suitable experimental measurements as a check of its correctness.
ACKNOWLEDGMENTS This work has been supported in part by the National Science Foundation (U.S.), the Engineering and Physical Sciences Research Council (U.K.), and the Department of Energy (U.S.) at the Fredrick Seitz Material Research Lab at the University of Illinois under grants DEFG02ER9645439 and DE-AC04-94AL85000.
LITERATURE CITED Abrikosov, I. A., Eriksson, O., Soderlind, P., Skriver, H. L., and Johansson, B., 1995. Theoretical aspects of the FecNi1–c Invar alloy. Phys. Rev. B. 51:1058–1066. Anderson, J. P. and Chen, H., 1994. Determination of the shortrange order structure of Au-25 at pct Fe using wide-angle diffuse synchrotron x-ray-scattering. Metallurgical and Materials Transactions A 25A:1561. Anisimov,V. I., Aryasetiawan, F., and Liechtenstein, A. I., 1997. First-principles calculations of the electronic structure and spectra of strongly correlated systems: the LDAþU method. J. Phys.: Condens. Matter 9:767–808. Asada, T. and Terakura, K., 1993. Generalized-gradient-approximation study of the magnetic and cohesive properties of bcc, fcc, and hcp Mn. Phys. Rev. B 47:15992–15995. Bagayoko, D. and Callaway, J., 1983. Lattice-parameter dependence of ferromagnetism in bcc and fcc iron. Phys. Rev. B 28:5419–5422. Bagno, P., Jepsen, O., and Gunnarsson, O., 1989. Ground-state properties of 3rd-row elements with nonlocal density functionals. Phys. Rev. B 40:1997–2000. Beiden, S. V., Temmerman, W. M., Szotek, Z., and Gehring, G. A., 1997. Self-interaction free relativistic local spin density approximation: equivalent of Hund’s rules in g-Ce. Phys. Rev. Lett. 79:3970–3973. Brooks, H., 1940. Ferromagnetic Anisotropy and the Itinerant Electron Model. Phys. Rev. 58:B909. Brooks, M. S. S. and Johansson, B., 1993. Density functional theory of the ground state properties of rare earths and actinides In Handbook of Magnetic Materials (K. H. J. Buschow, ed.). p. 139. Elsevier/North Holland, Amsterdam. Brooks, M. S. S., Eriksson, O., Wills, J. M., and Johansson, B., 1997. Density functional theory of crystal field quasiparticle excitations and the ab-initio calculation of spin hamiltonian parameters. Phys. Rev. Lett. 79:2546. Brout, R. and Thomas, H., 1967. Molecular field theory: the Onsager reaction field and the spherical model. Physics 3:317. Butler, W. H. and Stocks, G. M., 1984. Calculated electrical conductivity and thermopower of silver-palladium alloys. Phys. Rev. B 29:4217.
202
COMPUTATION AND THEORETICAL METHODS
Cable, J. W. and Medina, R. A., 1976. Nonlinear and nonlocal moment disturbance effects in Ni-Cr alloys. Phys. Rev. B 13:4868. Cable, J. W., Child, H. R., and Nakai, Y., 1989. Atom-pair correlations in Fe-13.5-percent-V. Physica 156 & 157 B:50. Callaway, J. and Wang, C. S., 1977. Energy bands in ferromagnetic iron. Phys. Rev. B 16:2095–2105. Capellman, H., 1977. Theory of itinerant ferromagnetism in the 3-d transition metals. Z. Phys. B 34:29. Ceperley, D. M. and Alder, B. J., 1980. Ground state of the electron gas by a stochastic method. Phys. Rev. Lett. 45:566–569. Chikazumi, S., 1964. Physics of Magnetism. Wiley, New York. Chikazurin, S. and Graham, C. D., 1969. Directional order. In Magnetism and Metallurgy, (A. E. Berkowitz and E. Kneller, eds.). vol. II, pp. 577–619. Academic Press, Inc, New York. Chuang, Y.-Y., Hseih, K.-C., and Chang,Y. A, 1986. A thermodynamic analysis of the phase equilibria of the Fe-Ni system above 1200K. Metall. Trans. A 17:1373. Collins, M. F., 1966. Paramagnetic scattering of neutrons by an iron-nickel alloy. J. Appl. Phys. 37:1352. Connolly, J. W. D. and Williams, A. R., 1983. Density-functional theory applied to phase transformations in transition-metal alloys. Phys. Rev. B 27:RC5169. Daalderop, G. H. O., Kelly, P. J., and Schuurmans, M. F. H., 1993. Comment on state-tracking first-principles determination of magnetocrystalline anisotropy. Phys. Rev. Lett. 71:2165. Driezler, R. M. and da Providencia, J. (eds.) 1985. Density Functional Methods in Physics. Plenum, New York. Ducastelle, F. 1991. Order and Phase Stability in Alloys. Elsevier/ North-Holland, Amsterdam. Ebert, H. 1996. Magneto-optical effects in transition metal systems. Rep. Prog. Phys. 59:1665. Ebert, H. and Akai, H., 1992. Spin-polarised relativistic band structure calculations for dilute and doncentrated disordered alloys, In Applications of Multiple Scattering Theory to Materials Science, Mat. Res. Soc. Symp. Proc. 253, (W.H. Butler, P.H. Dederichs, A. Gonis, and R.L. Weaver, eds.). pp. 329. Materials Research Society Press, Pittsburgh. Edwards, D. M. 1982. The paramagnetic state of itinerant electron-systems with local magnetic-moments. I. Static properties. J. Phys. F: Metal Physics 12:1789–1810. Edwards, D. M. 1984. On the dynamics of itinerant electron magnets in the paramagnetic state. J.Mag.Magn.Mat. 45:151–156. Entel, P., Hoffmann, E., Mohn, P., Schwarz, K., and Moruzzi, V. L. 1993. First-principles calculations of the instability leading to the INVAR effect. Phys. Rev. B 47:8706–8720. Entel, P., Kadau, K., Meyer, R., Herper, H. C. Acet, M., and Wassermann, E.F. 1998. Numerical simulation of martensitic transformations in magnetic transition-metal alloys. J. Mag. and Mag. Mat. 177:1409–1410. Ernzerhof, M., Perdew, J. P., and Burke, K. 1996. Density functionals: where do they come from, why do they work? Topics in Current Chemistry 180:1–30. Faulkner, J. S. and Stocks, G. M., 1980. Calculating properties with the coherent-potential approximation. Phys. Rev. B 21:3222. Faulkner, J. S., Wang, Y., and Stocks, G. M. 1997. Coulomb energies in alloys. Phys. Rev. B 55:7492. Faulkner, J. S., Moghadam, N. Y., Wang, Y., and Stocks, G. M. 1998. Evaluation of isomorphous models of alloys. Phys. Rev. B 57:7653. Feynman, R. P., 1955. Slow electrons in a polar crystal. Phys. Rev. 97:660.
Fratzl, P., Langmayr, F., and Yoshida, Y. 1991. Defect-mediated nucleation of alpha-iron in Au-Fe alloys. Phys. Rev. B 44:4192. Gay, J. G. and Richter, R., 1986. Spin anisotropy of ferromagneticfilms. Phys. Rev. Lett. 56:2728. Gubanov, V. A., Liechtenstein, A. I., and Postnikov, A. V. 1992. Magnetism and Electronic Structure of Crystals, Springer Series in Solid State Physics Vol. 98. Springer-Verlag, New York. Gunnarsson, O. 1976. Band model for magnetism of transition metals in the spin-density-functional formalism. J. Phys. F: Metal Physics 6:587–606. Gunnarsson, O. and Lundqvist, B. I., 1976. Exchange and correlation in atoms, molecules, and solids by the spin-density-functional formalism. Phys. Rev. B 13:4274–4298. Guo, G. Y., Temmerman, W. M., and Ebert, H. 1991. 1st-principles determination of the magnetization direction of Fe monolayer in noble-metals. J. Phys. Condens. Matter 3:8205. Gyo¨ rffy, B. L. and Stocks, G. M. 1983. Concentration waves and fermi surfaces in random metallic alloys. Phys. Rev. Lett. 50:374. Gyo¨ rffy, B. L., Kollar, J., Pindor, A. J., Staunton, J. B., Stocks, G. M., and Winter, H. 1983. In The Proceedings from Workshop on 3d Metallic Magnetism, Grenoble, France, March, 1983, pp. 121–146. Gyo¨ rffy, B. L., Pindor, A. J., Staunton, J. B., Stocks, G. M., and Winter, H. 1985. A first-principles theory of ferromagnetic phase-transitions in metals. J. Phys. F: Metal Physics 15:1337–1386. Gyo¨ rffy, B. L., Johnson, D. D., Pinski, F. J., Nicholson, D. M., Stocks, G. M. 1989. The electronic structure and state of compositional order in metallic alloys. In Alloy Phase Stability (G. M. Stocks and A. Gonis, eds.). NATO-ASI Series Vol. 163. Kluwer Academic Publishers, Boston. Hadjipanayis, G. and Gaunt, P. 1979. An electron microscope study of the structure and morphology of a magnetically hard PtCo alloy. J. Appl. Phys. 50:2358. Haglund, J. 1993. Fixed-spin-moment calculations on bcc and fcc iron using the generalized gradient approximation. Phys. Rev. B 47:566–569. Haines, E. M., Clauberg, R., and Feder, R. 1985. Short-range magnetic order near the Curie temperature in iron from spinresolved photoemission. Phys. Rev. Lett. 54:932. Haines, E. M., Heine, V., and Ziegler, A. 1986. Photoemission from ferromagnetic metals above the Curie temperature. 2. Cluster calculations for Ni. J. Phys. F: Metal Physics 16:1343. Hasegawa, H. 1979. Single-site functional-integral approach to itinerant-electron ferromagnetism. J. Phys. Soc. Jap. 46:1504. Hedin, L. and Lundqvist, B. I. 1971. Explicit local exchange-correlation potentials. J. Phys. C: Solid State Physics 4:2064–2083. Heine,V. and Joynt, R. 1988. Coarse-grained magnetic disorder above Tc in iron. Europhys. Letts. 5:81–85. Heine,V. and Samson, J. H. 1983. Magnetic, chemical and structural ordering in transition-metals. J. Phys. F: Metal Physics 13:2155–2168. Herring, C. 1966. Exchange interactions among itinerant electrons In Magnetism IV (G.T. Rado and H. Suhl, eds.). Academic Press, Inc., New York. Hohenberg, P. and Kohn, W. 1964. Inhomogeneous electron gas. Phys. Rev. 136:B864–B872. Hu, C. D. and Langreth, D. C. 1986. Beyond the random-phase approximation in nonlocal-density-functional theory. Phys. Rev. B 33:943–959. Hubbard, J. 1979. Magnetism of iron. II. Phys. Rev. B 20:4584. Jansen, H. J. F. 1988. Magnetic-anisotropy in density-functional theory. Phys. Rev. B 38:8022–8029.
MAGNETISM IN ALLOYS Jiang, X., Ice, G. E., Sparks, C. J., Robertson, L., and Zschack, P. 1996. Local atomic order and individual pair displacements of Fe46.5Ni53.5 and Fe22.5Ni77.5 from diffuse x-ray scattering studies. Phys. Rev. B 54:3211. Johnson, D. D. and Pinski, F. J. 1993. Inclusion of charge correlations in the calculation of the energetics and electronic structure for random substitutional alloys. Phys. Rev. B 48:11553. Johnson, D. D. and Shelton, W. A. 1997. The energetics and electronic origins for atomic long- and short-range order in Ni-Fe invar alloys. In The Invar Effect: A Centennial Symposium, (J.Wittenauer, ed.). p. 63. The Minerals Metals, and Materials Society, Warrendale, Pa. Johnson, D. D., Pinski, F. J., and Staunton, J. B. 1987. The SlaterPauling curve: First-principles calculations of the moments of Fe1–cNic and V1–cFec. J. Appl. Phys. 61:3715–3717. Johnson, D. D., Pinski, F. J., Staunton, J. B., Gyo¨ rffy, B. L, and Stocks, G. M. 1989. Theoretical insights into the underlying mechanisms responsible for their properties. In Physical Metallurgy of Controlled Expansion ‘‘INVAR-type’’ Alloys (K. C. Russell and D. Smith, eds.). The Minerals, Metals, and Materials Society, Warrendale, Pa. Johnson, D. D., Nicholson, D. M., Pinski, F. J., Stocks, G. M., and Gyo¨ rffy, B.L. 1986. Density-functional theory for random alloys: Total energy with the coherent potential approximation. Phys. Lett. 56:2096. Johnson, D. D., Nicholson, D. M., Pinski, F. J., Stocks, G. M., and Gyo¨ rffy, B. L. 1990. Total energy and pressure calculations for random substitutional alloys. Phys. Rev. B 41:9701. Jones, R. O. and Gunnarsson, O. 1989. The density functional formalism, its applications and prospects. Rev. Mod. Phys. 61:689–746. Kakizaki, A., Fujii, J., Shimada, K., Kamata, A., Ono, K., Park, K. H., Kinoshita, T., Ishii, T., and Fukutani, H. 1994. Fluctuating local magnetic-moments in ferromagnetic Ni observed by the spin-resolved resonant photoemission. Phys. Rev. Lett. 17: 2781–2784. Khachaturyan, A. G. 1983. Theory of structural transformations in solids. John Wiley & Sons, New York. Kirschner, J., Globl, M., Dose, V., and Scheidt, H. 1984. Wave-vector dependent temperature behavior of empty bands in ferromagnetic iron. Phys. Rev. Lett. 53:612–615. Kisker, E., Schroder, K., Campagna, M., and Gudat, W. 1984. Temperature-dependence of the exchange splitting of Fe by spin- resolved photoemission spectroscopy with synchrotron radiation. Phys. Rev. Lett. 52:2285–2288. Kisker, E., Schroder, K., Campagna, M., and Gudat, W. 1985. Spin-polarised, angle-resolved photoemission study of the electronic structure of Fe(100) as a function of temperature. Phys. Rev. B 31:329–339. Koelling, D. D. 1981. Self-consistent energy-band calculations. Rep. Prog. Phys. 44:139–212. Kohn, W. and Sham, L. J. 1965. Self-consistent equations including exchange and correlation effects. Phys. Rev. 140:A1133. Kohn,W. and Vashishta, P. 1982. Physics of solids and liquids. (B.I. Lundqvist and N. March, eds.). Plenum, New York. Korenman, V. 1985. Theories of itinerant magnetism. J. Appl. Phys. 57:3000–3005. Korenman, V., Murray, J. L., and Prange, R. E. 1977a. Local-band theory of itinerant ferromagnetism. I. Fermi-liquid theory. Phys. Rev. B 16:4032. Korenman, V., Murray, J. L., and Prange, R. E. 1977b. Local-band theory of itinerant ferromagnetism. II. Spin waves. Phys. Rev. B 16:4048.
203
Korenman, V., Murray, J. L., and Prange, R. E. 1977c. Local-band theory of itinerant ferromagnetism. III. Nonlinear LandauLifshitz equations. Phys. Rev. B 16:4058. Krivoglaz, M. 1969. Theory of x-ray and thermal-neutron scattering by real crystals. Plenum Press, New York. Kubler, J. 1984. First principle theory of metallic magnetism. Physica B 127:257–263. Kuentzler, R. 1980. Ordering effects in the binary T-Pt alloys. In Physics of Transition Metals. 1980, Institute of Physics Conference Series no.55 (P. Rhodes, ed.). pp. 397–400. Institute of Physics, London. Langreth, D. C. and Mehl, M. J. 1981. Easily implementable nonlocal exchange-correlation energy functionals. Phys. Rev. Lett. 47:446. Langreth, D. C. and Mehl, M. J. 1983. Beyond the local density approximation in calculations of ground state electronic properties. Phys. Rev. B 28:1809. Lieb, E. 1983. Density functionals for coulomb-systems. Int. J. of Quantum Chemistry 24:243. Lin, C. J. and Gorman, G. L. 1992. Evaporated CoPt alloy films with strong perpendicular magnetic anisotropy. Appl. Phys. Lett. 61:1600. Ling, M. F., Staunton, J. B., and Johnson, D. D. 1994a. Electronic mechanisms for magnetic interactions in a Cu-Mn spin-glass. Europhys. Lett. 25:631–636. Ling, M. F., Staunton, J. B., and Johnson, D. D. 1994b. A firstprinciples theory for magnetic correlations and atomic shortrange order in paramagnetic alloys. I. J. Phys.: Condens. Matter 6:5981–6000. Ling, M. F., Staunton, J. B., and Johnson, D. D. 1995a. All-electron, linear response theory of local environment effects in magnetic, metallic alloys and multilayers. J. Phys.: Condensed Matter 7:1863–1887. Ling, M. F., Staunton, J. B., Pinski, F. J., and Johnson, D. D. 1995b. Origin of the {1,1/2,0} atomic short range order in Aurich Au-Fe alloys. Phys. Rev. B: Rapid Communications 52:3816– 3819. Liu, A. Y. and Singh, D. J. 1992. General-potential study of the electronic and magnetic structure of FeCo. Phys. Rev. B 46: 11145–11148. Liu, S. H. 1978. Quasispin model for itinerant magnetism—effects of short-range order. Phys. Rev. B 17:3629–3638. Lonzarich, G. G. and Taillefer, L. 1985. Effect of spin fluctuations on the magnetic equation of state of ferromagnetic or nearly ferromagnetic metals. J.Phys.C: Solid State Physics 18:4339. Lovesey, S. W. 1984. Theory of neutron scattering from condensed matter. 1. Nuclear scattering, International series of monographs on physics. Clarendon Press, Oxford. MacDonald, A. H. and Vosko, S. H. 1979. A relativistic density functional formalism. J. Phys. C: Solid State Physics 12:6377. Mackintosh, A.R. and Andersen, O. K. 1980. The electronic structure of transition metals. In Electrons at the Fermi Surface (M. Springford, ed.). pp. 149–224. Cambridge University Press, Cambridge. Malozemoff, A. P., Williams, A. R., and Moruzzi, V. L. 1984. Bandgap theory of strong ferromagnetism: Application to concentrated crystalline and amorphous Fe-metalloid and Co-metalloid alloys. Phys. Rev. B 29:1620–1632. Maret, M., Cadeville, M.C., Staiger, W., Beaurepaire, E., Poinsot, R., and Herr, A. 1996. Perpendicular magnetic anisotropy in CoxPt1–x alloy films. Thin Solid Films 275:224. Marshall, W. 1968. Neutron elastic diffuse scattering from mixed magnetic systems. J. Phys. C: Solid State Physics 1:88.
204
COMPUTATION AND THEORETICAL METHODS
Massalski, T. B., Okamoto, H., Subramanian, P. R., and Kacprzak, L. 1990. Binary alloy phase diagrams. American Society for Metals, Metals Park, Ohio. McKamey, C. G., DeVan, J. H., Tortorelli, P. F., and Sikka, V. K. 1991. A review of recent developments in Fe3Al-based alloys. Journal of Materials Research 6:1779–1805. Mermin, N. D. 1965. Thermal properties of the inhomogeneous electron gas. Phys. Rev. 137:A1441. Mohn, P., Schwarz, K., and Wagner, D. 1991. Magnetoelastic anomalies in Fe-Ni invar alloys. Phys. Rev. B 43:3318. Moriya, T. 1979. Recent progress in the theory of itinerant electron magnetism. J. Mag. Magn. Mat. 14:1. Moriya, T. 1981. Electron Correlations and Magnetism in Narrow Band Systems. Springer, New York. Moruzzi, V. L. and Marcus, P. M. 1993. Energy band theory of metallic magnetism in the elements. In Handbook of Magnetic Materials, vol. 7 (K. Buschow, ed.) pp. 97. Elsevier/North Holland, Amsterdam. Moruzzi, V. L., Janak, J. F., and Williams, A. R. 1978. Calculated Electronic Properties of Metals. Pergamon Press, Elmsford, N.Y. Moruzzi, V. L., Marcus, P. M., Schwarz, K., and Mohn, P. 1986. Ferromagnetic phases of BCC and FCC Fe, Co and Ni. Phys. Rev. B 34:1784. Mryasov, O. N., Gubanov, V. A., and Liechtenstein, A. I. 1992. Spiralspin-density-wave states in fcc iron: Linear-muffin-tin-orbitals band-structure approach. Phys. Rev. B 21:12330–12336. Murata, K. K. and Doniach, S. 1972. Theory of magnetic fluctuations in itinerant ferromagnets. Phys. Rev. Lett. 29:285. Oguchi, T., Terakura, K., and Hamada, N. 1983. Magnetism of iron above the Curie-temperature. J. Phys. F.: Metal Physics 13:145–160. Pederson, M. R., Heaton, R. A., and Lin, C. C. 1985. Densityfunctional theory with self-interaction correction: application to the lithium molecule. J. Chem. Phys. 82:2688. Perdew, J. P. and Yue, W. 1986. Accurate and simple density functional for the electronic exchange energy: Generalised gradient approximation. Phys. Rev. B 33:8800. Perdew, J. P. and Zunger, A. 1981. Self-interaction correction to density-functional approximations for many-electron systems. Phys. Rev. B 23:5048–5079. Perdew, J. P., Chevary, J. A., Vosko, S. H., Jackson, K. A., Pedersen, M., Singh, D. J., and Fiolhais, C. 1992. Atoms, molecules, solids and surfaces: Applications of the generalized gradient approximation for exchange and correlation. Phys. Rev. B 46:667. Perdew, J. P., Burke, K., and Ernzerhof, M. 1996. GGA made simple. Phys. Rev. Lett. 77:3865–68. Pettifor, D. G. 1995. Bonding and Structure of Molecules and Solids. Oxford University Press, Oxford. Pierron-Bohnes, V., Kentzinger, E., Cadeville, M. C., Sanchez, J. M., Caudron, R., Solal, F., and Kozubski, R. 1995. Experimental determination of pair interactions in a Fe0.804V0.196 single crystal. Phys. Rev. B 51:5760. Pinski, F. J., Staunton, J. B., Gyo¨ rffy, B. L., Johnson, D. D., and Stocks, G. M. 1986. Ferromagnetism versus antiferromagnetism in face-centered-cubic iron. Phys. Rev. Lett. 56:2096–2099. Rajagopal, A. K. 1978. Inhomogeneous relativistic electron gas. J. Phys. C: Solid State Physics 11:L943. Rajagopal, A. K. 1980. Spin density functional formalism. Adv. Chem. Phys. 41:59. Rajagopal, A. K. and Callaway, J. 1973. Inhomogeneous electron gas. Phys. Rev. B 7:1912.
Ramana, M. V. and Rajagopal, A. K. 1983. Inhomogeneous relativistic electron-systems: A density-functional formalism. Adv.Chem.Phys. 54:231–302. Razee, S. S. A., Staunton, J. B., and Pinski, F. J. 1997. Firstprinciples theory of magneto-crystalline anisotropy of disordered alloys: Application to cobalt-platinum. Phys. Rev. B 56: 8082. Razee, S. S. A., Staunton, J. B., Pinski, F. J., Ginatempo, B., and Bruno, E. 1998. Magnetic anisotropies in NiPt and CoPt alloys. J. Appl. Phys. In press. Sakuma, A. 1994. First principle calculation of the magnetocrystalline anisotropy energy of FePt and CoPt ordered alloys. J. Phys. Soc. Japan 63:3053. Samson, J. 1989. Magnetic correlations in paramagnetic iron. J. Phys.: Condens. Matter 1:6717–6729. Sandratskii, L. M. 1998. Non-collinear magnetism in itinerant electron systems: Theory and applications. Adv. Phys. 47:91. Sandratskii, L. M. and Kubler, J. 1993. Local magnetic moments in BCC Co. Phys. Rev. B 47:5854. Severin, L., Gasche, T., Brooks, M. S. S., and Johansson, B. 1993. Calculated Curie temperatures for RCo2 and RCo2H4 compounds. Phys. Rev. B 48:13547. Shirane, G., Boni, P., and Wicksted, J. P. 1986. Paramagnetic scattering from Fe(3.5 at-percent Si): Neutron measurements up to the zone boundary. Phys. Rev. B 33:1881–1885. Singh, D. J., Pickett, W. E., and Krakauer, H. 1991. Gradient-corrected density functionals: Full-potential calculations for iron. Phys. Rev. B 43:11628–11634. Solovyev, I. V., Dederichs, P. H., and Mertig, I. 1995. Origin of orbital magnetization and magnetocrystalline anisotropy in TX ordered alloys (where T ¼ Fe, Co and X ¼ Pd, Pt). Phys. Rev. B 52:13419. Soven, P. 1967. Coherent-potential model of substitutional disordered alloys. Phys. Rev. 156:809–813. Staunton, J. B., and Gyo¨ rffy, B. L. 1992. Onsager cavity fields in itinerant-electron paramagnets. Phys. Rev. Lett. 69:371– 374. Staunton, J. B., Gyo¨ rffy, B. L., Pindor, A. J., Stocks, G. M., and Winter, H. 1985. Electronic-structure of metallic ferromagnets above the Curie-temperature. J. Phys. F.: Metal Physics 15:1387–1404. Staunton, J. B., Johnson, D. D., and Gyo¨ rffy, B. L. 1987. Interaction between magnetic and compositional order in Ni-rich NicFe1–c alloys. J. Appl. Phys. 61:3693–3696. Staunton, J. B., Johnson, D. D., and Pinski, F. J. 1990. Theory of compositional and magnetic correlations in alloys: Interpretation of a diffuse neutron-scattering experiment on an iron-vanadium single-crystal. Phys. Rev. Lett. 65:1259– 1262. Staunton, J. B., Matsumoto, M., and Strange, P. 1992. Spin-polarized relativistic KKR In Applications of Multiple Scattering Theory to Materials Science, Mat. Res. Soc. 253, (W. H. Butler, P. H. Dederichs, A. Gonis, and R. L. Weaver, eds.). pp. 309. Materials Research Society, Pittsburgh. Staunton, J. B., Johnson, D. D., and Pinski, F. J. 1994. Compositional short-range ordering in metallic alloys: band-filling, charge-transfer, and size effects from a first-principles, all-electron, Landau-type theory. Phys. Rev. B 50:1450. Staunton, J. B., Ling, M. F., and Johnson, D. D. 1997. A theoretical treatment of atomic short-range order and magnetism in iron-rich b.c.c. alloys. J. Phys. Condensed Matter 9:1281– 1300.
MAGNETISM IN ALLOYS Stephens, J. R. 1985. The B2 aluminides as alternate materials. In High Temperature Ordered Intermetallic Alloys, vol. 39. (C. C. Koch, C. T. Liu, and N. S. Stolhoff, eds.). pp. 381. Materials Research Society, Pittsburgh. Stocks, G. M. and Winter, H. 1982. Self-consistent-field-KorringaKohn-Rostoker-Coherent-Potential approximation for random alloys. Z. Phys. B 46:95–98. Stocks, G. M., Temmerman, W. M., and Gyo¨ rffy, B. L. 1978. Complete solution of the Korringa-Kohn-Rostoker coherent-potential-approximation equations: Cu-Ni alloys. Phys. Rev. Lett. 41:339. Stoner, E. C. 1939. Collective electron ferromagnetism II. Energy and specific heat. Proc. Roy. Soc. A 169:339. Strange, P., Ebert, H., Staunton, J. B., and Gyo¨ rffy, B. L. 1989a. A relativistic spin-polarized multiple-scattering theory, with applications to the calculation of the electronic-structure of condensed matter. J. Phys. Condens. Matter 1:2959. Strange, P., Ebert, H., Staunton, J. B., and Gyo¨ rffy, B. L. 1989b. A first-principles theory of magnetocrystalline anisotropy in metals. J. Phys. Condens. Matter 1:3947. Strange, P., Staunton, J. B., Gyo¨ rffy, B. L., and Ebert, H. 1991. First principles theory of magnetocrystalline anisotropy Physica B 172:51. Suzuki, T., Weller, D., Chang, C. A., Savoy, R., Huang, T. C., Gurney, B., and Speriosu, V. 1994. Magnetic and magneto-optic properties of thick face-centered-cubic Co single-crystal films. Appl. Phys. Lett. 64:2736. Svane, A. 1994. Electronic structure of cerium in the self-interaction corrected local spin density approximation. Phys. Rev. Lett. 72:1248–1251. Svane, A. and Gunnarsson, O. 1990. Transition metal oxides in the self-interaction corrected density functional formalism. Phys. Rev. Lett. 65:1148. Swihart, J. C., Butler, W. H., Stocks, G. M., Nicholson, D. M., and Ward, R. C. 1986. First principles calculation of residual electrical resistivity of random alloys. Phys. Rev. Lett. 57:1181. Szotek, Z., Temmerman, W. M., and Winter, H. 1993. Application of the self-interaction correction to transition metal oxides. Phys. Rev. B 47:4029. Szotek, Z., Temmerman, W. M., and Winter, H. 1994. Self-interaction corrected, local spin density description of the ga transition in Ce. Phys. Rev. Lett. 72:1244–1247. Temmerman, W. M., Szotek, Z., and Winter, H. 1993. Self-interaction-corrected electronic structure of La2CuO4. Phys. Rev. B 47:11533–11536. Treglia, G., Ducastelle, F., and Gautier, F. 1978. Generalized perturbation theory in disordered transition metal alloys: Application to the self-consistent calculation of ordering energies. J. Phys. F: Met. Phys. 8:1437–1456. Trygg, J., Johansson, B., Eriksson, O., and Wills, J. M. 1995. Total energy calculation of the magnetocrystalline anisotropy energy in the ferromagnetic 3d metals. Phys. Rev. Lett. 75: 2871. Tyson, T. A., Conradson, S. D., Farrow, R. F. C., and Jones, B. A. 1996. Observation of internal interfaces in PtxCo1–x (x ¼ 0.7) alloy films: A likely cause of perpendicular magnetic anisotropy. Phys. Rev. B 54:R3702.
205
van Tendeloo, G., Amelinckx, S., and de Fontaine, D. 1985. On the nature of the short-range order in {1,1/2,0} alloys. Acta. Crys. B 41:281. Victora, R.H. and MacLaren, J. M. 1993. Theory of magnetic interface anisotropy. Phys. Rev. B 47:11583. von Barth, U., and Hedin, L. 1972. A local exchange-correlation potential for the spin polarized case: I. J. Phys. C: Solid State Physics 5:1629–1642. von der Linden, Donath, M., and Dose, V. 1993. Unbiased access to exchange splitting of magnetic bands using the maximum entropy method. Phys. Rev. Lett. 71:899–902. Vosko, S. H., Wilk, L., and Nusair, M. 1980. Accurate spindependent electron liquid correlation energies for local spin density calculations: A critical analysis. Can. J. Phys. 58: 1200–1211. Wang, Y. and Perdew, J. P. 1991. Correlation hole of the spinpolarised electron gas with exact small wave-vector and high density scaling. Phys. Rev. B 44:13298. Wang, C. S., Prange, R. E., Korenman, V. 1982. Magnetism in iron and nickel. Phys. Rev. B 25:5766–5777. Wassermann, E. F. 1991. The INVAR problem. J. Mag. Magn. Mat. 100:346–362. Weinert, M., Watson, R. E., and Davenport, J. W. 1985. Totalenergy differences and eigenvalue sums. Phys. Rev. B 32:2115. Weller, D., Brandle, H., Lin, C. J., and Notary, H. 1992. Magnetic and magneto-optical properties of cobalt-platinum alloys with perpendicular magnetic anisotropy. Appl. Phys. Lett. 61:2726. Weller, D., Brandle, H., and Chappert, C. 1993. Relationship between Kerr effect and perpendicular magnetic anisotropy in Co1–xPtx and Co1–xPdx alloys. J. Magn. Magn. Mater. 121:461. Williams, A. R., Malozemoff, A. P., Moruzzi, V. L., and Matsui, M. 1984. Transition between fundamental magnetic behaviors revealed by generalized Slater-Pauling construction. J. Appl. Phys. 55:2353–2355. Wohlfarth, E. P. 1953. The theoretical and experimental status of the collective electron theory of ferromagnetism. Rev. Mod. Phys. 25:211. Wu, R. and Freeman, A. J. 1996. First principles determinations of magnetostriction in transition metals. J. Appl. Phys. 79:6209. Ziebeck, K. R. A., Brown, P. J., Deportes, J., Givord, D., Webster, P. J., and Booth, J. G. 1983. Magnetic correlations in metallic magnetics at finite temperatures. Helv. Phys. Acta. 56:117– 130. Ziman, J. M. 1964. The method of neutral pseudo-atoms in the theory of metals. Adv. Phys. 13:89. Zunger, A. 1994. First-principles statistical mechanics of semiconductor alloys and intermetallic compounds. In Statics and Dynamics of Alloy Phase Transformations, NATO-ASI Series (A. Gonis and P. E. A.Turchi, eds.). pp. 361–420. Plenum Press, New York.
F. J. PINSKI University of Cincinnati Cincinnati, Ohio
Uhl, M. and Kubler, J. 1996. Exchange-coupled spin-fluctuation theory: Application to Fe, Co and Ni. Phys. Rev. Lett. 77:334.
J. B. STAUNTON S. S. A. RAZEE
Uhl, M. and Kubler, J. 1997. Exchange-coupled spin-fluctuation theory: Calculation of magnetoelastic properties. J. Phys.: Condensed Matter 9:7885.
University of Warwick Coventry, U.K.
Uhl, M., Sandratskii, L. M., and Kubler, J. 1992. Electronic and magnetic states of g-Fe. J. Mag. Magn. Mat. 103:314–324.
University of Illinois Urbana-Champaign, Illinois
D. D. JOHNSON
206
COMPUTATION AND THEORETICAL METHODS
KINEMATIC DIFFRACTION OF X RAYS
PRINCIPLES OF THE METHOD
INTRODUCTION
Overview of Scattering Processes
Diffraction by x rays, electrons, or neutrons has enjoyed great success in crystal structure determination (e.g., the structures of DNA, high-Tc superconductors, and reconstructed silicon surfaces). For a perfectly ordered crystal, diffraction results in arrays of sharp Bragg reflection spots periodically arranged in reciprocal space. Analysis of the Bragg peak locations and their intensities leads to the identification of crystal lattice type, symmetry group, unit cell dimensions, and atomic configuration within a unit cell. On the other hand, for crystals containing lattice defects such as dislocations, precipitates, local ordered domains, surface, and interfaces, diffuse intensities are produced in addition to Bragg peaks. The distribution and magnitude of diffuse intensities are dependent on the type of imperfection present and the x-ray energy used in a diffraction experiment. Diffuse scattering is usually weak, and thus more difficult to measure, but it is rich in structure information that often cannot be obtained by other experimental means. Since real crystals are generally far from perfect, many properties exhibited by them are therefore determined by the lattice imperfections present. Consequently, understanding of the atomic structures of these lattice imperfections (e.g., atomic short-range order, extended vacancy defect complexes, phonon properties, composition fluctuation, charge density waves, static displacements, and superlattices) and of the roles these imperfections play (e.g., precipitation hardening, residual stresses, phonon softening, and phase transformations) is of paramount importance if these materials properties are to be exploited for optimal use. This unit addresses the fundamental principles of diffraction based upon the kinematic diffraction theory for x rays. (Nevertheless, the diffraction principles described in this unit may be extended to kinematic diffraction events involving thermal neutrons or electrons.) The accompanying DYNAMICAL DIFFRACTION is concerned with dynamic diffraction theory, which applies to diffraction from single crystals of high quality so that multiple scattering becomes significant and kinematic diffraction theory becomes invalid. In practice, most x-ray diffraction experiments are carried out on crystals containing a sufficiently large number of defects that kinematic theory is generally applicable. This unit is divided into two major sections. In the first section, the fundamental principles of kinematic diffraction of x rays will be discussed and a systematic treatment of theory will be given. In the second section, the practical aspects of the method will be discussed; specific expressions for kinematically diffracted x-ray intensities will be described and used to interpret diffraction behavior from real crystals containing lattice defects. Neither specific diffraction techniques and analysis nor sample preparation methods will be described in this unit. Readers may refer to X-RAY TECHNIQUES for experimental details and specific applications.
When a stream of radiation (e.g., photons or neutrons) strikes matter, various interactions can take place, one of which is the scattering process that may be best described using the wave properties of radiation. Depending on the energy, or wavelength, of the incident radiation, scattering may occur on different levels—at the atomic, molecular, or microscopic scale. While some scattering events are noticeable in our daily routines (e.g., scattering of visible light off the earth’s atmosphere to give a blue sky and scattering from tiny air bubbles or particles in a glass of water to give it a hazy appearance), others are more difficult to observe directly with human eyes, especially for those scattering events that involve x rays or neutrons. X rays are electromagnetic waves or photons that travel at the speed of light. They are no different from visible light, but have wavelengths ranging from a few hun˚ ) to a few hundred angstroms. dredths of an angstrom (A The conversion from wavelength to energy for all photons is given in the following equation with wavelength l in angstroms and energy in kilo-electron volts (keV):
˚ Þ¼ lðA
˚ keVÞ c 12:40ðA ¼ n EðkeVÞ
ð1Þ
˚ /s) and n is the in which c is the speed of light (3 ! 1018 A frequency. It is customary to classify x rays with a wavelength longer than a few angstroms as ‘‘soft x rays’’ as ˚) opposed to ‘‘hard x rays’’ with shorter wavelengths (91 A and higher energies (0keV). In what follows, a general scattering theory will be presented. We shall concentrate on the kinematic scattering theory, which involves the following assumptions: 1. The traveling wave model is utilized so that the x-ray beam may be represented by a plane wave formula. 2. The source-to-specimen and the specimen-to-detector distances are considered to be far greater than the distances separating various scattering centers. Therefore, both the incident and the scattering beam can be represented by a set of parallel rays with no divergence. 3. Interference between x-ray beams scattered by elements at different positions is a result of superposition of those scattered traveling waves with different paths. 4. No multiple scattering is allowed: that is, the oncescattered beam inside a material will not rescatter. (This assumption is most important since it separates kinematic scattering theory from dynamic scattering theory.) 5. Only the elastically scattered beam is considered; conservation of x-ray energy applies. The above assumptions form the basis of the kinematic scattering/diffraction theory; they are generally valid
KINEMATIC DIFFRACTION OF X RAYS
207
assumptions in the most widely used methods for studying scattering and diffraction from materials. In some cases, such as diffraction from perfect or nearly perfect single crystals, dynamic scattering theory must be employed to explain the nature of the diffraction events (DYNAMICAL DIFFRACTION). In other cases, such as Compton scattering, where energy exchanges occur in addition to momentum transfers, inelastic scattering theories must be invoked. While the word ‘‘scattering’’ refers to a deflection of beam from its original direction by the scattering centers that could be electrons, atoms, molecules, voids, precipitates, composition fluctuations, dislocations, and so on, the word ‘‘diffraction’’ is generally defined as the constructive interference of coherently scattered radiation from regularly arranged scattering centers such as gratings, crystals, superlattices, and so on. Diffraction generally results in strong intensity in specific, fixed directions in reciprocal (momentum) space, which depend on the translational symmetry of the diffracting system. Scattering, however, often generates weak and diffuse intensities that are widely distributed in reciprocal space. A simple picture may be drawn to clarify this point. For instance, interaction of radiation with an amorphous substance is a ‘‘scattering’’ process that reveals broad and diffuse intensity maxima, whereas with a crystal it is a ‘‘diffraction’’ event, as sharp and distinct peaks appear. Sometimes the two words are interchangeable, as the two events may occur concurrently or indistinguishably.
necessary to keep track of the phase of the wave scattered from individual volume elements. Therefore the scattered wave along s is made up of components scattered from the individual volume elements, the path differences traveled by each individual ray traveling from P1 to P2. In reference to an arbitrary point O in the specimen, the path difference between a ray scattered from the volume element V1 and that from O is
Elementary Kinematic Scattering Theory
The phase of the scattered radiation is then expressed by the plane wave eif j , and the resultant amplitude is obtained by summing over the complex amplitudes scattered from each incremental scattering center: X A¼ fj e2piKrj ð6Þ
r1 ¼ r1 s r1 s0 ¼ r1 ðs s0 Þ
ð2Þ
Thus, the difference in phase between waves scattered from the two points will be proportional to the difference in distances that the two waves travel from P1 to P2 —a path difference equal to the wavelength l, corresponding to a phase difference f of 2p radians: f r1 ¼ l 2p
ð3Þ
In general, the phase of the wave scattered from the jth increment of volume Vj , relative to the phase of the wave scattered from the origin O, will thus be fj ¼
2pðs s0 Þ rj l
ð4Þ
Equation 4 may be expressed by fj ¼ 2p K rj , where K is the scattering vector (Fig. 2), K¼
s s0 l
ð5Þ
In Figure 1, an incoming plane wave P1, traveling in the direction specified by the unit vector s0, interacts with the specimen, and the scattered beam, another plane wave P2, travels along the direction s, again a unit vector. Taking into consideration increments of volume within the specimen, V1 , waves scattered from different increments of volume will interfere with each other: that is, their instantaneous amplitudes will be additive (a list of symbols used is contained in the Appendix). Since the variation of amplitude with time will be sinusoidal, it is
where fj is the scattering power, or scattering length, of the jth volume element (this scattering power will be further discussed a little later). For a continuous medium viewed on a larger scale, as is the case in small-angle scattering,
Figure 1. Schematics showing a diffracting element V1 at a distance r1 from an arbitrarily chosen origin O in the crystal. The incident and the diffraction beam directions are indicated by the unit vectors, s0 and s, respectively.
Figure 2. The diffraction condition is determined by the incident and the scattering beam direction unit vectors normalized against the specified wavelength (l). The diffraction vector K is defined as the difference of the two vectors s/l and s0/l. The diffraction angle, 2y, is defined by these two vectors as well.
j
208
COMPUTATION AND THEORETICAL METHODS
the summation sign in Equation 6 may be replaced by an integral over the entire volume of the irradiated specimen. The scattered intensity, I(K), written in absolute units, which are commonly known as electron units, is proportional to the square of the amplitude in Equation 6: "2 " " "X " 2piKrj " IðKÞ ¼ AA ¼ " fj e " " " j
ð7Þ
Diffraction from a Crystal For a crystalline material devoid of defects, the atomic arrangement may be represented by a primitive unit cell with lattice vectors a1, a2, and a3 that display a particular set of translational symmetries. Since every unit cell is identical, the above summation over the diffracting volume within a crystal can be replaced by the summation over a single unit cell followed by a summation over the unit cells contained in the diffraction volume: " "2 " "2 u:c: "X " "X " " 2 2 2piKrj " " 2piKrn " IðKÞ ¼ " fj e e " " " ¼ jFðKÞj jGðKÞj " j " " n "
ð8Þ
The first term, known as the structure factor, F(K), is a summation of all scattering centers within one unit cell (u.c.). The second term defines the interference function, G(K), which is a Fourier transformation of the real-space point lattice. The vector rn connects the origin to the nth lattice point and is written as: rn ¼ n1 a1 þ n2 a2 þ n3 a3
ð9Þ
where n1, n2, and n3 are integers. Consequently, the single summation for the interference function may be replaced by a triple summation over n1, n2, and n3: " "2 " "2 " "2 N3 N "X " " N2 " "X " " 1 2piKn1 a1 " "X 2 2piKn2 a2 " " 2piKn3 a3 " e e e jGðKÞj ¼ " " " " " " ð10Þ "n " "n " "n " 1
2
3
where N1, N2, and N3 are numbers of unit cells along the three lattice vector directions, respectively. For large Ni ; Equation 10 reduces to
Figure 3. Schematic drawing of the interference function for N ¼ 8 showing periodicity with angle b. The amplitude of the function equals N 2 while the width of the peak is proportional to 1/N, where N represents the number of unit cells contributing to diffraction. There are N 1 zeroes in (D) and N 2 subsidary maxima besides the two large ones at b ¼ 0 and 360 . Curves (C) and (D) have been normalized to unity. After Buerger (1960).
infinity, the interference function is a delta function with a value Ni : Therefore, when Ni!1 ; Equation 11, becomes jGðKÞj2 ¼ N1 N2 N3 ¼ Nv
ð12Þ
where Nv is the total number of unit cells in the diffracting volume. For diffraction to occur from such a three-dimensional (3D) crystal, the following three conditions must be satisfied simultaneously to give constructive interference, that is, to have significant values for G(K) K a1 ¼ h;
K a2 ¼ k;
K a3 ¼ l
ð13Þ
2
jGðKÞj ¼ sin2 ðpK N1 a1 Þ sin2 ðpK N2 a2 Þ sin2 ðpK N3 a3 Þ 2
sin ðpK a1 Þ
2
sin ðpK a2 Þ
2
sin ðpK a3 Þ ð11Þ
A general display of the above interference function is shown in Figure 3. First, the function is a periodic one. Maxima occur at specific K locations followed by a series of secondary maxima with much reduced amplitudes. It is noted that the larger the Ni the sharper the peak, because the width of the peak is inversely proportional to Ni while the peak height equals Ni2 : When Ni approaches
where h, k, and l are integers. These are three conditions known as Laue conditions. Obviously, for diffraction from a lower-dimensional crystal, one or two of the conditions are removed. The Laue conditions, Equation 13, indicate that scattering is described by sets of planes spaced h/a1, k/a2, and l/a3 apart and perpendicular to a1, a2, and a3, respectively. Therefore, diffraction from a one-dimensional (1D) crystal with a periodicity a would result in sheets of intensities perpendicular to the crystal direction and separated by a distance 1/a. For a two-dimensional (2D) crystal, the diffracted intensities would be distributed along rods normal to the crystal plane. In three dimensions
KINEMATIC DIFFRACTION OF X RAYS
(3D), the Laue conditions define arrays of points that form the reciprocal lattice. The reciprocal lattice may be defined by means of three reciprocal space lattice vectors that are, in turn, defined from the real-space primitive unit cell vectors as in Equation 14: bi ¼
aj ! ak Va
ai bj ¼ dij
ð15Þ
where dij is the Kronecker delta function, which is defined as for for
i¼j i 6¼ j
ð16Þ
A reciprocal space vector H can thus be expressed as a summation of reciprocal space lattice vectors: H ¼ h b1 þ k b2 þ l b3
ð17Þ
where h, k, and l are integers. The magnitude of this vector, H, can be shown to be equal to the inverse of the interplanar spacing, dhkl. It can also be shown that the vector H satisfies the three Laue conditions (Equation 13). Consequently, the interference function in Equation 11 would have significant values when the following condition is satisfied K¼H
ð18Þ
This is the vector form of Bragg’s law. As shown in Equation 18 and Figure 2, when the scattering vector K, as defined according to the incident and the diffracted beam directions and the associated wavelength, matches one of the reciprocal space lattice vectors H, the interference function will have significant value, thereby showing constructive interference—Bragg diffraction. It can be shown by taking the magnitudes of the two vectors H and K that the familiar scalar form of the Bragg’s law is recovered: 2dhkl sin y ¼ nl
among all scattering centers within one unit cell. Certain extinction conditions may appear for a combination of h, k, and l values as a result of the geometrical arrangement of atoms or molecules within the unit cell. If a unit cell contains N atoms, with fractional coordinates xi ; yi ; and zi for the ith atom in the unit cell, then the structure factor for the hkl reflection is given by
ð14Þ
where i, j, and k are permutations of three integers, 1, 2, and 3, and Va is the volume of the primitive unit cell constructed by a1, a2, and a3. There exists an orthonormal relationship between the real-space and the reciprocal space lattice vectors, as in Equation 15:
dij ¼ 1 ¼0
209
n ¼ 1; 2; . . .
ð19Þ
By combining Equations 12 and 8, we now conclude that when Bragg’s law is met (i.e., when K ¼ H), the diffracted intensity becomes IðKÞ ¼ Nv jFðKÞj2
ð20Þ
Structure Factor The structure factor, designated by the symbol F, is obtained by adding together all the waves scattered from one unit cell; it therefore displays the interference effect
FðhklÞ ¼
N X
fi e2piðhxi þkyi þlzi Þ
ð21Þ
i
where the summation extends over all the N atoms of the unit cell. The parameter F is generally a complex number and expresses both the amplitude and phase of the resultant wave. Its absolute value gives the magnitude of diffracting power as given in Equation 20. Some examples of structure-factor calculations are given as follows: 1. For all primitive cells with one atom per lattice point, the coordinates for this atom are 0 0 0. The structure factor is F¼f
ð22Þ
2. For a body-centered cell with two atoms of the same kind, their coordinates are 0 0 0 and 12 12 12 ; and the structure factor is F ¼ f 1 þ epiðhþkþlÞ ð23Þ This expression may be evaluated for any combination of h, k, and l integers. Therefore, F ¼ 2f F¼0
when ðh þ k þ lÞ is even when ðh þ k þ lÞ is odd
ð24Þ
3. Consider a face-centered cubic (fcc) structure with identical atoms at x, y, z ¼ 0 0 0, 12 12 0; 12 0 12 ; and 0 1 1 2 2: The structure factor is F ¼ f 1 þ epiðhþkÞ þ epiðkþlÞ þ epiðlþhÞ ¼ 4f ¼0
for h; k; l all even or all odd for mixed h; k; and l
ð25Þ
4. Zinc blend (ZnS) has a common structure that is found in many Group III-V compounds such as GaAs and InSb and there are four Zn and four S atoms per fcc unit cell with the coordinates shown below: 11 0; 22 111 331 S: ; ; 444 444
Zn:
0 0 0;
1 1 11 0 ; and 0 2 2 22 313 133 ; and 444 444
The structure factor may be reduced to pi F ¼ fZn þ fS e 2 ðhþkþlÞ 1 þ epiðhþkÞ þ epiðkþlÞ þ epiðlþhÞ ð26Þ
210
COMPUTATION AND THEORETICAL METHODS
The second term is equivalent to the fcc conditions as in Equation 25, so h, k, and l must be unmixed integers. The first term further modifies the structure factor to yield
imaginary part concerns the absorption effect. Thus the true atomic scattering factor should be written f ¼ f0 þ f 0 þ if 00
F ¼ fZn þ fS ¼ fZn þ ifS ¼ fZn fS
when when when
h þ k þ l ¼ 4n and n is integer h þ k þ l ¼ 4n þ 1 h þ k þ l ¼ 4n þ 2
¼ fZn ifS
when
h þ k þ l ¼ 4n þ 3
ð27Þ
Scattering Power and Scattering Length X rays are electromagnetic waves; they interact readily with electrons in an atom. In contrast, neutrons scatter most strongly from nuclei. This difference in contrast origin results in different scattering powers between x rays and neutrons even from the same species (see Chapter 13). For a stream of unpolarized, or randomly polarized, x rays scattered from one electron, the scattered intensity, Ie ; is known as the Thomson scattering per electron: I 0 e4 1 þ cos2 2y Ie ¼ 2 2 4 2 m r c
ð28Þ
where Ie is the incident beam flux, e is the electron charge, m is the electron mass, c is the speed of light, r is the distance from the scattering center to the detector position, and 2y is the scattering angle (Fig. 2). The factor (1 þ cos2 2y)/2 is often referred to as the polarization factor. If the beam is fully or partially polarized, the total polarization factor will naturally be different. For instance, for synchrotron storage rings, x rays are linearly polarized in the plane of the ring. Therefore, if the diffraction plane containing vectors s0 and s in Figure 2 is normal to the storage ring plane, the polarization is unchanged during scattering. Scattering of x rays from atoms is predominantly from the electrons in the atom. Because electrons in an atom do not assume a fixed position but rather are described by a wave function that satisfies the Schrodinger equation in quantum mechanics, the scattering power for x rays from an atom may be expressed by an integration of all waves scattered from these electrons as represented by an electron density function, r(r), f ðKÞ ¼
ð
rðrÞe2piKr dVr
ð30Þ
Tabulated values for these correction terms, often referred to as the Honl corrections, can be found in the International Table for X-ray Crystallography (1996) or other references. In conclusion, the intensity expressions shown in Equations 7, 8, and 20 are written in electron units, an absolute unit independent of incident beam flux and polarization factor. These intensity expressions represent the fundamental forms of kinematic diffraction. Applications of these fundamental diffraction principles to several specific examples of scattering and diffraction will be discussed in the following section.
PRACTICAL ASPECTS OF THE METHOD Lattice defects may be classified as follows: (1) intrinsic defects, such as phonons and magnetic spins; (2) point defects, such as vacancies, substitutional, and interstitial solutes; (3) linear defects, such as dislocations, 1D superlattices, and charge density waves; (4) planar defects, such as twins, grain boundaries, surfaces, and interfaces; and (5) volume defects, such as voids, inclusions, precipitate particles, and magnetic clusters. In this section, kinematically scattered x-ray diffuse intensity expressions will be presented to correlate to lattice defects. Specific examples include: (1) thermal diffuse scattering from phonons, (2) short-range ordering or clustering in binary alloys, (3) surface/interface diffraction for reconstruction and interface structure, and (4) small-angle x-ray scattering from nanometer-sized particles dispersed in an otherwise uniform matrix. Not included in the discussion is the most fundamental use of the Bragg peak intensities for the determination of crystal structure from single crystals and for the analysis of lattice parameter, particle size distribution, preferred orientation, residual stress, and so on, from powder specimens. Discussion of these topics may be found in X-RAY POWDER DIFFRACTION and in many excellent books [e.g., Azaroff and Buerger (1958), Buerger (1960), Cullity (1978), Guinier (1994), Klug and Alexander (1974), Krivoglaz (1969), Noyan and Cohen (1987), Schultz (1982), Schwartz and Cohen (1987), and Warren (1969)].
ð29Þ
atom
where dVr is the volume increment and the integration is taken over the entire volume of the atom. The quantity f in Equation 29 is the scattering amplitude of an atom relative to that for a single electron. It is commonly known as the atomic scattering factor for x rays. The magnitude of f for different atomic species can be found in many text and reference books. There are dispersion corrections to be made to f. These include a real and an imaginary component: the real part is related to the bonding nature of the negatively charged electrons with the positively charged nucleus, whereas the
Thermal Diffuse Scattering (TDS) At any finite temperature, atoms making up a crystal do not stay stationary but rather vibrate in an cooperative manner; this vibrational amplitude usually becomes bigger at higher temperatures. Because of the periodic nature of crystals and the interconnectivity of an atomic network coupled by force constants, the vibration of an atom at a given position is related to the vibrations of others via atomic displacement waves (known as phonons) traveling through a crystal. The displacement of each atom is the sum total of the effects of these waves. Atomic vibration is considered one ‘‘imperfection’’ or ‘‘defect’’ that is intrinsic
KINEMATIC DIFFRACTION OF X RAYS
to the crystal and is present at all times. The scattering process for phonons is basically inelastic, and involves energy transfer as well as momentum transfer. However, for x rays the energy exchange in such an inelastic scattering process is only a few hundredths of an electron volt, much too small compared to the energy of the x-ray photon used (typically in the neighborhood of thousands of electron volt) to allow them to be conveniently separated from the elastically scattered x rays in a normal diffraction experiment. As a result, thermal diffuse x-ray scattering may be treated in either a quasielastic or elastic manner. Such is not the case with thermal neutron scattering since energy resolution in this case is sufficient to separate the inelastic scattering due to phonons from other elastic parts. In this section, we shall discuss thermal diffuse xray scattering only. The development of the scattering theory of the effect of thermal vibration on the x-ray diffraction in crystals is associated primarily with the Debye (1913a,b,c, 1913– 1914), Waller (1923), Faxen (1918, 1923), and James (1948). The whole subject was brought together for the first time in a book by James (1948). Warren (1969), who adopted the approach of James, has written a comprehensive chapter on this subject on which this section is based. What follows is a short summary of the formulations used in the thermal diffuse x-ray scattering analysis. Examples of TDS applications may be found in Warren (1969) and in papers by Dvorack and Chen (1983) and by Takesue et al. (1997). The most familiar effect of temperature vibration is the reduction of the Bragg reflections by the well-known Debye-Waller factor. This effect may be seen from the structure factor calculation:
FðKÞ ¼
u:c: X
fm e
2piKrm
In arriving at Equation 34, the linear average of the displacement field is set to zero, as is true for a random thermal vibration. Thus, 2
he2piKum i e2p
FðKÞ ¼
e2piKum
m
e2piKum 1 þ 2piK um 2p2 ðK um Þ2 þ
n
"2 + *" u:c: X u:c: "X " " " fm fn e2piKrm n " ¼ " " m n " u:c: X u:c: X m
ð32Þ
As a first approximation, the second exponential term in Equation 32 may be expanded into a Taylor series up to the second-order terms:
0
jfm fn je2piKrmn he2piKumn i
ð37Þ
n
in which rmn ¼ rm rn ; r0mn ¼ r0m r0n ; and umn ¼ u un. Therefore, coupling between atoms is kept in the term umn. Again, the approximation is applied with the assumption that a small vibrational amplitude is considered, so that a Taylor expansion may be used and the linear average set to zero:
ð33Þ 2
he2piKumn i 1 2p2 hðK umn Þ2 i e2p A time average may be performed for Equation 33, as a typical TDS experiment measuring interval is much longer than the phonon vibrational period, so that he2piKum i 1 2p2 hðK um Þ2 i þ
ð36Þ
"2 + *" "X " u:c: " 2piKrm " IðKÞ ¼ " fm e " " m " * + u:c: u:c: X X fm e2piKrm fn e2piKrn ¼
ð31Þ
m
ð35Þ
It now becomes obvious that thermal vibrations of atoms reduce the x-ray scattering intensities by the effect of the Debye-Waller temperature factor, exp(–M), in which M is proportional to the mean-squared displacement of a vibrating atom and is 2y dependent. The effect of the Debye-Waller factor is to decrease the amplitude of a given Bragg reflection but to keep the diffraction profile unaltered. The above approximation assumed that each individual atom vibrates independently from others; this is naturally incorrect, as correlated vibrations of atoms by way of lattice waves (phonons) are present in crystals. This cooperative motion of atoms must be included in the TDS treatment. A more rigorous approach, in accord with the TDS treatment of Warren (1969), is now described for a cubic crystal with one atom per unit cell. Starting with a general intensity equation expressed in terms of electron units and defining the time-dependent dynamic displacement vector um, one obtains
¼ fm e
¼ eMm
" "2 "X " u:c: 0 " " IðKÞ / hjFðKÞj2 i " fm eMm e2piKrm " " m "
where the upper limit, u.c., means summation over the unit cell. Let rm ¼ r0m + um(t), where r0m represents the average location of the mth atom and um is the dynamic displacement, a function of time t. Thus, 2piKr0m
hðKum Þ2 i
where Mm ¼ 2p2 hðK um Þ2 i; known as the Debye-Waller temperature factor for mth atom. Therefore, the total scattering intensity that is proportional to the square of the structure factor reduces to
m
u:c: X
211
ð34Þ
2
¼ ehPmn i=2
hðKumn Þ2 i
ð38Þ
where hP2mn i 4p2 hðK umn Þ2 i
ð39Þ
212
COMPUTATION AND THEORETICAL METHODS
The coupling between atomic vibrations may be expressed by traveling sinusoidal lattice waves, the concept of ‘‘phonons.’’ Each lattice wave may be represented by a wave vector g and a frequency ogj , in which the j subscript denotes the jth component (j ¼ 1, 2, 3) of the g lattice wave. Therefore, the total dynamic displacement of the nth atom is the sum of all lattice waves as seen in Equation 40 un ¼
X
un ðg; jÞ
ð40Þ
Again, assuming small vibrational amplitude, the second term in the product of Equation 44 may be expanded into a series: ex 1 þ x þ
IðKÞ
XX m
un ðg; jÞ ¼ agj egj cosðogj t 2pg r0n dgj Þ
ð41Þ
and agj is the vibrational amplitude; egj is the unit vector of the vibrating direction, that is, the polarization vector, for the gj wave; g is the propagation wave vector; dgj is an arbitrary phase factor; ogj is the frequency; and t is the time. Thus, Equation 39 may be rewritten hP2mn i ¼ 4p2
% X
K agj egj cosðogj t 2pg r0m dgj Þ
gj
X
K ag0 j0 eg0 j0 cos ðog0 j0 t 2pg0 r0n dg0 j0 Þ
2 &
g0 j 0
ð42Þ After some mathematical manipulation, Equation 42 reduces to hP2mn i ¼
Xn
ð2pK egj Þ2 ha2gj i½1 cosð2pg r0mn Þ
o
ð43Þ
þ
X 0 jfeM j2 e2piKrmn 1 þ Ggj cos ð2pg r0mn Þ
n
gj
1XX 2
gj
IðKÞ ¼
u:c: X u:c: X m
jfeM j2 e2piK
r0mn
0
egj Ggj cosð2pgrmn Þ
ð44Þ
n
where the first term in the product is equivalent to Equation 36, which represents scattering from the average lattice—that is, Bragg reflections—modified by the Debye-Waller temperature factor. The phonon coupling effect is contained in the second term of the product. The Debye-Waller factor 2M is the sum of Ggj ; which is given by 2M
X
Ggj ¼
gj
X1 gj
¼
2
ð2pK egj Þ2 ha2gj i
# ð45Þ " 4p sin y 2 X 1 2 2 ha icos ðK; egj Þ l 2 gj gj
where the term in brackets is the mean-square displacement projected along the diffraction vector K direction.
Ggj Gg0j0 cos ð2pg
g0j0
! cosð2pg0 r0mn Þ þ
r0mn Þ
ð47Þ
The first term, the zeroth-order thermal effect, in Equation 47 is the Debye-Waller factormodified Bragg scattering followed by the first-order TDS, the second-order TDS, and so on. The first-order TDS is a one-phonon scattering process by which one phonon will interact with the x ray resulting in an energy and momentum exchange. The second-order TDS involves the interaction of one photon with two phonons. The expression for first-order TDS may be further simplified and related to lattice dynamics; this is described in this section. Higher-order TDS (for which force constants are required) usually become rather difficult to handle. Fortunately, they become important only at high temperatures (e.g., near and above the Debye temperature). The first-order TDS intensity may be rewritten as follows:
gj
P Defining Ggj ¼ 12 ð2pK egj Þ2 ha2gj i and gj Ggj ¼ 2M causes the scattering equation for a single element system to reduce to
ð46Þ
Therefore, Equation 44 becomes
g; j
where
x2 x3 þ þ 2 6
I1TDS ðKÞ
XX 1 2 2M X 0 ¼ f e Gg j e2piðKþgÞrmn 2 m n gj XX 0 þ e2piðKgÞrmn m
ð48Þ
n
To obtain Equation 48, the following equivalence was used cosðxÞ ¼
eix þ eix 2
ð49Þ
The two double summations in the square bracket are in the form of the 3D interference function, the same as G(K) in Equation 11, with wave vectors K þ g and K g, respectively. We understand that the interference function has a significant value when its vector argument, K þ g and K g in this case, equals to a reciprocal lattice vector, H(hkl). Consequently, the first-order TDS reduces to I1TDS ðKÞ ¼ ¼
1 2 2M X f e Ggj ½GðK þ gÞ þ GðK gÞ 2 gj 1 2 2 2M X N f e Ggj 2 v j
ð50Þ
KINEMATIC DIFFRACTION OF X RAYS
213
when K g ¼ H, and Nv is the total number of atoms in the irradiated volume of the crystal. Approximations may be applied to Ggj to relate it to more meaningful and practical parameters. For example, the mean kinetic energy of lattice waves is * +2 1 X dun m 2 n dt
ð50aÞ
in which the displacement term un has been given in Equations 40 and 41, and m is the mass of a vibrating atom. If we take a first derivative of Equation 40 with respect to time (t), the kinetic energy (K.E.) becomes X 1 K:E: ¼ mN o2gj ha2gj i 4 gj
ð51Þ
The total energy of lattice waves is the sum of the kinetic and potential energies. For a harmonic oscillator, which is assumed in the present case, the total energy is equal to two times the kinetic energy. That is, Etotal ¼ 2½K:E: ¼
X X 1 mN o2gj ha2gj i ¼ hEgj i 2 gj gj
ð52Þ
At high temperatures, the phonon energy for each gj component may be approximated by hEgj i kT
ð53Þ
where k is the Boltzman constant. Thus, from Equation 52 we have ha2gj i ¼
2hEgj i 2kT
mNo2gj mNo2gj
ð54Þ
Substituting Equation 54 for the term ha2gj i in Equation 50, we obtain the following expression for the first-order TDS intensity I1TDS ðKÞ
2 2M
¼f e
3 cos2 ðK; egj Þ NkT 4p sin y 2 X m l o2gj j¼1
ð55Þ
in which the scattering vector satisfies K g ¼ H and the cosine function is determined based upon the angle spanned by the scattering vector K and the phonon eigenvector egj : In a periodic lattice, there is no need to consider elastic waves with a wavelength less than a certain minimum value because there are equivalent waves with a longer wavelength. The concept of Brillouin zone is applied to restrict the range of g. The significance of a measurement of the first-order TDS at various positions in reciprocal space may be observed in Figure 4, which represents the hk0 section of the reciprocal space of a body-centered cubic (bcc) crystal. At point P, the first-order TDS intensity is due only to elastic waves with the wave vector equal to g, and hence only to waves propagating in the direction of g. There are gen-
Figure 4. The (hk0) section of the reciprocal space corresponding to a bcc single crystal. At the general point P, there is a contribution from three phonon modes to the first-order TDS. At position Q, there is a contribution only from [100] longitudinal waves. At point R, there is a contribution from both longitudinal and transverse [100] waves.
erally three independent waves for a given g, and even in the general case, one is approximately longitudinal and the other two are approximately transverse waves. The cosine term appearing in Equation 55 may be considered as a geometrical extinction factor, which can further modify the contribution from the various elastic waves with the wave vector g. Through appropriate strategy, it is possible to separate the phonon wave contribution from different branches. One such example may be found in Dvorack and Chen (1983). From Equation 55, it is seen that the first-order TDS may be calculated for any given reciprocal lattice space location K, so long as the eigenvalues, ogj , and the eigenvectors, egj ; of phonon branches are known for the system. In particular, the lower branch, or the lower-frequency phonon branches, contribute most to the TDS since the TDS intensity is inversely proportional to the square of the phonon frequencies. Quite often, the TDS pattern can be utilized to study soft-mode behavior or to identify the soft modes. The TDS intensity analysis is seldom carried out to determine the phonon dispersion curves, although such an analysis is possible (Dvorack and Chen, 1983); it requires making the measurements with absolute units and separating TDS intensities from different phonon branches. Neutron inelastic scattering techniques are much more common when it comes to determination of the phonon dispersion relationships. With the advent of high-brilliance synchrotron radiation facilities with milli-electron volt or better energy resolution, it is now possible to perform inelastic x-ray scattering experiments. The second- and higher-order TDS might be appreciable for crystal systems showing soft modes, or close to or above the Debye temperature. The contribution of
214
COMPUTATION AND THEORETICAL METHODS
Figure 5. Equi-intensity contour maps, on the (100) plane of a cubic BaTiO3 single crystal at 200 C. Calculated first-order TDS in (A), second-order TDS in (B), and the sum of (A) and (B) in (C), along with observed TDS intensities in (D).
second-order TDS represents the interaction between two phonon wave vectors with x rays and it can be calculated if the phonon dispersion relationship is known. The higherorder TDS can be significant and must be accounted for in diffuse scattering analysis in some cases. Figure 5 shows the calculated first- and second-order TDS along with the measured intensities for a BaTiO3 single crystal in its paraelectric cubic phase (Takesue et al., 1997). The calculated TDS pattern shows the general features present in the observed data, but a discrepancy exists near the Brillouin zone center where measured TDS is higher than the calculation. This discrepancy is attributed to the overdamped phonon modes that are known to exist in BaTiO3 due to anharmonicity. Local Atomic Arrangement—Short-Range Ordering A solid solution is thermodynamically defined as a single phase existing over a range of composition and temperature; it may exist over the full composition range of a binary system, be limited to a range near one of the pure constituents, or be based on some intermetallic compounds. It is, however, not required that the atoms be distributed randomly on the lattice sites; some degree of atomic ordering or segregation is the rule rather than the exception. The local atomic correlation in the absence
of long-range order is the focus of interest in the present context. The mere presence of a second species of atom, called solute atoms, requires that scattering from a solid solution produce a component of diffuse scattering throughout reciprocal space, in addition to the fundamental Bragg reflections. This component of diffuse scattering is modulated by the way the solute atoms are dispersed on and about the lattice sites, and hence contains a wealth of information. An elegant theory has evolved that allows one to treat this problem quantitatively within certain approximations, as have related techniques for visualizing and characterizing real-space, locally ordered atomic structure. More recently, it has been shown that pairwise interaction energies can be obtained from diffuse scattering studies on alloys at equilibrium. These energies offer great promise in allowing one to do realistic kinetic Ising modeling to understand how, for example, supersaturated solid solutions decompose. An excellent, detailed review of the theory and practice of the diffuse scattering method for studying local atomic order, predating the quadratic approximation, has been given by Sparks and Borie (1966). More recent reviews on this topic were described by Chen et al. (1979) and by Epperson et al. (1994). In this section, the scattering principles for the extraction of pairwise interaction energies
KINEMATIC DIFFRACTION OF X RAYS
are outlined for a binary solid solution showing local order. Readers may find more detailed experimental procedures and applications in XAFS SPECTROMETRY. This section is written in terms of x-ray experiments, since x rays have been used for most local order diffuse scattering investigations to date; however, neutron diffuse scattering is in reality a complementary method. Within the kinematic approximation, the coherent scattering from a binary solid solution alloy with species A and B is given in electron units by XX Ieu ðKÞ ¼ fp fq eiKðRp Rq Þ ð56Þ p
q
where fp and fq are the atomic scattering factors of the atoms located at sites p and q, respectively, and (Rp Rq) is the instantaneous interatomic vector. The interatomic vector can be b written as ðRp Rq Þ ¼ hRp Rq i þ ðdp dq Þ
ð57Þ
where dp and dq are vector displacements from the average lattice sites. The hi brackets indicate an average over time and space. Thus XX Ieu ðKÞ ¼ fp fq eiKðdp dq Þ ehiKðRp Rq Þi ð58Þ p
q
In essence, the problem in treating local order diffuse scattering is to evaluate the factor fp fq eiKðdp dq Þ taking into account all possible combinations of atom pairs: AA, AB, BA, and BB. The modern theory and the study of local atomic order diffuse scattering had their origins in the classical work by Cowley (1950) in which he set the displacement to zero. Experimental observations by Warren et al. (1951), however, soon demonstrated the necessity of accounting for this atomic displacement effect, which tends to shift the local order diffuse maxima from positions of cosine symmetry in reciprocal space. Borie (1961) showed that a linear approximation of the exponential containing the displacements allowed one to separate the local order and static atomic displacement contributions by making use of the fact that the various components of diffuse scattering have different symmetry in reciprocal space. This approach was extended to a quadratic approximation of the atomic displacement by Borie and Sparks (1971). All earlier diffuse scattering measurements were made using this separation method. Tibbals (1975) later argued that the theory could be cast so as to allow inclusion of the reciprocal space variation of the atomic scattering factors. This is included in the state-ofart formulation by Auvray et al. (1977), which is outlined here. Generally, for a binary substitutional alloy one can write A
A
2 iKðdp dq Þ h fp fq eiKðdp dq Þ i ¼ XA PAA i pq fA he B
A
A
B
iKðdp dq Þ þ XA PBA i pq fA fB he B
B
iKðdp dq Þ 2 iKðdp dq Þ þ XB PAB i þ XB PBB i pq fA fB he pq fB he
ð59Þ
215
where XA and XB are atom fractions of species A and B, respectively, and PAB pq is the conditional probability of finding an A atom at site p provided there is a B atom at site q, and so on. There are certain relationships among the conditional probabilities for a binary substitutional solid solution: AB XA PBA pq ¼ XB Ppq
PAA pq PBB pq
þ þ
PBA pq PAB pq
ð60Þ
¼1
ð61Þ
¼1
ð62Þ
If one also introduces the Cowley-Warren (CW) order parameter (Cowley, 1950), apq ¼ 1
PBA pq XB
ð63Þ
Equation 59 reduces to A
A
h fp fq eiKðdp dq Þ i ¼ ðXA2 XA XB apq Þ fA2 heiKðdp dq Þ i B
A
þ 2XA XB ð1 apq Þ fA fB heKðdp dq Þ i B
B
þ ðXB2 þ XA XB apq Þ fB2 heKðdp dq Þ i
ð64Þ
If one makes series expansions of the exponentials and retains only quadratic and lower-order terms, it follows that Ieu ðKÞ ¼
XX ðXA fA þ XB fB Þ2 eiKRp q p
q
XX þ XA XB ð fA fB Þ2apq eiKRpq p
q
p
q
XX A ðXA2 þ XA XB apq Þ fA2 hiK ðdA þ p dq Þi A þ 2XA XB ð1 apq Þ fA fB hiK ðdB p dq Þi B þ ðXB2 þ XA XB ap qÞ fB2 hiK ðdB d Þi eikRpq p q
%h i2 & 1XX A ðXA2 þ XA XB apq Þ fA2 K ðdA p dq Þ 2 p q %h i2 & A d Þ þ 2XA XB ð1 apq Þ fA fB K ðdB p q
þ
ðXB2
þ
XA XB apq Þ fB2
%h i2 & B B eiKRpq ð65Þ K ðdp dq Þ
where eiKRpq denotes ehiKðRp Rq Þi : The first double summation represents the fundamental Bragg reflections for the average lattice. The second summation is the atomic-order modulated Laue monotonic, the term of primary interest here. The third sum is the so-called first-order atomic displacements, and it is purely static in nature. The final double summation is the second-order atomic displacements and contains both static and dynamic contributions. A detailed derivation would show that the second-order displacement series does not converge to zero. Rather, it represents a loss of intensity by the Bragg reflections; this is how TDS and Huang scattering originate. Henceforth, we shall use the term second-order displacement
216
COMPUTATION AND THEORETICAL METHODS
scattering to denote this component, which is redistributed away from the Bragg positions. Note in particular that the second-order displacement component represents additional intensity, whereas the first-order size effect scattering represents only a redistribution that averages to zero. However, the quadratic approximation may not be adequate to account for the thermal diffuse scattering in a given experiment, especially for elevated temperature measurements or for systems showing a soft phonon mode. The experimental temperature in comparison to Debye temperature of the alloy is a useful guide for judging the adequacy of the quadratic approximation. For cubic alloys that exhibit only local ordering (i.e., short-range ordering or clustering), it is convenient to replace the double summations by N times single sums over lattice sites, now specified by triplets of integers (lmn), which denote occupied sites in the lattice; N is the number of atoms irradiated by the x-ray beam. One can express the average interatomic vector as hRlmn i ¼ la1 þ ma2 þ na3
ð66Þ
where a1, a2, and a3 are orthogonal vectors parallel to the cubic unit cell edges. The continuous variables in reciprocal space (h1, h2, h3) are related to the scattering vector by S S0 ¼ 2pðh1 b1 þ h2 b2 þ h3 b3 Þ K ¼ 2p l
If one invokes the symmetry of the cubic lattice and simplifies the various expressions, the coherently scattered diffuse intensity that is observable becomes, in the quadratic approximation of the atomic displacements, ID ðh1 ; h2 ; h3 Þ 2
NXA XB ðfA fB Þ
¼
XXX l
m
almn cos 2pðh1 l þ h2 m þ h3 nÞ
n
BB AA þ h1 ZQAA x þ h1 xQx þ h2 ZQy AA BB þ h2 xQBB y þ h3 ZQz þ h3 xQz 2 AB 2 2 BB þ h21 Z2 RAA x þ 2h1 ZxRx þ h1 x Rx 2 AB 2 2 BB þ h22 Z2 RAA y þ 2h2 ZxRy þ h2 x Ry 2 AB 2 2 BB þ h23 Z2 RAA z þ 2h3 ZxRz þ h3 x Rz 2 BB AB þ h1 h2 Z2 SAA xy þ 2h1 h2 ZxSxy þ h1 h2 x Sxy 2 BB AB þ h1 h3 Z2 SAA xz þ 2h1 h3 ZxSxz þ h1 h3 x Sxz 2 BB AB þ h2 h3 Z2 SAA yz þ 2h2 h3 ZxSyz þ h2 h3 x Syz
ð69Þ
fA fA fB
ð70Þ
x¼
fB fA fB
ð71Þ
The Qi functions, which describe the first-order size effects scattering component, result from simplifying the third double summation in Equation 65 and are of the form QAA x ¼ 2p
X XXXA AA þ almn hXlmn i X B m n l
! sin 2 ph1 l cos 2p h2 m cos 2p h3 n
ð72Þ
AA where hXlmn i is the mean component of displacement, relative to the average lattice, in the x direction of the A atom at site lmn when the site at the local origin is also occupied by an A-type atom. The second-order atomic displacement terms obtained by simplification of the fourth double summation in Equation 65 are given by expressions of the type
2 RAA x ¼ 4p
X XXXA l
ð68Þ
Z¼ and
ð67Þ
where b1, b2, and b3 are the reciprocal space lattice vectors as defined in Equation 14. The coordinate used here is that conventionally employed in diffuse scattering work and is chosen in order that the occupied sites can be specified by a triplet of integers. Note that the 200 Bragg position becomes 100, and so on, in this notation. It is also convenient to represent the vector displacements in terms of components along the respective real-space axes as A AA AA AA AA ðdA p dq Þ dpq ¼ Xlmn a1 þ Ylmn a2 þ Zlmn a3
where
m
n
XB
A þ almn hXoA Xlmn i
! cos 2 ph1 l cos 2 ph2 m cos 2 ph3 n
ð73Þ
and 2 SAB xy ¼ 8p
X X XXA l
m
n
XB
A þ almn hXoA Ylmn i
! sin 2 p h1 l sin 2 ph2 m cos 2 ph3 n
ð74Þ
In Equations 73 and 74, the terms in angle brackets represent correlations of atomic displacements. For examA ple, hXoA Ylmn i represents the mean component of displacement in the Y direction of an A-type atom at a vector distance lmn from an A-type atom at the local origin. The first summation in Equation 69 (ISRO) contains the statistical information about the local atomic ordering of primary interest; it is a 3D Fourier cosine series whose coefficients are the CW order parameters. The first term in this series (a000) is a measure of the integrated local order diffuse intensity, and, provided the data are normalized by the Laue monotonic unit ½XA XB ðfA fB Þ2 ; should have the value of unity. A schematic representation of the various contributions due to ISRO, Q, and R/S components to the total diffuse intensity along an [h00] direction is shown in Figure 6 for a system showing short-range clustering. As one can see, beside sharp Bragg peaks, there are ISRO components concentrated near the fundamental Bragg peaks due to local clustering, the oscillating diffuse intensity due to static displacements (Q), and TDS-like intensity (R and S) near the tail of the fundamental reflections. Each of these diffuse-intensity components can be separated and
KINEMATIC DIFFRACTION OF X RAYS
Figure 6. Schematic representation of the various contributions to diffuse x-ray scattering (l) along an [h00] direction in reciprocal space, from an alloy with short-range ordering and displacement. Fundamental Bragg reflections have all even integers; other sharp peak locations represent the superlattice peak when the system becomes ordered.
analyzed to reveal the local structure and associated static displacement fields. The coherent diffuse scattering thus consists of 25 components for the cubic binary substitutional alloy, each of which possesses distinct functional dependence on the reciprocal space variables. This fact permits the components to be separated. To effect the separation, intensity measurements are made at a set of reciprocal lattice points referred to as the ‘‘associated set.’’ These associated points follow from a suggestion of Tibbals (1975) and are selected according to crystallographic symmetry rules such that the corresponding 25 functions (ISRO, Q, R, and S) in Equation 69 have the same absolute value. Note, however, that the intensities at the associated points need not be the same, because the functions are multiplied by various combinations of hi ; Z, and x. Extended discussion of this topic has been given by Schwartz and Cohen (1987). An associated set is defined for each reciprocal lattice point in the required minimum volume for the local order component of diffuse scattering, and the corresponding intensities must be measured in order that the desired separation can be carried out. The theory outlined above has heretofore been used largely for studying short-range-ordering alloys (preference for unlike nearest neighbors); however, the theory is equally valid for alloy systems that undergo clustering (preference for like nearest neighbors), or even a combination of the two. If clustering occurs, local order diffuse scattering will be distributed near the fundamental Bragg positions, including the zeroth-order diffraction; that is, small-angle scattering (SAS) will be observed. Because of the more localized nature of the order diffuse scattering, analysis is usually carried out with rather a different formalism; however, Hendricks and Borie (1965) considered some important aspects using the atomistic approach and the CW formalism.
217
In some cases, both short-range ordering and clustering may coexist, as in the example by Anderson and Chen (1994), who utilized synchrotron x rays to investigate the short-range-order structure of an Au25 at.% Fe single crystal at room temperature. Two heat treatments were investigated: a 400 C aging treatment for 2 days and a 440 C treatment for 5 days, both preceded by solution treatment in the single-phase field and water quenched to room temperature. Evolution of SRO structure with aging was determined by fitting two sets of CowleyWarren SRO parameters to a pair of 140,608-atom models. The microstructures, although quite disordered, showed a trend with aging for an increasing volume fraction of an Fe-enriched and an Fe-depleted environment—indicating that short-range ordering and clustering coexist in the system. The Fe-enriched environment displayed a preference for Fe segregation to the {110} and {100} fcc matrix planes. A major portion of the Fe-depleted environment was found to contain elements (and variations of these elements) of the D1a ordered superstructure. The SRO contained in the Fe-depleted environment may best be described in terms of the standing wave packet model. This model was the first study to provide a quantitative real-space view of the atomic arrangement of the spinglass system Au-Fe. Surface/Interface Diffraction Surface science is a subject that has grown enormously in the last few decades, partly because of the availability of new electron-based tools. X-ray diffraction has also contributed to many advances in the field, particularly when synchrotron radiation is used. Interface science, on the other hand, is still in its infancy as far as structural analysis is concerned. Relatively crude techniques, such as dissolution and erosion of one-half of an interface, exist but have limited application. Surfaces and interfaces may be considered as a form of defect because the uniform nature of a bulk crystal is abruptly terminated so that the properties of the surfaces and interfaces often differ significantly from the bulk. In spite of the critical role that they play in such diverse sciences as catalysis, tribology, metallurgy, and electronic devices and the expected richness of the 2D physics of melting, magnetism, and related phase transitions, only a few surface structures are known, most of those are known only semiquantitatively (e.g., their symmetry; Somorjai, 1981). Our inability in many cases to understand atomic structure and to make the structure/properties connection in the 2D region of surfaces and interfaces has significantly inhibited progress in understanding this rich area of science. X-ray diffraction has been an indispensable tool in 3D materials structure characterization despite the relatively low-scattering cross-section of x-ray photons compared with electrons. But the smaller number of atoms involved at surfaces and interfaces has made structural experiments at best difficult and in most cases impossible. The advent of high-intensity synchrotron radiation sources has definitely facilitated surface/interface x-ray diffraction. The nondestructive nature of the technique together
218
COMPUTATION AND THEORETICAL METHODS
The term ‘‘interface’’ usually refers to the case when two bulk media of the same or different material are in contact, as Figure 7C shows. Either one or both may be crystalline, and therefore interfaces include grain boundaries as well. Rearrangement of atoms at interfaces may occur, giving rise to unique 2D diffraction patterns. By and large, the diffraction principles for scattering from surfaces or interfaces are considered identical. Consequently, the following discussion applies to both cases.
Figure 7. Real-space and reciprocal space views of an ideal crystal surface reconstruction. (A) A single monolayer with twice the periodicity in one direction producing featureless 2D Bragg rods whose periodicity in reciprocal space is one-half in one direction. The grid in reciprocal space corresponds to a bulk (1 ! 1) cell. (B) A (1 ! 1) bulk-truncated crystal and corresponding crystal truncation rods (CTRs). (C) An ideal reconstruction combining features from (A) and (B); note the overlap of one-half the monolayer or surface rods with the bulk CTRs. In general, 2D Bragg rods arising from a surface periodicity unrelated to the bulk (1 ! 1) cell in size of orientation will not overlap with the CTRs.
with its high penetration power and negligible effect due to multiple scattering should make x-ray diffraction a premier method for quantitative surface and interface structural characterization (Chen, 1996). Up to this point we have considered diffraction from 3D crystals based upon the fundamental kinematic scattering theory laid out in the section on Diffraction from a Crystal. For diffraction from surfaces or interfaces, modifications need to be made to the intensity formulas that we shall discuss below. Schematic pictures after Robinson and Tweet (1992) illustrating 2D layers existing at surfaces and interfaces are shown in Figure 7; there are three cases for consideration. Figure 7A is the case where an ideal 2D monolayer exists, free from interference of any other atoms. This case is hard to realize in nature. The second case is more realistic and is the one that most surface scientists are concerned with: the case of a truncated 3D crystal on top of which lies a 2D layer. This top layer could have a structure of its own, or it could be a simple continuation of the bulk structure with minor modifications. This top layer could also be of a different element or elements from the bulk. The surface structure may sometimes involve arrangement of atoms in more than one atomic layer, or may be less than one monolayer thick.
Rods from a 2D Diffraction. Diffraction from 2D structures in the above three cases can be described using Equations 8, 9, 10, 11, and 12 and Equation 20. If we take a3 to be along the surface/interface normal, the isolated monolayer is a 2D crystal with N3 ¼ 1. Consequently, one of the Laue conditions is relaxed, that is, there is no constraint on the magnitude of K a3, which means the diffraction is independent of K a3, the component of momentum transfer perpendicular to the surface. As a result, in 3D reciprocal space the diffraction pattern from this 2D structure consists of rods perpendicular to the surface, as depicted in Figure 7A. Each rod is a line of scattering extending out to infinity along the surface-normal direction, but is sharp in the other two directions parallel to the surface. For the surface of a 3D crystal, the diffuse rods resulting from the scattering of the 2D surface structure will connect the discrete Bragg peaks of the bulk. If surface/ interface reconstruction occurs, new diffuse rods will occur; these do not always run through the bulk Bragg peaks, as in the case shown in Figure 7C. The determination of a 2D structure can, in principle, be made by following the same methods that have been developed for 3D crystals. The important point here is that one has to scan across the diffuse rods, that is, the scattering vector K must lie in the plane of the surface— the commonly known ‘‘in-plane’’ scan. Only through measurements such as these can the total integrated intensities, after resolution function correction and background subtraction, be utilized for structure analysis. The grazing-incidence x-ray diffraction technique is thus developed to accomplish this goal SURFACE X-RAY DIFFRACTION. Other techniques such as the specular reflection, standing-wave method can also be utilized to aid in the determination of surface structure, surface roughness, and composition variation. Figure 7C represents schematically the diffraction pattern from the corresponding structure consisting of a 2D reconstructed layer on top of a 3D bulk crystal. We have simply superimposed the 3D bulk crystal diffraction pattern in the form of localized Bragg peaks (dots) with the Bragg diffraction rods deduced from the 2D structure. One should be reminded that extra reflections, that is, extra rods, could occur if the 2D surface structure differs from that of the bulk. For a 2D structure involving one layer of atoms and one unit cell in thickness, the Bragg diffraction rods, if normalized against the decaying nature of the atomic scattering factors, are flat in intensity and extend to infinity in reciprocal space. When the 2D surface structure has a thickness of more than one unit cell, a pseudo-2D structure or a very thin layer is of concern, and the Bragg diffraction rods will no longer be flat in their intensity profiles but instead fade away monotonically
KINEMATIC DIFFRACTION OF X RAYS
from the zeroth-order plane normal to the sample surface in reciprocal space. The distance to which the diffraction rods extends is inversely dependent on the thickness of the thin layer. Crystal Truncation Rods. In addition to the rods originating from the 2D structure, there is one other kind of diffuse rod that contributes to the observed diffraction pattern that has a totally different origin. This second type of diffuse rod has its origin in the abrupt termination of the underlying bulk single-crystal substrate, the so-called crystal truncation rods, CTRs. This contribution further complicates the diffraction pattern, but is rich in information concerning the surface termination sequence, relaxation, and roughness; therefore, it must be considered. The CTR intensity profiles are not flat but vary in many ways that are determined by the detailed atomic arrangement and static displacement fields near surfaces, as well as by the topology of the surfaces. The CTR intensity lines are always perpendicular to the surface of the substrate bulk single crystal and run through all Bragg peaks of the bulk and the surface. Therefore, for an inclined surface normal that is not parallel to any crystallographic direction, CTRs do not connect all Bragg peaks, as shown in Figure 7B. Let us consider the interference function, Equation 11, along the surface normal, a3, direction. The numerator, sin2(pK N3a3), is an extremely rapid varying function of K, at least for large N3, and is in any case smeared out in a real experiment because of finite resolution. Since it is always positive, we can approximate it by its average value of 12 : This gives a simpler form for the limit of large N3 that is actually independent of N3: jG3 ðKÞj2 ¼
1 2sin2 ðpK a3 Þ
ð75Þ
Although the approximation is not useful at any of the Bragg peaks defined by the three Laue conditions, it does tell us that the intensity in between Bragg peaks is actually nonzero along the surface normal direction, giving rise to the CTRs. Another way of looking at CTRs comes from convolution theory. From the kinematic scattering theory presented earlier, we understand that the scattering crosssection is the product of two functions, the structure factor F(K) and the interference function G(K), expressed in terms of the reciprocal space vector K. This implies, in real space, that the scattering cross-section is related to a convolution of two real-space structural functions: one defining the positions of all atoms within one unit cell and the other covering all lattice points. For an abruptly terminated crystal at a well-defined surface, the crystal is semi-infinite, which can be represented by a product of a step function with an infinite lattice. The diffraction pattern is then, by Fourier transformation, the convolution of a reciprocal lattice with the function (2pKa3)1. It was originally shown by von Laue (1936) and more recently by Andrews and Cowley (1985), in a continuum approximation, that the external surface can thus give rise to streaks emanating from each Bragg peak of the bulk,
219
perpendicular to the terminating crystal surface. This is what we now call the CTRs. It is important to make the distinction between CTRs passing through bulk reciprocal lattice points and those due to an isolated monolayer 2D structure at the surface. Both can exist together in the same sample, especially when the surface layer does not maintain lattice correspondence with the bulk crystal substrate. To illustrate the difference and similarity of the two cases, the following equations may be used to represent the rod intensities of two different kinds: I2D ¼ I0 N1 N2 jFðKÞj2 ICTR
1 ¼ I0 N1 N2 jFðKÞj2 2sin2 ðpK a3 Þ
ð76Þ ð77Þ
The two kinds of rod have the same order-of-magnitude intensity in the ‘‘valley’’ far from the Bragg peaks at K a3 ¼ l. The actual intensity observed in a real experiment is several orders of magnitude weaker than the Bragg peaks. For the 2D rods, integrated intensities at various (hk) reflections can be measured and Fourier inverted to reveal the real-space structure of the 2D ordering. Patterson function analysis and difference Patterson function analysis are commonly utilized, along with least-squares fitting to obtain the structure information. For the CTRs, the stacking sequences and displacement of atomic layers near the surface, as well as the surface roughness factor, and so on, can be modeled through the calculation of the structure factor in Equation 77. Experimental techniques and applications of surface/interface diffraction techniques to various materials problems may be found in SURFACE X-RAY DIFFRACTION. Some of our own work may be found in the studies of buried semiconductor surfaces by Aburano et al. (1995) and by Hong et al. (1992a,b, 1993, 1996) and in the determination of the terminating stacking sequence of c-plane sapphire by Chung et al. (1997). Small-Angle Scattering The term ‘‘small-angle scattering’’ (SAS) is somewhat ambiguous as long as the sample, type of radiation, and incident wavelength are not specified. Clearly, Bragg reflections of all crystals when investigated with highenergy radiation (e.g., g rays) occur at small scattering angles (small 2y) simply because the wavelength of the probing radiation is short. Conversely, crystals with large lattice constants could lead to small Bragg angles for a reasonable wavelength value of the radiation used. These Bragg reflections, although they might appear at small angles, can be treated in essentially the same way as the large-angle Bragg reflections with their origins laid out in all previous sections. However, in the more specific sense of the term, SAS is a scattering phenomenon related to the scattering properties at small scattering vectors K (with magnitudes K ¼ 2 sin y/l), or, in other words, diffuse scattering surrounding the direct beam. It is this form of diffuse SAS that is the center of discussion in this section. SAS is produced by the variation of scattering length density over distances exceeding the normal interatomic distances in condensed systems. Aggregates of small
220
COMPUTATION AND THEORETICAL METHODS
particles (e.g., carbon black and catalysts) in air or vacuum, particles or macromolecules in liquid or solid solution (e.g., polymers and precipitates in alloys), and systems with smoothly varying concentration (or scattering length density) profiles (e.g., macromolecules, glasses, and spinodally decomposed systems) can be investigated with SAS methods. SAS intensity appears at low K values, that is, K should be small compared with the smallest reciprocal lattice vector in crystalline substances. Because the scattering intensity is related to the Fourier transform properties, as shown in Equation 7, it follows that measurements at low K will not allow one to resolve structural details in real space over distances smaller than dmin p/ Kmax, where Kmax is the maximum value accessible in the ˚ 1, then SAS experiment. If, for example, Kmax ¼ 0.2 A ˚ dmin ¼ 16 A, and the discrete arrangement of scattering centers in condensed matter can in most cases be replaced by a continuous distribution of scattering length, averaged over volumes of about d3min . Consequently, summations over discrete scattering sites as represented in Equation 7 and the subsequent ones can be replaced by integrals. If we replace the scattering length fj by a locally averaged scattering length density r(r), where r is a continuously variable position vector, Equation 7 can be rewritten "ð "2 " " IðKÞ ¼ "" rðrÞe2piKr d3 r""
ð78Þ
V
where the integration extends over the sample volume V. The scattering length density may vary over distances of the order dmin as indicated earlier, and it is sometimes useful to express rðrÞ ¼ rðrÞ þ r0
where Vp is the particle volume so that F(0) ¼ 1; we can write for Np identical particles Ia ðKÞ ¼
ð80Þ
V
ð83Þ
The interference (correlation) term in Equation 81 that we have neglected to arrive at in Equation 83, is the Fourier transform (K) of the static pair correlation function, ðKÞ ¼
1 X 2piKðri rj Þ e Np i ¼ j
ð84Þ
where ri and rj are the position vectors of the centers of particles labeled i and j. This function will only be zero for all nonzero K values if the interparticle distance distribution is completely random, as is approximately the case in very dilute systems. Equation 84 is also valid for oriented anisotropic particles if they are all identically oriented. In the more frequent cases of a random orientational distribution or discrete but multiple orientations of anisotropic particles, the appropriate averages of Fp ðKÞ2 have to be used. Scattering Functions for Special Cases. Many different particle form factors have been calculated by Guinier and Fournet (1955), some of which are reproduced as follows for the isotropic and uncorrelated distribution, i.e., spherically random distribution of identical particles. Spheres. For a system of noninteracting identical spheres of radius Rs ; the form factor is Fs ðKRS Þ ¼
Two-Phase Model. Let the sample contain Np particles with a homogeneous scattering length density rp ; and let these particles be embedded in a matrix of homogeneous scattering length density rm . From Equation 80, one obtains for the SAS scattering intensity per atom: ð81Þ
where N is the total number of atoms in the scattering volume and the integral extends over the volume V occupied by all particles in the irradiation sample. In the most general case, the above integral contains spatial and orientational correlations among particles, as well as effects due to size distributions. For a monodispersed
3½sin ð2pKRS Þ 2pKRS cos ð2pKRS Þ ð2pKÞ3 R3S
ð85Þ
Ellipsoids. Ellipsoids of revolution of axes 2a, 2a, and 2av yield the following form factor: 2
"ð "2 " " 1 Ia ðKÞ ¼ jrp rm j2 "" e2piKr d3 r"" N V
Np Vp2 jrp rm j2 jFp ðKÞj2 N
ð79Þ
where r0 is averaged over a volume larger than the resolution volume of the instrument (determined by the minimum observable value of K). Therefore, by discounting the Bragg peak, the diffuse intensity originating from inhomogeneities is "ð "2 " " IðKÞ ¼ "" rðrÞe2piKr d3 r""
system free of particle correlation, the single-particle form factor is ð 1 Fp ðKÞ ¼ e2piKr d3 r ð82Þ Vp V p
jFe ðKÞj ¼
ð 2p
2
jFs j
0
where Fs is the a ¼ tan1 ðv tanbÞ.
! 2pKav pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi cos b db sin2 a þ v2 cos2 a
ð86Þ
function
and
in
Equation
85
Cylinders. For a cylinder of diameter 2a and height 2H, the form factor becomes jFc ðKÞj2 ¼
ðp 2
sin2 ð2pKHcos bÞ 4J12 ð2pKa sin bÞ
0
ð2pKÞ2 H 2 cos2 b ð2pKÞ2 a2 sin2 b
sinb db ð87Þ
where J1 is the first-order spherical Bessel function. Porod (see Guinier and Fournet, 1955) has given an
KINEMATIC DIFFRACTION OF X RAYS
approximation form for Equation 87 valid for KH 1 and a * H which, in the intermediate range Ka < 1, reduces to 2 2 p jFc ðKÞj2 eð2pKÞ a =4 ð88Þ 4pKH For infinitesimally thin rods of length 2H, one can write jFrod ðKÞj2
Si ð4pKHÞ sin2 ð2pKHÞ 2pKH ð2pKÞ2 H 2 ðx
x
0
sin t dt t
ð90Þ
1 4KH
ð91Þ
For flat disks (i.e., when H * a), the scattering function for KH * 1 is jFdisk ðKÞj2 ¼
2 ð2pKÞ2 a2
1
1 J1 ð4pKaÞ 2pKa
ð92Þ
where J1 is the Bessel function. For KH < 1 * Ka, Equation 92 reduces to 2
jFdisk ðKÞj
2 ð2pKÞ2 a2
4p2 K 2 H2 =3
e
ð93Þ
The expressions given above are isotropic averages of particles of various shapes. When preferred alignment of particles occurs, modification to the above expressions must be made. General Properties of the SAS Function. Some general behavior of the scattering functions shown above are described. Extrapolation to K ¼ 0. If the measured scattering curve can be extrapolated to the origin of reciprocal space (i.e., K ¼ 0) one obtains, from Equation 83, a value for the factor Vp2 Np ðrp rm Þ2 =N; which, for the case of Np ¼ 1; rm ¼ 0, and rp ¼ Nf =Vp ; reduces to Ia ðKÞ ¼ Nf 2
If one chooses the center of gravity of a diffracting object as its origin , the second term is zero. The first term is the volume V of the object times r. The third integral is the second moment of the diffracting object, related to RG. K 2 R2G ¼
For the case KH 1, Equation 87 reduces to jFrod ðKÞj2
amplitude, as shown in Equation 80, may be expressed by a Taylor’s expansion up to the quadratic term: ð A ¼ r e2piKr d3 r v ð ð ð 4p2 d3 r þ 2piK rd3 r ðK rÞ2 d3 r ð95Þ
r 2 v v v
ð89Þ
where Si ðxÞ ¼
221
ð94Þ
For a system of scattering particles with known contrast and size, Equation 94 will yield N, the total number of atoms in the scattering volume. In the general case of unknown Vp ; Np ; and ðrp rm Þ; the results at K = 0 have to be combined with information obtained from other parts of the SAS curve.
1 V
ðK rÞ2 d3 r
ð96Þ
v
Thus the scattering amplitude in Equation 95 becomes 4p2 2 2 2 2 2 VRG K
rVe2p RG K A r V 2
ð97Þ
and for the SAS intensity for n independent, but identical, objects: 2
IðKÞ ¼ A A nðrÞ2 V 2 e4p
R2G K 2
ð98Þ
Equation 98 implies that for the small-angle approximation, that is, for small K or small 2y, the intensity can be approximated by a Gaussian function versus K2. By plotting ln I(K) versus K2 (known as the Guinier plot), a linear relationship is expected at small K with its slope proportional to RG, which is also commonly referred as the Guinier radius. The radius of gyration of a homogeneous particle has been defined in Equation 96. For a sphere of radius Rs, RG ¼ ð35Þ1=2 Rs ; and the Gaussian form of the SAS intensity function, as shown in Equation 98, coincides with the correct expression, Equations 83 and 85, up to the term proportional to K4. Subsequent terms in the two series expansions are in fair agreement and corresponding terms have the same sign. For this case, the Guinier approximation is acceptable over a wide range of KRG. For the oblate rotational ellipsoid with v ¼ 0.24 and the prolate one with v ¼ 1.88, the Guinier approximation coincides with the expansion of the scattering functions even up to K6 . In general, the concept of the radius of gyration is applicable to particles of any shapes, but the K range, where this parameter can be identified, may vary with different shapes. Porod Approximation. For homogeneous particles with sharp boundaries and a surface area Ap, Porod (see Guinier and Fournet, 1995) has shown that for large K IðKÞ
Guinier Approximation. Guinier has shown that at small values of Ka, where a is a linear dimension of the particles, the scattering function is approximately related to a simple geometrical parameter called the radius of gyration, RG. For small angles, K 2y/l, the scattering
ð
2pAp Vp2 ð2pKÞ4
ð99Þ
describes the average decrease of the scattering function. Damped oscillations about this average curve may occur in systems with very uniform particle size.
222
COMPUTATION AND THEORETICAL METHODS
Integrated Intensity (The Small-Angle Invariant). Integration of SAS intensity over all K values yields an invariant, Q. For a two-phase model system, this quantity is Q ¼ V 2 Cp ð1 Cp Þðrp rm Þ2
ð100Þ
where Cp is the volume fraction of the dispersed particles. What is noteworthy here is that this quantity enables one to determine either Cp or ðrp rm Þ if the other is known. Generally, the scattering contrast ðrp rm Þ is known or can be estimated and thus measurement of the invariant permits a determination of the volume fraction of the dispersed particles. Interparticle Interference Function. We have deliberately neglected interparticle interference terms (cf. Equation 84), to obtain Equation 83; its applicability is therefore restricted to very dilute systems, typically Np Vp < 0:01. As long as the interparticle distance remains much larger than the particle size, it will be possible to identify single-particle scattering properties in a somewhat restricted range, as interference effects will affect the scattering at lower K values only. However, in dense systems one approaches the case of macromolecular liquids, and both single-particle as well as interparticle effects must be realized over the whole K range of interest. For randomly oriented identical particles of arbitrary shape, interference effects can be included by writing (cf. Equations 83 and 84) 2
2
2
IðKÞ / fjFp ðKÞj jFp ðKÞj þ jFp ðKÞj Wi ðKÞg
ð101Þ
where the bar indicates an average of all directions of K, and the interference function Wi ðKÞ ¼ ðKÞ þ 1
ð102Þ
with the function given in Equation 84. The parameter Wi ðKÞ is formally identical to the liquid structure factor, and there is no fundamental difference in the treatment between the two. It is possible to introduce thermodynamic relationships if one defines an interaction potential for the scattering particles. For applications in the solid state, hard-core interaction potentials with an adjustable interaction range exceeding the dimensions of the particle may be used to rationalize interparticle interference effects. It is also possible to model interference effects by assuming a specific model, or using a statistical approach. As Wi ðKÞ ! 0 for large K, interference between particles is most prominently observed at the lower K values of the SAS curve. For spherical particles, the first two terms in Equation 101 are equal and the scattering cross-section, or the intensity, becomes 2
IðKÞ ¼ C1 jFs ðKRs Þj Wi ðK; Cs Þ
ð103Þ
where C1 is a constant factor appearing in Equation 83, Fs is the single-particle scattering form factor of a sphere, and
Wi is the interference function for rigid spheres of different concentration Cs ¼ Np Vp =V. The interference effects become progressively more important with increasing Cs . At large Cs values, the SAS curve shows a peak characteristic of the interference function used. When particle interference modifies the SAS profile, the linear portion of the Guinier plot usually becomes inaccessible at small K. Therefore, any straight line found in a Guinier plot at relatively large K is accidental and gives less reliable information about particle size. Size Distribution. Quite frequently, the size distribution of particles also complicates the interpretation of SAS patterns, and the single-particle characteristics such as RG, Lp ; Vp ; Ap ; and so on, defined previously for identical particles, will have to be replaced by appropriate averages over the size distribution function. In many cases, both particle interference and size distribution may appear simultaneously so that the SAS profile will be modified by both effects. Simple expressions for the scattering from a group of nonidentical particles can only be expected if interparticle interference is neglected. By generalizing Equation 83, one can write for the scattering of a random system of nonidentical particles without orientational correlation:
IðKÞ ¼
1X 2 2 V Npv r2v jFpv ðKÞj N v pv
ð104Þ
where v is a label for particles with a particular size parameter. The bar indicates orientational averaging. If the Guinier approximation is valid for even the largest particles in a size distribution, an experimental radius of gyration determined from the lower-K end of the scattering curve in Guinier representation will correspond to the largest sizes in the distribution. The Guinier plot will show positive curvature similar to the scattering function of nonspherical particles. There is obviously no unique way to deduce the size distribution of particles of unknown shape from the measured scattering profile, although it is much easier to calculate the cross-section for a given model. For spherical particles, several attempts have been made to obtain the size distribution function or certain characteristics of it experimentally, but even under these simplified conditions wide distributions are difficult to determine.
ACKNOWLEDGMENTS This chapter is dedicated to Professor Jerome B. Cohen of Northwestern University, who passed away suddenly on November 7, 1999. The author received his education on crystallography and diffraction under the superb teaching of Professor Cohen. The author treasures his over 27 years of collegial interaction and friendship with Jerry. The author also wishes to acknowledge J. B. Cohen, J. E. Epperson, J. P. Anderson, H. Hong, R. D. Aburano, N. Takesue, G. Wirtz, and T. C. Chiang for their direct or
KINEMATIC DIFFRACTION OF X RAYS
indirect discussions, collaborations, and/or teaching over the past 25 years. The preparation of this unit is supported in part by the U.S. Department of Energy, Office of Basic Energy Science, under contract No. DEFH02-96ER45439, and in part by the state of Illinois Board of Higher Education, under a grant number NWU98 IBHE HECA through the Frederick Seitz Materials Research Laboratory at the University of Illinois at Urbana-Champaign.
223
Faxen, H. 1918. Die bei Interferenz von Rontgenstrahlen durch die Warmebewegung entstehende zerstreute Strahlung. Ann. Phys. 54:615–620. Faxen, H. 1923. Die bei Interferenz von Rontgenstrahlen infolge der Warmebewegung entstehende Streustrahlung. Z. Phys. 17:266–278. Guinier, A. 1994. X-ray Diffraction in Crystals, Imperfect crystals, and amorphous bodies. Dover Pub. Inc., New York. Guinier, A. and Fournet, G. 1955. Small-Angle Scattering of X-rays. John Wiley & Sons, New York.
LITERATURE CITED Aburano, R. D., Hong, H., Roesler, J. M., Chung, K., Lin, D.-S., Chen, H., and Chiang, T.-C. 1995. Boundary structure determination of Ag/Si(111) interfaces by X-ray diffraction. Phys. Rev. B 52(3):1839–1847. Anderson, J. P. and Chen, H. 1994. Determination of the shortrange order structure of Au-25At. Pct. Fe using wide-angle diffuse synchrotron X-ray scattering. Metall. Mater. Trans. 25A:1561–1573. Auvray, X., Georgopoulos, J., and Cohen, J. B. 1977. The structure of G.P.I. zones in Al-1.7AT.%Cu. Acta Metall. 29:1061– 1075. Azaroff, L. V. and Buerger, M. J. 1958. The Powder Method in Xray Crystallography. McGraw-Hill, New York. Borie, B. S. 1961. The separation of short range order and size effect diffuse scattering. Acta Crystallogr. 14:472–474. Borie, B. S. and Sparks, Jr., C. J. 1971. The interpretation of intensity distributions from disordered binary alloys. Acta Crystallogr. A27:198–201. Buerger, M. J. 1960. Crystal Structure Analysis. John Wiley & Sons, New York. Chen, H. 1996. Review of surface/interface X-ray diffraction. Mater. Chem. Phys. 43:116–125. Chen, H., Comstock, R. J., and Cohen, J. B. 1979. The examination of local atomic arrangements associated with ordering. Annu. Rev. Mater. Sci. 9:51–86. Chung, K. S., Hong, H., Aburano, R. D., Roesler, J. M., Chiang, T. C., and Chen, H. 1997. Interface structure of Cu thin films on C-plane sapphire using X-ray truncation rod analysis. In Proceedings of the Symposium on Applications of Synchrotron Radiation to Materials Science III. Vol. 437. San Francisco, Calif. Cowley, J. M. 1950. X-ray measurement of order in single crystal of Cu3 Au. J. Appl. Phys. 21:24–30. Cullity, B. D. 1978. Elements of X-ray Diffraction. Addison Wesley, Reading, Mass. Debye, P. 1913a. Uber den Einfluts der Warmebewegung auf die Interferenzerscheinungen bei Rontgenstrahlen. Verh. Deutsch. Phys. Ges. 15:678–689. Debye, P. 1913b. Uber die Interasitatsvertweilung in den mit Rontgenstrahlen erzeugten Interferenzbildern. Verh. Deutsch. Phys. Ges. 15:738–752. Debye, P. 1913c. Spektrale Zerlegung der Rontgenstrahlung mittels Reflexion und Warmebewegung. Verh. Deutsch. Phys. Ges. 15:857–875. Debye, P. 1913–1914. Interferenz von Rontgenstrahlen und Warmebewegung. Ann. Phys. Ser. 4, 43:49.
Hendricks, R. W. and Borie, B. S. 1965. On the Determination of the Metastable Miscibility Gap From Integrated Small-Angle X-Ray Scattering Data. In Proc. Symp. On Small Angle X-Ray Scattering (H. Brumberger, ed.) pp. 319–334. Gordon and Breach, New York. Hong, H., Aburano, R. D., Chung, K., Lin, D.-S., Hirschorn, E. S., Chiang, T.-C., and Chen, H. 1996. X-ray truncation rod study of Ge(001) surface roughening by molecular beam homoepitaxial growth. J. Appl. Phys. 79:6858–6864. Hong, H., Aburano, R. D., Hirschorn, E. S., Zschack, P., Chen, H., and Chiang, T. C. 1993. Interaction of (1!2)-reconstructed Si(100) and Ag(110):Cs surfaces with C60 overlayers. Phys. Rev. B 47:6450–6454. Hong, H., Aburano, R. D., Lin, D. S., Chiang, T. C., Chen, H., Zschack, P., and Specht, E. D. 1992b. Change of Si(111) surface reconstruction under noble metal films. In MRS Proceeding Vol. 237 (K. S. Liang, M. P. Anderson, R. J. Bruinsma and G. Scoles, eds.) pp. 387–392. Materials Research Society, Warrendale, Pa. Hong, H., McMahon, W. E., Zschack, P., Lin, D. S., Aburano, R. D., Chen, H., and Chiang, T.C. 1992a. C60 Encapsulation of the Si(111)-(7!7) Surface. Appl. Phys. Lett. 61(26):3127– 3129. International Table for Crystallography 1996. International Union of Crystallography: Birmingham, England. James, R. W. 1948. Optical Principles of the Diffraction of X-rays. G. Bell and Sons, London. Klug, H. P. and Alexander, L. E. 1974. X-ray Diffraction Procedures. John Wiley & Sons, New York. Krivoglaz, M. A. 1969. Theory of X-ray and Thermal-Neutron Scattering by Real Crystals. Plenum, New York. Noyan, I. C. and Cohen, J. B. 1987. Residual Stress: Measurement by Diffraction and Interpretation. Springer-Verlag, New York. Robinson, I. K. and Tweet, D. J. 1992. Surface x-ray diffraction. Rep. Prog. Phys. 55:599–651. Schultz, J. M. 1982. Diffraction for Materials Science. PrenticeHall, Englewood Cliffs, N.J. Schwartz, L. H. and Cohen, J. B. 1987. Diffraction from Materials. Springer-Verlag, New York. Somorjai, G. A. 1981. Chemistry in Two Dimensions: Surfaces. Cornell University Press, Ithaca, N.Y. Sparks, C. J. and Borie, B. S. 1966. Methods of analysis for diffuse X-ray scatterling modulated by local order and atomic displacements. In Local Atomic Arrangement Studied by X-ray Diffraction (J. B. Cohen and J. E. Hilliard, eds.) pp. 5–50. Gordon and Breach, New York.
Dvorack, M. A. and Chen, H. 1983. Thermal diffuse x-ray scattering in b-phase Cu-Al-Ni alloy. Scr. Metall. 17:131–134.
Takesue, N., Kubo, H., and Chen, H. 1997. Thermal diffuse X-ray scattering study of anharmonicity in cubic barium titanate. J. Nucl. Instr. Methods Phys. Res. B133:28–33.
Epperson, J. E., Anderson, J. P., and Chen, H. 1994. The diffusescattering method for investigating locally ordered binary solid solution. Metal. Mater. Trans. 25A:17–35.
Tibbals, J. E. 1975. The separation of displacement and substitutional disorder scattering: a correction from structure-factor ratio variation. J. Appl. Crystallogr. 8:111–114.
224
COMPUTATION AND THEORETICAL METHODS
von Laue, M. 1936. Die autsere Form der Kristalle in ihrem Einfluts auf die Interferenzerscheinungen an Raumgittern. Ann. Phys. 5(26):55–68. Waller, J. 1923. Zur Frage der Einwirkung der Warmebewegung auf die Interferenz von Rontgenstrahlen. Z. Phys. 17:398– 408. Warren, B. E. 1969. X-Ray Diffraction. Addison-Wesley, Reading, Mass. Warren, B. E., Averbach, B. L., and Roberts, B. W. 1951. Atomic size effect in the x-ray scattering in alloys. J. Appl. Phys. 22(12):1493–1496.
scientists who wish to seek solutions by means of diffraction techniques. Warren, 1969. See above. The emphasis of this book is a rigorous development of the basic diffraction theory. The treatment is carried far enough to relate to experimentally observable quantities. The main part of this book is devoted to the application of x-ray diffraction methods to both crystalline and amorphous materials, and to both perfect and imperfect crystals. This book is not intended for beginners.
APPENDIX: GLOSSARY OF TERMS AND SYMBOLS KEY REFERENCES Cullity, 1978. See above. Purpose of this book is to acquaint the reader who has little or no previous knowledge of the subject with the theory of x-ray diffraction, the experimental methods involved, and the main applications. Guinier, 1994. See above. Begins with the general theory of diffraction, and then applies this theory to various atomic structures, amorphous bodies, crystals, and imperfect crystals. Author has assumed that the reader is familiar with the elements of crystallography and x-ray diffraction. Should be especially useful for solid-state physicists, metallorgraphers, chemists, and even biologists. International Table for Crystallography 1996. See above. Purpose of this series is to collect and critically evaluate modern, advanced tables and texts on well-established topics that are relevant to crystallographic research and for applications of crystallographic methods in all sciences concerned with the structure and properties of materials. James, 1948. See above. Intended to provide an outline of the general optical principles underlying the diffraction of x rays by matter, which may serve as a foundation on which to base subsequent discussions of actual methods and results. Therefore, all details of actual techniques, and of their application to specific problems have been considered as lying beyond the scope of the book. Klug, and Alexander, 1974. See above. Contains details of many x-ray diffraction experimental techniques and analysis for powder and polycrystalline materials. Serves as textbook, manual, and teacher to plant workers, graduate students, research scientists, and others who seek to work in or understand the field.
s0 s rn un(t) K f I(K) F(K) G(K) Ni dij dgj dp ; dq H Ie r(r) M g agj egj k ogj egj apq Rlmn Q R, S RG Cp
incident x-ray direction scattered x-ray direction vector from origin to nth lattice point time-dependent dynamic displacement vector scattering vector, reciprocal lattice space location scattering power, length, or amplitude (of an atom relative to that of a single electron) scattering intensity structure factor interference function number of unit cell i Kronecker delta function arbitrary phase factor vector displacements reciprocal space vector, reciprocal lattice vector Thomson scattering per electron electron density function, locally averaged scattering length intensity Debye-Waller temperature factor lattice wave, propogation wave vector vibrational amplitude for the gj wave polarization vector for the gj wave Boltzmann’s constant eigenvalue of the phonon branches eigenvector of the phonon branches Cowley-Warren parameter interatomic vector first-order size effects scattering component second-order atomic displacement terms radius of gyration volume fraction of the dispersed particles
Schultz, 1982. See above. The thrust of this book is to convince the reader of the universality and utility of the scattering method in solving structural problems in materials science. This textbook is aimed at teaching the fundamentals of scattering theory and the broad scope of applications in solving real problems. It is intended that this book be augmented by additional notes dealing with experimental practice. Schwartz, and Cohen, 1987. See above. Covers an extensive list of topics with many examples. It deals with crystallography and diffraction for both perfect and imperfect crystals and contains an excellent set of advanced problem solving home works. Not intended for beginners, but serves the purpose of being an excellent reference for materials
HAYDN CHEN University of Illinois at Urbana-Champaign Urbana, Illinois
DYNAMICAL DIFFRACTION INTRODUCTION Diffraction-related techniques using x rays, electrons, or neutrons are widely used in materials science to provide basic structural information on crystalline materials. To
DYNAMICAL DIFFRACTION
describe a diffraction phenomenon, one has the choice of two theories: kinematic or dynamical. Kinematic theory, described in KINEMATIC DIFFRACTION OF X RAYS, assumes that each x-ray photon, electron, or neutron scatters only once before it is detected. This assumption is valid in most cases for x rays and neutrons since their interactions with materials are relatively weak. This singlescattering mechanism is also called the first-order Born approximation or simply the Born approximation (Schiff, 1955; Jackson, 1975). The kinematic diffraction theory can be applied to a vast majority of materials studies and is the most commonly used theory to describe x-ray or neutron diffraction from crystals that are imperfect. There are, however, practical situations where the higher-order scattering or multiple-scattering terms in the Born series become important and cannot be neglected. This is the case, for example, with electron diffraction from crystals, where an electron beam interacts strongly with electrons in a crystal. Multiple scattering can also be important in certain application areas of x-ray and neutron scattering, as described below. In all these cases, the simplified kinematic theory is not sufficient to evaluate the diffraction processes and the more rigorous dynamical theory is needed where multiple scattering is taken into account. Application Areas Dynamical diffraction is the predominant phenomenon in almost all electron diffraction applications, such as lowenergy electron diffraction (LOW-ENERGY ELECTRON DIFFRACTION) and reflection high-energy electron diffraction. For x rays and neutrons, areas of materials research that involve dynamical diffraction may include the situations discussed in the next six sections.
225
perfect crystals. Often, centimeter-sized perfect semiconductor crystals such as GaAs and Si are used as substrate materials, and multilayers and superlattices are deposited using molecular-beam or chemical vapor epitaxy. Bulk crystal growers are also producing larger high-quality crystals by advancing and perfecting various growth techniques. Characterization of these large nearly perfect crystals and multilayers by diffraction techniques often involves the use of dynamical theory simulations of the diffraction profiles and intensities. Crystal shape and its geometry with respect to the incident and the diffracted beams can also influence the diffraction pattern, which can only be accounted for by dynamical diffraction.
Topographic Studies of Defects. X-ray diffraction topography is a useful technique for studying crystalline defects such as dislocations in large-grain nearly perfect crystals (Chikawa and Kuriyama, 1991; Klapper, 1996; Tanner, 1996). With this technique, an extended highly collimated x-ray beam is incident on a specimen and an image of one or several strong Bragg reflections are recorded with high-resolution photographic films. Examination of the image can reveal micrometer (mm)-sized crystal defects such as dislocations, growth fronts, and fault lines. Because the strain field induced by a defect can extend far into the single-crystal grain, the diffraction process is rather complex and a quantitative interpretation of a topographic image frequently requires the use of dynamical theory and its variation on distorted crystals developed by Takagi (1962, 1969) and Taupin (1964).
Strong Bragg Reflections. For Bragg reflections with large structure factors, the kinematic theory often overestimates the integrated intensities. This occurs for many real crystals such as minerals and even biological crystals such as proteins, since they are not ideally imperfect. The effect is usually called the extinction (Warren, 1969), which refers to the extra attenuation of the incident beam in the crystal due to the loss of intensity to the diffracted beam. Its characteristic length scale, extinction length, depends on the structure factor of the Bragg reflection being measured. One can further categorize extinction effects into two types: primary extinction, which occurs within individual mosaic blocks in a mosaic crystal, and secondary extinction, which occurs for all mosaic blocks along the incident beam path. Primary extinction exists when the extinction length is shorter than the average size of mosaic blocks and secondary extinction occurs when the extinction length is less than the absorption length in the crystal.
Internal Field-Dependent Diffraction Phenomena. Several diffraction techniques make use of the secondary excitations induced by the wave field inside a crystal under diffraction conditions. These secondary signals may be x-ray fluorescence (X-RAY MICROPROBE FOR FLUORESCENCE AND DIFFRACTION ANALYSIS) or secondary electrons such as Auger (AUGER ELECTRON SPECTROSCOPY) or photoelectrons. The intensities of these signals are directly proportional to the electric field strength at the atom position where the secondary signal is generated. The wave field strength inside the crystal is a sensitive function of the crystal orientation near a specular or a Bragg reflection, and the dynamical theory is the only theory that provides the internal wave field amplitudes including the interference between the incident and the diffracted waves or the standing wave effect (Batterman, 1964). As a variation of the standing wave effect, the secondary signals can be diffracted by the crystal lattice and form standing wave-like diffraction profiles. These include Kossel lines for x-ray fluorescence (Kossel et al., 1935) and Kikuchi (1928) lines for secondary electrons. These effects can be interpreted as the optical reciprocity phenomena of the standing wave effect.
Large Nearly Perfect Crystals and Multilayers. It is not uncommon in today’s materials preparation and crystal growth laboratories that one has to deal with large nearly
Multiple Bragg Diffraction Studies. If a single crystal is oriented in such a way that more than one reciprocal node falls on the Ewald sphere of diffraction, a simultaneous multiple-beam diffraction will occur. These
226
COMPUTATION AND THEORETICAL METHODS
simultaneous reflections were first discovered by Renninger (1937) and are often called Renninger reflections or detour reflections (Umweganregung, ‘‘detour’’ in German). Although the angular positions of the simultaneous reflections can be predicted from simple geometric considerations in reciprocal space (Cole et al., 1962), a theoretical formalism that goes beyond the kinematic theory or the first-order Born approximation is needed to describe the intensities of a multiple-beam diffraction (Colella, 1974). Because of interference among the simultaneously excited Bragg beams, multiple-beam diffraction promises to be a practical solution to the phase problem in diffractionbased structural determination of crystalline materials, and there has been a great renewed interest in this research area (Shen, 1998; 1999a,b; Chang et al., 1999). Grazing-Incidence Diffraction. In grazing-incidence diffraction geometry, either the incident beam, the diffracted beam, or both has an incident or exit angle, with respect to a well-defined surface, that is close to the critical angle of the diffracting crystal. Full treatment of the diffraction effects in a grazing-angle geometry involves Fresnel specular reflection and requires the concept of an evanescent wave that travels parallel to the surface and decays exponentially as a function of depth into the crystal. The dynamical theory is needed to describe the specular reflectivity and the evanescent wave-related phenomena. Because of its surface sensitivity and adjustable probing depth, grazing-incidence diffraction of x rays and neutrons has evolved into an important technique for materials research and characterization. Brief Literature Survey Dynamical diffraction theory of a plane wave by a perfect crystal was originated by Darwin (1914) and Ewald (1917), using two very different approaches. Since then the early development of the dynamical theory has primarily been focused on situations involving only an incident beam and one Bragg-diffracted beam, the so-called two-beam case. Prins (1930) extended Darwin’s theory to take absorption into account, and von Laue (1931) reformulated Ewald’s approach and formed the backbone of modern-day dynamical theory. Reviews and extensions of the theory have been given by Zachariasen (1945), James (1950), Kato (1952), Warren (1969), and Authier (1970). A comprehensive review of the Ewald–von Laue theory has been provided by Batterman and Cole (1964) in their seminal article in Review of Modern Physics. More recent reviews can be found in Kato (1974), Cowley (1975), and Pinsker (1978). Updated and concise summaries of the two-beam dynamical theory have been given recently by Authier (1992, 1996). A historical survey of the early development of the dynamical theory was given in Pinsker (1978). Contemporary topics in dynamical theory are mainly focused in the following four areas: multiple-beam diffraction, grazing-incidence diffraction, internal fields and standing waves, and special x-ray optics. These modern developments are largely driven by recent interests in rapidly emerging fields such as synchrotron radiation, xray crystallography, surface science, and semiconductor research.
Dynamical theory of x rays for multiple-beam diffraction, with two or more Bragg reflections excited simultaneously, was considered by Ewald and Heno (1968). However, very little progress was made until Colella (1974) developed a computational algorithm that made multiple-beam x-ray diffraction simulations more tractable. Recent interests in its applications to measure the phases of structure factors (Colella, 1974; Post, 1977; Chapman et al., 1981; Chang, 1982) have made multiplebeam diffraction an active area of research in dynamical theory and experiments. Approximate theories of multiple-beam diffraction have been developed by Juretschke (1982, 1984, 1986), Hoier and Marthinsen (1983), Hu¨ mmer and Billy (1986), Shen (1986, 1999b,c), and Thorkildsen (1987). Reviews on multiple-beam diffraction have been given by Chang (1984, 1992, 1998), Colella (1995), and Weckert and Hu¨ mmer (1997). Since the pioneer experiment by Marra et al. (1979), there has been an enormous increase in the development and use of grazing-incidence x-ray diffraction to study surfaces and interfaces of solids. Dynamical theory for the grazing-angle geometry was soon developed (Afanasev and Melkonyan, 1983; Aleksandrov et al., 1984) and its experimental verifications were given by Cowan et al. (1986), Durbin and Gog (1989), and Jach et al. (1989). Meanwhile, a semikinematic theory called the distortedwave Born approximation was used by Vineyard (1982) and by Dietrich and Wagner (1983, 1984). This theory was further developed by Dosch et al. (1986) and Sinha et al. (1988), and has become widely utilized in grazingincidence x-ray scattering studies of surfaces and nearsurface structures. The theory has also been extended to explain standing-wave-enhanced and nonspecular scattering in multilayer structures (Kortright and FischerColbrie, 1987), and to include phase-sensitive scattering in diffraction from bulk crystals (Shen, 1999b,c). Direct experimental proof of the x-ray standing wave effect was first achieved by Batterman (1964) by observing x-ray fluorescence profiles while the diffracting crystal was rotated through a Bragg reflection. While earlier works were mainly on locating impurity atoms in bulk semiconductor materials (Batterman, 1969; Golovchenko et al., 1974; Anderson et al., 1976), more recent research activities focus on determinations of atom locations and distributions in overlayers above crystal surfaces (Golovchenko et al., 1982; Funke and Materlik, 1985; Durbin et al., 1986; Patel et al., 1987; Bedzyk et al., 1989), in synthetic multilayers (Barbee and Warburton, 1984; Kortright and Fischer-Colbrie, 1987), in long-period overlayers (Bedzyk et al., 1988; Wang et al., 1992), and in electrochemical solutions (Bedzyk et al., 1986). Recent reviews on x-ray standing waves are given by Patel (1996) and Lagomarsino (1996). The rapid increase in synchrotron radiation-based materials research in recent years has spurred new developments in x-ray optics (Batterman and Bilderback, 1991; Hart, 1996). This is especially true in the areas of x-ray wave guides for producing submicron-sized beams (Bilderback et al., 1994; Feng et al., 1995), and x-ray phase plates and polarization analyzers used for studies on magnetic materials (Golovchenko et al., 1986; Mills, 1988; Belyakov
DYNAMICAL DIFFRACTION
and Dmitrienko, 1989; Hirano et al., 1991; Batterman, 1992; Shen and Finkelstein, 1992; Giles et al., 1994; Yahnke et al., 1994; Shastri et al., 1995). Recent reviews on polarization x-ray optics have been given by Hirano et al. (1995), Shen (1996a), and Malgrange (1996). An excellent collection of articles on these and other current topics in dynamical diffraction can be found in X-ray and Neutron Dynamical Diffraction Theory and Applications (Authier et al., 1996). Scope of This Unit Given the wide range of topics in dynamical diffraction, the main purpose of this unit is not to cover every detail but to provide readers with an overview of basic concepts, formalisms, and applications. Special attention is paid to the difference between the more familiar kinematic theory and the more complex dynamical approach. Although the basic dynamical theory is the same for x rays, electrons, and neutrons, we will focus mainly on x rays since much of the original terminology was founded in x-ray dynamical diffraction. The formalism for x rays is also more complex—and thus more complete—because of the vector-field nature of electromagnetic waves. For reviews on dynamical diffraction of electrons and neutrons, we refer the readers to an excellent textbook by Cowley (1975), Moodie et al. (1997), and a recent article by Schlenker and Guigay (1996). We will start in the Basic Principles section with the fundamental equations and concepts in dynamical diffraction theory, which are derived from classical electrodynamics. Then, in the Two-Beam Diffraction section, we move onto the widely used two-beam approximation, essentially following the description of Batterman and Cole (1964). The two-beam theory deals only with the incident beam and one strongly diffracted Bragg beam, and the multiple scattering between them; multiple scattering due to other Bragg reflections are ignored. This theory provides many basic concepts in dynamical diffraction, and is very useful in visualizing the unique physical phenomena in dynamical scattering. A full multiple-beam dynamical theory, developed by Colella (1974), takes into account all multiple-scattering effects and surface geometries as well as giving the most complete description of the diffraction processes of x rays, electrons, or neutrons in a perfect crystal. An outline of this theory is summarized in the Multiple-Beam Diffraction section. Also included in that section is an approximate formalism, given by Shen (1986), based on secondorder Born approximations. This theory takes into account only double scattering in a multiple-scattering regime yet provides a useful picture of the physics of multiple-beam interactions. Finally, an approximate yet more accurate multiple-beam theory (Shen, 1999b) based on an expanded distorted-wave approximation is presented, which can provide accurate accounts of three-beam interference profiles in the so-called reference-beam diffraction geometry (Shen, 1998). In the Grazing-Angle Diffraction section, the main results for grazing-incidence diffraction are described using the dynamical treatment. Of particular importance
227
is the concept of evanescent waves and its applications. Also described in this section is a so-called distortedwave Born approximation, which uses dynamical theory to evaluate specular reflections but treats surface diffraction and scattering within the kinematic regime. This approximate theory is useful in structural studies of surfaces and interfaces, thin films, and multilayered heterostructures. Finally, because of limited space, a few topics are not covered in this unit. One of these is the theory by Takagi and Taupin for distorted perfect crystals. We refer the readers to the original articles (Takagi, 1962, 1969; Taupin, 1964) and to recent publications by Bartels et al. (1986) and by Authier (1996).
BASIC PRINCIPLES There are two approaches to the dynamical theory. One, based on work by Darwin (1914) and Prins (1930), first finds the Fresnel reflectance and transmittance for a single atomic plane and then evaluates the total wave fields for a set of parallel atomic planes. The diffracted waves are obtained by solving a set of difference equations similar to the ones used in classical optics for a series of parallel slabs or optical filters. Although it had not been widely used for a long time due to its computational complexity, Darwin’s approach has gained more attention in recent years as a means to evaluate reflectivities for multilayers and superlattices (Durbin and Follis, 1995), for crystal truncation effects (Caticha, 1994), and for quasicrystals (Chung and Durbin, 1995). The other approach, developed by Ewald (1917) and von Laue (1931), treats wave propagation in a periodic medium as an eigenvalue problem and uses boundary conditions to obtain Bragg-reflected intensities. We will follow the Ewald–von Laue approach since many of the fundamental concepts in dynamical diffraction can be visualized more naturally by this approach and it can be easily extended to situations involving more than two beams. In the early literature of dynamical theory (for two beams), the mathematical forms for the diffracted intensities from general absorbing crystals appear to be rather complicated. The main reason for these complicated forms is the necessity to separate out the real and imaginary parts in dealing with complex wave vectors and wave field amplitudes before the time of computers and powerful calculators. Today these complicated equations are not necessary and numerical calculations with complex variables can be easily performed on a modern computer. Therefore, in this unit, all final intensity equations are given in compact forms that involve complex numbers. In the author’s view, these forms are best suited for today’s computer calculations. These simpler forms also allow readers to gain physical insights rather than being overwhelmed by tedious mathematical notations. Fundamental Equations The starting point in the Ewald–von Laue approach to dynamical theory is that the dielectric function eðrÞ in a
228
COMPUTATION AND THEORETICAL METHODS
crystalline material is a periodic function in space, and therefore can be expanded in a Fourier series: eðrÞ ¼ e0 þ deðrÞ with deðrÞ ¼
X
FH eiHr
ð1Þ
H
˚ is the classiwhere ¼ re l2 =ðpVc Þ and re ¼ 2:818 ! 105 A cal radius of an electron, l is the x-ray wavelength, Vc is the unit cell volume, and FH is the coefficient of the H Fourier component with FH being the structure factor. All of the Fourier coefficients are on the order of 105 to 106 or smaller at x-ray wavelengths, deðrÞ * e0 ¼ 1, and the dielectric function is only slightly less than unity. We further assume that a monochromatic plane wave is incident on a crystal, and the dielectric response is of the same wave frequency (elastic response). Applying Maxwell’s equations and neglecting the magnetic interactions, we obtain the following equation for the electric field E and the displacement vector D: ðr2 þ k20 ÞD ¼ r ! r ! ðD e0 EÞ
ð2Þ
where k0 is the wave vector of the monochromatic wave in vacuum, k0 ¼ jk0 j ¼ 2p=l. For treatment involving magnetic interactions, we refer to Durbin (1987). If we assume an isotropic relation between D(r) and E(r), DðrÞ ¼ eðrÞEðrÞ, and deðrÞ * e0 , we have
Figure 1. ðAÞ Ewald sphere construction in kinematic theory and polarization vectors of the incident and the diffracted beams. ðBÞ Dispersion surface in dynamical theory for a one-beam case and boundary conditions for total external reflection.
ðr2 þ k20 ÞD ¼ r ! r ! ðdeDÞ
The introduction of the dispersion surface is the most significant difference between the kinematic and the dynamical theories. Here, instead of a single Ewald sphere (Fig. 1A), we have a continuous distribution of ‘‘Ewald spheres’’ with their centers located on the dispersion surface, giving rise to all possible traveling wave vectors inside the crystal. As an example, we assume that the crystal orientation is far from any Bragg reflections, and thus only one beam, the incident beam K0 , would exist in the crystal. For this ‘‘one-beam’’ case, Equation 5 becomes
ð3Þ
We now use the periodic condition, Equation 1, and substitute for the wave field D in Equation 3 a series of Bloch waves with wave vectors KH ¼ K0 þ H, DðrÞ ¼
X
DH eiKH r
ð4Þ
H
where H is a reciprocal space vector of the crystal. For every Fourier component (Bloch wave) H, we arrive at the following equation: 2 DH ¼ ½ð1 F0 Þk20 KH
X
FHG KH ! ðKH ! DG Þ ð5Þ
G6¼H
where H–G is the difference reciprocal space vector between H and G, the terms involving 2 have been neglected and KH DH are set to zero because of the transverse wave nature of the electromagnetic radiation. Equation 5 forms a set of fundamental equations for the dynamical theory of x-ray diffraction. Similar equations for electrons and neutrons can be found in the literature (e.g., Cowley, 1975). Dispersion Surface A solution to the eigenvalue equation (Equation 5) gives rise to all the possible wave vectors KH and wave field amplitude ratios inside a diffracting crystal. The loci of the possible wave vectors form a multiple-sheet threedimensional (3D) surface in reciprocal space. This surface is called the dispersion surface, as given by Ewald (1917).
½ð1 F0 Þk20 K02 D0 ¼ 0
ð6Þ
K0 ¼ k0 =ð1 þ F0 Þ1=2 ffi k0 ð1 F0 =2Þ
ð7Þ
Thus, we have
which shows that the wave vector K0 inside the crystal is slightly shorter than that in vacuum as a result of the average index of refraction, n ¼ 1 F00 =2 where F00 is the real part of F0 and is related to the average density r0 by r0 ¼
pF00 re l 2
ð8Þ
In the case of absorbing crystals, K0 and F0 are complex variables and the imaginary part, F000 of F0 , is related to the average linear absorption coefficient m0 by m0 ¼ k0 F000 ¼ 2pF000 =l
ð9Þ
DYNAMICAL DIFFRACTION
Equation 7 shows that the dispersion surface in the onebeam case is a refraction-corrected sphere centered around the origin in reciprocal space, as shown in Figure 1B. Boundary Conditions Once Equation 5 is solved and all possible waves inside the crystal are obtained, the necessary connections between wave fields inside and outside the crystal are made through the boundary conditions. There are two types of boundary conditions in classical electrodynamics (Jackson, 1974). One states that the tangential components of the wave vectors have to be equal on both sides of an interface (Snell’s law): kt ¼ Kt
ð10Þ
Throughout this unit, we use the convention that outside vacuum wave vectors are denoted by k and internal wave vectors are denoted by K, and the subscript t stands for the tangential component of the vector. To illustrate this point, we again consider the simple one-beam case, as shown in Figure 1B. Suppose that an x-ray beam k0 with an angle y is incident on a surface with n being its surface normal. To locate the proper internal wave vector K0 , we follow along n to find its intersection with the dispersion surface, in this case, the sphere with its radius defined by Equation 7. However, we see immediately that this is possible only if y is greater than a certain incident angle yc , which is the critical angle of the material. From Figure 1B, we can easily obtain that cos yc ¼ K0 =k0 , or for small angles, yc ¼ ðF0 Þ1=2 . Below yc no traveling wave solutions are possible and thus total external reflection occurs. The second set of boundary conditions states that the tangential components of the electric and magnetic field ^ ! E (k ^ is a unit vector along the provectors, E and H ¼ k pagation direction), are continuous across the boundary. In dynamical theory literature, the eigenequations for dispersion surfaces are expressed in terms of either the electric field vector E or the electric displacement vector D. These two choices are equivalent, since in both cases a small longitudinal component on the order of F0 in the E-field vector is ignored, because its inclusion only contributes a term of 2 in the dispersion equation. Thus E and D are interchangeable under this assumption and the boundary conditions can be expressed as the following: out Din t ¼ Dt ^ ! Din Þ ¼ ðK ^ ! Dout Þ ðk t
ð11aÞ t
ð11bÞ
In dynamical diffraction, the boundary condition, Equation 10, or Snell’s law selects which points are excited on the dispersion surface or which waves actually exist inside the crystal for a given incident condition. The conditions, Equation 11a and Equation 11b, on the field vectors are then used to evaluate the actual internal field amplitudes and the diffracted wave intensities outside the crystal. Dynamical theory covers a wide range of specific topics, which depend on the number of beams included in the dispersion equation, Equation 5, and the diffraction geometry
229
of the crystal. In certain cases, the existence of some beams can be predetermined based on the physical law of energy conservation. In these cases, only Equation 11a is needed for the field boundary condition. Such is the case of conventional two-beam diffraction, as discussed in the Internal Fields section. However, both sets of conditions in Equation 11 are needed for general multiple-beam cases and for grazing-angle geometries. Internal Fields One of the important applications of dynamical theory is to evaluate the wave fields inside the diffracting crystal, in addition to the external diffracted intensities. Depending on the diffraction geometry, an internal field can be a periodic standing wave as in the case of a Bragg diffraction, an exponentially decayed evanescent wave as in the case of a specular reflection, or a combination of the two. Although no detectors per se can be put inside a crystal, the internal field effects can be observed in one of the following two ways. The first is to detect secondary signals produced by an internal field, which include x-ray fluorescence (X-RAY MICROPROBE FOR FLUORESCENCE AND DIFFRACTION ANALYSIS), Auger electrons (AUGER ELECTRON SPECTROSCOPY), and photoelectrons. These inelastic secondary signals are directly proportional to the internal field intensity and are incoherent with respect to the internal field. Examples of this effect include the standard x-ray standing wave techniques and depth-sensitive x-ray fluorescence measurements under total external reflection. The other way is to measure the elastic scattering of an internal field. In most cases, including the standing wave case, an internal field is a traveling wave along a certain direction, and therefore can be scattered by atoms inside the crystal. This is a coherent process, and the scattering contributions are added on the level of amplitudes instead of intensities. An example of this effect is the diffuse scattering of an evanescent wave in studies of surface or nearsurface structures. TWO-BEAM DIFFRACTION In the two-beam approximation, we assume only one Bragg diffracted wave KH is important in the crystal, in addition to the incident wave K0 . Then, Equation 5 reduces to the following two coupled vector equations: (
½ð1 F0 Þk20 K02 D0 ¼ FH K0 ! ðK0 ! DH Þ 2 DH ¼ FH KH ! ðKH ! D0 Þ ½ð1 F0 Þk20 KH
ð12Þ
The wave vectors K0 and KH define a plane that is usually called the scattering plane. If we use the coordinate system shown in Figure 1A, we can decompose the wave field amplitudes into s and p polarization directions. Now the equations for the two polarization states decouple and can be solved separately (
½ð1 F0 Þk20 K02 D0s;p k20 FH PDHs;p ¼ 0 2 DHs;p ¼ 0 k20 FH PD0s;p þ ½ð1 F0 Þk20 KH
ð13Þ
230
COMPUTATION AND THEORETICAL METHODS
where P ¼ sH s0 ¼ 1 for s polarization and P ¼ pH p0 ¼ cosð2yb Þ for p polarization, with yB being the Bragg angle. To seek nontrivial solutions, we set the determinant of Equation 13 to zero and solve for K0 : " " ð1 F0 Þk2 K 2 0 0 " " " k20 FH P
" " " "¼0 2 2 " ð1 F0 Þk0 KH k20 FH P
ð14Þ
2 is related to K0 through Bragg’s law, where KH 2 KH ¼ jK0 þ Hj2 . Solution of Equation 14 defines the possible wave vectors in the crystal and gives rise to the dispersion surface in the two-beam case.
Properties of Dispersion Surface To visualize what the dispersion surface looks like in the two-beam case, we define two parameters x0 and xH , as described in James (1950) and Batterman and Cole (1964): x0 ½K02 ð1 F0 Þk20 =2k0 ¼ K0 k0 ð1 F0 =2Þ 2 ð1 F0 Þk20 =2k0 ¼ KH k0 ð1 F0 =2Þ xH ½KH
ð15Þ
These parameters represent the deviations of the wave vectors inside the crystal from the average refraction-corrected values given by Equation 7. This also shows that in general the refraction corrections for the internal incident and diffracted waves are different. With these deviation parameters, the dispersion equation, Equation 14, becomes 1 x0 xH ¼ k20 2 P2 FH FH 4
ð16Þ Figure 2. Dispersion surface in the two-beam case. ðAÞ Overview. ðBÞ Close-up view around the intersection region.
Hyperboloid Sheets. Since the right-hand side of Equation 16 is a constant for a given Bragg reflection, the dispersion surface given by this equation represents two sheets of hyperboloids in reciprocal space, for each polarization state P, as shown in Figure 2A. The hyperboloids have their diameter point, Q, located around what would be the center of the Ewald sphere (determined by Bragg’s law) and asymptotically approach the two spheres centered at the origin O and at the reciprocal node H, with a refraction-corrected radius k0 ð1 F0 =2Þ. The two corresponding spheres in vacuum (outside crystal) are also shown and their intersection point is usually called the Laue point, L. The dispersion surface branches closer to the Laue point are called the a branches (as, ap), and those further from the Laue point are called the b branches (bs, bp). Since the square-root value of the right-hand side constant in Equation 16 is much less than k0 , the gap at the diameter point is on the order of 105 compared to the radius of the spheres. Therefore, the spheres can be viewed essentially as planes in the vicinity of the diameter point, as illustrated in Figure 2B. However, the curvatures have to be considered when the Bragg reflection is in the grazing-angle geometry (see the section Grazing-Angle Diffraction).
Wave Field Amplitude Ratios. In addition to wave vectors, the eigenvalue equation, Equation 13, also provides the ratio of the wave field amplitudes inside the crystal for each polarization. In terms of x0 and xH , the amplitude ratio is given by ¼ k0 PFH =2xH DH =D0 ¼ 2x0 =k0 PFH
ð17Þ
Again, the actual ratio in the crystal depends entirely on the tie points selected by the boundary conditions. Around the diameter point, x0 and xH have similar lengths and thus the field amplitudes DH and D0 are comparable. Away from the exact Bragg condition, only one of x0 and xH has an appreciable size. Thus either D0 or DH dominates according to their asymptotic spheres. Boundary Conditions and Snell’s Law. To illustrate how tie points are selected by Snell’s law in the two-beam case, we consider the situation in Figure 2B where a crystal surface is indicated by a shaded line. We start with an incident condition corresponding to an incident vacuum
DYNAMICAL DIFFRACTION
wave vector k0 at point P. We then construct a surface normal passing through P and intersecting four tie points on the dispersion surface. Because of Snell’s law, the wave fields associated with these four points are the only permitted waves inside the crystal. There are four waves for each reciprocal node, O or H; altogether a total of eight waves may exist inside the crystal in the two-beam case. To find the external diffracted beam, we follow the same surface normal to the intersection point P0 , and the corresponding wave vector connecting P0 to the reciprocal node H would be the diffracted beam that we can measure with a detector outside the crystal. Depending on whether or not a surface normal intercepts both a and b branches at the same incident condition, a diffraction geometry is called either the Laue transmission or the Bragg reflection case. In terms of the direction cosines g0 and gH of the external incident and diffracted wave vectors, k0 and kH , with respect to the surface normal n, it is useful to define a parameter b: b g0 =gH k0 n=kH n
ð18Þ
where b > 0 corresponds to the Laue case and b < 0 the Bragg case. The cases with b ¼ 1 are called the symmetric Laue or Bragg cases, and for that reason b is often called the asymmetry factor. Poynting Vector and Energy Flow. The question about the energy flow directions in dynamical diffraction is of fundamental interests to scientists who use x-ray topography to study defects in perfect crystals. Energy flow of an electromagnetic wave is determined by its time-averaged Poynting vector, defined as S¼
c c ^ ðE ! H Þ ¼ jDj2 K 8p 8p
ð19Þ
^ is a unit vector along the where c is the speed of light, K propagation direction, and terms on the order of or higher are ignored. The total Poynting vector ST at each tie point on each branch of the dispersion surfaces is the vector sum of those for the O and H beams ST ¼
c ^ 0 þ D2 K ^ ðD2 K H HÞ 8p 0
ð20Þ
To find the direction of ST , we consider the surface normal v of the dispersion branch, which is along the direction of the gradient of the dispersion equation, Equation 16: v ¼ rðx0 xH Þ ¼ x0 rxH þ xH rx0 ¼
x0 ^ x ^ KH þ H K 0 xH x0
^ 0 þ D2 K ^ / D20 K H H / ST
ð21Þ
where we have used Equation 17 and assumed a negligible absorption ðjFH ¼ jFH jÞ. Thus we conclude that ST is parallel to v, the normal to the dispersion surface. In other words, the total energy flow at a given tie point is always normal to the local dispersion surface. This important theorem is generally valid and was first proved by Kato (1960). It follows that the energy flow inside the crystal
231
is parallel to the atomic planes at the full excitation condition, that is, the diameter points of the hyperboloids. Special Dynamical Effects There are significant differences in the physical diffraction processes between kinematic and dynamical theory. The most striking observable results from the dynamical theory are Pendello¨ sung fringes, anomalous transmission, finite reflection width for semi-infinite crystals, x-ray standing waves, and x-ray birefringence. With the aid of the dispersion surface shown in Figure 2, these effects can be explained without formally solving the mathematical equations. Pendello¨ sung. In a Laue case, the a and b tie points across the diameter gap of the hyperbolic dispersion surfaces are excited simultaneously at a given incident condition. The two sets of traveling waves associated with the two branches can interfere with each other and cause oscillations in the diffracted intensity as the thickness of the crystal changes on the order of 2p=K, where K is simply the gap at the diameter point. These intensity oscillations are termed Pendello¨ sung fringes and the quantity 2p=K is called the Pendello¨ sung period. From the geometry shown in Figure 2B, it is straightforward to show that the diameter gap is given by K ¼ k0 jPj
pffiffiffiffiffiffiffiffiffiffiffiffiffi FH FH =cos yB
ð22Þ
where yB is the internal Bragg angle. As an example, at ˚ 1, and 10 keV, for Si(111) reflection, K ¼ 2:67 ! 105 A thus the Pendello¨ sung period is equal to 23 mm. Pendello¨ sung interference is a unique diffraction phenomenon for the Laue geometry. Both the diffracted wave (H beam) and the forward-diffracted wave (O beam) are affected by this effect. The intensity oscillations for these two beams are 180 out of phase to each other, creating the effect of energy flow swapping back and forth between the two directions as a function of depth into the crystal surface. For more detailed discussions of Pendello¨ sung fringes we refer to a review by Kato (1974). We should point out that Pendello¨ sung fringes are entirely different in origin from interference fringes due to crystal thickness. The thickness fringes are often observed in reflectivity measurements on thin film materials and can be mostly accounted for by a finite size effect in the Fraunhofer diffraction. The period of thickness fringes depends only on crystal thickness, not on the strength of the reflection, while the Pendello¨ sung period depends only on the reflection strength, not on crystal thickness. Anomalous Transmission. The four waves selected by tie points in the Laue case have different effective absorption coefficients. This can be understood qualitatively from the locations of the four dispersion surface branches relative to the vacuum Laue point L and to the average refractioncorrected point Q. The b branches are further from L and are on the more refractive side of Q. Therefore the waves associated with the b branches have larger than average refraction and absorption. The a branches, on the other
232
COMPUTATION AND THEORETICAL METHODS
hand, are located closer to L and are on the less refractive side of Q. Therefore the waves on the a branches have less than average refraction and absorption. For a relatively thick crystal in the Laue diffraction geometry, the a waves would effectively be able to pass through the thickness of the crystal more easily than would an average wave. What this implies is that if the intensity is not observed in the transmitted beam at off-Bragg conditions, an anomalously ‘‘transmitted’’ intense beam can actually appear when the crystal is set to a strong Bragg condition. This phenomenon is called anomalous transmission; it was first observed by Borrmann (1950) and is also called the Borrmann effect. If the Laue crystal is sufficiently thick, then even the ap wave may be absorbed and only the as wave will remain. In this case, the Laue-diffracting crystal can be used as a linear polarizer since only the s-polarized x rays will be transmitted through the crystal. Darwin Width. In Bragg reflection geometry, all the excited tie points lie on the same branch of the dispersion surface at a given incident angle. Furthermore, no tie points can be excited at the center of a Bragg reflection, where a gap exists at the diameter point of the dispersion surfaces. The gap indicates that no internal traveling waves exist at the exact Bragg condition and total external reflection is the only outlet of the incident energy if absorption is ignored. In fact, the size of the gap determines the range of incident angles at which the total reflection would occur. This angular width is usually called the Darwin width of a Bragg reflection in perfect crystals. In the case of symmetric Bragg geometry, it is easy to see from Figure 2 that the full Darwin width is pffiffiffiffiffiffiffiffiffiffiffiffiffi 2jPj FH FH K w¼ ¼ ð23Þ k0 sin yB sin 2yB Typical values for w are on the order of a few arc-seconds. The existance of a finite reflection width w, even for a semi-infinite crystal, may seem to contradict the mathematical theory of Fourier transforms that would give rise to a zero reflection width if the crystal size is infinite. In fact, this is not the case. A more careful examination of the situation shows that because of the extinction the incident beam would never be able to see the whole ‘‘infinite’’ crystal. Thus the finite Darwin width is a direct result of the extinction effect in dynamical theory and is needed to conserve the total energy in the physical system. X-ray Standing Waves. Another important effect in dynamical diffraction is the x-ray standing waves (XSWs) (Batterman, 1964). Inside a diffracting crystal, the total wave field intensity is the coherent sum of the O and H beams and is given by (s polarization) jDj2 ¼ jD0 eiK0 r þ DH eiKH r j2 ¼ jD0 j2 " " " DH iHr ""2 ! ""1 þ e " D0
ð24Þ
Equation 24 represents a standing wave field with a spatial period of 2p=jHj, which is simply the d spacing of the Bragg reflection. The field amplitude ratio DH =D0
has well-defined phases at a and b branches of the dispersion surface. According to Equation 17 and Figure 2, we see that the phase of DH =D0 is p þ aH at the a branch, since xH is positive, and is aH at the b branch, since xH is negative, where aH is the phase of the structure factor FH and can be set to zero by a proper choice of real-space origin. Thus the a mode standing wave has its nodes on the atomic planes and the b mode standing wave has its antinodes on the atomic planes. In Laue transmission geometry, both the a and the b modes are excited simultaneously in the crystal. However, the b mode standing wave is attenuated more strongly because its peak field coincides with the atomic planes. This is the physical origin of the Borrmann anomalous absorption effect. The standing waves also exist in Bragg geometry. Because of its more recent applications in materials studies, we will devote a later segment (Standing Waves) to discuss this in more detail. X-ray Birefringence. Being able to produce and to analyze a generally polarized electromagnetic wave has long benefited scientists and researchers in the field of visiblelight optics and in studying optical properties of materials. In the x-ray regime, however, such abilities have been very limited because of the weak interaction of x rays with matter, especially for production and analysis of circularly polarized x-ray beams. The situation has changed significantly in recent years. The growing interest in studying magnetic and anisotropic electronic materials by x-ray scattering and spectroscopic techniques have initiated many new developments in both the production and the analyses of specially polarized x rays. The now routinely available high-brightness synchrotron radiation sources can provide naturally collimated x rays that can be easily manipulated by special x-ray optics to generate x-ray beams with polarization tunable from linear to circular. Such optics are usually called x-ray phase plates or phase retarders. The principles of most x-ray phase plates are based on the linear birefringence effect near a Bragg reflection in perfect or nearly perfect crystals due to dynamical diffraction (Hart, 1978; Belyakov and Dmitrienko, 1989). As illustrated in Figure 2, close to a Bragg reflection H, the lengths of the wave vectors for the s and the p polarizations are slightly different. The difference can cause a phase shift between the s and the p wave fields to accumulate through the crystal thickness t: ¼ ðKs Kp Þt. When the phase shift reaches 90 , circularly polarized radiation is generated, and such a device is called a quarter-wave phase plate or retarder (Mills, 1988; Hirano et al., 1991; Giles et al., 1994). In addition to these transmission-type phase retarders, a reflectiontype phase plate also has been proposed and studied (Brummer et al., 1984; Batterman, 1992; Shastri et al., 1995), which has the advantage of being thickness independent. However, it has been demonstrated that the Bragg transmission-type phase retarders are more robust to incident beam divergences and thus are very practical x-ray circular polarizers. They have been used for measurements of magnetic dichroism in hard permanent
DYNAMICAL DIFFRACTION
233
magnets and other magnetic materials (Giles et al., 1994; Lang et al., 1995). Recent reviews on x-ray polarizers and phase plates can be found in articles by Hart (1991), Hirano et al. (1995), Shen (1996a), and Malgrange (1996). Solution of the Dispersion Equation So far we have confined our discussions to the physical effects that exist in dynamical diffraction from perfect crystals and have tried to avoid the mathematical details of the solutions to the dispersion equation, Equation 11 or 12. As we have shown, considerable physical insight concerning the diffraction processes can be gained without going into mathematical details. To obtain the diffracted intensities in dynamical theory, however, the mathematical solutions are unavoidable. In the summary of these results that follows, we will keep the formulas in a general complex form so that absorption effects are automatically taken into account. The key to solving the dispersion equations (Equation 14 or 16) is to realize that the internal incident beam K0 can only differ from the vacuum incident beam k0 by a small component K0n along the surface normal direction of the incident surface, which in turn is linearly related to x0 or xH . The final expression reduces to a quadratic equation for x0 or xH , and solving for x0 or xH alone results in the following (Batterman and Cole, 1964): x0 ¼
1 k0 jPj 2
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffih i jbjFH FH Z ðZ2 þ b=jbjÞ1=2
ð25Þ
where Z is the reduced deviation parameter normalized to the Darwin width Z
2b pffiffiffiffiffiffi ðy y0 Þ w jbj
Figure 3. Boundary conditions for the wave fields outside the crystal in ðAÞ Laue case and ðBÞ Bragg case.
Diffracted Intensities We now employ boundary conditions to evaluate the diffracted intensities. Boundary Conditions. In the Laue transmission case (Fig. 3A), assuming a plane wave with an infinite crosssection, the field boundary conditions are given by the following equations: ( i D0 ¼ D0a þ D0b Entrance surface: ð29Þ 0 ¼ DHa þ DHb
ð26Þ
( Exit surface:
y ¼ y yB is the angular deviation from the vacuum Bragg angle yB , and y0 is the refraction correction y0
F0 ð1 1=bÞ 2 sin 2yB
ð27Þ
The dual signs in Equation 25 correspond to the a and b branches of the dispersion surface. In the Bragg case, b < 0 so the correction y0 is always positive—that is, the y value at the center of a reflection is always slightly larger than yB given by the kinematic theory. In the Laue case, the sign of y0 depends on whether b > 1 or b < 1. In the case of absorbing crystals, both Z and y0 can be complex, and the directional properties are represented by the real parts of these complex variables while their imaginary parts are related to the absorption given by F000 and w. Substituting Equation 25 into Equation 17 yields the wave field amplitude ratio inside the crystal as a function of Z DH jPj qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jbjFH =FH ½Z ðZ2 þ b=jbjÞ1=2 ¼ P D0
ð28Þ
De0 ¼ D0a eiK0a r þ D0b eiK0b r DeH ¼ DHa eiKHa r þ DHb eiKHb r
ð30Þ
In the Bragg reflection case (Fig. 3B), the field boundary conditions are given by ( i D0 ¼ D0a þ D0b ð31Þ Entrance surface: DeH ¼ DHa þ DHb ( Back surface:
De0 ¼ D0a eiK0a r þ D0b eiK0b r 0 ¼ DHa eiKHa r þ DHb eiKHb r
ð32Þ
In either case, there are six unknowns, D0a , D0b , DHa , DHb , De0 , DeH , and three pairs of equations, Equations 28, 29, 30, or Equations 28, 31, 32, for each polarization state. Our goal is to express the diffracted waves DeH outside the crystal as a function of the incident wave Di0 . Intensities in the Laue Case. In the Laue transmission case, we obtain, apart from an insignificant phase factor,
DeH
¼
Di0 em0 t=4ð1=g0 þ1=gH Þ
s"ffiffiffiffiffiffiffiffiffiffiffiffi"ffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffi "bFH " sinðA Z2 þ 1Þ " " pffiffiffiffiffiffiffiffiffiffiffiffiffiffi "F " Z2 þ 1 H
ð33Þ
234
COMPUTATION AND THEORETICAL METHODS
where A is the effective thickness (complex) that relates to real thickness t by (Zachariasen, 1945)
For thin nonabsorbing crystals (A * 1), we rewrite Equation 35 in the following form:
pffiffiffiffiffiffiffiffiffiffiffiffiffi pjPjt FH FH pffiffiffiffiffiffiffiffiffiffiffiffiffi A l jg0 gH j
" pffiffiffiffiffiffiffiffiffiffiffiffiffiffi #2 PH sinðA Z2 þ 1Þ sinðAZÞ 2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼
Z P0 Z2 þ 1
ð34Þ
The real part of A is essentially the ratio of the crystal thickness to the Pendello¨ sung period. A quantity often measured in experiments is the total power PH in the diffracted beam, which is equal to the diffracted intensity multiplied by the cross-section area of the beam. The power ratio PH =P0 of the diffracted beam to the incident beam is given by the intensity ratio, jDeH =Di0 j2 multiplied by the area ratio, 1=jbj, of the beam cross-sections " "2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi " " " " jsinðA Z2 þ 1Þj2 PH 1 ""DeH "" m0 t=2½1=g0 þ1=gH "FH " ¼ " " ¼e "F " ! P0 jbj " Di0 " jZ2 þ 1j H ð35Þ A plot of PH =P0 versus Z is usually called the rocking curve. Keeping in mind that Z can be a complex variable due essentially to F000 , Equation 35 is a general expression that is valid for both nonabsorbing and absorbing crystals. A few examples of the rocking curves in the Laue case for nonabsorbing crystals are shown in Figure 4A. For thick nonabsorbing crystals, A is large (A 1) so the sin2 oscillations tend to average to a value equal to 12. Thus, Equation 35 reduces to a simple Lorentzian shape PH 1 ¼ P0 2ðZ2 þ 1Þ
ð36Þ
ð37Þ
This approximation (Equation 37) can be realized by expanding the quantities in the square brackets on both sides to third power and neglecting the A3 term since A * 1. We see that in this thin-crystal limit, dynamical theory gives the same result as kinematic theory. The condition A * 1 can be restated as the crystal thickness t is much less than the Pendello¨ sung period. Intensities in the Bragg Case. In the Bragg reflection case, the diffracted wave field is given by DeH
¼
Di0
sffiffiffiffiffiffiffiffiffiffiffiffi "ffi " "bFH " 1 " " pffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffi "F " 2 2 H Z þ i Z 1 cotðA Z 1Þ
ð38Þ
The power ratio PH =P0 of the diffracted beam to the incident, often called the Bragg reflectivity, is " " PH ""FH "" 1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ P0 "FH " jZ þ i Z2 1 cotðA Z2 1Þj2
ð39Þ
In the case of thick crystals (A 1), Equation 39 reduces to " " PH ""FH "" 1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼" " P0 FH jZ Z2 1j2
ð40Þ
The choice of the signs is such that a smaller value of PH =P0 is retained. On the other hand, for semi-infinite crystals (A 1), we can go back to the boundary conditions, Equations 31 and 32, and ignore the back surface altogether. If we then apply the argument that only one of the two tie points on each branch of the dispersion surface is physically feasible in the Bragg case because of the energy flow conservation, we arrive at the following simple boundary condition: Di0 ¼ D0
DeH ¼ DH
ð41Þ
By using Equations 41 and 28, the diffracted power can be expressed by " " pffiffiffiffiffiffiffiffiffiffiffiffiffiffi""2 PH ""FH """" ¼ " ""Z Z2 1" P0 FH
Figure 4. Diffracted intensity PH =P0 in ðAÞ nonabsorbing Laue case, and ðBÞ absorbing Bragg case, for several effective thicknesses. The Bragg reflection in ðBÞ is for GaAs(220) at a wave˚. length 1.48 A
ð42Þ
Again the sign in front of the square root is chosen so that PH =P0 is less than unity. The result is obviously identical to Equation 40. Far away from the Bragg condition, Z 1, Equation 40 shows that the reflected power decreases as 1=Z2 . This asymptotic form represents the ‘‘tails’’ of a Bragg reflection (Andrews and Cowley, 1985), which are also called the crystal truncation rod in kinematic theory (Robinson,
DYNAMICAL DIFFRACTION
1986). In reciprocal space, the direction of the tails is along the surface normal since the diffracted wave vector can only differ from the Bragg condition by a component normal to the surface or interface. More detailed discussions of the crystal truncation rods in dynamical theory can be found in Colella (1991), Caticha (1993, 1994), and Durbin (1995). Examples of the reflectivity curves, Equation 39, for a GaAs crystal with different thicknesses in the symmetric Bragg case are shown in Figure 4B. The oscillations in the tails are entirely due to the thickness of the crystal. These modulations are routinely observed in x-ray diffraction profiles from semiconductor thin films on substrates and can be used to determine the thin-film thickness very accurately (Fewster, 1996). Integrated Intensities. The integrated intensity RZH in the reduced Z units is given by integrating the diffracted power ratio PH =P0 over the entire Z range. For nonabsorbing crystals in the Laue case, in the limiting cases of A * 1 and A 1, RZH can be calculated analytically as (Zachariasen, 1945) ð1 PH pA; dZ ¼ RZH ¼ p=2; 1 P0
A*1 A1
ð43Þ
For intermediate values of A or for absorbing crystals, the integral can only be calculated numerically. A general plot of RZH versus A in the nonabsorbing case is shown in Figure 5 as the dashed line. For nonabsorbing crystals in the Bragg case, Equation 39 can be integrated analytically (Darwin, 1922) to yield RZH ¼
ð1 PH pA; dZ ¼ p tanhðAÞ ¼ p; 1 P0
A*1 A1
ð44Þ
A plot of the integrated power in the symmetric Bragg case is shown in Figure 5 as the solid curve. Both curves
Figure 5. Comparison of integrated intensities in the Laue case and the Bragg case with the kinematic theory.
235
in Figure 5 show a linear behavior for small A, which is consistent with kinematic theory. If we use the definitions of Z and A, we obtain that the integrated power RyH over the incident angle y in the limit of A * 1 is given by
RyH ¼
ð1 1
PH w p r2 l3 P2 jFH j2 t dy ¼ RZH ¼ wA ¼ e ð45Þ P0 2 2 Vc sin 2yB
which is identical to the integrated intensity in the kinematic theory for a small crystal (Warren, 1969). Thus in some sense kinematic theory is a limiting form of dynamical theory, and the departures of the integrated intensities at larger A values (Fig. 5) is simply the effect of primary extinction. In the thick-crystal limit A 1, the yintegrated intensity RyH in both Laue and Bragg cases is linear in jFH j. This linear rather than quadratic dependence on jFH j is a distinct and characteristic result of dynamical diffraction. Standing Waves As we discussed earlier, near or at a Bragg reflection, the wave field amplitudes, Equation 24, represent standing waves inside the diffracting crystal. In the Bragg reflection geometry, as the incident angle increases through the full Bragg reflection, the selected tie points shift from the a branch to the b branch. Therefore the nodes of the standing wave shift from on the atomic planes (r ¼ 0) to in between the atomic planes (r ¼ d=2) and the corresponding antinodes shift from in between to on the atomic planes. For a semi-infinite crystal in the symmetric Bragg case and s polarization, the standing wave intensity can be written, using Equations 24, 28, and 42, as sffiffiffiffiffiffiffi " "2 " PH iðnþaH HrÞ "" " I ¼ "1 þ e " " " P0
ð46Þ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffi where n is the phase of Z Z2 1 and aH is the phase of the structure factor FH , assuming absorption is negligible. If we define the diffraction plane by choosing an origin such that aH is zero, then the standing wave intensity as a function of Z is determined by the phase factor H r with respect to the origin chosen and the d spacing of the Bragg reflection (Bedzyk and Materlik, 1985). Typical standing wave intensity profiles given by Equation 46 are shown in Figure 6. The phase variable n and the corresponding reflectivity curve are also shown in Figure 6. An XSW profile can be observed by measuring the x-ray fluorescence from atoms embedded in the crystal structure since the fluorescence signal is directly proportional to the internal wave field intensity at the atom position (Batterman, 1964). By analyzing the shape of a fluorescence profile, the position of the fluorescing atom with respect to the diffraction plane can be determined. A detailed discussion of nodal plane position shifts of the standing waves in general absorbing crystals has been given by Authier (1986).
236
COMPUTATION AND THEORETICAL METHODS
MULTIPLE-BEAM DIFFRACTION So far, we have restricted our discussion to diffraction cases in which only the incident beam and one Braggdiffracted beam are present. There are experimental situations, however, in which more than one diffracted beam may be significant and therefore the two-beam approximation is no longer valid. Such situations involving multiplebeam diffraction are dealt with in this section. Basic Concepts
Figure 6. XSW intensity and phase as a function of reduced angular parameter Z, along with reflectivity curve, calculated ˚. for a semi-infinite GaAs(220) reflection at 1.48 A
The standing wave technique has been used to determine foreign atom positions in bulk materials (Batterman, 1969; Golovchenko et al., 1974; Lagomarsino et al., 1984; Kovalchuk and Kohn, 1986). Most recent applications of the XSW technique have been the determination of foreign atom positions, surface relaxations, and disorder at crystal surfaces and interfaces (Durbin et al., 1986; Zegenhagen et al., 1988; Bedzyk et al., 1989; Martines et al., 1992; Fontes et al., 1993; Franklin et al., 1995; Lyman and Bedzyk, 1997). By measuring standing wave patterns for two or more reflections (either separately or simultaneously) along different crystallographic axes, atomic positions can be triangulated in space (Greiser and Materlik, 1986; Berman et al., 1988). More details of the XSW technique can be found in recent reviews by Patel (1996) and Lagomarsino (1996). The formation of XSWs is not restricted to wide-angle Bragg reflections in perfect crystals. Bedzyk et al. (1988) extended the technique to the regime of specular reflections from mirror surfaces, in which case both the phase and the period of the standing waves vary with the incident angle. Standing waves have also been used to study the spatial distribution of atomic species in mosaic crystals (Durbin, 1998) and quasicrystals (Chung and Durbin, 1995; Jach et al., 1999). Due to a substantial (although imperfect) standing wave formation, anomalous transmission has been observed on the strongest diffraction peaks in nearly perfect quasicrystals (Kycia et al., 1993).
Multiple-beam diffraction occurs when several sets of atomic planes satisfy Bragg’s laws simultaneously. A convenient way to realize this is to excite one Bragg reflection and then rotate the crystal around the diffraction vector. While the H reflection is always excited during such a rotation, it is possible to bring another set of atomic planes, L, into its diffraction condition and thus to have multiplebeam diffraction. The rotation around the scattering vector H is defined by an azimuthal angle, c. For x rays, multiple-beam diffraction peaks excited in this geometry were first observed by Renninger (1937); hence, these multiple diffraction peaks are often called the ‘‘Renninger peaks.’’ For electrons, multiple-beam diffraction situations exist in almost all cases because of the much stronger interactions between electrons and atoms. As shown in Figure 7, if atomic planes H and L are both excited at the same time, then there is always another set of planes, H–L, also in diffraction condition. The diffracted beam kL by L reflection can be scattered again by the H–L reflection and this doubly diffracted beam is in the same direction as the H-reflected beam kH . In this sense, the photons (or particles) in the doubly diffracted beam have been through a ‘‘detour’’ route compared to the photons (particles) singly diffracted by the H reflection. We usually call H the main reflection, L the detour reflection, and H–L the coupling reflection.
Figure 7. Illustration of a three-beam diffraction case involving O, H, and L, in real space (upper) and reciprocal space (lower).
DYNAMICAL DIFFRACTION
Depending on the strengths of the structure factors involved, a multiple reflection can cause either an intensity enhancement (peak) or reduction (dip) in the twobeam intensity of H. A multiple reflection peak is commonly called the Umweganregung (‘‘detour’’ in German) and a dip is called the Aufhellung. The former occurs when H is relatively weak and both L and H–L are strong, while the latter occurs when both H and L are strong and H–L is weak. A semiquantitative intensity calculation can be obtained by total energy balancing among the multiple beams, as worked out by Moon and Shull (1964) and Zachariasen (1965). In most experiments, multiple reflections are simply a nuisance that one tries to avoid since they cause inaccurate intensity measurements. In the last two decades, however, there has been renewed and increasing interest in multiple-beam diffraction because of its promising potential as a physical solution to the well-known ‘‘phase problem’’ in diffraction and crystallography. The phase problem refers to the fact that the data collected in a conventional diffraction experiment are the intensities of the Bragg reflections from a crystal, which are related only to the magnitude of the structure factors, and the phase information is lost. This is a classic problem in diffraction physics and its solution remains the most difficult part of any structure determination of materials, especially for biological macromolecular crystals. Due to an interference effect among the simultaneously excited Bragg beams, multiple-beam diffraction contains the direct phase information on the structure factors involved, and therefore can be used as a way to solve the phase problem. The basic idea of using multiple-beam diffraction to solve the phase problem was first proposed by Lipcomb (1949), and was first demonstrated by Colella (1974) in theory and by Post (1977) in an experiment on perfect crystals. The method was then further developed by several groups (Chapman et al., 1981; Chang, 1982; Schmidt and Colella, 1985; Shen and Colella, 1987, 1988; Hu¨ mmer et al., 1990) to show that it can be applied not only to perfect crystals but also to real, mosaic crystals. Recently, there have been considerable efforts to apply multibeam diffraction to large-unit-cell inorganic and macromolecular crystals (Lee and Colella, 1993; Chang et al., 1991; Hu¨ mmer et al., 1991; Weckert et al., 1993). Progress in this area has been amply reviewed by Chang (1984, 1992), Colella (1995, 1996), and Weckert and Hu¨ mmer (1997). A recent experimental innovation in reference-beam diffraction (Shen, 1998) allows parallel data collection of three-beam interference profiles using an area detector in a modified oscillation-camera setup, and makes it possible to measure the phases of a large number of Bragg reflections in a relatively short time. Theoretical treatment of multiple-beam diffraction is considerably more complicated than for the two-beam theory, as evidenced by some of the early works (Ewald and Heno, 1968). This is particularly so in the case of x rays because of mixing of the s and p polarization states in a multiple-beam diffraction process. Colella (1974), based upon his earlier work for electron diffraction (Colella, 1972), developed a full dynamical theory procedure for multiple-beam diffraction of x rays and a corresponding
237
computer program called NBEAM. With Colella’s theory, multiple-beam dynamical calculations have become more practical and more easily performed. On today’s powerful computers and software and for not too many beams, running the NBEAM program can be almost trivial, even on personal computers. We will outline the principles of the NBEAM procedure in the NBEAM Theory section. NBEAM Theory The fundamental equations for multiple-beam x-ray diffraction are the same as those in the two-beam theory, before the two-beam approximation is made. We can go back to Equation 5, expand the double cross-product, and rewrite it in the following form:
X k20 ð1 þ F Þ Di þ Fij ½ui ðui Dj Þ Dj ¼ 0 ð47Þ 0 2 Ki j6¼i
Eigenequation for D-field Components. In order to properly express the components of all wave field amplitudes, we define a polarization unit-vector coordinate system for each wave j: uj ¼ Kj =jKj j sj ¼ uj ! n=juj ! nj pj ¼ uj ! sj
ð48Þ
where n is the surface normal. Multiplying Equation 26 by sj and pj yields
X k20 ð1 þ F Þ Fij ½ðsj si ÞDjs þ ðpj si ÞDjp 0 Dis ¼ 2 Ki j 6¼ i 2 X k0 ð1 þ F Þ Dip ¼ Fij ½ðsj pi ÞDjs þ ðpj pi ÞDjp 0 Ki2 j 6¼ i ð49Þ
Matrix form of the Eigenequation. For an NBEAM diffraction case, Equation 49 can be written in a matrix form if we define a 2N ! 1 vector D ¼ ðD1s ; . . . ; DNs ; D1p ; . . . ; DNp Þ, a 2N ! 2N diagonal matrix Tij with Tii ¼ k20 =Ki2 ði ¼ jÞ and Tij ¼ 0 ði 6¼ jÞ, and a 2N ! 2N general matrix Aij that takes all the other coefficients in front of the wave field amplitudes. Matrix A is Hermitian if absorption is ignored, or symmetric if the crystal is centrosymmetric. Equation 49 then becomes ðT þ AÞD ¼ 0
ð50Þ
Equation 50 is equivalent to ðT1 þ A1 ÞD ¼ 0
ð51Þ
Strictly speaking the eigenvectors in Equation 51 are actually the E fields: E ¼ T D. However, D and E are exchangeable, as discussed in the Basic Principles section.
238
COMPUTATION AND THEORETICAL METHODS
To find nontrivial solutions of Equation 51, we need to solve the secular eigenvalue equation jT1 þ A1 j ¼ 0
ð52Þ
with Tii1 ¼ Ki2 =K02 ði ¼ jÞ and Tij1 ¼ 0 ði 6¼ jÞ. We can write k2j in the form of its normal (n) and tangential (t) components to the entrance surface: Kj2 ¼ ðk0n þ Hjn Þ2 þ k2jt
ð53Þ
which is essentially Bragg’s law together with the boundary condition that Kjt ¼ kjt . Strategy for Numerical Solutions. If we treat m ¼ K0n=k0 as the only unknown, Equation 52 takes the following matrix form: jm2 mB þ Cj ¼ 0
ð54Þ
where Bij ¼ ð2Hjn =k0 Þdij is a diagonal matrix and 2 Cij ¼ ðA 1Þij þ dij ðHjn þ k2jt =k20 . Equation 54 is a quadratic eigenequation that no computer routines are readily available for solving. Colella (1974) employed an ingenious method to show that Equation 51 is equivalent to solving the following linear eigenvalue problem:
C 0
B I
D0 D
0 D ¼m D
ð55Þ
where I is a unit matrix, and D0 ¼ mD, which is a redundant 2N vector with no physical significance. Equation 55 can now be solved with standard software routines that deal with linear eigenvalue equations. It is a 4Nth-order equation for K0n , and thus has 4N solutions, l denoted as K0n ; l ¼ 1; . . . ; 4N. For each eigenvalue K0n , there is a corresponding 2N eigenvector that is stored in D, which now is a 2N ! 4N matrix and its element labeled Dljs in its top N rows and Dljp in its bottom N rows. These wave field amplitudes are evaluated at this point only on a relative scale, similar to the amplitude ratio in the twobeam case. For convenience, each 2N eigenvector can be normalized to unity: N X
ðjDljs j2 þ jDljp j2 Þ ¼ 1
ð56Þ
j l and the eigenvectors In terms of the eigenvalues K0n l l ¼ ðDjs ; Djp Þ, a general expression for the wave field inside the crystal is given by
Dlj
DðrÞ ¼
X l
ql
X
l
Dlj eiKj r
ð57Þ
j
where Klj ¼ Kl0 þ Hj and ql ’s (l ¼ 1; . . . ; 4N) are the coefficients to be determined by the boundary conditions. Boundary Conditions. In general, it is not suitable to distinguish the Bragg and the Laue geometries in multiple-
beam diffraction situations since it is possible to have an internal wave vector parallel to the surface and thus the distinction would be meaningless. The best way to treat the situation, as pointed out by Colella (1974), is to include both the back-diffracted and the forward-diffracted beams in vacuum, associated with each internal beam j. Thus for each beam j, we have two vacuum waves defined by kj ¼ kjt nðk20 k2jt Þ1=2 , where again the subscript t stands for the tangential component. Therefore for an Nbeam diffraction from a parallel crystal slab, we have altogether 8N unknowns: 4N ql values for the field inside the crystal, 2N wave field components of Dej above the entrance surface, and 2N components of the wave field Dej below the back surface. The 8N equations needed to solve the above problem are fully provided by the general boundary conditions, Equation 11. Inside the crystal we have Ej ¼
X
l
ql Dlj eiKj r
ð58Þ
l
and Hj ¼ uj ! Ej , where the sum is over all eigenvalues l for each jth beam. (We note that in Colella’s original formalism converting Dj to Ej is not necessary since Equation 51 is already for Ej . This is also consistent with the omissions of all longitudinal components of E fields, after the eigenvalue equation is obtained, in dynamical theory.) Outside the crystal, we have Dej at the back surface and Dej plus incident beam Di0 at the entrance surface. These boundary conditions provide eight scalar equations for each beam j, and thus the 8N unkowns can be solved for as a function of Di0 . Intensity Computations. Both the reflected and the transmitted intensities, Ij and Ij , for each beam j can be calculated by taking Ij ¼ jDej j2 =jDi0 j2 . We should note that the whole computational procedure described above only evaluates the diffracted intensity at one crystal orientation setting with respect to the incident beam. To obtain meaningful information, the computation is usually repeated for a series of settings of the incident angle y and the azimuthal angle c. An example of such two-dimensional (2D) calculations is shown in Figure 8A, which is for a three-beam case, GaAs(335)/(551). In many experimental situations, the intensities in the y direction are integrated either purposely or because of the divergence in the incident beam. In that case, the integrated intensities versus the azimuthal angle c are plotted, as shown in Figure 8B. Second-Order Born Approximation From the last segment, we see that the integrated intensity as a function of azimuthal angle usually displays an asymmetric intensity profile, due to the multiple-beam interference. The asymmetry profile contains the phase information about the structure factors involved. Although the NBEAM program provides full account for these multiple-beam interferences, it is rather difficult to gain physical insight into the process and into the structural parameters it depends on.
DYNAMICAL DIFFRACTION
239
equation by using the Green’s function and obtain the following: DðrÞ ¼ Dð0Þ ðrÞ þ
ð 0 1 eik0 jrr j 0 r ! r0 ! ½deðr0 ÞDðr0 Þ dr0 jr r0 j 4p ð59Þ
where Dð0Þ ðrÞ ¼ D0 eik0 r is the incident beam. Since de is small, we can calculate the scattered wave field DðrÞ iteratively using the perturbation theory of scattering (Jackson, 1975). For first-order approximation, we substitute Dðr0 Þ in the integrand by the incident beam Dð0Þ ðrÞ, and obtain a first-order solution Dð1Þ ðrÞ. This solution can then be substituted into the integrand again to provide a second-order approximation, Dð2Þ ðrÞ, and so on. The sum of all these approximate solutions gives rise to the true solution of Equation 59, DðrÞ ¼ Dð0Þ ðrÞ þ Dð1Þ ðrÞ þ Dð2Þ ðrÞ þ
ð60Þ
This is essentially the Born series in quantum mechanics. Assuming that the distance r from the observation point to the crystal is large compared to the size of the crystal (far field approximation), it can be shown (Shen, 1986) that the wave field of the first-order approximation is given by Dð1Þ ðrÞ ¼ Nre FH u ! ðu ! D0 Þðeik0 r =rÞ
Figure 8. ðAÞ Calculated reflectivity using NBEAM for the threebeam case of GaAs(335)/(551), as a function of Bragg angle y and azimuthal angle c. ðBÞ Corresponding integrated intensities versus c (open circles). The solid-line-only curve corresponds to the profile with an artificial phase of p added in the calculation.
In the past decade or so, there have been several approximate approaches for multiple-beam diffraction intensity calculations based on Bethe approximations (Bethe, 1928; Juretschke, 1982, 1984, 1986; Hoier and Marthinsen, 1983), second-order Born approximation (Shen, 1986), Takagi-Taupin differential equations (Thorkildsen, 1987), and an expanded distorted-wave approximation (Shen, 1999b). In most of these approaches, a modified two-beam structure factor can be defined so that integrated intensities can be obtained through the two-beam equations. In the following section, we will discuss only the second-order Born approximation (for x rays), since it provides the most direct connection to the two-beam kinematic results. The expanded distortedwave theory is outlined at the end of this unit following the standard distorted-wave theory in surface scattering. To obtain the Born approximation series, we transform the fundamental Equation 3 into an integral
ð61Þ
where N is the number of unit cells in the crystal, and only one set of atomic planes H satisfies the Bragg’s condition, k0 u ¼ k0 þ H, with u being a unit vector. Equation 61 is identical to the scattered wave field expression in kinematic theory, which is what we expect from the first-order Born approximation. To evaluate the second-order expression, we cannot use Equation 61 as Dð1Þ since it is valid only in the far field. The original form of Dð1Þ with Green’s function has to be used. For detailed derivations we refer to Shen’s (1986). The final second-order wave field Dð2Þ is expressed by
D
ð2Þ
" # X eik0 r kL ! ðkL ! D0 Þ u! u! ¼ Nre FHL FL r k20 k2L L ð62Þ
It can be seen that Dð2Þ is the detoured wave field involving L and H–L reflections, and the summation over L represents a coherent superposition of all possible threebeam interactions. The relative strength of a given detoured wave is determined by its structure factors and is inversely proportional to the distance k20 KL2 of the reciprocal lattice node L from the Ewald sphere. The total diffracted intensity up to second order in is given by a coherent sum of Dð1Þ and Dð2Þ : I ¼ jDð1Þ þ Dð2Þ j2 " !#"2 " X FHL FL kL ! ðkL ! D0 Þ "" " eik0 r u ! u ! FH D0 ¼ "" Nre " " r FH k20 k2L L
ð63Þ
240
COMPUTATION AND THEORETICAL METHODS
Equation 63 provides an approximate analytical expression for multiple-beam diffracted intensities and represents a modified two-beam intensity influenced by multiple-beam interactions. The integrated intensity can be computed by replacing FH in the kinematic intensity formula by a ‘‘modified structure factor’’ defined by
FH D0 ! FH D0
X FHL FL kL ! ðkL ! D0 Þ L
FH
k20 k2L
! ð64Þ
Often, in practice, multiple-beam diffraction intensities are normalized to the corresponding two-beam values. In this case, Equation 63 can be used directly since the prefactors in front of the square brackets will be canceled out. It can be shown (Shen, 1986) that Equation 63 gives essentially the same result as the NBEAM as long as the full three-beam excitation points are excluded, indicating that the second-order Born approximation is indeed a valid approach to multiple-beam diffraction simulations. Equation 63 becomes divergent at the exact three-beam excitation point k0 ¼ kL . However, the singularity can be avoided numerically if we take absorption into account by introducing an imaginary part in the wave vectors.
aL , and aH : d ¼ aHL þ aL aH . It can be shown that although the individual phases aHL , and aH depend on the choice of origin in the unit cell, the phase triplet does not; it is therefore called the invariant phase triplet in crystallography. The resonant phase n depends on whether the reciprocal node L is outside (k0 < kL ) or inside (k0 > kL ) the Ewald sphere. As the diffracting crystal is rotated through a three-beam excitation, n changes by p since L is swept through the Ewald sphere. This phase change of p in addition to the constant phase triplet is the cause for the asymmetric three-beam diffraction profiles and allows one to measure the structural phase d in a diffraction experiment. Polarization Mixing. For noncoplanar multiple-beam diffraction cases (i.e., L not in the plane defined by H and k0 ), there is in general a mixing of the s and p polarization states in the detoured wave (Shen, 1991, 1993). This means that if the incident beam is purely s polarized, the diffracted beam may contain a p-polarized component in the case of multiple-beam diffraction, which does not happen in the case of two-beam diffraction. It can be shown that the polarization properties of the detour-diffracted beam in a three-beam case is governed by the following 2 ! 2 matrix
Special Multiple-Beam Effects The second-order Born approximation not only provides an efficient computational technique, but also allows one to gain substantial insight to the physics involved in a multiple-beam diffraction process. Three-Beam Interactions as the Leading Dynamical Effect. The successive terms in the Born series, Equation 60, represent different levels of multiple-beam interactions. For example, Dð0Þ is simply the incident beam (O), Dð1Þ consists of two-beam (O, H) diffraction, Dð2Þ involves threebeam (O, H, L) interactions, and so on. Equation 62 shows that even when more than three beams are involved, the individual three-beam interactions are the dominant effects compared to higher-order beam interactions. This conclusion is very important to computations of NBEAM effects when N is large. It can greatly simplify even the full dynamical calculations using NBEAM, as shown by Tischler and Batterman (1986). The new multiple-beam interpretation of the Born series also implies that the three-beam effect is the leading term beyond the kinematic first-order Born approximation and thus is the dominant dynamical effect in diffraction. In a sense, the threebeam interactions (O ! L ! H) are even more important than the multiple scattering in the two-beam case since that involves O ! H ! O ! H (or higher order) scattering, which is equivalent to a four-beam interaction. Phase Information. Equation 63 shows explicitly the phase information involved in the multiple-beam diffraction. The interference between the detoured wave Dð2Þ and the directly scattered wave Dð1Þ depends on the relative phase difference between the two waves. This phase difference is equal to phase n of the denominator, plus the phase triplet d of the structure factor phases aHL ,
A¼
k2L ðL s0 Þ2
!
ðL s0 ÞðL p0 Þ
ðkL pH ÞðL s0 Þ k2L ðpH p0 Þ ðkL pH ÞðL p0 Þ ð65Þ
The off-diagonal elements in A indicate the mixing of the polarization states. This polarization mixing, together with the phasesensitive multiple-beam interference, provides an unusual coupling to the incident beam polarization state, especially when the incident polarization contains a circularly polarized component. The effect has been used to extract acentric phase information and to determine noncentrosymmetry in quasicrystals (Shen and Finkelstein, 1990; Zhang et al., 1999). If we use a known noncentrosymmetric crystal such as GaAs, the same effect provides a way to measure the degree of circular polarization and can be used to determine all Stokes polarization parameters for an x-ray beam (Shen and Finkelstein, 1992, 1993; Shen et al., 1995). Multiple-Beam Standing Waves. The internal field in the case of multiple-beam diffraction is a 3D standing wave. This 3D standing wave can be detected, just like in the two-beam case, by observing x-ray fluorescence signals (Greiser and Matrlik, 1986), and can be used to determine the 3D location of the fluorescing atom—similar to the method of triangulation by using multiple separate twobeam cases. Multiple-beam standing waves are also responsible for the so-called super-Borrmann effect because of additional lowering of the wave field intensity around the atomic planes (Borrmann and Hartwig, 1965).
DYNAMICAL DIFFRACTION
Polarization Density Matrix If the incident beam is partially polarized—that is, it includes an unpolarized component—calculations in the case of multiple-beam diffraction can be rather complicated. One can simplify the algorithm a great deal by using a polarization density matrix as in the case of magnetic xray scattering (Blume and Gibbs, 1988). A polarization matrix is defined by 1 r¼ 2
1 þ P1
P2 iP3
P2 þ iP3
1 P1
! ð66Þ
where (P1 , P2 , P3 ) are the normalized Stokes-Poincare´ polarization parameters (Born and Wolf, 1983) that characterize the s and p linear polarization, 45 tilted linear polarization, and left- and right-handed circular polarization, respectively. A polarization-dependent scattering process, where the incident beam (D0s ; D0p ) is scattered into (DHs ; DHp ), can be described by a 2 ! 2 matrix M whose elements Mss ; Msp , and Mpp represent the respective s ! s; s ! p; p ! s; and p ! p scattering amplitudes:
DHs DHp
¼
Mss Msp
Mps Mpp
D0s D0p
ð67Þ
It can be shown that with the density matrix r and scattering matrix M, the scattered new density matrix rH is given by rH ¼ MrMy , where My is the Hermitian conjugate of M. The scattered intensity IH is obtained by calculating the trace of the new density matrix
241
A GID geometry may include the following situations: (1) specular reflection, (2) coplanar GID involving highly asymmetric Bragg reflections, and (3) GID in an inclined geometry. Because of the substantial decrease in the penetration depths of the incident beam in these geometries, there have been widespread applications of GID using synchrotron radiation in recent years in materials studies of surface structures (Marra et al., 1979), depth-sensitive disorder and phase transitions (Dosch, 1992; Rhan et al., 1993; Krimmel et al., 1997; Rose et al., 1997), and long-period multilayers and superlattices (Barbee and Warburton, 1984; Salditt et al., 1994). We devote this section first to the basic concepts in GID geometries. A recent review on these topics has been given by Holy (1996); see also SURFACE X-RAY DIFFRACTION. In the Distorted-wave Born Approximation section, we present the principle of this approximation (Vineyard, 1982; Dietrich and Wagner, 1983, 1984; Sinha et al., 1988), which provides a bridge between the dynamical Fresnel formula and the kinematic theory of surface scattering of x rays and neutrons. Specular Reflectivity It is straightforward to show that the Fresnel’s optical reflectivity, which is widely used in studies of mirrors (e.g., Bilderback, 1981), can be recovered in the dynamical theory for x-ray diffraction. We recall that in the case of one beam, the solution to the dispersion equation is given by Equation 7. Assuming a semi-infinite crystal and using the general boundary condition, Equation 11, we have the following equations across the interface (see Appendix for definition of terms): e
IH ¼ TrðrH Þ
ð68Þ
This equation is valid for any incident beam polarization, including when the beam is partially polarized. We should note that the method is not restricted to dynamical theory and is widely used in other physics fields such as quantum mechanics. In the case of multiple-beam diffraction, the matrix M can be evaluated using either the NBEAM program or one of the perturbation approaches.
GRAZING-ANGLE DIFFRACTION Grazing-incidence diffraction (GID) of x rays or neutrons refers to situations where either the incident or the diffracted beam forms a small angle less than or in the vicinity of the critical angle of a well-defined crystal surface. In these cases, both a Bragg-diffracted beam and a specular-reflected beam can occur simultaneously. Although there are only two beams, O and H, inside the crystal, the usual two-beam dynamical diffraction theory cannot be applied to this situation without some modifications (Afanasev and Melkonyan, 1983; Cowan et al., 1986; Hoche et al., 1986). These special considerations, however, can be automatically taken into account in the NBEAM theory discussed in the Multiple-Beam Diffraction section, as shown by Durbin and Gog (1989).
Di0 þ D0 ¼ D0 e
k0 sin yðDi0 D0 Þ ¼ k0 sin y0 D0
ð69Þ
where y and y0 are the incident angles of the external and the internal incident beams (Fig. 1B). By using Equation 7 and the fact that K0 and k0 can differ only by a component normal to the surface, we arrive at the following wave field ratios (for small angles): qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 e D0 y y yc qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r0 i ¼ D0 y þ y2 y2 c
ð70aÞ
D0 2y qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ Di0 y þ y2 y2 c
ð70bÞ
t0
with the critical angle defined as yc ¼ ðF0 Þ1=2 . For most materials, yc is on the order of a few milliradians. In general, yc can be complex in order to take into account absorption. Obviously, Equations 70 gives the same reflection and transmission coefficients as the Fresnel theory in visible optics (see, e.g., Jackson, 1974). The specular reflectivity R is given by the square of the magnitude of Equation 70a: R ¼ jr0 j2 , while jt0 j2 of Equation 70b is the internal wave field intensity at the surface. An example of jr0 jj2 and jt0 j2 is shown in Figure 9A,B for a GaAs surface.
242
COMPUTATION AND THEORETICAL METHODS
where t0 is given by Equation 70b and rn and rt are the respective coordinates normal and parallel to the surface. The characteristic penetration depth tð1=eÞ value of the intensity) is given by t ¼ 1=½2 ImðK0n Þ , where Im (K0n ) is the imaginary part of K0n . A plot of t as a function of the incident angle y is shown in Figure 9C. In general, a pene˚ tration depth (known as skin depth) as short as 10 to 30 A can be achieved with Fresnel’s specular reflection when y < yc . The limit at y ¼ 0 is simply given by t ¼ l=ð4pyc Þ with l being the x-ray wavelength. This makes the x-ray reflectivity–related measurement a very useful tool for studying surfaces of various materials. At y > yc , t becomes quickly dominated by true photoelectric absorption and the variation is simply geometrical. The large variation of t around y yc forms the basis for such depth-controlled techniques as x-ray fluorescence under total external reflection (de Boer, 1991; Hoogenhof and de Boer, 1994), grazing-incidence scattering and diffraction (Dosch, 1992; Lied et al., 1994; Dietrich and Hasse, 1995; Gunther et al., 1997), and grazing-incidence x-ray standing waves (Hashizume and Sakata, 1989; Jach et al., 1989; Jach and Bedzyk, 1993). Multilayers and Superlattices Figure 9. ðAÞ Fresnel’s reflectivity curve for a GaAs surface at ˚ . ðBÞ Intensity of the internal field at the surface. ðCÞ Pene1.48 A tration depth.
At y yc , y in Equations 70a,b should be replaced by the original sin y and the Fresnel reflectivity jr0 j2 varies as 1/(2sin y)4, or as 1=q4 with q being the momentum transfer normal to the surface. This inverse fourth power law is the same as that derived in kinematic theory (Sinha et al., 1988) and in the theory of small-angle scattering (Porod, 1952, 1982). At first glance, the 1=q4 asymptotic law is drastically different from the crystal truncation rod 1=q2 behavior for the Bragg reflection tails. A more careful inspection shows that the difference is due to the integral nature of the reflectivity over a more fundamental physical quantity called differential cross-section, ds=d , which is defined as the incident flux scattered into a detector area that forms a solid angle d with respect to the scattering source. In both Fresnel reflectivity and Bragg reflection cases, ds=d 1=q2 in reciprocal space units. Reflectivity calculations in both cases involve integrating over the solid angle and converting the incident flux into an incident intensity; each would give rise to a factor of 1/sin y (Sinha et al., 1988). The only difference now is that in the case of Bragg reflections, this factor is simply 1/sin yB , which is a constant for a given Bragg reflection, whereas for Fresnel reflectivity cases, sin y q results in an additional factor of 1=q2 . Evanescent Wave When y < yc , the normal component K 0n of the internal wave vector K0 is imaginary so that the x-ray wave field inside the material diminishes exponentially as a function of depth, as given by D0 ðrÞ ¼ t0 eImðK0n Þrn eikt rt
ð71Þ
Synthetic multilayers and superlattices usually have long ˚ . Since the Bragg angles corresponding periods of 20 to 50 A to these periods are necessarily small in an ordinary x-ray diffraction experiment, the superlattice diffraction peaks are usually observed in the vicinity of specular reflections. Thus dynamical theory is often needed to describe the diffraction patterns from multilayers of amorphous materials and superlattices of nearly perfect crystals. A computational method to calculate the reflectivity from a multilayer system was first developed by Parratt (1954). In this method, a series of recursive equations on the wave field amplitudes is set up, based on the boundary conditions at each interface. Assuming that the last layer is a substrate that is sufficiently thick, one can find the solution of each layer backward and finally obtain the reflectivity from the top layer. For details of this method, we refer the readers to Parratt’s original paper (1954) and to a more recent matrix formalism reviewed by Holy (1996). It should be pointed out that near the specular region, the internal crystalline structures of the superlattice layers can be neglected, and only the average density of each layer would contribute. Thus the reflectivity calculations for multilayers and for superlattices are identical near the specular reflections. The crystalline nature of a superlattice needs to be taken into account near or at Bragg reflections. With the help of Takagi-Taupin equations, lattice mismatch and variations along the growth direction can also be taken into account, as shown by Bartels et al. (1986). By treating a semi-infinite single crystal as an extreme case of a superlattice or multilayer, one can calculate the reflectivity for the entire range from specular to all of the Bragg reflections along a given crystallographic axis (Caticha, 1994). X-ray diffraction studies of laterally structured superlattices with periods of 0.1 to 1 mm, such as surface
DYNAMICAL DIFFRACTION
243
Since most GID experiments are performed in the inclined geometry, we will focus only on this geometry and refer the highly asymmetric cases to the literature (Hoche et al., 1988; Kimura and Harada, 1994; Holy, 1996). In an inclined GID arrangement, both the incident beam and the diffracted beam form a small angle with respect to the surface, as shown in Figure 10A, with the scattering vector parallel to the surface. This geometry involves two internal waves, O and H, and three external waves, incident O, specular reflected O and diffracted H beams. With proper boundary conditions, the diffraction problem can be solved analytically as shown by several authors (Afanasev and Melkonyan, 1983; Cowan et al., 1986; Hoche et al., 1986; Hung and Chang, 1989; Jach et al., 1989). Durbin and Gog (1989) applied the NBEAM program to GID geometry. A characteristic dynamical effect in GID geometry is a double-critical-angle phenomenon due to the diameter gap of the dispersion surface for H reflection. This can be seen intuitively from simple geometric considerations. Inside
the crystal, only two beams, O and H, are excited and thus the usual two-beam theory described in the TwoBeam Diffraction section applies. The dispersion surface inside the crystal is exactly the same as shown in Figure 2A. The only difference is the boundary condition. In the GID case, the surface normal is perpendicular to the page in Figure 2, and therefore the circular curvature out of the page needs to be taken into account. For simplicity, we consider only the diameter points on the dispersion surface for one polarization state. A cut through the diameter points L and Q in Figure 2 is shown schematically in Figure 10B; this consists of three concentric circles representing the hyperboloids of revolution a and b branches, and the vacuum sphere at point L. At very small incident angles, we see that no tie points can be excited and only total specular reflection can exist. As the incident angle increases so that f > fac , a tie points are excited but the b branch remains extinguished. Thus specular reflectivity would maintain a lower plateau, until f > fbc when both a and b modes can exist inside the crystal. Meanwhile, the Bragg reflected beam should have been fully excited when fac < f < fbc , but because of the partial specular reflection its diffracted intensity is much reduced. These effects can be clearly seen in the example shown in Figure 11, which is for a Ge(220) reflection with a (1-11) surface orientation. If the Bragg’s condition is not satisfied exactly, then the circle labeled L in Figure 10B will be split into two concentric ones representing the two spheres centered at O and H, respectively. We then see that the exit take-off angles can be different for the reflected O beam and the diffracted H beam. With a position-sensitive linear detector and a range of incident angles, angular profiles (or rod profiles) of diffracted beams can be observed directly, which can provide depth-sensitive structural information near a crystal surface (Dosch et al., 1986; Bernhard et al., 1987).
Figure 10. ðAÞ Schematic of the grazing-incidence diffraction geometry. ðBÞ A cut through the diameter points of the dispersion surface.
Figure 11. Specular and Bragg reflectivity at the center of the rocking curve for the Ge(220) reflection with a (1-11) surface orientation.
gratings and quantum wire and dot arrays, have been of much interest in materials science in recent years (Bauer et al., 1996; Shen, 1996b). Most of these studies can be dealt with using kinematic diffraction theory (Aristov et al., 1988), and a rich amount of information can be obtained such as feature profiles (Shen et al., 1993; Darhuber et al., 1994), roughness on side wall surfaces (Darhuber et al., 1994), imperfections in grating arrays (Shen et al., 1996b), size-dependent strain fields (Shen et al., 1996a), and strain gradients near interfaces (Shen and Kycia, 1997). Only in the regimes of total external reflection and GID are dynamical treatments necessary as demonstrated by Tolan et al. (1992, 1995) and by Darowski et al. (1997). Grazing-Incidence Diffraction
244
COMPUTATION AND THEORETICAL METHODS
Distorted-Wave Born Approximation GID, which was discussed in the last section, can be viewed as the dynamical diffraction of the internal evanescant wave, Equation 71, generated by specular reflection under grazing-angle conditions. If the rescattering mechanism is relatively weak, as in the case of a surface layer, then dynamical diffraction theory may not be necessary and the Born approximation can be substituted to evaluate the scattering of the evanescant wave. This approach is called the distorted-wave Born approximation (DWBA) in quantum mechanics (see, e.g., Schiff, 1955), and was first applied to x-ray scattering from surfaces by Vineyard (1982) and by Dietrich and Wagner (1983, 1984). It was noted by Dosch et al. (1986) that Vineyard’s original treatment did not handle the exit-angle dependence properly because of a missing factor in its reciprocity arrangement. The DWBA has been applied to several different scattering situations, including specular diffuse scattering from a rough surface, crystal truncation rod scattering near a surface, diffuse scattering in multilayers, and near-surface diffuse scattering in binary alloys (X-RAY AND NEUTRON DIFFUSE SCATTERING MEASUREMENTS). The underlying principle is the same for all these cases and we will only discuss specular diffuse scattering to illustrate these principles. From a dynamical theory point of view, the DWBA is schematically shown in Figure 12A. An incident beam k0 creates an internal incident beam K0 and a specular reflected beam k0 . We then assume that the internal beam K0 is scattered by a weak ‘‘Bragg reflection’’ at a lateral momentum transfer qt . Similar to the two-beam case in dynamical theory, we draw two spheres centered at qt shown as the dashed circles in Figure 12A. However, the internal diffracted wave vector is determined by kinematic scattering as Kq ¼ K0 þ k, where q includes both the lateral component qt and a component qn normal to the surface, defined by the usual 2y angle. Therefore only one of the tie points on the internal sphere is excited, giving rise to Kq . Outside the surface, we have two tie points that yield kq and kq , respectively, as defined in dynamical theory. Altogether we have six beams, three associated with O and three associated with q. The connection between the O and the q beams is through the internal kinematic scattering Dq ¼ Sðqt ÞD0
ð72Þ
where Sðqt ) is the surface scattering form factor. As will be seen later, jSðqt Þj2 represents the scattering cross-section per unit surface area defined by Sinha et al. (1988) and equals the Fourier transform of the height-height correlation function Cðrt Þ in the case of not-too-rough surfaces. To find the diffuse-scattered exit wave field Deq , we use the optical reciprocity theorem of Helmhotz (Born and Wolf, 1983) and reverse the directions of all three wave vectors of the q beams. We see immediately that the situation is identical to that discussed at the beginning of this section for Fresnel reflections. Thus, we should have Deq ¼ tq Dq
tq
2yq qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi yq þ y2q y2c
ð73Þ
Figure 12. ðAÞ Dynamical theory illustration of the distortedwave Born approximation. ðBÞ Typical diffuse scattering profile in specular reflectivity with Yoneda wings.
Using Equations 70b and 72, we obtain that Deq ¼ t0 tq Sðqt ÞDi0
ð74Þ
and the diffuse scattering intensity is simply given by " "2 Idiff ¼ jDeq =Di0 j2 ¼ jt0 j2 "tq " jSðqt Þj2
ð75Þ
Apart from a proper normalization factor, Equation 75 is the same as that given by Sinha et al. (1988). Of course, here the scattering strength jSðqt Þj2 is only a symbolic quantity. For the physical meaning of various surface roughness correlation functions and its scattering forms, we refer to the article by Sinha et al. (1988) for more a detailed discussion. In a specular reflectivity measurement, one usually uses so-called rocking scans to record a diffuse scattering profile. The amount of diffuse scattering is determined by
DYNAMICAL DIFFRACTION
the overall surface roughness and the shape of the profile is determined by the lateral roughness correlations. An example of computer-simulated rocking scan is shown in ˚ with the detector Figure 12B for a GaAs surface at 1.48 A 2 2y ¼ 3 . The parameter jSðqt Þj is assumed to be a Lorent˚ . The two peaks at zian with a correlation length of 4000 A y 0:3 and 2.7 correspond to the situation where the incident or the exit beam makes the critical angle with respect to the surface. These peaks are essentially due to the enhancement of the evanescent wave (standing wave) at the critical angle (Fig. 9B) and are often called the Yoneda wings, as they were first observed by Yoneda (1963). Diffuse scattering of x rays, neutrons, and electrons is widely used in materials science to characterize surface morphology and roughness. The measurements can be performed not only near specular reflection but also around nonspecular crystal truncation rods in grazingincidence inclined geometry (Shen et al., 1989; Stepanov et al., 1996). Spatially correlated roughness and morphologies in multilayer systems have also been studied using diffuse x-ray scattering (Headrick and Baribeau, 1993; Baumbach et al., 1994; Kaganer et al., 1996; Paniago et al., 1996; Darhuber et al., 1997). Some of these topics are discussed in detail in KINEMATIC DIFFRACTION OF X RAYS and in the units on x-ray surface scattering (see X-RAY TECHNIQUES).
Since FG ¼ jFG j exp ðiaC Þ and FG ¼ FG ¼ jFG j exp ðiaG Þ if absorption is negligible, it can be seen that the additional component in Equation 76 represents a sinusoidal distortion, 2jFG j cos ðaG G rÞ The distorted wave D1 ðrÞ, due only to de1 ðrÞ, satisfies the following equation:
ðr2 þ k20 ÞD1 ¼ r ! r ! ðde1 D1 Þ
iGr e Þ de1 ðrÞ ¼ ðF0 þ FG eiGr þ FG
ð76Þ
and the remaining de2 ðrÞ is de2 ðrÞ ¼
X L 6¼ 0;G
FL eiLr
ð77Þ
ð78Þ
which is a standard two-beam case since only O and G Fourier components exist in de1 ðrÞ, and can therefore be solved by the two-beam dynamical theory (Batterman and Cole, 1964; Pinsker, 1978). It can be shown that the total distorted wave D1 ðrÞ can be expressed as follows: D1 ðrÞ ¼ D0 ðr0 eiK0 r þ rG eiaG eiKG r Þ
ð79Þ
where (
r0 ¼ 1 rG ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffi jbjðZG Z2G 1Þ
ð80Þ
in the semi-infinite Bragg case and (
Expanded Distorted-Wave Approximation The scheme of the distorted-wave approximation can be extended to calculate nonspecular scattering that includes multilayer diffraction peaks from a multilayer system where a recursive Fresnel’s theory is usually used to evaluate the distorted-wave (Kortright and Fischer-Colbrie, 1987; Holy and Baumbach, 1994). Recently, Shen (1999b,c) has further developed an expanded distortedwave approximation (EDWA) to include multiple-beam diffraction from bulk crystals where a two-beam dynamical theory is applied to obtain the distorted internal waves. In Shen’s EDWA theory, a sinusoidal Fourier component G is added to the distorting susceptibility component, which represents a charge-density modulation of the G reflection. Instead of the Fresnel theory, a two-beam dynamical theory is employed to evaluate the distorted-wave, while the subsequent scattering of the distorted-wave is again handled by the first-order Born approximation. We now briefly outline this EDWA approach. Following the formal distorted-wave description given in Vineyard (1982), deðrÞ in the fundamental equation 3 is separated into a distorting component de1 ðrÞ and the remaining part de2 ðrÞ : deðrÞ ¼ de1 ðrÞ þ de2 ðrÞ, where de1 ðrÞ contains the homogeneous average susceptibility, plus a single predominant Fourier component G:
245
r0 ¼ cosðAZG Þ þ i sinðAZÞ pffiffiffiffiffiffi rG ¼ i jbj sinðAZG Þ=ZG
ð81Þ
in the thin transparent Laue case. Here standard notations in the Two-Beam Dynamical Theory section are used. It should be noted that the amplitudes of these distorted waves, given by Equations 80 and 81, are slow varying functions of depth z through parameter A, since A is much smaller than K0 r or KG r by a factor of jFG j, which ranges from 105 to 106 for inorganic to 107 to 108 for protein crystals. We now consider the rescattering of the distorted-wave D1 ðrÞ, Equation 79, by the remaining part of the susceptibility de2 ðrÞ defined in Equation 77. Using the first-order Born approximation, the scattered wave field DðrÞ is given by DðrÞ ¼
ð eik0 r 0 dr0 eik0 ur r0 ! r0 ! ½de2 ðr0 ÞD1 ðr0 Þ 4pr
ð82Þ
where u is a unit vector and r is the distance from the sample to the observation point, and the integral is evaluated over the sample volume. The amplitudes r0 and rG can be factored out of the integral because of their much weaker spatial dependence than K0 r KG r as mentioned above. The primary extinction effects in Bragg cases and the Pendello¨ sung effects in Laue cases are taken into account by first evaluating intensity IH (z) scattered by a volume element at a certain depth z, and then taking an average over z to obtain the final diffracted intensity. It is worth noting that the distorted wave, Equation 79, can be viewed as the new incident wave for the Born approximation, Equation 59, and it consists of two beams, K0 and KG . These two incident beams can each produce its own diffraction pattern. If reflection H satisfies Bragg’s
246
COMPUTATION AND THEORETICAL METHODS
law, k0 u ¼ K0 þ H KH , and is excited by K0 , then there always exists a reflection H–G, excited by KG , such that the doubly scattered wave travels along the same direction as KH , since KG þ H G ¼ KH . With this in mind and using the algebra given in the Second Order Born Approximation section, it is easy to show that Equation 82 gives rise to the following scattered wave: DH ¼ Nre u ! ðu ! D0 Þ
e
ik0 r
r
ðFH r0 þ FHG rG eiaG Þ
ð83Þ
Normalizing to the conventional first-order Born wave ð1Þ field DH defined by Equation 61, Equation 83 can be rewritten as ð1Þ
DH ¼ DH ðr0 þ jFHG =FH jrG eid Þ
ð84Þ
where d ¼ aHG þ aG aH is the invariant triplet phase widely used in crystallography. Finally, the scattered intensity into the kH ¼ KH ¼ k0 u direction is given by Ðt IH ¼ ð1=tÞ 0 jDH j2 dz, which is averaged over thickness t of the crystal as discussed in the last paragraph. Numerical results show that the EDWA theory outlined here provides excellent agreement with the full NBEAM dynamical calculations even at the center of a multiple reflection peak. For further information, refer to Shen (1999b,c).
SUMMARY In this unit, we have reviewed the basic elements of dynamical diffraction theory for perfect or nearly perfect crystals. Although the eventual goal of obtaining structural information is the same, the dynamical approach is considerably different from that in kinematic theory. A key distinction is the inclusion of multiple scattering processes in the dynamical theory whereas the kinematic theory is based on a single scattering event. We have mainly focused on the Ewald–von Laue approach of the dynamical theory. There are four essential ingredients in this approach: (1) dispersion surfaces that determine the possible wave fields inside the material; (2) boundary conditions that relate the internal fields to outside incident and diffracted beams; (3) intensities of diffracted, reflected, and transmitted beams that can be directly measured; and (4) internal wave field intensities that can be measured indirectly from signals of secondary excitations. Because of the interconnections of different beams due to multiple scattering, experimental techniques based on dynamical diffraction can often offer unique structural information. Such techniques include determination of impurity locations with x-ray standing waves, depth profiling with grazing-incidence diffraction and fluorescence, and direct measurements of phases of structure factors with multiple-beam diffraction. These new and developing techniques have benefited substantially from the rapid growth of synchrotron radiation facilities around the world. With more and newer-generation facilities becom-
ing available, we believe that dynamical diffraction study of various materials will continue to expand in application and become more common and routine to materials scientists and engineers.
ACKNOWLEDGMENTS The author would like to thank Boris Batterman, Ernie Fontes, Ken Finkelstein, and Stefan Kycia for critical reading of this manuscript. This work is supported by the National Science Foundation through CHESS under grant number DMR-9311772.
LITERATURE CITED Afanasev, A. M. and Melkonyan, M. K. 1983. X-ray diffraction under specular reflection conditions. Ideal crystals. Acta Crystallogr. Sec. A 39:207–210. Aleksandrov, P. A., Afanasev, A. M., and Stepanov, S. A. 1984. Bragg-Laue diffraction in inclined geometry. Phys. Status Solidi A 86:143–154. Anderson, S. K., Golovchenko, J. A., and Mair, G. 1976. New application of x-ray standing wave fields to solid state physics. Phys. Rev. Lett. 37:1141–1144. Andrews, S. R. and Cowley, R. A. 1985. Scattering of X-rays from crystal surfaces. J. Phys. C 18:6427–6439. Aristov, V. V., Winter, U., Nikilin, A. Y., Redkin, S. V., Snigirev, A.A., Zaumseil, P., and Yunkin, V.A. 1988. Interference thickness oscillations of an x-ray wave on periodically profiled silicon. Phys. Status Solidi A 108:651–655. Authier, A. 1970. Ewald waves in theory and experiment. In Advances in Structure Research by Diffraction Methods Vol. 3 (R. Brill and R. Mason, eds.) pp. 1–51. Pergamon Press, Oxford. Authier, A. 1986. Angular dependence of the absorption-induced nodal plane shifts of x-ray stationary waves. Acta Crystallogr. Sec. A 42:414–425. Authier, A. 1992. Dynamical theory of x-ray diffraction In International Tables for Crystallography, Vol. B (U. Shmueli, ed.) pp. 464–480. Academic, Dordrecht. Authier, A. 1996. Dynamical theory of x-ray diffraction—I. Perfect crystals; II. Deformed crystals. In X-ray and Neutron Dynamical Diffraction: Theory and Applications (A. Authier, S. Lagomarsino, and B. K. Tanner, eds.). Plenum, New York. Barbee, T. W. and Warburton, W. K. 1984. X-ray evanescent and standing-wave fluorescence studies using a layered synthetic microstructure. Mater. Lett. 3:17–23. Bartels, W. J., Hornstra, J., and Lobeek, D. J. W. 1986. X-ray diffraction of multilayers and superlattices. Acta Crystallogr. Sec. A 42:539–545. Batterman, B. W. 1964. Effect of dynamical diffraction in x-ray fluorescence scattering. Phys. Rev. 133:759–764. Batterman, B. W. 1969. Detection of foreign atom sites by their x-ray fluorescence scattering. Phys. Rev. Lett. 22:703–705. Batterman, B. W. 1992. X-ray phase plate. Phys. Rev. B 45:12677– 12681. Batterman, B. W. and Bilderback, D. H. 1991. X-ray monochromators and mirrors. In Handbook on Synchrotron Radiation, Vol. 3 (G. S. Brown and D. E. Moncton, eds.) pp. 105–153. NorthHolland, New York.
DYNAMICAL DIFFRACTION Batterman, B. W. and Cole, H. 1964. Dynamical diffraction of xrays by perfect crystals. Rev. Mod. Phys. 36:681–717. Bauer, G., Darhuber, A. A., and Holy, V. 1996. Structural characterization of reactive ion etched semiconductor nanostructures using x-ray reciprocal space mapping. Mater. Res. Soc. Symp. Proc. 405:359–370.
247
Quantitative phase determination for macromolecular crystals using stereoscopic multibeam imaging. Acta Crystallogr. A 55:933–938. Chang, S. L., King, H. E., Jr., Huang, M.-T., and Gao, Y. 1991. Direct phase determination of large macromolecular crystals using three-beam x-ray interference. Phys. Rev. Lett. 67:3113–3116.
Baumbach, G. T., Holy, V., Pietsch, U., and Gailhanou, M. 1994. The influence of specular interface reflection on grazing incidence X-ray diffraction and diffuse scattering from superlattices. Physica B 198:249–252.
Chapman, L. D., Yoder, D. R., and Colella, R. 1981. Virtual Bragg scattering: A practical solution to the phase problem. Phys. Rev. Lett. 46:1578–1581.
Bedzyk, M. J., Bilderback, D. H., Bommarito, G. M., Caffrey, M., and Schildkraut, J. S. 1988. Long-period standing waves as molecular yardstick. Science 241:1788–1791.
Chikawa, J.-I. and Kuriyama, M. 1991. Topography. In Handbook on Synchrotron Radiation, Vol. 3 (G. S. Brown and D. E. Moncton, eds.) pp. 337–378. North-Holland, New York.
Bedzyk, M. J., Bilderback, D., White, J., Abruna, H. D., and Bommarito, M.G. 1986. Probing electrochemical interfaces with xray standing waves. J. Phys. Chem. 90:4926–4928.
Chung, J.-S. and Durbin, S. M. 1995. Dynamical diffraction in quasicrystals. Phys. Rev. B 51:14976–14979.
Bedzyk, M. J. and Materlik, G. 1985. Two-beam dynamical solution of the phase problem: A determination with x-ray standing-wave fields. Phys. Rev. B 32:6456–6463. Bedzyk, M. J., Shen, Q., Keeffe, M., Navrotski, G., and Berman, L. E. 1989. X-ray standing wave surface structural determination for iodine on Ge (111). Surf. Sci. 220:419–427. Belyakov, V. and Dmitrienko, V. 1989. Polarization phenomena in x-ray optics. Sov. Phys. Usp. 32:697–719. Berman, L. E., Batterman, B. W., and Blakely, J. M. 1988. Structure of submonolayer gold on silicon (111) from x-ray standingwave triangulation. Phys. Rev. B 38:5397–5405. Bernhard, N., Burkel, E., Gompper, G., Metzger, H., Peisl, J., Wagner, H., and Wallner, G. 1987. Grazing incidence diffraction of X-rays at a Si single crystal surface: Comparison of theory and experiment. Z. Physik B 69:303–311.
Cole, H., Chambers, F. W., and Dunn, H. M. 1962. Simultaneous diffraction: Indexing Umweganregung peaks in simple cases. Acta Crystallogr. 15:138–144. Colella, R. 1972. N-beam dynamical diffraction of high-energy electrons at glancing incidence. General theory and computational methods. Acta Crystallogr. Sec. A 28:11–15. Colella, R. 1974. Multiple diffraction of x-rays and the phase problem. computational procedures and comparison with experiment. Acta Crystallogr. Sec. A 30:413–423. Colella, R. 1991. Truncation rod scattering: Analysis by dynamical theory of x-ray diffraction. Phys. Rev. B 43:13827–13832. Colella, R. 1995. Multiple Bragg scattering and the phase problem in x-ray diffraction. I. Perfect crystals. Comments Cond. Mater. Phys. 17:175–215.
Bethe, H. A. 1928. Ann. Phys. (Leipzig) 87:55. Bilderback, D. H. 1981. Reflectance of x-ray mirrors from 3.8 to 50 keV (3.3 to 0.25 A). SPIE Proc. 315:90–102.
Colella, R. 1996. Multiple Bragg scattering and the phase problem in x-ray diffraction. II. Perfect crystals; Mosaic crystals. In Xray and Neutron Dynamical Diffraction: Theory and Applications (A. Authier, S. Lagomarsino, and B. K. Tanner, eds.). Plenum, New York.
Bilderback, D. H., Hoffman, S. A., and Thiel, D. J. 1994. Nanometer spatial resolution achieved in hand x-ray imaging and Laue diffraction experiments. Science 263:201–203.
Cowan, P. L., Brennan, S., Jach, T., Bedzyk, M. J., and Materlik, G. 1986. Observations of the diffraction of evanescent x rays at a crystal surface. Phys. Rev. Lett. 57:2399–2402.
Blume, M. and Gibbs, D. 1988. Polarization dependence of magnetic x-ray scattering. Phys. Rev. B 37:1779–1789.
Cowley, J. M. 1975. Diffraction Physics. North-Holland Publishing, New York.
Born, M. and Wolf, E. 1983. Principles of Optics, 6th ed. Pergamon, New York.
Darhuber, A. A., Koppensteiner, E., Straub, H., Brunthaler, G., Faschinger, W., and Bauer, G. 1994. Triple axis x-ray investigations of semiconductor surface corrugations. J. Appl. Phys. 76:7816–7823.
Borrmann, G. 1950. Die Absorption von Rontgenstrahlen in Fall der Interferenz. Z. Phys. 127:297–323. Borrmann, G. and Hartwig, Z. 1965. Z. Kristallogr. Kristallgeom. Krystallphys. Kristallchem. 121:401. Brummer, O., Eisenschmidt, C., and Hoche, H. 1984. Polarization phenomena of x-rays in the Bragg case. Acta Crystallogr. Sec. A 40:394–398. Caticha, A. 1993. Diffraction of x-rays at the far tails of the Bragg peaks. Phys. Rev. B 47:76–83. Caticha, A. 1994. Diffraction of x-rays at the far tails of the Bragg peaks. II. Darwin dynamical theory. Phys. Rev. B 49:33–38. Chang, S. L. 1982. Direct determination of x-ray reflection phases. Phys. Rev. Lett. 48:163–166. Chang, S. L. 1984. Multiple Diffraction of X-Rays in Crystals. Springer-Verlag, Heidelberg. Chang, S.-L. 1998. Determination of X-ray Reflection Phases Using N-Beam Diffraction. Acta Crystallogr. A 54:886–894. Chang, S. L. 1992. X-ray phase problem and multi-beam interference. Int. J. Mod. Phys. 6:2987–3020. Chang, S.-L., Chao, C.-H., Huang, Y.-S., Jean, Y.-C., Sheu, H.-S., Liang, F.-J., Chien, H.-C., Chen, C.-K., and Yuan, H. S. 1999.
Darhuber, A. A., Schittenhelm, P., Holy, V., Stangl, J., Bauer, G., and Abstreiter, G. 1997. High-resolution x-ray diffraction from multilayered self-assembled Ge dots. Phys. Rev. B 55:15652– 15663. Darowski, N., Paschke, K., Pietsch, U., Wang, K. H., Forchel, A., Baumbach, T., and Zeimer, U. 1997. Identification of a buried single quantum well within surface structured semiconductors using depth resolved x-ray grazing incidence diffraction. J. Phys. D 30:L55–L59. Darwin, C. G. 1914. The theory of x-ray reflexion. Philos. Mag. 27:315–333; 27:675–690. Darwin, C. G. 1922. The reflection of x-rays from imperfect crystals. Philos. Mag. 43:800–829. de Boer, D. K. G. 1991. Glancing-incidence X-ray fluorescence of layered materials. Phys. Rev. B 44:498–511. Dietrich, S. and Haase, A. 1995. Scattering of X-rays and neutrons at interfaces. Phys. Rep. 260: 1–138. Dietrich, S. and Wagner, H. 1983. Critical surface scattering of x-rays and neutrons at grazing angles. Phys. Rev. Lett. 51: 1469–1472.
248
COMPUTATION AND THEORETICAL METHODS
Dietrich, S. and Wagner, H. 1984. Critical surace scattering of x-rays at Grazing Angles. Z. Phys. B 56:207–215. Dosch, H. 1992. Evanescent X-rays probing surface-dominated phase transitions. Int. J. Mod. Phys. B 6:2773–2808. Dosch, H., Batterman, B. W., and Wack., D. C. 1986. Depthcontrolled grazing-incidence diffraction of synchrotron x-radiation. Phys. Rev. Lett. 56:1144–1147. Durbin, S. M. 1987. Dynamical diffraction of x-rays by perfect magnetic crystals. Phys. Rev. B 36:639–643. Durbin, S. M. 1988. X-ray standing wave determination of Mn sublattice occupancy in a Cd1x Mnx Te mosaic crystal. J. Appl. Phys. 64:2312–2315. Durbin, S. M. 1995. Darwin spherical-wave theory of kinematic surface diffraction. Acta Crystallogr. Sec. A 51:258–268. Durbin, S. M., Berman, L. E., Batterman, B. W., and Blakely, J. M. 1986. Measurement of the silicon (111) surface contraction. Phys. Rev. Lett. 56:236–239. Durbin, S. M. and Follis, G. C. 1995. Darwin theory of heterostructure diffraction. Phys. Rev. B 51:10127–10133. Durbin, S. M. and Gog, T. 1989. Bragg-Laue diffraction at glancing incidence. Acta Crystallogr. Sec. A 45:132–141. Ewald, P. P. 1917. Zur Begrundung der Kristalloptik. III. Die Kristalloptic der Rontgenstrahlen. Ann. Physik (Leipzig) 54:519–597. Ewald, P. P. and Heno, Y. 1968. X-ray diffraction in the case of three strong rays. I. Crystal composed of non-absorbing point atoms. Acta Crystallogr. Sec. A 24:5–15. Feng, Y. P., Sinha, S. K., Fullerton, E. E., Grubel, G., Abernathy, D., Siddons, D. P., and Hastings, J. B. 1995. X-ray Fraunhofer diffraction patterns from a thin-film waveguide. Appl. Phys. Lett. 67:3647–3649. Fewster, P. F. 1996. Superlattices. In X-ray and Neutron Dynamical Diffraction: Theory and Applications (A. Authier, S. Lagomarsino, and B. K. Tanner, eds.). Plenum, New York. Fontes, E., Patel, J. R., and Comin, F. 1993. Direct measurement of the asymmetric diner buckling of Ge on Si(001). Phys. Rev. Lett. 70:2790–2793.
Gunther, R., Odenbach, S., Scharpf, O., and Dosch, H. 1997. Reflectivity and evanescent diffraction of polarized neutrons from Ni(110). Physica B 234-236:508–509. Hart, M. 1978. X-ray polarization phenomena. Philos. Mag. B 38:41–56. Hart, M. 1991. Polarizing x-ray optics for synchrotron radiation. SPIE Proc. 1548:46–55. Hart, M. 1996. X-ray optical beamline design principles. In X-ray and Neutron Dynamical Diffraction: Theory and Applications (A. Authier, S. Lagomarsino, and B. K. Tanner, eds.). Plenum, New York. Hashizume, H. and Sakata, O. 1989. Dynamical diffraction of X-rays from crystals under grazing-incidence conditions. J. Crystallogr. Soc. Jpn. 31:249–255; Coll. Phys. C 7:225–229. Headrick, R. L. and Baribeau, J. M. 1993. Correlated roughness in Ge/Si superlattices on Si(100). Phys. Rev. B 48:9174–9177. Hirano, K., Ishikawa, T., and Kikuta, S. 1995. Development and application of x-ray phase retarders. Rev. Sci. Instrum. 66:1604–1609. Hirano, K., Izumi, K., Ishikawa, T., Annaka, S., and Kikuta, S. 1991. An x-ray phase plate using Bragg case diffraction. Jpn. J. Appl. Phys. 30:L407–L410. Hoche, H. R., Brummer, O., and Nieber, J. 1986. Extremely skew x-ray diffraction. Acta Crystallogr. Sec. A 42:585–586. Hoche, H.R., Nieber, J., Clausnitzer, M., and Materlik, G. 1988. Modification of specularly reflected x-ray intensity by grazing incidence coplanar Bragg-case diffraction. Phys. Status Solidi A 105:53–60. Hoier, R. and Marthinsen, K. 1983. Effective structure factors in many-beam x-ray diffraction—use of the second Bethe approximation. Acta Crystallogr. Sec. A 39:854–860. Holy, V. 1996. Dynamical theory of highly asymmetric x-ray diffraction. In X-ray and Neutron Dynamical Diffraction: Theory and Applications (A. Authier, S. Lagomarsino, and B.K. Tanner, eds.). Plenum, New York. Holy, V. and Baumbach, T. 1994. Nonspecular x-ray reflection from rough multilayers. Phys. Rev. B 49:10668–10676.
Giles, C., Malgange, C., Goulon, J., de Bergivin, F., Vettier, C., Dartyge, E., Fontaine, A., Giorgetti, C., and Pizzini, S. 1994. Energy-dispersive phase plate for magnetic circular dichroism experiments in the x-ray range. J. Appl. Crystallogr. 27:232– 240.
Hoogenhof, W. W. V. D. and de Boer, D. K. G. 1994. GIXA (glancing incidence X-ray analysis), a novel technique in near-surface analysis. Mater. Sci. Forum (Switzerland) 143– 147:1331–1335. Hu¨ mmer, K. and Billy, H. 1986. Experimental determination of triplet phases and enantiomorphs of non-centrosymmetric structures. I. Theoretical considerations. Acta Crystallogr. Sec. A 42:127–133. Hu¨ mmer, K., Schwegle, W., and Weckert, E. 1991. A feasibility study of experimental triplet-phase determination in small proteins. Acta Crystallogr. Sec. A 47:60–62.
Golovchenko, J. A., Batterman, B. W., and Brown, W. L. 1974. Observation of internal x-ray wave field during Bragg diffraction with an application to impurity lattice location. Phys. Rev. B 10:4239–4243.
Hu¨ mmer, K., Weckert, E., and Bondza, H. 1990. Direct measurements of triplet phases and enantiomorphs of non-centrosymmetric structures. Experimental results. Acta Crystallogr. Sec. A 45:182–187.
Golovchenko, J. A., Kincaid, B. M., Levesque, R. A., Meixner, A. E., and Kaplan, D. R. 1986. Polarization Pendellosung and the generation of circularly polarized x-rays with a quarter-wave plate. Phys. Rev. Lett. 57:202–205.
Hung, H. H. and Chang, S. L. 1989. Theoretical considerations on two-beam and multi-beam grazing-incidence x-ray diffraction: Nonabsorbing cases. Acta Crystallogr. Sec. A 45:823–833.
Franklin, G. E., Bedzyk, M. J., Woicik, J. C., Chien L., Patel, J. R., and Golovchenko, J.A. 1995. Order-to-disorder phase-transition study of Pb on Ge(111). Phys. Rev. B 51:2440–2445. Funke, P. and Materlik, G. 1985. X-ray standing wave fluorescence measurements in ultra-high vacuum adsorption of Br on Si(111)-(1X1). Solid State Commun. 54:921.
Golovchenko, J. A., Patel, J. R., Kaplan, D. R., Cowan, P. L., and Bedzyk, M. J. 1982. Solution to the surface registration problem using x-ray standing waves. Phys. Rev. Lett. 49:560. Greiser, N. and Matrlik, G. 1986. Three-beam x-ray standing wave analysis: A two-dimensional determination of atomic positions. Z. Phys. B 66:83–89.
Jach, T. and Bedzyk, M.J. 1993. X-ray standing waves at grazing angles. Acta Crystallogr. Sec. A 49:346–350. Jach, T., Cowan, P. L., Shen, Q., and Bedzyk, M. J. 1989. Dynamical diffraction of x-rays at grazing angle. Phys. Rev. B 39:5739– 5747. Jach, T., Zhang, Y., Colella, R., de Boissieu, M., Boudard, M., Goldman, A. I., Lograsso, T. A., Delaney, D. W., and Kycia, S. 1999. Dynamical diffraction and x-ray standing waves from
DYNAMICAL DIFFRACTION 2-fold reflections of the quasicrystal AlPdMn. Phys. Rev. Lett. 82:2904–2907. Jackson, J. D. 1975. Classical Electrodynamics, 2nd ed. John Wiley & Sons, New York. James, R. W. 1950. The Optical Principles of the Diffraction of X-rays. G. Bell and Sons, London. Juretschke, J. J. 1982. Invariant-phase information of x-ray structure factors in the two-beam Bragg intensity near a three-beam point. Phys. Rev. Lett. 48:1487–1489. Juretschke, J. J. 1984. Modified two-beam description of x-ray fields and intensities near a three-beam diffraction point. General formulation and first-order solution. Acta Crystallogr. Sec. A 40:379–389. Juretschke, J. J. 1986. Modified two-beam description of x-ray fields and intensities near a three-beam diffraction point. Second-order solution. Acta Crystallogr. Sec. A 42:449–456. Kaganer, V. M., Stepanov, S. A., and Koehler, R. 1996. Effect of roughness correlations in multilayers on Bragg peaks in X-ray diffuse scattering. Physica B 221:34–43. Kato, N. 1952. Dynamical theory of electron diffraction for a finite polyhedral crystal. J. Phys. Soc. Jpn. 7:397–414. Kato, N. 1960. The energy flow of x-rays in an ideally perfect crystal: Comparison between theory and experiments. Acta Crystallogr. 13:349–356. Kato, N. 1974. X-ray diffraction. In X-ray Diffraction (L. V. Azaroff, R. Kaplow, N. Kato, R. J. Weiss, A. J. C. Wilson, and R. A. Young, eds.). pp. 176–438. McGraw-Hill, New York. Kikuchi, S. 1928. Proc. Jpn. Acad. Sci. 4:271. Kimura, S., and Harada, J. 1994. Comparison between experimental and theoretical rocking curves in extremely asymmetric Bragg cases of x-ray diffraction. Acta Crystallogr. Sec. A 50:337. Klapper, H. 1996. X-ray diffraction topography: Application to crystal growth and plastic deformation. In X-ray and Neutron Dynamical Diffraction: Theory and Applications (A. Authier, S. Lagomarsino, and B. K. Tanner, eds.). Plenum, New York. Kortright, J. B. and Fischer-Colbrie, A. 1987. Standing wave enhanced scattering in multilayer structures. J. Appl. Phys. 61:1130–1133. Kossel, W., Loeck, V., and Voges, H. 1935. Z. Phys. 94:139. Kovalchuk, M. V. and Kohn, V. G. 1986. X-ray standing wave—a new method of studying the structure of crystals. Sov. Phys. Usp. 29:426–446. Krimmel, S., Donner, W., Nickel, B., Dosch, H., Sutter, C., and Grubel, G. 1997. Surface segregation-induced critical phenomena at FeCo(001) surfaces. Phys. Rev. Lett. 78:3880–3883. Kycia, S. W., Goldman, A. I., Lograsso, T. A., Delaney, D. W., Black, D., Sutton, M., Dufresne, E., Bruning, R., and Rodricks, B. 1993. Dynamical x-ray diffraction from an icosahedral quasicrystal. Phys. Rev. B 48:3544–3547. Lagomarsino, S. 1996. X-ray standing wave studies of bulk crystals, thin films and interfaces. In X-ray and Neutron Dynamical Diffraction: Theory and Applications (A. Authier, S. Lagomarsino, and B. K. Tanner, eds.). Plenum, New York. Lagomarsino, S., Scarinci, F., and Tucciarone, A. 1984. X-ray stading waves in garnet crystals. Phys. Rev. B 29:4859–4863. Lang, J. C., Srajer, G., Detlefs, C., Goldman, A. I., Konig, H., Wang, X., Harmon, B. N., and McCallum, R. W. 1995. Confirmation of quadrupolar transitions in circular magnetic X-ray dichroism at the dysprosium LIII edge. Phys. Rev. Lett. 74:4935–4938. Lee, H., Colella, R., and Chapman, L. D. 1993. Phase determination of x-ray reflections in a quasicrystal. Acta Crystallogr. Sec. A 49:600–605.
249
Lied, A., Dosch, H., and Bilgram, J. H. 1994: Glancing angle X-ray scattering from single crystal ice surfaces. Physica B 198:92– 96. Lipcomb, W. N. 1949. Relative phases of diffraction maxima by multiple reflection. Acta Crystallogr. 2:193–194. Lyman, P. F. and Bedzyk, M. J. 1997. Local structure of Sn/Si(001) surface phases. Surf. Sci. 371:307–315. Malgrange, C. 1996. X-ray polarization and applications. In X-ray and Neutron Dynamical Diffraction: Theory and Applications (A. Authier, S. Lagomarsino, and B. K. Tanner, eds.). Plenum, New York. Marra, W. L., Eisenberger, P., and Cho, A. Y. 1979. X-ray totalexternal-relfection Bragg diffraction: A structural study of the GaAs-Al interface. J. Appl. Phys. 50:6927–6933. Martines, R. E., Fontes, E., Golovchenko, J. A., and Patel, J. R. 1992. Giant vibrations of impurity atoms on a crystal surface. Phys. Rev. Lett. 69:1061–1064. Mills, D. M. 1988. Phase-plate performance for the production of circularly polarized x-rays. Nucl. Instrum. Methods A 266:531– 537. Moodie, A.F., Crowley, J. M., and Goodman, P. 1997. Dynamical theory of electron diffraction. In International Tables for Crystallography, Vol. B (U. Shmueki, ed.). p. 481. Academic Dordrecht, The Netherlands. Moon, R. M. and Shull, C. G. 1964. The effects of simultaneous reflection on single-crystal neutron diffraction intensities. Acta Crystallogr. Sec. A 17:805–812. Paniago, R., Homma, H., Chow, P. C., Reichert, H., Moss, S. C., Barnea, Z., Parkin, S. S. P., and Cookson, D. 1996. Interfacial roughness of partially correlated metallic multilayers studied by nonspecular X-ray reflectivity. Physica B 221:10–12. Parrat, P. G. 1954. Surface studies of solids by total reflection of xrays. Phys. Rev. 95:359–369. Patel, J. R. 1996. X-ray standing waves. In X-ray and Neutron Dynamical Diffraction: Theory and Applications (A. Authier, S. Lagomarsino, B. K. and Tanner, eds.). Plenum, New York. Patel, J. R., Golovchenko, J. A., Freeland, P. E., and Gossmann, H.-J. 1987. Arsenic atom location on passivated silicon (111) surfaces. Phys. Rev. B 36:7715–7717. Pinsker, Z. G. 1978. Dynamical Scattering of X-rays in Crystals. Springer Series in Solid-State Sciences, Springer-Verlag, Heidelberg. Porod, G. 1952. Die Ro¨ ntgenkleinwinkelstreuung von dichtgepackten kolloiden system en. Kolloid. Z. 125:51–57; 108– 122. Porod, G. 1982. In Small Angle X-ray Scattering (O. Glatter and O. Kratky, eds.). Academic Press, San Diego. Post, B. 1977. Solution of the x-ray phase problem. Phys. Rev. Lett. 39:760–763. Prins, J. A. 1930. Die Reflexion von Rontgenstrahlen an absorbierenden idealen Kristallen. Z. Phys. 63:477–493. Renninger, M. 1937. Umweganregung, eine bisher unbeachtete Wechselwirkungserscheinung bei Raumgitter-interferenzen. Z. Phys. 106:141–176. Rhan, H., Pietsch, U., Rugel, S., Metzger, H., and Peisl, J. 1993. Investigations of semiconductor superlattices by depth-sensitive X-ray methods. J. Appl. Phys. 74:146–152. Robinson, I. K. 1986. Crystal truncation rods and surface roughness. Phys. Rev. B 33:3830–3836. Rose, D., Pietsch, U., and Zeimer, U. 1997. Characterization of Inx Ga1x As single quantum wells, buried in GaAs[001], by grazing incidence diffraction. J. Appl. Phys. 81:2601– 2606.
250
COMPUTATION AND THEORETICAL METHODS
Salditt, T., Metzger, T. H., and Peisl, J. 1994. Kinetic roughness of amorphous multilayers studied by diffuse x-ray scattering. Phys. Rev. Lett. 73:2228–2231.
Shen, Q., Shastri, S., and Finkelstein, K. D. 1995. Stokes polarimetry for x-rays using multiple-beam diffraction. Rev. Sci. Instrum. 66:1610–1613.
Schiff, L. I. 1955. Quantum Mechanics, 2nd ed. McGraw-Hill, New York. Schlenker, M. and Guigay, J.-P. 1996. Dynamical theory of neutron scattering. In X-ray and Neutron Dynamical Diffraction: Theory and Applications (A. Authier, S. Lagomarsino, and B. K. Tanner, eds.). Plenum, New York. Schmidt, M. C. and Colella, R. 1985. Phase determination of forbidden x-ray reflections in V3Si by virtual Bragg scattering. Phys. Rev. Lett. 55:715–718.
Shen, Q., Umbach, C. C., Weselak, B., and Blakely, J. M. 1993. X-ray diffraction from a coherently illuminated Si(001) grating surface, Phys. Rev. B 48:17967–17971.
Shastri, S. D., Finkelstein, K. D., Shen, Q., Batterman, B. W., and Walko, D. A. 1995. Undulator test of a Bragg-reflection elliptical polarizer at 7.1 keV. Rev. Sci. Instrum. 66:1581. Shen, Q. 1986. A new approach to multi-beam x-ray diffraction using perturbation theory of scattering. Acta Crystallogr. Sec. A 42:525–533. Shen, Q. 1991. Polarization state mixing in multiple beam diffraction and its application to solving the phase problem. SPIE Proc. 1550:27–33. Shen, Q. 1993. Effects of a general x-ray polarization in multiplebeam Bragg diffraction. Acta Crystallogr. Sec. A 49:605–613. Shen, Q. 1996a. Polarization optics for high-brightness synchrotron x-rays. SPIE Proc. 2856:82. Shen, Q. 1996b. Study of periodic surface nanosctructures using coherent grating x-ray diffraction (CGXD). Mater. Res. Soc. Symp. Proc. 405:371–379. Shen, Q. 1998. Solving the phase problem using reference-beam xray diffraction. Phys. Rev. Lett. 80:3268–3271. Shen, Q. 1999a. Direct measurements of Bragg-reflection phases in x-ray crystallography. Phys. Rev. B 59:11109–11112. Shen, Q. 1999b. Expanded distorted-wave theory for phase-sensitive x-ray diffraction in single crystals. Phys. Rev. Lett. 83:4764–4787.
Shen, Q., Umbach, C. C., Weselak, B., and Blakely, J. M. 1996b. Lateral correlation in mesoscopic on silicon (001) surface determined by grating x-ray diffuse scattering. Phys. Rev. B 53: R4237–4240. Sinha, S. K., Sirota, E. B., Garoff, S., and Stanley, H. B. 1988. X-ray and neutron scattering from rough surfaces. Phys. Rev. B 38:2297–2311. Stepanov, S. A., Kondrashkina, E. A., Schmidbauer, M., Kohler, R., Pfeiffer, J.-U., Jach, T., and Souvorov, A. Y. 1996. Diffuse scattering from interface roughness in grazing-incidence X-ray diffraction. Phys. Rev. B 54:8150–8162. Takagi, S. 1962. Dynamical theory of diffraction applicable to crystals with any kind of small distortion. Acta Crystallogr. 15:1311–1312. Takagi, S. 1969. A dynamical theory of diffraction for a distorted crystal. J. Phys. Soc. Jpn. 26:1239–1253. Tanner, B. K. 1996. Contrast of defects in x-ray diffraction topographs. In X-ray and Neutron Dynamical Diffraction: Theory and Applications (A. Authier, S. Lagomarsino, and B. K. Tanner, eds.). Plenum, New York. Taupin, D. 1964. Theorie dynamique de la diffraction des rayons x par les cristaux deformes. Bull. Soc. Fr. Miner. Crist. 87:69. Thorkildsen, G. 1987. Three-beam diffraction in a finite perfect crystal. Acta Crystallogr. Sec. A 43:361–369. Tischler, J. Z. and Batterman, B. W. 1986. Determination of phase using multiple-beam effects. Acta Crystallogr. Sec. A 42:510– 514.
Shen, Q. 1999c. A distorted-wave approach to reference-beam xray diffraction in transmission cases. Phys. Rev. B. 61:8593– 8597.
Tolan, M., Konig, G., Brugemann, L., Press, W., Brinkop, F., and Kotthaus, J. P. 1992. X-ray diffraction from laterally structured surfaces: Total external reflection and grating truncation rods. Eur. Phys. Lett. 20:223–228.
Shen, Q., Blakely, J. M., Bedzyk, M. J., and Finkelstein, K. D. 1989. Surface roughness and correlation length determined from x-ray-diffraction line-shape analysis on Ge(111). Phys. Rev. B 40:3480–3482.
Tolan, M., Press, W., Brinkop, F., and Kotthaus, J. P. 1995. X-ray diffraction from laterally structured surfaces: Total external reflection. Phys. Rev. B 51:2239–2251.
Shen, Q. and Colella, R. 1987. Solution of phase problem for crys˚ . Nature (London) 329: tallography at a wavelength of 3.5 A 232–233. Shen, Q. and Colella, R. 1988. Phase observation in organic crystal ˚ x-rays. Acta Crystallogr. Sec. A 44:17–21. benzil using 3.5 A Shen, Q. and Finkelstein, K. D. 1990. Solving the phase problem with multiple-beam diffraction and elliptically polarized x rays. Phys. Rev. Lett. 65:3337–3340. Shen, Q. and Finkelstein, K. D. 1992. Complete determination of x-ray polarization using multiple-beam Bragg diffraction. Phys. Rev. B 45:5075–5078. Shen, Q. and Finkelstein, K. D. 1993. A complete characterization of x-ray polarization state by combination of single and multiple Bragg reflections. Rev. Sci. Instrum. 64:3451–3456.
Vineyard, G. H. 1982. Grazing-incidence diffraction and the distorted-wave approximation for the study of surfaces. Phys. Rev. B 26:4146–4159. von Laue, M. 1931. Die dynamische Theorie der Rontgenstrahlinterferenzen in neuer Form. Ergeb. Exakt. Naturwiss. 10:133– 158. Wang, J., Bedzyk, M. J., and Caffrey, M. 1992. Resonanceenhanced x-rays in thin films: A structure probe for membranes and surface layers. Science 258:775–778. Warren, B. E. 1969. X-Ray Diffraction. Addison Wesley, Reading, Mass. Weckert, E. and Hu¨ mmer, K. 1997. Multiple-beam x-ray diffraction for physical determination of reflection phases and its applications. Acta Crystallogr. Sec. A 53:108–143.
Shen, Q. and Kycia, S. 1997. Determination of interfacial strain distribution in quantum-wire structures by synchrotron x-ray scattering. Phys. Rev. B 55:15791–15797.
Weckert, E., Schwegle, W., and Hummer, K. 1993. Direct phasing of macromolecular structures by three-beam diffraction. Proc. R. Soc. London A 442:33–46.
Shen, Q., Kycia, S. W., Schaff, W. J., Tentarelli, E. S., and Eastman, L. F. 1996a. X-ray diffraction study of size-dependent strain in quantum wire structures. Phys. Rev. B 54:16381– 16384.
Yahnke, C. J., Srajer, G., Haeffner, D. R., Mills, D. M, and Assoufid. L. 1994. Germanium x-ray phase plates for the production of circularly polarized x-rays. Nucl. Instrum. Methods A 347:128–133.
DYNAMICAL DIFFRACTION Yoneda, Y. 1963. Anomalous Surface Reflection of X-rays. Phys. Rev. 131:2010–2013.
L
Zachariasen, W. H. 1945. Theory of X-ray Diffraction in Crystals. John Wiley & Sons, New York.
M; A
Zachariasen, W. H. 1965. Multiple diffraction in imperfect crystals. Acta Crystallogr. Sec. A 18:705–710.
n N
Zegenhagen, J., Hybertsen, M. S., Freeland, P. E., and Patel, J. R. 1988. Monolayer growth and structure of Ga on Si(111). Phys. Rev. B 38:7885–7892. Zhang, Y., Colella, R., Shen, Q., and Kycia, S. W. 1999. Dynamical three-beam diffraction in a quasicrystal. Acta Crystallogr. A 54:411–415.
n O P P1 , P2 , P3 PH =P0
KEY REFERENCES
r R r0 , rG
Authier et al., 1996. See above. Contains an excellent selection of review papers on modern dynamical theory topics.
r 0 , t0 re
Batterman and Cole, 1964. See above. One of the most cited articles on x-ray dynamical theory. Colella, 1974. See above. Provides a fundamental formulation for NBEAM dynamical theory. Zachariasen, 1945. See above. A classic textbook on x-ray diffraction theories, both kinematic and dynamical.
APPENDIX: GLOSSARY OF TERMS AND SYMBOLS A b D D0 D0 DH Di0 ; DiH De0 ; DiH E F00 F000 FH G H H H–G IH K0n K0 k0 KH kH
Effective crystal thickness parameter Ratio of direction cosines of incident and diffracted waves Electric displacement vector Incident electric displacement vector Specular reflected wave field Fourier component H of electric displacement vector Internal wave fields External wave fields Electric field vector Real part of F0 Imaginary part of F0 Structure factor of reflection H Reciprocal lattice vector for reference reflection Reciprocal lattice vector Negative of H Difference between two reciprocal lattice vectors Intensity of reflection H Component of internal incident wave vector normal to surface Incident wave vector inside crystal Incident wave vector outside crystal Wave vector inside crystal Wave vector outside crystal
S, ST T, A, B, C u w aH a, b deðrÞ eðrÞ e0 g0 gH Z, ZG l m0 n p y yB yc r r0 rðrÞ s t x0 xH c
251
reciprocal lattice vector for a detour reflection Matrices used with polarization density matrix Index of refraction Number of unit cells participating in diffraction Unit vector along surface normal Reciprocal lattice origin Polarization factor Stokes-Poincare polarization parameters Total diffracted power normalized to the incident power Real-space position vector Reflectivity Distorted-wave amplitudes in expanded distorted-wave theory Fresnel reflection and transmission coefficient at an interface Classical radius of an electron, 2:818 ! 105 angstroms Poynting vector Matrices used in NBEAM theory Unit vector along wave propagation direction Intrinsic diffraction width, Darwin width ¼ re l2 ðpVc Þ Phase of FH, structure factor of reflection H Branches of dispersion surface Susceptibility function of a crystal Dielectric function of a crystal Dielectric constant in vacuum Direction cosine of incident wave vector Direction cosine of diffracted wave vector Angular deviation from yB normalized to Darwin width Wavelength Linear absorption coefficient Intrinsic dynamical phase shift Polarization unit vector within scattering plane Incident angle Bragg angle Critical angle Polarization density matrix Average charge density Charge density Polarization unit vector perpendicular to scattering plane Penetration depth of an evanescent wave Correction to dispersion surface O due to two-beam diffraction Correction to dispersion surface H due to two-beam diffraction Azimuthal angle around the scattering vector
QUN SHEN Cornell University Ithaca, New York
252
COMPUTATION AND THEORETICAL METHODS
COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS INTRODUCTION Diffuse intensities in alloys are measured by a variety of techniques, such as x ray, electron, and neutron scattering. Above a structural phase-transformation boundary, typically in the solid-solution phase where most materials processing takes place, the diffuse intensities yield valuable information regarding an alloy’s tendency to order. This has been a mainstay characterization technique for binary alloys for over half a century. Although multicomponent metallic alloys are the most technologically important, they also pose a great experimental and theoretical challenge. For this reason, a vast majority of experimental and theoretical effort has been made on binary systems, and most investigated ‘‘ternary’’ systems are either limited to a small percentage of ternary solute (say, to investigate electron-per-atom effects) or they are pseudo-binary systems. Thus, for multicomponent alloys the questions are: how can you interpret diffuse scattering experiments on such systems and how does one theoretically predict the ordering behavior? This unit discusses an electronic-based theoretical method for calculating the structural ordering in multicomponent alloys and understanding the electronic origin for this chemical-ordering behavior. This theory is based on the ideas of concentration waves using a modern electronic-structure method. Thus, we give examples (see Data Analysis and Initial Interpretation) that show how we determined the electronic origin behind the unusual ordering behavior in a few binary and ternary alloy systems that were not understood prior to our work. From the start, the theoretical approach is compared and contrasted to other complimentary techniques for completeness. In addition, some details are given about the theory and its underpinnings. Please do not let this deter you from jumping ahead and reading Data Analysis and Initial Interpretation and Principles of the Method. For those not familiar with electronic properties and how they manifest themselves in the ordering properties, the discussion following Equation 27 may prove useful for understanding Data Analysis and Initial Interpretation. Importantly, for the more general multicomponent case, we describe in the context of concentration waves how to extract more information from diffuse-scattering experimental data (see Concentration Waves in Multicomponent Alloys). Although developed to understand the calculated diffuse-scattering intensities, this analysis technique allows one to determine completely the type of ordering described in the numerous chemical pair correlations that must be measured. In fact, what is required (in addition to the ordering wavevector) is an ordering ‘‘polarization’’ of the concentration wave that is contained in the diffuse intensities. The example case of face-centered cubic (fcc) Cu2NiZn is given. For definitions of the symbols used throughout the unit, see Table 1. For binary or multicomponent alloys, the atomic shortrange order (ASRO) in the disordered solid-solution phase
is related to the thermally induced concentration fluctuations in the alloy. Such fluctuations in the chemical site occupations are the (infinitesimal) deviations from a homogeneously random state, and are directly related to the chemical pair correlations in the alloy (Krivoglaz, 1969). Thus, the ASRO provides valuable information on the atomic structure to which the disordered alloy is tending—i.e., it reveals the chemical ordering tendencies in the high-temperature phase (as shown by Krivoglaz, 1969; Clapp and Moss, 1966; de Fontaine, 1979; Khachaturyan, 1983; Ducastelle, 1991). Importantly, the ASRO can be determined experimentally from the diffuse scattering intensities measured in reciprocal space either by x rays (X-RAY AND NEUTRON DIFFUSE SCATTERING MEASUREMENTS), neutrons (NEUTRON TECHNIQUES), or electrons (LOW-ENERGY ELECTRON DIFFRACTION; Sato and Toth, 1962; Moss, 1969; Reinhard et al., 1990). However, the underlying microscopic or electronic origin for the ASRO cannot be determined from such experiments, only their observed indirect effect on the order. Therefore, the calculation of diffuse intensities in high-temperature, disordered alloys based on electronic density-functional theory (DFT; SUMMARY OF ELECTRONIC STRUCTURE METHODS) and the subsequent connection of those intensities to its microscopic origin(s) provides a fundamental understanding of the experimental data and phase instabilities. These are the principal themes that we will emphasize in this unit. The chemical pair correlations determined from the diffuse intensities are written usually as normalized probabilities, which are then the familiar Warren-Cowley parameters (defined later). In reciprocal space, where scattering data is collected, the Warren-Cowley parameters are denoted by amn (k), where m and n label the species (1 to N in an N-component alloy) and where k is the scattering wave vector. In the solid-solution phase, the sharp Bragg diffraction peaks (in contrast to the diffuse peaks) identify the underlying Bravais lattice symmetry, such as, fcc and body-centered cubic (bcc), and determine the possible set of available wave vectors. (We will assume heretofore that there is no change in the Bravais lattice.) The diffuse maximal peaks in amn (k) at wave vector k0 indicate that the disordered phase has low-energy ordering fluctuations with that periodicity, and k0 is not where the Bragg reflection sits. These fluctuations are not stable but may be long-lived, and they indicate the nascent ordering tendencies of the disordered alloy. At the so-called spinodal temperature, Tsp, elements of the amn (k ¼ k0) diverge, indicating the absolute instability of the alloy to the formation of a long-range ordered state with wavevector k0. Hence, it is clear that the fluctuations are related to the disordered alloy’s stability matrix. Of course, there may be more than one (symmetry unrelated) wavevector prominent, giving a more complex ordering tendency. Because the concentrations of the alloy’s constituents are then modulated with a wave-like periodicity, such orderings are often referred to as ‘‘concentration waves’’ (Khachaturyan, 1972, 1983; de Fontaine, 1979). Thus, any ordered state can be thought of as a modulation of the disordered state by a thermodynamically stable concentration wave. Keep in mind that any arrangement of atoms on a Bravais lattice (sites labeled by i) may be
COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS
253
Table 1. Table of Symbols Symbol AuFe L10, L12, L11, etc.
h...i Bold symbols k and q k0 Star of k N (h, k, l) i, j, k, etc. m, n, etc. Ri xm,i s cm,i dm,n qmn,ij amn,ij qmn(k) amn(k)
esm ðkÞ
ZsS ðTÞ T F
N(E) n(E) ta,i ta,ii
Meaning Standard alloy nomenclature such that underlined element is majority species, here Au-rich Throughout we use the Strukturbericht notation (http://dave.nrl,navy.mil) lattice where A are monotonic (e.g., Al ¼ fcc; A2 ¼ bcc), B2 [e.g., CsCl with (111) wavevector ordering], and L10 (e.g., CuAu with h100i wavevector ordering), and so on Thermal/configurational average Vectors Wavevectors in reciprocal space Specific set of symmetry-related, ordering wavevector The star of a wavevector is the set of symmetry equivalent k values, e.g., in fcc, the star of k ¼ (100) is {(100), (010), (001)} Number of elements in a multicomponent alloy, giving N1 independent degrees of freedom because composition is conserved General k-space (reciprocal lattice) point in the first Brillouin zone Refer to enumeration of real-space lattice site Greek symbols refer to elements in alloy, i.e., species labels Real-space lattice position for ith site Site occupation variable (1, if m-type atoms at ith site, 0 otherwise) Branch index for possible multicomponent ordering polarizations, i.e., sublattice occupations relative to ‘‘host’’ element (see text). Concentration of m type atoms at ith site, which is the thermal average of xm,i. As this is between 0 and 1, this can also be thought of as a site-occupancy probability. Kronecker delta (1 if subscripts are same, 0 otherwise). Einstein summation is not used in this text. Real-space atomic pair-correlation function (not normalized). Generally, it has two species labels, and two site indices. Normalized real-space atomic pair-correlation function, traditionally referred to as the Warren-Cowley shortrange-order parameter. Generally, it has two species labels, and two site indices. aii ¼ 1 by definition (see text). Fourier transform of atomic pair-correlation function Experimentally measured Fourier transform of normalized pair-correlation function, traditionally referred to as the Warren-Cowley short-range-order parameter. Generally, it has two species labels. For binary alloy, no labels are required, which is more familiar to most people. For N-component alloys, element of eigenvector (or eigenmode) for concentration-wave composed of N 1 branches s and N 1 independent species m. This is 1 for binary alloy, but between 0 and 1 for an N-component alloy. As we report, this can be measured experimentally to determine the sublattice ordering in a multicomponent alloy, as done recently by ALCHEMI measurements. Temperature-dependent long-range-order parameter for branch index s, which is between 0 (disordered phase) and 1 (fully ordered phase) Temperature (units are given in text) Free energy Grand potential of alloy. With subscript ‘‘e,’’ it is the electronic grand potential of the alloy, where the electronic degrees of freedom have not been integrated out The electronic integrated density of states at an energy E The electronic density of states at an energy E Single-site scattering matrix, which determines how an electron will scatter off a single atom Electronic scattering-path operator, which completely details how an electron scatters through an array of atoms
Fourier-wave decomposed, i.e., considered a ‘‘concentration wave.’’ For a binary ðA1c Bc Þ, the concentration wave (or site occupancy) is simply ci ¼ c þ k ½QðkÞeikRi þ c:c: , with the wavevectors limited to the Brillouin zoneassociated Q(k) with the underlying Bravais lattice of the disordered alloy, and where the amplitudes dictate strength of ordering (c.c. stands for complex conjugate). For example, a peak in amn (k0 ¼ {001}) within a 50-50 binary fcc solid solution indicates an instability toward alternating layers along the z direction in real space, such as in the Cu-Au structure [designated as L10 in Strukturbericht notation (see Table 1) and having alternating Cu/Au layers along (001)]. Of course, at high temperatures, all wavevectors related by the symmetry operations of the disordered lattice (referred to as a star) are degenerate, such as the h100i star comprised of (100), (010), and (001). In contrast,
a k0 ¼ (000) peak indicates clustering because the associated wavelength of the concentration modulation is very long range. Interpretation of the results of our firstprinciples calculations is greatly facilitated by the concentration wave concept, especially for multicomponent alloys, and we will explain results in that context. In the high-temperature disordered phase, where most materials processing takes place, this local atomic ordering governs many materials properties. In addition, these incipient ordering tendencies are often indicative of the long-range order (LRO) found at lower temperatures, even if the transition is first order; that is, the ASRO is a precusor of the LRO phase. For these two additional reasons, it is important to predict and to understand fundamentally this ubiquitous alloying behavior. To be precise for the experts and nonexperts alike, strictly speaking,
254
COMPUTATION AND THEORETICAL METHODS
the fluctuations in the disordered state reveal the low-temperature, long-range ordering behavior for a second-order transition, with critical temperature Tc ¼ Tsp. On the other hand, for first-order transitions (with Tc > Tsp), symmetry arguments indicate that this can be, but does not have to be, the case (Landau, 1937a,b; Lifshitz, 1941, 1942; Landau and Lifshitz, 1980; Khachaturyan, 1972, 1983). It is then possible that the system undergoes a first-order transition to an ordering that preempts those indicated by the ASRO and leads to LRO of a different periodicity unrelated to k0. Keep in mind, while not every alloy has an experimentally realizable solid-solution phase, the ASRO of the hypothetical solid-solution phase is still interesting because it is indicative of the ordering interactions in the alloy, and, is typically indicative of the long-ranged ordered phases. Most metals of technological importance are alloys of more than two constituents. For example, the easy-forming, metallic glasses are composed of four and five elements (Inoue et al., 1990; Peker and Johnson, 1993), and traditional steels have more than five active elements (Lankford et al., 1985). The enormous number of possible combinations of elements makes the search for improved or novel metallic properties a daunting proposition for both theory and experiment. Except for understanding the ‘‘electron-per-atom’’ (e/a) effects due to small ternary additions, measurement of ASRO and interpretation of diffuse scattering experiments in multicomponent alloys is, in fact, a largely uncharted area. In a binary alloy, the theory of concentration waves permits one to determine the structure indicated by the ASRO given only the ordering wavevector (Khachaturyan, 1972, 1983; de Fontaine, 1975, 1979). In multicomponent alloys, however, the concentration waves have additional degrees of freedom corresponding to polarizations in ‘‘composition space,’’ similar to ‘‘branches’’ in the case of phonons in alloys (Badalayan et al., 1969; de Fontaine, 1973; Althoff et al., 1996); thus, more information is required. These polarizations are determined by the electronic interactions and they determine the sublattice occupations in partially ordered states (Althoff et al., 1996). From the point of view of alloy design, and at the root of alloy theory, identifying and understanding the electronic origins of the ordering tendencies at high temperatures and the reason why an alloy adopts a specific low-temperature state gives valuable guidance in the search for new and improved alloys via ‘‘tuning’’ an alloy’s properties at the most fundamental level. In metallic alloys, for example, the electrons cannot be allocated to specific atomic sites, nor can their effects be interpreted in terms of pairwise interactions. For addressing ASRO in specific alloys, it is generally necessary to solve the many-electron problem as realistically and as accurately as possible, and then to connect this solution to the appropriate compositional, magnetic, or displacive correlation functions measured experimentally. To date, most studies from first-principle approaches have focused on binary alloy phase diagrams, because even for these systems the thermodynamic problem is extremely nontrivial, and there is a wealth of experimental data for comparison. This unit will concentrate on the
techniques employed for calculating the ASRO in binary and multicomponent alloys using DFT methods. We will not include, for example, simple parametric phase stability methods, such as CALPHAD (Butler et al., 1997; Saunders, 1996; Oates et al., 1996), because they fail to give any fundamental insight and cannot be used to predict ASRO. In what follows, we give details of the chemical pair correlations, including connecting what is measured experimentally to that developed mathematically. Because we use an electronic DFT based, mean-field approach, some care will be taken throughout the text to indicate innate problems, their solutions, quantitative and qualitative errors, and resolution accomplished within mean-field means (but would agree in great detail with more accurate, if not intractable, means). We will also discuss at some length the interesting means of interpreting the type of ASRO in multicomponent alloys from the diffuse intensities, important for both experiment and theory. Little of this has been detailed elsewhere, and, with our applications occurring only recently, this important information is not widely known. Before presenting the electronic basis of the method, it is helpful to develop a fairly unique approach based on classical density-functional theory that not only can result in the well-known, mean field equations for chemical potential and pair correlation but may equally allow a DFT-based method to be developed for such quantities. Because the electronic DFT underpinnings for the ASRO calculations are based on a rather mathematical derivation, we try to discuss the important physical content of the DFT-based equations through truncated versions of them, which give the essence of the approach. In the Data Analysis and Initial Interpretation section, we discuss the role of several electronic mechanisms that produce strong CuAu [L10 with (001) wavevector] order in NiPt, Ni4Mo [or ð1 12 0Þ wavevector] ordering in AuFe alloys, both commensurate and incommensurate order in fcc Cu-Ni-Zn alloys, and the novel CuPt [or L11, with ð 12 12 12 Þ wave vector] order in fcc CuPt. Prior to these results, and very relevant for NiPt, we discuss how charge within a homogeneously random alloy is actually correlated through the local chemical environment, even though there are no chemical correlations. At minimum, a DFT-based theory of ASRO, whose specific advantage is the ability to connect features in the ASRO in multicomponent alloys with features of the electronic structure of the disordered alloy, would be very advantageous for establishing trends, much the way Hume-Rothery established empirical relationships to trends in alloy phase formation. Some care will be given to list briefly where such calculations are relevent, in evolution or in defect, as well as those that complement other techniques. It is clear then that this is not an exhaustive review of the field, but an introduction to a specific approach. Competitive and Related Techniques Traditionally, DFT-based band structure calculations focus on the possible ground-state structures. While it is clearly valuable (and by no means trivial) to predict the
COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS
ground-state crystal structure from first principles, it is equally important to expand this to partially ordered and disordered phases at high temperatures. One reason for this is that ASRO measurements and materials processing take place at relatively high temperatures, typically in a disordered phase. Basically, today, this calculation can be done in two distinct (and usually complementary) ways. First, methods based on effective chemical interactions obtained from DFT methods have had successes in determining phase diagrams and ASRO (Asta and Johnson, 1997; Wolverton and Zunger, 1995a; Rubin and Finel, 1995). This is, e.g., the idea behind the cluster-expansion method proposed by Connolly and Williams (1983), also referred to as the structural inversion method (SIM). Briefly, in the cluster-expansion method a fit is made to the formation energies of a few (up to several tens of) ordered lattice configurations using a generalized Ising model (which includes 2-body, 3-body, up to N-body clusters, whatever is required [in principle] to produce effective-chemical interactions (ECIs). These ECIs approximate the formation energetics of all other phases, including homogeneously random, and are used as input to some classical statistical mechanics approach, like Monte Carlo or the cluster-variation method (CVM), to produce ASRO or phase-boundary information. While this is an extremely important first-principles method, and is highlighted elsewhere in this chapter (PREDICTION OF PHASE DIAGRAMS), it is difficult from this approach to discern any electronic origin because all the underlying electronic information has been integrated out, obscuring the quantum mechanical origins of the ordering tendencies. Furthermore, careful (reiterative) checks have to be made to validate the convergence of the fit with a number of structures, of stoichiometries, and the range and multiplet structure of interactions. The inclusion of magnetic effects or multicomponent additions begins to add such complexity that the cluster expansion becomes more and more difficult (and delicate), and the size of the electronic-structure unit cells begins to grow very large (depending on the DFT method, growing as N to N3, where N is the number of atoms in the unit cell). The use of the CVM, e.g., quickly becomes uninviting for multicomponent alloys, and it then becomes necessary to rely on Monte Carlo methods for thermodynamics, where interpretation sometimes can be problematic. Nevertheless, this approach can provide ASRO and LRO information, including phase-boundary (global stability) information. If, however, you are interested in calculating the ASRO for just one multicomponent alloy composition, it is more reliable and efficient to perform a fixed-composition SIM using DFT methods to get the effective ECIs, because fewer structures are required and the subtleties of composition do not have to be reproduced (McCormack et al., 1997). In this mode, the fitted interactions are more stable and multiplets are suppressed; however, global stability information is lost. A second approach (the concentration-wave approach), which we shall present below, involves use of the (possible) high-temperature disordered phase (at fixed composition) as a reference and looks for the types of local concentration fluctuations and ordering instabilities that are energeti-
255
cally allowed as the temperature is lowered. Such an approach can be viewed as a linear-response method for thermodynamic degrees of freedom, in much the same way that a phonon dynamical matrix may be calculated within DFT by expanding the vibrational (infinitesimal) displacements about the ideal Bravais lattice (i.e., the high-symmetry reference state; Gonze, 1997; Quong and Lui, 1997; Pavone et al., 1996; Yu and Kraukauer, 1994). Such methods have been used for three decades in classical DFT descriptions of liquids (Evans, 1979), and, in fact, there is a 1:1 mapping from the classical to electronic DFT (Gyo¨ rffy and Stocks, 1983). These methods may therefore be somewhat familiar in mathematical foundation. Generally speaking, a theory that is based on the high-temperature, disordered state is not biased by any a priori choice of chemical structures, which may be a problem with more traditional total-energy or cluster-expansion methods. The major disadvantage of this approach is that no global stability information is obtained, because only the local stability at one concentration is addressed. Therefore, the fact that the ASRO for a specific concentration can be directly addressed is both a strength and shortcoming, depending upon one’s needs. For example, if the composition dependence of the ASRO at five specific compositions is required, only five calculations are necessary, whereas in the first method described above, depending on the complexity, a great many alloy compositions and structural arrangements at those compositions are still required for the fitting (until the essential physics is somehow, maybe not transparently, included). Again, as emphasized in the introduction, a great strength of the first-principles concentration-wave method is that the electronic mechanisms responsible for the ordering instabilities may be obtained. Thus, in a great many senses, the two methods above are very complementary, rather than competing. Recently, the two methods have been used simultaneously on binary (Asta and Johnson, 1997) and ternary alloys (Wolverton and de Fontaine, 1994; McCormack et al., 1997). Certain results from both methods agree very well, but each method provides additional (complementary) information and viewpoints, which is very helpful from a computer alloy design perspective. Effective Interactions from High-Temperature Experiments While not really a first-principles method, it is worth mentioning a third method with a long-standing history in the study of alloys and diffuse-scattering data—using inverse Monte Carlo techniques based upon a generalized Ising model to extract ECIs from experimental diffuse-scattering data (Masanskii et al., 1991; Finel et al., 1994; Barrachin et al., 1994; Pierron-Bohnes et al., 1995; Le Bolloc’h et al., 1997). Importantly, such techniques have been used typically to extract the Warren-Cowley parameters in real space from the k-space data because it is traditional to interpret the experiment in this fashion. Such ECIs have been used to perform Monte Carlo calculations of phase boundaries, and so on. While it may be useful to extract the Warren-Cowley parameters via this route, it is important to understand some fundamental points
256
COMPUTATION AND THEORETICAL METHODS
that have not been appreciated until recently: the ECIs so obtained (1) are not related to any fundamental alloy Hamiltonian; (2) are parameters that achieve a best fit to the measured ASRO; and (3) should not be trusted, in general, for calculating phase boundaries. The origin and consequences of these three remarks are as follows. It should be fairly obvious that, given enough ECIs (i.e., fitting degrees of freedom), a fit of the ASRO is possible. For example, one may use many pairs of ECIs, or fewer pairs if some multiplet interactions are included, and so on (Finel et al., 1994; Barrachin et al., 1994). Therefore, it is clear that the fit is not unique and does not represent anything fundamental; hence, points 1 and 2 above. The only important matter for the fitting of the ASRO is the k-space location of the maximal intensities and their heights, which reveal both the type and strength of the ASRO, at least for binaries where such a method has been used countless times. Recently, a very thorough study was performed on a simple model alloy Hamiltonian to exemplify some of these points (Wolverton et al., 1997). In fact, while different sets of ECIs may satisfy the fitting procedure and lead to a good reproduction of the experimental ASRO, there is no a priori guarantee that all sets of ECIs will lead to equivalent predictions of other physical properties, such as grain-boundary energies (Finel et al., 1994; Barrachin et al., 1994). Point 3 is a little less obvious. If both the type and strength of the ASRO are reproduced, then the ECIs are accurately reproducing the energetics associated with the infinitesimal-amplitude concentration fluctuations in the high-temperature disordered state. They may not, however, reflect the strength of the finite-amplitude concentration variations that are associated with a (possibly strong) first-order transition from the disordered to a long-range ordered state. In general, the energy gained by a first-order transformation is larger than suggested by the ASRO, which is why Tc > Tsp. In the extreme case, it is quite possible that the ASRO produces a set of ECIs that produce ordering type phase boundaries (with a negative formation energy), whereas the low-temperature state is phase separating (with a positive formation energy). An example of this can be found in the Ni-Au system (Wolverton and Zunger, 1997). Keep in mind, however, that this is a generic comment and much understanding can certainly be obtained from such studies. Nevertheless, this should emphasize the need (1) to determine the underlying origins for the fundamental thermodynamic behavior, (2) to connect high and low temperature properties and calculations, and (3) to have complementary techniques for a more thorough understanding.
PRINCIPLES OF THE METHOD After establishing general definitions and the connection of the ASRO to the alloy’s free energy, we show, a simple standard Ising model, the well-known Krivoglaz-ClappMoss form (Krivoglaz, 1969; Clapp and Moss, 1966), connecting the so-called ‘‘effective chemical interactions’’ and the ASRO. We then generalize these concepts to the
more accurate formulation involving the electronic grand potential of the disordered alloy, which we base on a DFT Hamiltonian. Since we wish to derive the pair correlations from the electronic interactions inherent in the high-temperature state, it is most straightforward to employ a simple twostate, Ising-like variable for each alloy component and to enforce a single-occupancy constraint on each site in the alloy. This approach generates a model which straightforwardly deals with an arbitrary number of species, in contrast to an approach based on an N-state spin model (Ceder et al., 1994), which produces a mapping between the spin and concentration variables that is nonlinear. With this Ising-like representation, any atomic configuration of an alloy (whether ordered, partially ordered, or disordered) is described by a set of occupation variables, fxm;i g, where m is the species label and i labels the lattice site. The variable xm;i is equal to 1 if an atom of species m occupies the site i; otherwise it is 0. Because there can be only one atom per lattice site (i.e., a single-occupancy constraint: m xm;i ¼ 1) there are (N 1) independent occupation variables at each site for an N-component alloy. This single-occupancy constraint is implemented by designating one species as the ‘‘host’’ species (say, the Nth one) and treating the host variables as dependent. The site probability (or sublattice concentration) is just the thermodynamic average (denoted by h. . .i) of the site occupations, i.e., cm;i ¼ hxm;i i which is between 0 and 1. For the disordered state, with no long-range order, cm;i ¼ cm for all sites i. (Obviously, the presence of LRO is reflected by a nonzero value of cm;i ¼ cm , which is one possible definition of a LRO parameter.) In all that follows, because the meaning without a site index is obvious, we will forego the overbar on the average concentration. General Background on Pair Correlations The atomic pair-correlation functions, that is, the correlated fluctuations about the average probabilities, are then properly defined as: qmn;ij ¼ hðxm;i cm;i Þðxn; j cn; j Þi ¼ hxm;i xn; j i hxm;i ihxn j i
ð1Þ
which reflects the presence of ASRO. Note that pair correlation is of rank (N 1) for an N-component alloy because of our choice of independent variables (the ‘‘host’’ is dependent). Once the portion of rank (N 1) has been determined, the ‘‘dependent’’ part of the full N-dimensional correlation function may be found by the single-occupancy constraint. Because of the dependencies introduced by this constraint, the N-dimensional pair-correlation function is a singular matrix, whereas, the ‘‘independent’’ portion of rank (N 1) is nonsingular (it has an inverse) everywhere above the spinodal temperature. It is important to notice that, by definition, the sitediagonal part of the pair correlations, i.e., hxm;i xm; j i, obeys a sum rule because ðxm;i Þ2 ¼ xm; i , qmn;ii ¼ cm ðdmn cn Þ
ð2Þ
COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS
where dmn is a Kronecker delta (and there is no summation over repeated indices). For a binary alloy, with cA þ cB ¼ 1, there is only one independent composition, say, cA and cB is the ‘‘host’’ so that there is only one pair correlation and qAA;ii ¼ cA ð1 cA Þ. It is best to define the pair correlations in terms of the so-called Warren-Cowley parameters as: amn;ij ¼
qmn;ij cm ðdmn cn Þ
ð3Þ
Note that for a binary alloy, the single pair correlation is aAA;ij ¼ qAA;ij =½cA ð1 cA Þ and the AA subscripts are not needed. Clearly, the Warren-Cowley parameters are normalized to range between 1, and, hence, they are the joint probabilities of finding two particular types of atoms at two particular sites. The pair correlations defined in Equation 3 are, of course, the same pair correlations that are measured in diffuse-scattering experiments. This is seen by calculating the scattering intensity by averaging thermodynamically the square of the scattering amplitude, A(k). For example, the A(k) for a binary alloy with Ns atoms is given by ð1=Ns Þi ½ fA xA;i þ fB ð1 xA;i ÞÞ eikRi , for on site i you are either scattering off an ‘‘A’’ atom or ‘‘not an A’’ atom. Here fm is the scattering factor for x rays; use bm for neutrons. The scattering intensity, I(k), is then IðkÞ ¼ hjAðkÞj2 i ¼ dk;0 ½cA fA þ ð1 cA ÞfB 2 1 þ ð fA fB Þ2 ij qmn;ij eikðRi Rj Þ Ns
ðBragg termÞ ðdiffuse termÞ ð4Þ
The first term in the scattering intensity is the Bragg scattering found from the average lattice, with an intensity given by the compositionally averaged scattering factor. The second term is the so-called diffuse-scattering term, and it is the Fourier transform of Equation 1. Generally, the diffuse scattering intensity for an N-component alloy (relevant to experiment) is then Idiff ðkÞ ¼
N X
ð fm fn Þ2 qmn ðkÞ
ð5Þ
m 6¼ n ¼ 1
or the sum may also go from 1 to (N 1) if ð fm fn Þ2 is replaced by fm fn . The various ways to write this arise due to the single-occupancy constraint. For direct comparison to scattering experiments, theory needs to calculate qmn ðkÞ. Similarly, the experiment can only measure the m 6¼ n portion of the pair correlations because it is the only part that has scattering contrast (i.e., fm fn 6¼ 0) between the various species. The remaining portion is obtained by the constraints. In terms of experimental Laue units [i.e., Ilaue ¼ ð fm fn Þ2 cm ðdmn cn Þ], Idiff(k) may also be easily given in terms of Warren-Cowley parameters. When the free-energy curvature, i.e., q1(k), goes through zero, the alloy is unstable to chemical ordering, and q(k) and a(k) diverge at Tsp. So measurements or calculations of q(k) and (k) are a direct means to probe the
257
free energy associated with concentration fluctuations. Thus, it is clear that the chemical fluctuations leading to the observed ASRO arise from the curvature of the alloy free energy, just as phonons or positional fluctuations arise from the curvature of a free energy (the dynamical matrix). It should be realized that the above comments could just as well have been made for magnetization, e.g., using the mapping ð2x 1Þ ! s for the spin variables. Instead of chemical fields, there are magnetic fields, so that q(k) becomes the magnetic susceptibility, w(k). For a disordered alloy with magnetic fluctuations present, one will also have a cross-term that represents the magnetochemical correlations, which determine how the magnetization on an atomic site varies with local chemical fluctuations, or vice versa (Staunton et al., 1990; Ling et al., 1995a). This is relevant to magnetism in alloys covered elsewhere in this unit (see Coupling of Magnetic Effects and Chemical Order). Sum Rules and Mean-Field Errors By Equations 2 and 3, amn; ii should always be 1; that is, due to the (discrete) translational invariance of the disordered state, the Fourier transform is well defined and ð amn;ii ¼ amn ðR ¼ 0Þ ¼ dkamn ðkÞ ¼ 1
ð6Þ
This intensity sum rule is used to check the experimental errors associated with the measured intensities (see SYMMETRY IN CRYSTALLOGRAPHY, KINEMATIC DIFFRACTION OF X RAYS and DYNAMICAL DIFFRACTION). Within most mean-field theories using model Hamiltonians, unless care is taken, Equations 2 and 6 are violated. It is in fact this violation that accounts for the major errors found in mean-field estimates of transition temperatures, because the diagonal (or intrasite) elements of the pair correlations are the largest. Lars Onsager first recognized this in the 1930s for interacting electric dipoles (Onsager, 1936), where he found that a mean-field solution produced the wrong physical sign for the electrostatic energy. Onsager found that by enforcing the equivalents of Equations 4 or 6 (by subtracting an approximate field arising from self-correlations), a more correct physical behavior is found. Hence, we shall refer to the mathematical entities that enforce these sum rules as Onsager corrections (Staunton et al., 1994). In the 1960s, mean-field, magnetic-susceptibility models that implemented this correction were referred to as meanspherical models (Berlin and Kac, 1952), and the connection to Onsager corrections themselves were referred to as reaction or cavity fields (Brout and Thomas, 1967). Even today this correction is periodically rediscovered and implemented in a variety of problems. As this has profound effects on results, we shall return to how to implement the sum rules within mean-field approaches later, in particular, within our first-principles technique, which incorporates the corrections self-consistently. Concentration Waves in Multicomponent Alloys While the concept of concentration waves in binary alloys has a long history, only recently have efforts returned to
258
COMPUTATION AND THEORETICAL METHODS
the multicomponent alloy case. We briefly introduce the simple ideas of ordering waves, but take this as an opportunity to explain how to interpret ASRO in a multicomponent alloy system where the wavevector alone is not enough to specify the ordering tendency (de Fontaine, 1973; Althoff et al., 1996). As indicated in the introduction, any arrangement of atoms on a Bravais lattice may be thought of as a modulation of the disordered state by a thermodynamically stable concentration wave. That is, one may Fourier decompose the ordering wave for each site and species on the lattice: cai ¼ c0a þ
X ½Qa ðkj Þeikj Ri þ c:c
ð7Þ
j
A binary Ac Bð1cÞ alloy has a special symmetry: on each site, if the atom is not an A type atom, then it is definitely a B type atom. One consequence of this A-B symmetry is that there is only one independent local composition, fci g (for all sites i), and this greatly simplifies the calculation and the interpretation of the theoretical and experimental results. Because of this, the structure (or concentration wave) indicated by the ASRO is determined only by the ordering wavevector (Khachaturyan, 1972, 1983; de Fontaine, 1975, 1979); in this sense, the binary alloys are a special case. For example, for CuAu, the low-temperature state is a layered L10 state with alternating layers of Cu and Au. Clearly, with cCu ¼ 1=2, and cAu ¼ 1 cCu , the ‘‘concentration wave’’ is fully described by cCu;i ðRi Þ ¼
1 1 þ ZðTÞeið2p=aÞð001ÞRi 2 2
ð8Þ
where a single wavevector, k¼(001), in units of 2p/a, where a is the lattice parameter, indicates the type of modulation. Here, Z(T) is the long-range order parameter. So, knowing the composition of the alloy and the energetically favorable ordering wavevector, you fully define the type of ordering. Both bits of information are known from the experiment: the ASRO of CuAu indicates (Moss, 1969) the star of k ¼ (001) is the most energetically favorable fluctuation. The amplitude of the concentration wave is related to the energy gain due to ordering, as can be seen from a simple chemical, pairwise-interaction model with interactions VðRi Rj Þ. The energy difference between the disordered P and short-range ordered state is 12 k QðkÞj2 VðkÞ for infinitesimal ordering fluctuations. Multicomponent alloys (like an A-B-C alloy) do not possess the binary A-B symmetry and the ordering analysis is therefore more complicated. Because the concentration waves have additional degrees of freedom, more information is needed from experiment or theory. For a bcc ABC2 alloy, for example, the particular ordering also requires the relative polarizations in the Gibbs ‘‘composition space,’’ which are the concentrations of ‘‘A relative to C’’ and ‘‘B relative to C’’ on each sublattice being formed. The polarizations are similar to ‘‘branches’’ for the case of phonons in alloys (Badalayan et al., 1969; de Fontaine, 1973; Althoff et al., 1996). The polarizations of the ordering wave thus determine the sublattice occupations in partially ordered states (Althoff et al., 1996).
Figure 1. (A) the Gibbs triangle with an example of two possible polarization paths that ultimately lead to a Heusler or L21 type ordering at fixed composition in a bcc ABC2 alloy. Note that unit vectors that describe the change in the A (black) and B (dark gray) atomic concentrations are marked. First, a B2-type order is formed from a k0 ¼ (111) ordering wave; see (B); given polarization 1 (upper dashed line), A and B atoms randomly populate the cube corners, with C (light gray) atoms solely on the body centers. Next, the polarization 2 (lower dashed line) must occur, which separates the A and B onto separate cube corners, creating the Huesler structure (via a k1 ¼ k0 =2 symmetry allowed periodicity); see (C). Of course, other polarizations for B2 state are possible as determined from the ASRO.
An example of two possible polarizations is given in Figure 1, part A, for the case of B2-type ordering in a ABC2 bcc alloy. At high temperatures, k ¼ (111) is the unstable wave vector and produces a B2 partially ordered state. However, the amount of A and B on the two sublattices is dictated by the polarization: with polarization 1, for example, Figure 1, part B, is appropriate. At a lower temperature, k ¼ ð12 12 12Þ ordering is symmetry allowed, and then the alloy forms, in this example, a Heusler-type L21 alloy because only polarization 2 is possible (see Figure 1, part C; in a binary alloy, the Heusler would be the DO3 or Fe3Al prototype because there are two distinct ‘‘Fe’’ sites). However, for B2 ordering, keep in mind that there are an infinite number of polarizations (types of partial order) that can occur, which must be determined from the electronic interactions on a system-by-system basis. In general, then, the concentration wave relevant for specifying the type of ordering tendencies in a ternary alloy, as given by the wavevectors in the ASRO, can be written as (Althoff et al., 1996)
) s ( X e ðk Þ X s cA ðRi Þ c Zss ðTÞ sA s g ðkjs ; fesa gÞeik js Ri ¼ A þ eB ðks Þ cB ðRi Þ cB s;s js
ð9Þ
The generalization to N-component alloys follows easily in this vector notation. The same results can be obtained by mapping the problem of ‘‘molecules on a lattice’’ investigated by Badalayan et al. (1969). Here, the amplitude of the ordering wave has been broken up into a product of a temperature-dependent factor and two others:
COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS
Qa ðkjs Þ ¼ Zs ðTÞea ðks Þgðkjs Þ, in a spirit similar to that done for the binary alloys (Khachaturyan, 1972, 1983). Here, the Z are the temperature-dependent, long-range-order parameters that are normalized to 1 at zero temperature in the fully ordered state (if it exists); the ea are the eigenvectors specifying the relative polarizations of the species in the proper thermodynamic Gibbs space (see below; also see de Fontaine, 1973; Althoff et al., 1996); the values of g are geometric coefficients that are linear combinations of the eigenvectors at finite temperature, hence the k dependence, but must be simple ratios of numbers at zero temperatures in a fully ordered state (like the 12 in the former case of CuAu). In regard to the summation labels: s refers to the contributing stars [e.g., (100) or (1 12 0); s refers to branches, or, the number of chemical degrees of freedom (2 for a ternary); js refers the number of wavevectors contained in the star [for fcc, (100) has 3 in the star] (Khachaturyan, 1972, 1983; Althoff et al., 1996). Notice that only Zss ðTÞ are quantities not determined by the ASRO, for they depend on thermodynamic averages in a partially or fully ordered phase with those specific probability distributions. For B2-type (two sublattices, I and II) ordering in an ABC2 bcc alloy, there are two order parameters, which in the partially ordered state can be, e.g., Z1 ¼ cA ðIÞ cA ðIIÞ and Z2 ¼ cB ðIÞ cB ðIIÞ. Scattering measurements in the partially ordered state can determine these by relative weights under the superlattice spots that form, or they can be obtained by performing thermodynamic calculations with Monte Carlo or CVM. On a stoichiometric composition, the values of g are simple geometric numbers, although, from the notation, it is clear they can be different for each member of a star, hence, the different ordering of Cu-Au at L12 and L10 stoichiometries (Khachaturyan, 1983). Thus, the eigenvectors ea ðks Þ at the unstable wavevector(s) give the ordering of A (or B) relative to C. These eigenvectors are completely determined by the electronic interactions. What are these eigenvectors and how does one get them from any calculation or measurement? This is a bit tricky. First, let us note what the ea ðks Þ are not, so as to avoid confusion between high-T and low-T approaches which use concentration-wave ideas. In the high-temperature state, each component of the eigenvector is degenerate among a given star. By the symmetry of the disordered state, this must be the case and it may be removed from the ‘‘js’’ sum (as done in Equation 9). However, below a firstorder transition, it is possible that the ea ðks Þ is temperature and star dependent, for instance, but this cannot be ascertained from the ASRO. Thus, from the point of view of determining the ordering tendency from the ASRO, the ea ðks Þ do not vary among the members of the star, and their temperature dependence is held fixed after it is determined just above the transition. This does not mean, as assumed in pair-potential models, that the interactions (and, therefore, polarizations) are given a priori and do not change as a function of temperature; it only means that averages in the disordered state cannot necessarily give you averages in the partially ordered state. Thus, in general, the ea ðkÞ may also have a dependence on members of the star, because ea ðkjs Þgðkjs Þ has to reflect the symmetry operations of the ordered distribution when writing a
259
concentration wave. We do not address this possibility here. Now, what is ea ðkÞ and how do you get it? In Figure 1, part A; the unit vectors for the fluctuations of A and B compositions are shown within the ternary Gibbs triangle: only within this triangle are the values of cA , cB , and cC allowed (because cA þ cB þ cC ¼ 1). Notice that the unit vectors for dcA and dcB fluctuations are at an oblique angle, because the Gibbs triangle is an oblique coordinate system. The free energy associated with concentration fluctuations is F ¼ dcT q1 dc, using matrix notation with species labels suppressed (note that superscript T is a transpose operation). The matrix q1(k) is symmetric and square in (N 1) species (let us take species C as the ‘‘host’’). As such, it seems ‘‘obvious’’ that the eigenvectors of q1 are required because they reflect the ‘‘principal directions’’ in free energy space which reveal the true order. However, its eigenvectors, eC , produce a host-dependent, unphysical ordering! That is, Equation 9 would produce negative concentrations in some cases. Immediately, you see the problem. The Gibbs triangle is an oblique coordinate system and, therefore, the eigenvectors must be obtained in a properly orthogonal Cartesian coordinate system (de Fontaine, 1973). By an oblique coordinate transform, defined by dc ¼ Tx, Fx ¼ xT ðTT q1 TÞx, but still Fx ¼ F. From TT q1 T, we find a set of hostindependent eigenvectors, eX; in other words, regardless of which species you take as the host, you always get the same eigenvectors! Finally, the physical eigenvectors we seek in the Gibbs space are then eG ¼ TeX (since dc ¼ Tx). It is important to note that eC is not the same as eG because TT 6¼ T1 in an oblique coordinate system like the Gibbs triangle, and, therefore, TTT is not 1. It is the eG that reveal the true principal directions in free-energy space, and these parameters are related to linear combinations of elements of q1(k ¼ k0) at the pertinent unstable wavevector(s). If nothing else, the reader should take away that these quantities can be determined theoretically or experimentally via the diffuse intensities. Of course, any error in the theory or experiment, such as not maintaining the sum rules on q or a, will create a subsequent error in the eigenvectors and hence the polarization. Nevertheless, it is possible to obtain from the ASRO both wavevector and ‘‘wave polarization’’ information which determines the ordering tendencies (also see the appendix in Althoff et al., 1996). To make this a little more concrete, let us reexamine the previous bcc ABC2 alloy. In the bcc alloys, the first transformation from disordered A2 to the partially order B2 phase is second order, with k ¼ (111) and no other wavevectors in the star. The modulation (111) indicates that the bcc lattice is being separated into two distinct sublattices. If the polarization 1 in Figure 1, part A, was found, it indicates that species C is going to be separated on its own sublattice; whereas, if polarization 2 was found initially, species C would be equally placed on the two sublattices. Thus, the polarization already gives a great deal of information about the ordering in the B2 partially ordered phase and, in fact, is just the slope of the line in the Gibbs triangle. This is the basis for the recent graphical representation of ALCHEMI (atom location by channeling
260
COMPUTATION AND THEORETICAL METHODS
electron microscopy) results in B2-ordering ternary intermetallic compounds (Hou et al., 1997). There are, in principle, two order parameters because of the two branches in a ternary alloy case. The order-parameter Z2 , say, can be set to zero to obtain the B2-type ordering, and, because the eigenvalue, l2 , of eigenmode e2 is higher in energy than that of e1, i.e., l1 < l2 , only e1 is the initially unstable mode. See Johnson et al. (1999) for calculations in Ti-Al-Nb bcc-based alloys, which are directly compared to experiment (Hou, 1997). We close this description of interpreting ASRO in ternary alloys by mentioning that the above analysis generalizes completely for quaternaries and more complex alloys. The important chemical space progresses: binary is a line (no angles needed), ternary is a triangle (one angle), quaternaries are pyramids (two angles, as with Euler rotations), and so on. So the oblique transforms become increasingly complex for multidimensional spaces, but the additional information, along with the unstable wavevector, is contained within the ASRO. Concentration Waves from a Density-Functional Approach The present first-principles theory leads naturally to a description of ordering instabilities in the homogeneously random state in terms of static concentration waves. As discussed by Khachaturyan (1983), the concentrationwave approach has several advantages, which are even more relevant when used in conjunction with an electronic-structure approach (Staunton et al., 1994; Althoff et al., 1995, 1996). Namely, the method (1) allows for interatomic interaction at arbitrary distances, (2) accounts for correlation effects in a long-range interaction model, (3) establishes a connection with the Landau-Lifshitz thermodynamic theory of second-order phase transformations, and (4) does not require a priori assumptions about the atomic superstructure of the ordered phases involved in the order-disorder transformations, allowing the possible ordered-phase structures to be predicted from the underlying correlations. As a consequence of the electronic-structure basis to be discussed later, realistic contributions to the effective chemical interactions in metals arise, e.g., from electrostatic interactions, Fermi-surface effects, and strain fields, all of which are inherently long range. Analysis within the method is performed entirely in reciprocal space, allowing for a description of atomic clustering and ordering, or of strain-induced ordering, none of which can be included within many conventional ordering theories. In the present work, we neglect all elastic effects, which are the subject of ongoing work. As with the experiment, the electronic theory leads naturally to a description of the ASRO in terms of the temperature-dependent, two-body compositional-correlation function in reciprocal space. As the temperature is lowered, (usually) one wavevector becomes prominent in the ASRO, and the correlation function ultimately diverges there. It is probably best to derive some standard relations that are applicable to both simple models and DFT-based approaches. The idea is simply to show that certain simplifications lead to well-known and venerable results, such as
the Krivoglaz-Clapp-Moss formula (Krivoglaz, 1969; Clapp and Moss, 1966), where, by making fewer simplifications, an electronic DFT-based theory can be formulated, which nevertheless, is a mean-field theory of configurational degrees of freedom. While it is certainly much easier to derive approximations for pair correlations using very standard mean-field treatments based on effective interactions, as has been done traditionally, an electronic-DFTbased approach would require much more development along those lines. Consequently, we shall proceed in a much less common way, which can deal with all possibilities. In particular, we shall give a derivation for binary alloys and state the result for the N-component generalization, with some clarifying remarks added. As shown for an A-B binary system (Gyo¨ rffy and Stocks, 1983), it is straightforward to adapt the density-functional ideas of classical liquids (Evans, 1979) to a ‘‘lattice-gas’’ model of alloy configurations (de Fontaine, 1979). The fundamental DFT theorem states that, in the presence of an external field, Hext ¼ n $n xn with external chemical potential, $n , there is a grand potential (not yet the thermodynamic one), X½T; V; N; n; fcn g ¼ F½T; V; N; fcn g n ð$n nÞcn ð10Þ such that the internal Helmholtz free energy F[{cn}] is a unique functional of the local concentrations cn , that is, hxn i, meaning F is independent of $n . Here, T, V, N, and n are, respectively, the temperature, volume, number of unit cells, and chemical potential difference ðnA nB Þ. The equilibrium configuration is specified by the stationarity condition; " q "" ¼0 ð11Þ qcn "fc0n g which determines the Euler-Lagrange equations for the alloy problem. Most importantly, from arguments given by Evans (Evans, 1979), it can be proven that is a minimum at fc0n g and equal to the proper thermodynamic grand potential [T, V, N, n] (Gyo¨ rffy et al., 1989). In terms of ni ¼ ð$i nÞ; an effective chemical potential difference,
is a generating function for a hierarchy of correlation functions (Evans, 1979). The first two are
q
¼ ci qni
and
q2
¼ bqij qni qnj
ð12Þ
This second generator is the correlation function that we require for stability analysis. Some standard DFT tricks are useful at this point and these also happen to be the equivalent tricks originally used to derive the electronic-DFT Kohn-Sham equations (also known as the single-particle Schro¨ dinger equations; Kohn and Sham, 1965). Although F is not yet known, we can break this complex functional up into a known non-interacting part (given by point entropy for the alloy problem): F0 ¼ b1
X ½cn lncn þ ð1 cn Þ lnð1 cn Þ n
ð13Þ
COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS
and an interacting part , defined by F ¼ F0 . Here, b1 is the temperature, kB T where kB is the Boltzmann constant. In the DFT for electrons, the noninteracting part was taken as the single-particle kinetic energy (Kohn and Sham, 1965), which is again known exactly. It then follows from Equation 11 that the Euler-Lagrange equations that the fc0n g satisfy are: c0n Sð1Þ b ln n nn ¼ 0 ð1 c0n Þ 1
ð14Þ
which determines the contribution to the local chemical potential differences in the alloy due to all the interactions, if Sð1Þ can be calculated (in the physics literature, Sð1Þ would be considered a self-energy functional). Here it has been helpful to define a new set of correlation functions generated from the functional derivatives of with respect to concentration variable; the first two correlation functions being:
ð1Þ
Si
q qci
and
S2ij
q2 qci qcj
ð15Þ
In the classical theory of liquids, the Sð2Þ is an OrnsteinZernike (Ornstein, 1912; Ornstein and Zernike, 1914 and 1918; Zernike, 1940) direct-correlation function (with density instead of concentration fluctuations; Stell, 1969). Note that there are as many coupled equations in Equation 14 as there are atomic sites. If we are interested in, for example, the concentration profile around an antiphase boundary, Equation 14 would in principle provide that information, depending upon the complexity of and whether we can calculate its functional derivatives, which we shall address momentarily. Also, recognize that Sð2Þ is, by its very nature, the stability matrix (with respect to concentration fluctuations) of the interacting part of the free energy. Keep in mind that , in principle, must contain all many-body-type interactions, including all entropy contributions beyond the point entropy that was used as the noninteracting portion. If it was based on a fully electronic description, it must also contain ‘‘particle-hole’’ entropy associated with the electronic density of states at finite temperature (Staunton et al., 1994). The significance of Sð2Þ can immediately be found by performing a stability analysis of the Euler-Lagrange equations; that is, take the derivatives of Equation 14 w.r.t. ci , or, equivalently, expand the equation to find the fluctuations about c0i (i.e., ci ¼ c0i þ dci ), to find out how fluctuations affect the local chemical potential difference. The result is the stability equations for a general inhomogeneous alloy system: dij qni ð2Þ ¼0 Sij bci ð1 ci Þ qcj
ð16Þ
Through the DFT theorem and generating functionals, the response function qni =qcj , which tells how the concentra-
261
tions vary with changes in the applied field, has a simple relationship to the true pair-correlation function: qni ¼ qcj
qcj qni
1
d2
dni dnj
!1
b1 ðq1 Þij ½bcð1 cÞ 1 ða1 Þij
ð17Þ
where the last equality arises through the definition of Warren-Cowley parameters. If Equation 16 is now evaluated in the random state where (discrete) translational invariance holds, and the connection between the two types of correlation functions is used (i.e., Equation 17), we find: aðkÞ ¼
1 1 bcð1 cÞSð2Þ ðkÞ
ð18Þ
Note that here we have evaluated the exact functional in the homogeneously random state with c0i ¼ c 8 i, which is an approximation because in reality there are some changes to function induced by the developed ASRO. In principle, we should incorporate this ASRO in the evaluation to more properly describe the situation. For peaks at finite wavevector k0, it is easy to see that absolute instability of the binary alloy to ordering occurs when bcð1 cÞSð2Þ ðk ¼ k0 Þ ¼ 1 and the correlations diverge. The alloy would be unstable to ordering with that particular wavevector. The temperature, Tsp, where this may occur, is the so-called ‘‘spinodal temperature.’’ For peaks at k0 ¼ 0, i.e., long-wavelength fluctuations, the alloy would be unstable to clustering. For the N-component case, a similar derivation is applicable (Althoff et al., 1996; Johnson, 2001) with multiple chemical fields, $an , chemical potential differences, na (relative to the nN ), and effective chemical potential differences nan ¼ ð$an na Þ. One cannot use the simple c and (1 c) relationship in general and must keep all the labels relative to the Nth component. Most importantly, when taking compositional derivatives, the single-occupancy constraint must be handled properly, i.e., qcai =qcbj ¼ dij ½ðdab daN Þð1 dbN Þ . The generalized equations for the pair correlations are, when evaluated in the homogeneously random state: 1 dab 1 ð2Þ q ðkÞ ab ¼ bSab ðkÞ þ ca cN
ð19Þ
where neither a nor b can be the Nth component. This may again be normalized to produce the Warren-Cowley pairs. With the constraint implemented by designating the Nth species as the ‘‘host,’’ the (nonsingular) portion of the correlation function matrices are rank (N 1). For an A-B binary, ca ¼ cA ¼ c and cN ¼ cB ¼ 1 c because only the a ¼ b ¼ A term is valid (N ¼ 2 and matrices are rank 1), and we recover the familiar result, Equation 18. Equation 19 is, in fact, a most remarkable result. It is completely general and exact! However, it is based on some still unknown functional Sð2Þ ðkÞ, which is not a pairwise interaction but a pair-correlation function arising
262
COMPUTATION AND THEORETICAL METHODS
from the interacting part of the free energy. Also, Equations 18 and 19 properly conserve spectral intensity, aab ðR ¼ 0Þ ¼ 1, as required in Equation 6. Notice that Sð2Þ ðkÞ has been defined without referring to pair potentials or any larger sets of ECIs. In fact, we shall discuss how to take advantage of this to make a connection to first-principles electronic-DFT calculations of Sð2Þ ðkÞ. First, however, let us discuss some familiar mean-field results in the theory of pair correlations in binary alloys by picking approximate Sð2Þ ðkÞ functionals. In such a context, Equation 18 may be cautiously thought of as a generalization of the Krivoglaz-Clapp-Moss formula, where Sð2Þ plays the role of a concentration- and (weakly) temperature-dependent effective pairwise interaction. Connection to Well-Known Mean-Field Results In the concentration-functional approach, one mean-field theory is to take the interaction part of the free energy as the configurational average of the alloy Hamiltonian, i.e., MF ¼ hH½fxn g i, where the averaging is performed with an inhomogeneous product probability distribution Q function, P½fxn g ¼ n Pn ðxn Þ, with Pn ð1Þ ¼ cn and Pn ð0Þ ¼ 1 cn . Such a product distribution yields the mean-field results hxi xj i ¼ ci cj , e.g., usually called the random-phase approximation in the physics community. For an effective chemical interaction model based on pair potentials and using hxi xj i ¼ ci cj , then ! MF ¼
1X ci Vij cj 2 ij
ð20Þ
and therefore, Sð2Þ ðkÞ ¼ VðkÞ, which no longer has a direct electronic connection for the pairwise correlations. As a result, we recover the Krivoglaz-Clapp-Moss result (Krivoglaz, 1969; Clapp and Moss, 1966), namely:
aðkÞ ¼
1 ½1 þ bcð1 cÞVðkÞ
ð21Þ
and the Gorsky-Bragg-Williams equation of state would be reproduced by Equation 14. To go beyond such a meanfield result, fluctuation corrections would have to be added to MF . That is, the probability distribution would have to be more than a separable product. One consequence of the uncorrelated configurational averaging (i.e., hxi xj i ¼ ci cj ) is a substantial violation of the spectral intensity sum rule a(R ¼ 0) ¼ 1. This was recognized early on and various scenarios for normalizing the spectral intensity have been used (Clapp and Moss, 1966; Vaks et al., 1966; Reinhard and Moss, 1993). A related effect of such a mean-field averaging is that the system is ‘‘overcorrelated’’ through the mean fields. This occurs because the effective chemical fields are produced by averaging over all sites. As such, the local composition on the ith site interacts with all the remaining sites through that average field, which already contains effects from the ith site; so the ith site has a large self correlation. The ‘‘mean’’ field produces a correlation because it contains
field information from all sites, which is the reason that although assuming hxi xj i ¼ ci cj , which says that no pairs are correlated, we managed to obtain atomic short-range order, or a pair correlation. So, the mean-field result properly has a correlation, although it is too large a selfcorrelation, and there is a slight lack of consistency due to the use of ‘‘mean’’ fields. In previous comparisons of Ising models (e.g., to various mean-field results), this excessive self-correlation gave rise to the often quoted 20% error in transition temperatures (Brout and Thomas, 1965). Improvements to Mean-Field Theories While this could be a chapter unto itself, we will just mention a few key points. First, just because one uses a meanfield theory does not necessarily make the results bad. That is, there are many different breeds of mean-field approximations. For example, the CVM is a mean-field approximation for cluster entropy, being much better than the Gorsky-Bragg-Williams approximation, which uses only point entropy. In fact, the CVM is remarkably robust, giving in many cases results similar to ‘‘exact’’ Monte Carlo simulations (Sanchez and de Fontaine, 1978, 1980; Ducastelle, 1991). However, it too does have limitations, such as a practical restriction to small interaction ranges or multiplet sizes. Second, when addressing an alloy problem, the complexity of (the underlying Hamiltonian) matters, not only how it is averaged. The overcorrelation in the meanfield approximation, e.g., while often giving a large error in transition temperatures in simple alloy models, is not a general principle. If the correct physics giving rise to the ordering phenomena in a particular alloy is well described by the Hamiltonian, very good temperatures can result. If entropy was the entire driving force for the ordering, and did not have any entropy included, we would get quite poor results. On the other hand, if electronic band filling was the overwhelming contribution to the structural transformation, then a that included that information in a reasonable way, but that threw out higher-order entropy, would give very good results; much better, in fact, than the often quoted ‘‘20% too high in transition temperature.’’ We shall indeed encounter this in the results below. Third, even simple improvements to mean-field methods can be very useful, as we have already intimated when discussing the Onsager cavity-field corrections. Let us see what the effect is of just ensuring the sum rule required in Equation 6. The Onsager corrections (Brout and Thomas, 1967) for the above mean-field average amounts to the following coupled equations in the alloy problem (Staunton et al., 1994; Tokar, 1997; Boric¸ i-Kuqo et al., 1997), depending on the mean-field used aðk; TÞ ¼
1 1 bcð1
ð2Þ cÞ½SMF ðk; TÞ
ðTÞ
ð22Þ
and, using Equation 6, rðTÞ ¼
ð 1 ð2Þ dkSMF ðkÞaðk; T; Þ
BZ
ð23Þ
COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS
where is the temperature-dependent Onsager correction and BZ is the Brillouin zone volume of the random alloy Bravais lattice. This coupled set of equations may be solved by standard Newton-Raphson techniques. For the N-component alloy case, these become a coupled set of matrix equations where all the matrices (including ) have two subscripts identifying the pairs, as given in Equation 19 and appropriate internal sums over species are made (Althoff et al., 1995, 1996). An even more improved and general approach has been proposed by Chepulskii and Bugaev (1998a,b). The effect of the Onsager correction is to renormalize ð2Þ the mean-field Sð2Þ ðkÞ producing an effective Seff ðkÞ which properly conserves the intensity. For an exact Sð2Þ ðkÞ, is zero, by definition. So, the closer an approximate Sð2Þ ðkÞ satisfies the sum rule, the less important are the Onsager corrections. At high temperatures, where a(k) is 1, it is clear from Equation 23 that (T) becomes the average of Sð2Þ ðkÞ over the Brillouin zone, which turns out to be a good ‘‘seed’’ value for a Newton-Raphson solution. It is important to emphasize that Equations 22 and 23 may be derived in numerous ways. However, for the current discussion, we note that Staunton et al. (1994) derived these relations from Equation 14 by adding the Onsager cavity-field corrections while doing a linear-response analysis that can add additional complexity, (i.e., more q dependence than evidenced by Equation 22 and Equation 23). Such an approach can also yield the equivalent to a high-T expansion to second order in b—as used to explain the temperature-dependent shifts in ASRO (Le Bulloc’h et al., 1998). Now, for an Onsager-corrected mean-field theory, as one gets closer to the spinodal temperature, (T ffi Tsp ) becomes larger and larger because a(k) is diverging and more error has to be corrected. Improved entropy mean-field approaches, such as the CVM, still suffer from errors associated with the intensity sum rule, which are mostly manifest around the transition temperatures (Mohri et al., 1985). For a pairwise Hamiltonian, it is assumed that Vii ¼ 0, otherwise it would just be an arbitrary shift of the energy zero, which does not matter. However, an interesting effect in the high-T limitÐfor the (mean-field) pair-potential model is that ¼ 1BZ dkVðkÞ ¼ Vii , which is not generally zero, because Vii is not an interaction but a self-energy correction, (i.e., ii , which must be finite in mean-field theory just to have a properly normalized correlation function). As evidenced from Equation 16, the correlation function can be written as a1 ¼ V in terms of a self-energy , as can be shown more properly from field-theory (Tokar, 1985, 1997). However, in this case is the exact self-energy, rather than just [bc(1 c)]1 for the Krivoglaz-Clapp-Moss mean-field case. Moreover, the zeroth-order result for the self-energy yields the Onsager correction (Masanskii et al., 1991; Tokar, 1985, 1997), i.e., ¼ [bc(1 c)]1 þ (T). Therefore, Vii , or more properly (T), is manifestly not arbitrary. These techniques have little to say regarding complicated many-body Hamiltonians, however. It would be remiss not to note that for short-range order in strongly correlated situations the mean-field results, even using Onsager corrections, can be topologically
263
incorrect. An example is the interesting toy model of a disordered (electrostatically screened) binary Madelung lattice (Wolverton and Zunger, 1995b; Boric¸ i-Kuqo et al., 1997), in which there are two types of point charges screened by rules depending on the nearest-neighbor occupations. In such a pairwise case, including intrasite selfcorrelations, the intensity sums are properly maintained. However, self-correlations beyond the intrasite (at least out to second neighbors) are needed in order to correct a1 ¼ V and its topology (Wolverton and Zunger, 1995a,b; Boric¸ i-Kuqo et al., 1997). In less problematic cases, such as backing out ECIs from experimental data on real alloys, it is found that the zeroth-order (Onsager) correction plus additional first-order corrections agrees very well with those ECIs obtained using inverse Monte Carlo methods (Reinhard and Moss, 1993; Masanskii et al., 1991). When a secondorder correction was included, no difference was found between the ECIs from mean-field theory and inverse Monte Carlo, suggesting that lengthy simulations involved with the inverse Monte Carlo techniques may be avoided (Le Bolloc’h et al., 1997). However, as warned before, the inverse mapping is not unique, so care must be taken when using such information. Notice that even in problem cases, improvements made to mean-field theories properly reflect most of the important physics, and can usually be handled more easily than more exacting approaches. What is important is that a mean-field treatment is not in itself patently inappropriate or wrong. It is, however, important to have included the correct physics for a given system. Including the correct physics for a given alloy is a system-specific requirement, which usually cannot be known a priori. Hence, our choice is to try and handle chemical and electronic effects, all on an equal footing, and represented from a highly accurate, density-functional basis. Concentration Waves from First-Principles, Electronic-Structure Calculations What remains to be done is to connect the formal derivation given above to the system-dependent, electronic structure of the random substitutional alloy. In other words, we must choose a , which we shall do in a mean-field approach based on local density approximation (LDA) to electronic DFT (Kohn and Sham, 1965). In the adiabatic approximation, MF ¼ h e i, where e is the electronic grand potential of the electrons for a specific configuration (where we have also lumped in the ion-ion contribution). To complete the formulation, a mean-field configurational averaging of e is required in analytic form, and must be dependent on all sites in order to evaluate the functional derivatives analytically. Note that using a local density approximation to electronic DFT is also, in effect, a meanfield theory of the electronic degrees of freedom. So, even though they will be integrated out, the electronic degrees of freedom are all handled on a par with the configurational degrees of freedom contained in the noninteracting contribution to the chemical free energy. For binaries, Gyo¨ rffy and Stocks (1983) originally discussed the full adaptation of the above ideas and its
264
COMPUTATION AND THEORETICAL METHODS
implementation including only electronic band-energy contributions based on the Korringa-Kohn-Rostocker (KKR) coherent potential approximation (CPA) electronic-structure calculations. The KKR electronic-structure method (Korringa, 1947; Kohn and Rostoker, 1954) in conjunction with the CPA (Soven, 1967; Taylor, 1968) is now a well-proven, mean-field theory for calculating electronic states and energetics in random alloys (e.g., Johnson et al., 1986, 1990). In particular, the ideas of Ducastelle and Gautier (1976) in the context of tight-binding theory were used to obtain h e i within an inhomogeneous version of the KKR-CPA, where all sites are distinct so that variational derivatives could be made. As shown by Johnson et al. (1986, 1990), the electronic grand potential for any alloy configuration may be written as:
e ¼
ð1
deNðe; mÞf ðe mÞ
1
þ
ðm
1
dm0
ð1 1
de
dNðe; m0 Þ f ðe m0 Þ dm0
ð24Þ
where the first term is the single-particle, or band-energy, contribution, which produces the local (per site) electronic density of states, ni ðe; mÞ, and the second term properly gives the ‘‘double-counting’’ corrections. Here f(e m) is the Fermi occupation factor from finite-temperature effects on the electronic chemical potential, m (or Fermi energy at T ¼ 0 K). Hence, the band-energy term contains all electron-hole effects due to electronic entropy, which may be very Ð e important in some high-T alloys. The Nðe; mÞ ¼ i 1 de0 ni ðe; mÞ, and is the integrated density of states as typically discussed in band-structure methods. We may obtain an analytic expression for e as long as an analytic expression for N(e; m) exists (Johnson et al., 1986, 1990). Within the CPA, an analytic expression for N(e; m) in either a homogeneously or inhomogeneously disordered state is given by the generalized Lloyd formula (Faulkner and Stocks, 1980). Hence, we can determine CPA for a inhomogeneously random state. As with any DFT, besides the extrinsic variables T, V, and m (temperature, volume, is only a funcand chemical potential, respectively), CPA e tional of the CPA charge density, fra;i g for all species and sites. In terms of KKR multiple-scattering theory, the inhomogeneous CPA is pictorially understood by ‘‘replacing’’ the individual atomic scatterers at the ith site (i.e., ta;i ) by a CPA effective scatterer per site (i.e., tc;i ) (Gyo¨ rffy and Stocks, 1983; Staunton et al., 1994). It is more appropriate to average scattering properties rather than potentials to determine a random system’s properties (Soven, 1967; Taylor, 1968). For an array of CPA scatterers, tc;ii is a (site-diagonal) KKR scattering-path operator that describes the scattering of an electron from all sites given that it starts and ends at the ith site. The tc values are determined from the requirement that replacing the effective scatterer by any of the constituent atomic scatterers (i.e., ta;i ) does not on average change the scattering properties of the entire system as given by tc;ii (Fig. 2). This
Figure 2. Schematic of the required average scattering condition, which determines the inhomogeneous CPA self-consistent equations. tc and ta are the site-dependent, single-site CPA and atomic scattering matrices, respectively, and tc is the KKR scattering path operator describing the entire electronic scattering in the system.
requirement is expressed by a set of CPA conditions, a ca;i ta;ii ¼ tc;ii , one for each lattice site. Here, ta;ii is the site-diagonal, scattering-path operator for an array of CPA scatterers with a single impurity of type a at the ith site (see Fig. 2) and yields the required set of fra;i g. Notice that each of the CPA single-site scatterers can in principle be different (Gyo¨ rffy et al., 1989). Hence, the random state is inhomogeneous and scattering properties vary from site to site. As a consequence, we may relate any type of ordering (relative to the homogeneously disordered state) directly to electronic interactions or properties that lower the energy of a particular ordering wave. While these inhomogeneous equations are generally intractable for solution, the inhomogeneous CPA must be considered to calculate analytically the response to variað2Þ tions of the local concentrations that determine Sij . This allows all possible configurations (or different possible site occupations) to be described. By using the homogeneous CPA as the reference, all possible orderings (wave vectors) may be compared simultaneously, just as in with phonons in elemental systems (Pavone et al., 1996; Quong and Lui, 1997). The conventional single-site homogeneous CPA (used for total-energy calculations in random alloys; Johnson et al., 1990) provides a soluble highest symmetry reference state to perform a linear-response description of the inhomogeneous CPA theory. Those ideas have been recently extended and implemented to address multicomponent alloys (Althoff et al., 1995, 1996), although the initial calculations still just include terms involving the band-energy only (BEO). For binaries, Staunton et al. (1994) have worked out the details and implemented calculations of atomic-shortrange order that include all electronic contributions, e.g., electrostatic and exchange correlation. The coupling of magnetic and chemical degrees of freedom have been addressed within this framework by Ling et al. (1995a, b), and references cited therein. The full DFT theory has thus far been applied mostly to binary alloys, with several successes (Staunton et al., 1990; Pinski et al., 1991, 1998; Johnson et al., 1994; Ling et al., 1995; Clark et al., 1995). The extension of the theory to incorporate atomic displacements from the average lattice is also ongoing, as is the inclusion of all terms beyond the band energy for multicomponent systems.
COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS
Due to unique properties of the CPA, the variational nature of DFT within the KKR-CPA is preserved, i.e., d CPA =dra;i ¼ 0 (Johnson et al., 1986, 1990). As a result, e only the explicit concentration variations are required to obtain equations for Sð1Þ , the change in (local) chemical potentials, i.e.: d CPA ¼ ð1 daN ÞIm qca;i
ð1 1
265
Within a BEO approach, the expression for the bandð2Þ energy part of Sab ðq; eÞ is ð1 1 ð2Þ de f ðe mÞ Sab ðq; eÞ ¼ ð1 daN Þð1 dbN Þ Im p 1 ( ) X ! Ma;L1 L2 ðeÞXL2 L1 ;L3 L4 ðq; eÞMb;L3 L4 ðeÞ L1 L2 L3 L4
ð26Þ
def ðe me Þ
! ½Na;i ðeÞ NN;i ðeÞ þ
ð25Þ
Here, f(e m) is the Fermi filling factor and me is the electronic chemical potential (Fermi energy at T ¼ 0 K); Na ðeÞ is the CPA site integrated density of states for the a species, and the Nth species has been designated the ‘‘host.’’ The ellipses refer to the remaining direct concentration variations of the Coulomb and exchange-correlation contributions to CPA, and the term shown is the bandenergy-only contribution (Staunton et al., 1994). This band-energy term is completely determined for each site by the change in band-energy found by replacing the ‘‘host’’ species by an a species. Clearly, Sð1Þ is zero if the a species is the ‘‘host’’ because this cannot change the chemical potential. For the second variation for Sð2Þ , it is not so nice, because there are implicit changes to the charge densities (related to tc;ii and ta;ii ) and the electronic chemical potentials, m. Furthermore, these variations must be limited by global charge neutrality requirements. These restricted variations, as well as other considerations, lead to dielectric screening effects and charge ‘‘rearrangement’’type terms, as well as standard Madelung-type energy changes (Staunton et al., 1994). At this point, we have (in principle) kept all terms contributing to the electronic grand potential, except for static displacements. However, in many instances, only the terms directly involving the underlying electronic structure predominantly determine the ordering tendencies (Ducastelle, 1991), as it is argued that screening and near-local charge neutrality make the double-counting terms negligible. This is in general not so, however, as discussed recently (Staunton et al., 1994; Johnson et al., 1994). For simplicity’s sake, we consider here only the important details within the band-energy-only (BEO) approach, and only state important differences for the more general case when necessary. Nonetheless, it is remarkable that the BEO contributions actually address a great many alloying effects that determine phase stability in many alloy systems, such as, band filling (or electron-per-atom, e/a; Hume-Rothery, 1963), hybridization (arising from diagonal and off-diagonal disorder), and so-called electronic topological transitions (Lifshitz, 1960), which encompass Fermi-surface nesting (Moss, 1969) and van Hove singularity (van Hove, 1953) effects. We shall give some real examples of such effects (how they physically come about) and how these effects determine the ASRO, including in disordered fcc Cu-Ni-Zn (see Data Analysis and Initial Interpretation).
where L refers to angular momentum indices of the spherical harmonic basis set (i.e., contributions from s, p, d, etc. type electrons in the alloy), and the matrix elements may be found in the referenced papers (multicomponent: Johnson, 2001; binaries: Gyo¨ rffy and Stocks, 1983; Staunton et al., 1994). This chemical ‘‘ordering energy,’’ arising only from changes in the electronic structure from the homogeneously random, is associated with perturbing the concentrations on two sites. There are NðN 1Þ=2 independent terms, as we expected. There is some additional q dependence resulting from the response of the CPA medium, which has been ignored for simplicity’s sake to present this expression. Ignoring such q dependence is the same as what is done for the generalized perturbation methods (Duscastelle and Gautier, 1976; Duscastelle, 1991). As the key result, the main q dependence of the ordering typically arises mainly from the convolution of the electronic structure given by XL2 L1 ;L3 L4 ðq; eÞ ¼
ð 1 dk tc;L2 L3 ðk þ q; eÞtc;L4 L1 ðk; eÞ
BZ tc;ii;L2 L3 ðeÞtc;ii;L4 L1 ðeÞ
ð27Þ
which involves only changes to the CPA medium due to offdiagonal scattering terms. This is the difficult term to calculate. It is determined by the underlying electronic structure of the random alloy and must be calculated using electronic density functional theory. How various types ð2Þ of chemical ordering are revealed from Sab ðq; eÞ is discussed later (see Data Analysis and Initial Interpretation). However, it is sometimes helpful to relate the ordering directly to the electronic dispersion through the Bloch spectral functions AB ðk; eÞ / Im tc ðk; eÞ (Gyo¨ rffy and Stocks, 1983), where tc and the configurationally averaged Green’s functions and charge densities are also related (Faulkner and Stocks, 1980). The Bloch spectral function defines the average dispersion in the alloy system. For ordered alloys, AB(k; e) consists of delta functions in kspace whenever the dispersion relationship is satisfied, i.e., dðe ek Þ, which are the electronic ‘‘bands.’’ In a disordered alloy, these ‘‘bands’’ broaden and shift (in energy) due to disorder and alloying effects. The loci of peak positions at eF, if the widths of the peaks are small on the scale of the Brillouin zone dimension, defines a ‘‘Fermi surface’’ in a disordered alloy. The widths reflect, for example, the inverse lifetimes of electrons, determining such quantities as resistivity (Nicholson and Brown, 1993). Thus, if only electronic states near the Fermi surface play the dominant role in determining the ordering tendency from the convolution integral, the reader can already imagine how
266
COMPUTATION AND THEORETICAL METHODS
Fermi-surface nesting gives a large convolution from flat and parallel portions of electronic states, as detailed later. Notably, the species- and energy-dependent matrix elements in Equation 26 can be very important, as discussed later for the case of NiPt. To appreciate how band-filling effects (as opposed to Fermi-surface-related effects) are typically expected to affect the ordering in an alloy, it is useful to summarize as follows. In general, the bandð2Þ energy-only part of Sab ðq; eÞ is derived from the filling of the electronic states and harbors the Hume-Rothery electron-per-atom rules (Hume-Rothery, 1963), for example. From an analysis using tight-binding theory, Ducastelle and others (e.g., Ducastelle, 1992) have shown what ordering is to be expected in various limiting cases where the transition metal alloys can be characterized by diagonal disorder (i.e., difference between on site energies is large) and off-diagonal disorder (i.e., the constituent metals have different d band widths). The standard lore in alloy theory is then as follows: if the d band is either ð2Þ nearly full or empty, then SBand ðqÞ peaks at jqj ¼ 0 and the system clusters. On the other hand, if the bands are ð2Þ roughly half-filled, then SBand ðqÞ peaks at finite jqj values and the system orders. For systems with the d band nearly filled, the system is filling antibonding type states unfavorable to order, whereas, the half-filled band would have the bonding-type states filled and the antibonding-type states empty favoring order (this is very similar to the ideas learned from molecular bonding applied to a continuum of states). Many alloys can have their ordering explained on this basis. However, this simple lore is inapplicable for alloys with substantial off-diagonal disorder, as recently discussed by Pinski et al. (1991, 1998), and as explained below (see Data Analysis and Initial Interpretation) sections. While the ‘‘charge effects’’ are important to include as well (Mott, 1937), let us mention the overall gist of what is found (Staunton et al., 1994). There is a ‘‘charge-rearrangement’’ term that follows from implicit variations of the charge on site i and the concentration on site j, which represents a dielectric response of the CPA medium. In addition, charge density-charge density variations lead ð2Þ ð2Þ to Madelung-type energies. Thus, Stotal ðqÞ ¼ Sc;c ðqÞþ ð2Þ ð2Þ Sc;r ðqÞ þ Sr;r ðqÞ. The additional terms also affect the Onsager corrections discussed above (Staunton et al., 1994). Importantly, the density of states at the Fermi energy reflects the number of electrons available in the metal to screen excess charges coming from the solute atoms, as well as local fluctuations in the atomic densities due to the local environments (see Data Analysis and Initial Interpretation). In a binary alloy case, e.g., where there is a large density of states at the Fermi energy (eF), Sð2Þ reduces mainly to a screened Coulomb term (Staunton et al., 1994), which determine the Madelung-like effects. In addition, the major q dependence arises from the excess charge at the ion positions via the Fourier transform (FT) of the Coulomb potential, CðqÞ ¼ FTjRi Rj j1 , Sð2Þ ðqÞ Sð2Þ c;c ðqÞ
e2 Q2 ½CðqÞ R1 nn 1 þ l2scr ½CðqÞ R1 nn
ð28Þ
where Q ¼ qA qB is the difference in average excess charge (in units of e, electron charge) on a site in the homogeneous alloy, as determined by the self-consistent KKRÐ CPA. The average excess charge qa;i ¼ Zai cell dr rai ðrÞ (with Zi the atomic number on a site). Here, lscr is the system-dependent, metallic screening length. The nearestneighbor distance, Rnn , arises due to charge correlations from the local environment (Pinski et al., 1998), a possibly important intersite electrostatic energy within metallic systems previously absent in CPA-based calculations— essentially reflecting that the disordered alloy already contains a large amount of electrostatic (Madelung) energy (Cole, 1997). The proper (or approximate) physical description and importance of ‘‘charge correlations’’ for the formation energetics of random alloys have been investigated by numerous approaches, including simple models (Magri et al., 1990; Wolverton and Zunger, 1995b), in CPA-based, electronic-structure calculations (Abrikosov et al., 1992; Johnson and Pinski, 1993; Korzhavyi et al., 1995), and large supercell calculations (Faulkner et al., 1997), to name but a few. The sum of the above investigations reveal that for disordered and partially ordered metallic alloys, these atomic (local) charge correlations may be reasonably represented by a single-site theory, such as the coherent potential approximation. Including only the average effect of the charges on the nearest-neighbors shell (as found in Equation 28) has been shown to be sufficient to determine the energy of formation in metallic systems (Johnson and Pinski, 1993; Korzhavyi et al., 1995; Ruban et al., 1995), with only minor difference between various approaches that are not of concern here. Below (see Data Analysis and Initial Interpretation) we discuss the effect of incorporating such charge correlations into the concentrationwave approach for calculating the ASRO in random substitutional alloys (specifically fcc NiPt; Pinski et al., 1998). DATA ANALYSIS AND INITIAL INTERPRETATION Hybridization and Charge Correlation Effects in NiPt The alloy NiPt, with its d band almost filled, is an interesting case because it stands as a glaring exception to traditional band-filling arguments from tight-binding theory (Treglia and Ducastelle, 1987): a transition-metal alloy will cluster, i.e., phase separate, if the Fermi energy lies near either d band edge. In fact, NiPt strongly orders in the CuAu (or h100i-based) structure, with its phase diagram more like an fcc prototype (Massalski et al., 1990). Because Ni and Pt are in the same column of the periodic table, it is reasonable to assume that upon alloying there should be little effect from electrostatics and only the change in the band energy should really be governing the ordering. Under such an assumption, a tight-binding calculation based on average off-diagonal matrix elements reveals that no ordering is possible (Treglia and Ducastelle, 1987). Such a band-energy-only calculation of the ASRO in NiPt was, in fact, one of the first applications of our thermodynamic linear-response approach based on the CPA (Pinski et al., 1991, 1992), and it gave almost quantitative
COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS
267
Table 2. The Calculated k0 (in Units of 2p/a, Where a is the Lattice Constant) and Tsp (in K) for fcc Disordered NiPt (Including Scalar-relativistic Effects) at Various Levels of Approximation Using the Standard KKR-CPA (Johnson et al., 1986) and a Mean-field, Charge-correlated KKR-CPA (Johnson and Pinski, 1993), Labeled scr-KKR-CPA (Pinski et al., 1998) BEOa
Method KKR-CPA scr-KKR-CPA
100 100
1080 1110
BEO þ Onsagerb
BEO þ Coulombc
100 100
100 100
780 810
6780 3980
BEO þ Coulomb þ Onsagerd 111 222
100
1045 905
a
Band-energy-only (BEO) results. BEO plus Onsager corrections. c Results including the charge-rearrangement effects associated with short-range ordering. d Results of the full theory. Experimentally, NiPt has a Tc of 918 K (Massalski, et al., 1990). b
agreement with experiment. However, our more complete theory of ASRO (Staunton et al., 1994), which includes Madelung-type electrostatic effects, dielectric effects due to rearrangement of charge, and Onsager corrections, yielded results for the transition temperature and unstable wavevector in NiPt that were simply wrong, whereas for many other systems we obtained very good results (Johnson et al., 1994). By incorporating the previously described screening contributions to the calculation of ASRO in NiPt (Pinski et al., 1998), the wave vector and transition temperature were found to be in exceptional agreement with experiment, as evidenced in Table 2. In the experimental diffuse scattering on NiPt, a (100) ordering wave vector was found, which is indicative of CuAu (L10)-type short-range order (Dahmani et al., 1985), with a first-order transition temperature of 918 K. From Table 2, we see that using the improved screened (scr)-KKR-CPA yields a (100) ordering wave vector with a spinodal temperature of 905 K. If only the band-energy contributions are considered, for either the KKR-CPA or its screened version, the wave vector is the same and the spinodal temperature is about 1100 K (without the Onsager corrections). Essentially, the BEO approximation is reflecting most of the physics, as was anticipated based on their being in the same column of the periodic table. What is also clear is that the KKR-CPA, which contains a much larger Coulomb contribution, has necessarily much larger spinodal temperature (Tsp) before the Onsager correction is included. While the Onsager correction therefore must be very large to conserve spectral intensity, it is, in fact, the dielectric effects incorporated into the Onsager corrections that are trying to reduce such a large electrostatic contribution and change the wave vector into disagreement, i.e., q ¼ ð12 12 12Þ, even though the Tsp remains fairly good at 1045 K. The effect of the screening contributions to the electrostatic energy (as found in Equation 28) is to reduce significantly the effect of the Madelung energy (Tsp is reduced 40% before Onsager corrections); therefore, the dielectric effects are not as significant and do not change the wave vector dependence. Ultimately, the origin for the ASRO reduces back to what is happening in the band-energy-only situation, for it is predominantly describing the ordering and
temperature dependence and most of the electrostatic effects are canceling one another. The large electronic density of states at the Fermi level (Fig. 3) is also important for it is those electrons that contribute to screening and the dielectric response. What remains to tell is why fcc NiPt wants to order with q ¼ (100) periodicity. Lu et al. (1993) stated that relativistic effects induce the chemical ordering in NiPt. Their work showed that relativistic effects (specifically, the Darwin and mass-velocity terms) lead to a contraction of the s states, which stabilized both the disordered and ordered phases relative to phase separation, but their work did not explain the origin of the chemical ordering. As marked in electronic density of states (DOS) for disordered NiPt in Figure 3 (heavy, hatched lines), there is a large number of low-energy states below the Ni-based d band that arise due to hybridization with the Pt sites. These d states are of t2g symmetry whose lobes point to the nearest-neighbor sites in an fcc lattice. Therefore, the system can lower its energy by modulating itself with a (100) periodicity to create lots of such favorable (low-energy, d-type) bonds between nearest-neighbor Ni and Pt sites. This basic explanation was originally given by Pinski et al. (1991, 1992).
Figure 3. The calculated scr-KKR-CPA electronic density of states (states/Ry-atom) versus energy (Ry) for scalar-relativistic, disordered Ni50Pt50. The hybridized d states of t2g -symmetry created due to an electronic size effect related to the difference in electronic bandwidths between Ni and Pt are marked by thick, hatched lines. The apparent pinning of the density of states at the Fermi level for Ni and Pt reflect the fact that the two elements fall in the same column of the periodic table, and there is effectively no ‘‘charge transfer’’ from electronegativity effects.
268
COMPUTATION AND THEORETICAL METHODS
Pinski et al. (1991, 1992) pointed out that this hybridization effect arises due to what amounts to an electronic ‘‘size effect’’ related to the difference in bandwidths between Ni (little atom, small width) and Pt (big atom, large width), which is related to off-diagonal disorder in tight-binding theory. The lattice constant of the alloy plays a role in that it is smaller (or larger) than that of Pt (or Ni) which further increases (decreases) the bandwidths, thereby further improving the hybridization. Because Ni and Pt are in the same column of the periodic table, the Fermi level of the Ni and Pt d bands is effectively pinned, which greatly affects this hybridization phenomenon. See Pinski et al. (1991, 1992) for a more complete treatment. It is noteworthy that in metallic NiPt ordering originates from effects that are well below the Fermi level. Therefore, usual ideas regarding reasons for ordering used in substitutional metallic alloys about e/a effects, Fermi-surface nesting, or filling of (anti-) bonding states, that is, all effects are due to the electrons around the Fermi level, should not be considered ‘‘cast in stone.’’ The real world is much more interesting! This in hindsight turns out also to explain the failure of tight binding for NiPt: because off-diagonal disorder is important for Ni-Pt, it must be well described, that is, not to approximate those matrix elements by usual procedures. In effect, some system-dependent information of the alloying and hydridization must be included when establishing the tight-binding parameters. Coupling of Magnetic Effects and Chemical Order This hybridization (electronic ‘‘size’’) effect that gives rise to (100) ordering in NiPt is actually a more ubiquitous effect than one may at first imagine. For example, the observed q ¼ ð1 12 0Þ, or Ni4Mo-type, short-range order in paramagnetic, disordered AuFe alloys that have been fast quenched from high-temperature, results partially from such an effect (Ling, 1995b). In paramagnetic, disordered AuFe, two types of disorder (chemical and magnetic) must be described simultaneously [this interplay is predicted to allow changes to the ASRO through magnetic annealing (Ling et al., 1995b)]. For paramagnetic disordered AuFe alloys, the important point in the present context is that a competion arises between an electronic band-filling (or e/a) effect, which gives a clustering, or q ¼ (000) type ASRO, and the stronger hybridization effect, which gives a q ¼ (100) ASRO. The competition between clustering and ordering arises due to the effects from the magnetism (Ling et al., 1995b). Essentially, the large exchange splitting between the Fe majority and minority d band density of states results in the majority states being fully populated (i.e., they lie below the Fermi level), whereas the Fermi level ends up in a peak in the minority d band DOS (Fig. 4). Recall from usual band-filling-type arguments that filling bonding-type states favor chemical ordering, while filling antibonding-type states oppose chemical ordering (i.e., favor clustering). Hence, the hybridization ‘‘bonding states’’ that are created below the Fe d band due to interaction with the wider band Au (just as in NiPt) promotes ordering (Fig. 4), whereas the band filling of the minority
Figure 4. A schematic drawing of the electronic density of states (states/Ry-atom) versus energy (Ry) for scalar-relativistic, chemically disordered, and magnetically disordered (i.e., paramagnetic) Au75Fe25 using the CPA to configurationally average over both chemical and magnetic degrees of freedom. This represents the ‘‘local’’ density of states (DOS) for a site with its magnetization along the local z axis (indicated by the heavy vertical arrow). Due to magnetic disorder, there are equivalent DOS contributions from z direction, obtained by reflecting the DOS about the horizontal axis, as well as in the remaining 4p orientations. As with NiPt, the hybridized d states of t2g symmetry are marked by hatched lines for both majority (") and minority (#) electron states.
d band (which behave as ‘‘antibonding’’ states because of the exchange splitting) promotes clustering, with a compromise to ð1 12 0Þ ordering. In the calculation, this interpretation is easily verified by altering the band filling, or e/a, in a rigid-band sense. As the Fermi level is lowered below the exchange-split minority Fe peak in Figure 4, the calculated ASRO rapidly becomes (100)-type, simply because the unfavorable antibonding states are being depopulated. Charge-correlation effects that were important for Ni-Pt are irrelevant for AuFe. By ‘‘magnetic annealing’’ the high-T AuFe in a magnetic field, we can utilize this electronic interplay to alter the ASRO to h100i. Multicomponent Alloys: Fermi-Surface Nesting, van Hove Singularities, and e=a in fcc Cu-Ni-Zn Broadly speaking, the ordering in the related fcc binaries of Cu-Ni-Zn might be classified according to their phase diagrams (Massalski et al., 1990) as strongly ordering in NiZn, weakly ordering in CuZn, and clustering in CuNi. Perhaps then, it is no surprise that the phase diagram of Cu-Ni-Zn alloys (Thomas, 1972) reflects this, with clustering in Zn-poor regions, K-state effects (e.g., reduced
COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS
resistance with cold working), h100i short- (Hashimoto et al., 1985) and long-range order (van der Wegen et al., 1981), as well as ð1 14 0Þ (or DO23-type) ASRO (Reinhard et al., 1990), and incommensurate-type ordering in the Ni-poor region. Hashimoto et al. (1985) has shown that the three Warren-Cowley pair parameters for Cu2NiZn reflect the above ordering tendencies of the binaries with strong h100i-type ASRO in the Ni-Zn channel, and no fourfold diffuse scattering patterns, as is common in noble-metal-based alloys. Along with the transmission electron microscopy results of van der Wegen et al. (1981), which also suggest h100i-type long-range order, it was assumed that Fermi-surface nesting, which is directly related to the geometry of the Fermi surface and has long been known to produce fourfold diffuse patterns in the ASRO, is not operative in this system. However, the absence of fourfold diffuse patterns in the ASRO, while necessary, is not sufficient to establish the nonexistence of Fermi-surface nesting (Althoff et al., 1995, 1996). Briefly stated, and most remarkably, Fermi-surface effects (due to nesting and van Hove states) are found to be responsible for all the commensurate and incommensurate ASRO found in the Cu-rich, fcc ternary phase field. However, a simple interpretation based solely in terms of e/a ratio (Hume-Rothery, 1963) is not possible because of the added complexity of disorder broadening of the electronic states and because both composition and e/a may be independently varied in a ternary system, unlike in binary systems. Even though Fermi-surface nesting is operative, which is traditionally said to produce a four-fold incommensurate peak in the ASRO, a [100]-type of ASRO is found over an extensive composition range for the ternary, which indicates an important dependence of the nesting wavevector on e/a and disorder. In the random state, the broadening of the alloy’s Fermi surface from the disorder results in certain types of ASRO being stronger or persisting over wider ranges of e/a than one determines from sharp Fermi surfaces. For the fcc Cu-Ni-Zn, the electron states near the Fermi energy, eF , play the dominant role in determining the ordering tendency found from Sð2Þ ðqÞ (Althoff et al., 1995, 1996). In such a case, it is instructive to interpret (not calculate) Sð2Þ ðqÞ in terms of the convolution of Bloch spectral functions AB ðk; eÞ (Gyo¨ rffy and Stocks, 1983). The Bloch spectral function defines the average dispersion in the system and AB ðk; eÞ / Im tc ðk; eÞ. As mentioned earlier, for ordered alloys AB ðk; eÞ consists of delta functions in k space whenever the dispersion relationship is satisfied, i.e., dðe ek Þ, which are the electronic ‘‘bands.’’ In a disordered alloy, these ‘‘bands’’ broaden and shift (in energy) due to disorder and alloying effects. The loci of peak positions at eF , if the widths of the peaks are small on the scale of the Brillouin zone dimension, defines a ‘‘Fermi surface’’ in a disordered alloy. Provided that k-, energy-, and species-dependent matrix elements can be roughly neglected in Sð2Þ ðqÞ, and that only the energies near eF are pertinent because of the Fermi factor (NiPt was a counterexample to all this), then the q-dependent portion of Sð2Þ ðqÞ is proportional to a convolution of the spectral density of states at
269
Figure 5. The Cu-Ni-Zn Gibbs triangle in atomic percent. The dotted line is the Cu isoelectronic line. The ASRO is designated as: squares, h100i ASRO; circles, incommensurate ASRO; hexagon, clustering, or (000) ASRO. The additional line marked h100i-vH establishes roughly where the fcc Fermi surface of the alloys has spectral weight (due to van Hove singularities) at the h100i zone boundaries, suggesting bcc is nearing in energy to fcc. For fcc CuZn, this occurs at 40% Zn, close to the maximum solubility limit of 38% Zn before transformation to bcc CuZn. Beyond this line a more careful determination of the electronic free energy is required to determined fcc or bcc stability.
eF (the Fermi surface; Gyo¨ rffy and Stocks, 1983; Gyo¨ rffy et al., 1989), i.e.: ð ð2Þ Sab ðqÞ / dkAB ðk; eF ÞAB ðk þ q; eF Þ
ð29Þ
With the Fermi-surface topology playing the dominate role, ordering peaks in Sð2Þ ðqÞ can arise from states around eF in two ways: (1) due to a spanning vector that connects parallel, flat sheets of the Fermi surface to give a large convolution (so-called Fermi-surface nesting; Gyo¨ rffy and Stocks, 1983), or, (2) due to a spanning vector that promotes a large joint density of states via convolving points where van Hove singularities (van Hove, 1953) occur in the band structure at or near eF (Clark et al., 1995). For fcc CuNi-Zn, both of these Fermi-surface-related phenomena are operative, and are an ordering analog of a Peierls transition. A synopsis of the calculated ASRO is given in Figure 5 for the Gibbs triangle of fcc Cu-Ni-Zn in atomic percent. All the trends observed experimentally are completely reproduced: Zn-poor Cu-Ni-Zn alloys and Cu-Ni binary alloys show clustering-type ASRO; along the line Cu0:50þx Ni0:25n Zn0:25 (the dashed line in the figure), Cu75Zn shows ð1 14 0Þ-type ASRO, which changes to commensurate (100)-type at Cu2NiZn, and then to fully incommensurate around CuNi2Zn, where the K-state effects are observed. K-state effects have been tied to the short-range order (Nicholson and Brown, 1993). Most interestingly, a large
270
COMPUTATION AND THEORETICAL METHODS
Figure 6. The Fermi surface, or AB ðk; eF Þ, in the {100} plane of the first Brillouin zone for fcc alloys with a lattice constant of 6.80 a.u.: (A) Cu75Zn25, (B) Cu25Ni25Zn50, (C) Cu50Ni25Zn25, and (D) Ni50Zn50. Note that (A) and (B) have e/a ¼ 1.25 and (C) and (D) have e/a ¼ 1.00. As such, the caliper dimensions of the Fermi surface, as measured from peak to peak (and typically referred to as ‘‘2kF’’), are identical for the two pairs. The widths change due to increased disorder: NiZn has the greatest difference between scattering properties and therefore the largest widths. In the lower left quadrant of (A) are the fourfold diffuse spots that occur due to nesting. The fourfold diffuse spots may be obtained graphically by drawing circles (actually spheres) of radius ‘‘2kF’’ from all points and finding the common intersection of such circles along the X-W-X high symmetry lines.
region of (100)-type ordering is calculated around the Cu isoelectronic line (the dotted line in the figure), as is observed (Thomas, 1972). The Fermi surface in the h100i plane of Cu75Zn is shown in Figure 6, part A, and is reminiscient of the Cu-like ‘‘belly’’ in this plane. The caliper dimensions, or so-called ‘‘2kF,’’ of the Fermi surface in the [110] direction is marked; it is measured peak to peak and determines the nesting wavevector. It should be noted that perpendicular to this plane ([001] direction) this rather flat portion of Fermi surface continues to be rather planar, which additionally contributes to the convolution in Equation 29 (Althoff et al., 1996). In the lower left quadrant of Figure 6, part A, are the fourfold diffuse spots that occur due to the nesting. As shown in Figure 6, parts C and D, the caliper dimensions of the Fermi surface in the h100i plane are the same along the Cu isoelectronic line (i.e., constant e/a ¼ 1.00). For NiZn and Cu2NiZn, this ‘‘2kF’’ gives a (100)type ASRO because its magnitude matches the k ¼ jð000Þ ð110Þj, or X, distance perfectly. The spectral widths change due to increased disorder. NiZn
has the greatest difference between scattering properties and therefore the largest widths (see Fig. 6). The increasing disorder with decreasing Cu actually helps improve the convolution of the spectral density of states, Equation 29, and strengthens the ordering, as is evidenced experimentally through the phase-transformation temperatures (Massalski et al., 1990). As one moves off this isoelectronic line, the caliper dimensions change and an incommensurate ASRO is found, as with Cu75Zn and CuNi2Zn (see Fig. 6, parts A and B). As Zn is added, eventually van Hove states (van Hove, 1953) appear at (100) points or X-points (see Fig. 6, part D) due to symmetry requirements of the electronic states at the Brillouin zone boundaries. These van Hove states create a larger convolution integral favoring (100) order over incommensurate order. For Cu50Zn50, one of the weaker ordering cases, a competition with temperature is found between spanning vectors arising from Fermi-surface-nesting and van Hove states (Althoff et al., 1996). For compositions such as CuNiZn2, the larger disorder broadening and increase in van Hove states make the (100) ASRO dominant. It is interesting to note that the appearance of van Hove states at (100) points, such as for Cu60Zn40, where Zn has a maximum solubility of 38.5% experimentally (Thomas, 1972; Massalski et al., 1990) occurs like precursors to the observed fcc-to-bcc transformations (see rough sketch in the Gibbs triangle; Fig. 5). A detailed discussion that clarifies this correlation has been given recently about the effect of Brillouin zone boundaries in the energy difference between fcc and bcc Cu-Zn (Paxton et al., 1997). Thus, all the incommensurate and commensurate ordering can be explained in terms of Fermi-surface mechanisms that were dismissed experimentally as a possibility due to the absence of fourfold diffuse scattering spots. Also, disorder broadening in the random alloy plays a role, in that it actual helps the ordering tendency by improving the (100) nesting features. The calculated Tsp and other details may be found in Althoff et al. (1996). This highlights one of the important roles for theory: to determine the underlying electronic mechanism(s) responsible for order and make predictions that can be verified from experiment. Polarization of the Ordering Wave in Cu2NiZn As we have already discussed, a ternary alloy like fcc ZnNiCu2 does not possess the A-B symmetry of a binary; the analysis is therefore more complicated due to the concentration waves having ‘‘polarization’’ degrees of freedom, requiring more information from experiment or theory. In this case, the extra degree of freedom introduced by the third component leads also to additional ordering transitions at lower temperatures. These polarizations (as well as the unstable wavevector) are determined by the electronic interactions; also they determine the sublattice occupations that are (potentially) made inequivalent in the partially ordered state (Althoff et al., 1996). The relevent star of k0 ¼ h100i ASRO—comprised of (100), (010), (001) vectors—found for ZnNiCu2 is a precursor to the partially ordered state that may be determined
COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS
approximately from the eigenvectors of q1 (k0), as discussed previously (see Principles of the Method). We have written the alloy in this way because Cu has been arbitrarily taken as the ‘‘host.’’ If the eigenvectors are normalized, then there is but one parameter that describes the eigenvectors of F in the Cartesian or Gibbsian coordinates, which can be written:
ezn sin yk0 1 ðk0 Þ ¼ Ni e1 ðk0 Þ cos yk0
;
ezn cos yk0 2 ðk0 Þ ¼ Ni e2 ðk0 Þ sin yk0
Table 3. Atomic Distributions in Real Space of the Partially Ordered State to which the Disordered State with Stochiometry Cu2NiZn is Calculated to be Unstable at Tc Sublattice 1: Zn rich
ð30Þ
If yk is taken as the parameter in the Cartesian space, then in the Gibb’s space the eigenvectors are appropriate linear combinations of yk . The ‘‘angle’’ yk is fully determined by the electronic interactions and plays the role of ‘‘polarization angle’’ for the concentration wave with the k0 wavevector. Details are fully presented in the appendix of Althoff et al. (1996). However, the lowest energy concentration mode in Gibbs space at T ¼ 1000 K for the k0 ¼ h100i is given by eZn ¼ 1.0517 and eNi ¼ 0.9387, where one calculates Tsp ¼ 985 K, including Onsager corrections (experimental Tc ¼ 774 K; Massalski et al., 1990). For a ternary alloy, the matrices are of rank 2 due to the independent degrees of freedom. Therefore, there are two possible order parameters, and hence, two possible transitions as the temperature is lowered (based on our knowledge from high-T information). For the partially ordered state, the long-range order parameter associated with the higher energy mode can be set to zero. Using this information in Equation 9, as discussed by Althoff et al. (1996), produces an atomic distribution in real space for the partially ordered state as in Table 3. Clearly, there is already a trend to a tetragonal, L10-like state with Zn-enhanced on cube corners, as observed (van der Wegen et al., 1981) in the low-temperature, fully ordered state (where Zn is on the fcc cube corners, Cu occupies the faces in the central plane, and Ni occupies the faces in the Zn planes). However, there is still disorder on all the sublattices. The h100i wave has broken the disordered fcc cube into a four-sublattice structure, with two sublattices degenerate by symmetry. Unfortunately, the partially ordered state assessed from TEM measurements (van der Wegen et al, 1981) suggests that it is L12-like, with Cu/Ni disordered on all the cube faces and predominately Zn on the fcc corners. Interestingly, if domains of the calculated L10 state occur with an equal distribution of tetragonal axes, then a state with L12 symmetry is produced, similar to that supposed by TEM. Also, because the discussion is based on the stability of the high-temperature disordered state, the temperature for the second transition cannot be gleaned from the eigenvalues directly. However, a simple estimate can be made. Before any sublattice has a negative occupation value, which occurs for Z ¼ 0.49 (see Ni in Table 3), the second long-range order parameter must become finite and the second mode becomes accessible. As the transition temperature is roughly proportional to Eorder, or Z2, then T II ¼ (1 Z2)T I (assuming that the Landau coefficients are the same). Therefore, TII =TI ¼ 0:76, which is close to the experimental value of 0.80 (i.e., 623 K=774 K)
271
2: Ni rich
3 and 4: Random
Alloy Component
Site-Occupation Probabilitya
Zn Ni Cu Zn Ni Cu Zn Ni
0.25 þ 0.570Z(T) 0.25 0.510Z(T) 0.50 0.060Z(T) 0.25 0.480Z(T) 0.25 þ 0.430Z(T) 0.50 þ 0.050Z(T) 0.25 0.045Z(T) 0.25 þ 0.040Z(T)
Cu
0.50 þ 0.005Z(T)
a
Z is the long-range-order parameter, where 0 Z 1. Values were obtained from Althoff et al. (1996).
(Massalski et al., 1990). Further discussion and comparison with experiment may be found elsewhere (Althoff et al., 1996), along with allowed ordering due to symmetry restrictions. Electronic Topological Transitions: van Hove Singularities in CuPt The calculated ASRO for Cu50Pt50 (Clark et al., 1995) indicates an instability to concentration fluctuations with a q ¼ ð12 12 12Þ, consistent with the observed L11 or CuPt ordering (Massalski et al., 1990). The L11 structure consists of alternating fcc (111) layers of Cu and Pt, in contrast with the more common L10 structure, which has alternating (100) planes of atoms. Because CuPt is the only substitutional metallic alloy that forms in the L11 structure (Massalski et al., 1990), it is appropriate to ask: what is so novel about the CuPt system and what is the electronic origin for the structural ordering? The answers follow directly from the electronic properties of disordered CuPt near its Fermi surface, and arise due to what Lifshitz (1960) termed an ‘‘electronic topological transition.’’ That is, due to the topology of the electronic structure, electronic states, which are possibly unfavorable, may be filled (or unfilled) due to small changes in lattice or chemical structure, as arising from Peierls instabilities. Such electronic topological transitions may affect a plethora of observables, causing discontinuities in, e.g., lattice constants and specific heats (Bruno et al., 1995). States due to van Hove singularities, as discussed in fcc Cu-Ni-Zn, are one manifestation of such topological effects, and such states are found in CuPt. In Figure 7, the Fermi surface of disordered CuPt around the L point has a distinctive ‘‘neck’’ feature similar to elemental Cu. Furthermore, because eF cuts the density of states near the top of a feature that is mainly Pt-d in character (see Fig. 8, part A) pockets of d holes exist at the X points (Fig. 7). As a result, the ASRO has peaks at ð12 12 12Þ due to the spanning vector X L ¼ ð0; 0; 1Þ ð12 12 12Þ (giving a large joint electron density of states in Equation 29), which is a member of the star of L. Thus, the L11 structure is stabilized by a Peierls-like mechanism arising from the
272
COMPUTATION AND THEORETICAL METHODS
Figure 7. AB(k; eF) for disordered fcc CuPt, i.e., the Fermi surface, for portions of the h110i (-X-U-L-K), and h100i (-X-W-KW-X-L) planes. Spectral weight is given by relative gray scale, with black as largest and white as background. Note the neck at L, and the smeared pockets at X. The widths of the peaks are due to the chemical disorder experienced by the electrons as they scatter through the random alloy. The spanning vector, kvH, associated with states near van Hove singularities, as well as typical ‘‘2kF’’ Fermi-surface nesting are clearly labeled. The more Cu in the alloy the fewer d holes, which makes the ‘‘2kF’’ mechanism more energetically favorable (if the dielectric effects are accounted for fully; Clark et al., 1995).
hybridization between van Hove singularities at the highsymmetry points. This hybridization is the only means the system has to fill up the few remaining (antibonding) Pt d states, which is why this L11 ordering is rather unique to CuPt. That is, by ordering along the (111) direction, all the states at the X points—(100), (010), and (001)—may be equally populated, whereas only the states around (100) and (010) are fully populated with an (001) ordering wave consistent with L10 type order. See Clark et al. (1995) for more details. This can be easily confirmed as follows. By increasing the number of d holes at the X points, L11 ordering should not be favored because it becomes increasingly more difficult for a ð12 12 12Þ concentration wave to occupy all the d holes at X. Indeed, calculations repeated with the Fermi level lowered by 30 mRy (in a rigid-band way) into the Pt d-electron peak near eF results in a large clustering tendency (Clark et al., 1995). By filling the Pt d holes of the disordered alloy (raise eF by 30 mRy, see Fig. 8), thereby removing the van Hove singularities at eF , there is no great advantage to ordering into L11 and Sð2Þ ðqÞ now peaks at all X points, indicating L10-type ordering (Clark et al., 1995). This can be confirmed from ordered band-structure calculations using the linear muffin tin orbital method (LMTO) within the atomic sphere approximation (ASA). In Figure 8, we show the calculated LMTO electronic densities of states for the L10 and L11 configurations for comparison to the density of states for the CPA disordered state, as given by Clark et al. (1995). In the disordered case, Figure 8, part A, eF cuts the top of the Pt d band, which is consistent with the X pockets in the Fermi surface. In the L11 structure, the density of states at eF is reduced, since the modulation in concentration introduces couplings between states at eF . The L10 density of states in Figure 8, part C demonstrates that not all ordered struc-
Figure 8. Scalar-relativistic total densities of states for (A) disordered CuPt, using the KKR-CPA method; ordered CuPt in the (B) L11 and (C) L10 structures, using the LMTO method. The dashed line indicates the Fermi energy. Note the change of scale in partial Pt state densities. The bonding (antibonding) states created by the L11 concentration wave just below (above) the Fermi energy are shaded in black.
tures will produce this effect. Notice the small Peierlstype set of bonding and antibonding peaks that exist in the L11 Pt d-state density in Figure 8, part B (darkened area). Furthermore, the L10 L11 energy difference is 2.3 mRy per atom with LMTO (2.1 mRy with full-potential method; Lu et al., 1991) in favor of the L11 structure, which confirms the associated lowering of energy with L11-type ordering. We also note that without the complete description of bonding (particularly s contributions) in the alloy, the system would not be globally stable, as discussed by (Lu et al., 1991). The ordering mechanism described here is similar to the conventional Fermi surface nesting mechanism. However, conventional Fermi surface nesting takes place over extended regions of k space with spanning vectors between almost parallel sheets. The resulting structures tend to be long-period superstructures (LPS), which are observed in Cu-, Ag-, and Au-rich alloys (Massalski et al., 1990). In contrast, in the mechanism proposed for CuPt, the spanning vector couples only the regions around the X and L points in the fcc Brillouin zone, and the large joint density of states results from van Hove singularities that exist near eF . The van Hove mechanism will naturally lead to high-symmetry structures with short periodicities, since the spanning vectors tend to connect high-symmetry points (Clark et al., 1995). What is particularly interesting in Cu1c Ptc is that the L11 ordering (at c 0.5) and the one-dimensional LPS associated with Fermi-surface nesting (at c 0.73) are both found experimentally (Massalski et al., 1990). Indeed, there are nested regions of Fermi surface in the (100) plane (see Fig. 7) associated with the s-p electrons, as found in Cu-rich Cu-Pd alloys (Gyo¨ rffy and Stocks, 1983). The Fermi-surface nesting dimension is concentration dependent, and, a(q) peaks at q ¼ (1,0.2,0) at 73% Cu, provided both band-energy and double-counting terms are included
COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS
(Clark et al., 1995). Thus, a cross-over is found from a conventional Fermi-surface ordering mechanism around 75% Cu to ordering dominated by the novel van Hove-singularity mechanism at 50%. At higher Pt concentrations, c 0.25, the ASRO peaks at L with subsidiary peaks at X, which is consistent with the ordered tetragonal fccbased superstructure of CuPt3 (Khachaturyan, 1983). Thus, just as in Cu-Ni-Zn alloys, nesting from s-p states and van Hove singularities (in this case arising from d states) both play a role, only here the effects from van Hove singularities cause a novel, and observed, ordering in CuPt. On the Origin of Temperature-Dependent Shifts of ASRO Peaks The ASRO peaks in Cu3Au and Pt8V at particular (h, k, l) positions in reciprocal-space have been observed to shift with temperature. In Cu3Au, the four-fold split diffuse peaks at (1k0) positions about the (100) points in reciprocal-space coalesce to one peak at Tc, i.e., k ! 0; whereas, the splitting in k increases with increasing temperature (Reichert et al., 1996). In Pt8V, however, there are twofold split diffuse peaks at (1 h,0,0) and the splitting, h, decreases with increasing temperature, distinctly opposite to Cu3Au (Le Bulloc’h et al., 1998). Following the Cu3Au observations, several explanations have been offered for the increased splitting in Cu3Au, all of which cite entropy as being responsible for increasing the fourfold splitting (Reichert et al., 1996, 1997; Wolverton and Zunger, 1997). It was emphasized that the behavior of the diffuse scattering peaks shows that its features are not easily related to the energetics of the alloy, i.e., the usual Fermi-surface nesting explanation of fourfold diffuse spots (Reichert et al., 1997). However, entropy is not an entirely satisfactory explanation for two reasons. First, it does not explain the opposite behavior found for Pt8V. Second, entropy by its very nature is dimensionless, having no q dependence that can vary peak positions. A relatively simple explanation has been recently offered by Le Bulloc’h et al. (1998), although it is not quantitative. They detail how the temperature dependence of peak splitting of the ASRO is affected differently depending on whether the splitting occurs along (1k0), as in Cu3Au and Cu3Pd, or whether it occurs along (h00) as in Pt8V. However, the origin of the splitting is always just related to the underlying chemical interactions and energetics of the alloy. While the electronic origin for the splitting would be obtained directly from our DFT approach, this subtle temperature and entropy effect would not be properly described by the method under its current implementation.
CONCLUSION For multicomponent alloys, we have described how the ‘‘polarization of the ordering waves’’ may be obtained from the ASRO. Besides the unstable wavevector(s), the polarizations are the additional information required to
273
define the ordering tendency of the alloy. This can also be obtained from the measured diffuse scattering intensities, which, heretofore, have not been appreciated. Furthermore, it has been the main purpose of this unit to give an overview of an electronic-structure-based method for calculating atomic short-range order in alloys from first principles. The method uses a linear-response approach to obtain the thermodynamically induced ordering fluctuations about the random solid solution as described via the coherent-potential approximation. Importantly, this density functional-based concentrationwave theory is generalized in a straightforward manner to multicomponent alloys, which is extremely difficult for most other techniques. While the approach is clearly mean-field (as thoroughly outlined), it incorporates on an equal footing many of the electronic and entropic mechanisms that may compete on a system-by-system basis. This is especially notable when in metals the important energy differences are the order of a few mRy, where 1 mRy is 158 K; thus, even a 10% error in calculated temperature scales at this point is amazingly good, although for some systems we have done much better. When the ASRO indicates the low-temperature, ordered state (as is usually the case), then it is possible to determine the underlying electronic mechanism responsible for the phase transformation. In any case, it does determine the origin of the atomic short-range order. Nevertheless, the first-principles concentration wave theory does not include possibly important effects, such as statistical fluctuations beyond mean field or cluster entropy, which may give rise to first-order transformations entirely distinct from the premonitory fluctuations or temperature effects in the ASRO. In such cases, more accurate statistical methods, such as Monte Carlo, may be employed if the energetics that are relevant (as determined by some means) can also be obtained with sufficient relative accuracy. A few electronic mechanisms were showcased which gave rise to ordering in various binary and ternary alloys. Fermi-surface effects explain all the commensurate and incommensurate ordering tendencies in fcc Cu-Ni-Zn alloys, in contrast to interpretations made from experimental data. A hybridization effect that occurs well below the Fermi level produces the strong L10-type order in NiPt. The hybridization and well-known band-filling (or, e/a) effects explain the ASRO in AuFe alloys, if magnetic exchange splitting is incorporated. A novel van Hove singularity mechanism, which arises due to the topology of the electronic structure of the disordered alloy, explains the unique L11-type order found in CuPt. Without the ability to connect the ASRO to electronic effects, many of these effects would have been impossible to identify via traditional band-structure applications, even for the cases of long-range order, which speaks to the usefulness of the technique. The first-principles, concentration-wave technique may be even more useful in multicomponent alloys. Currently, several ternary alloy systems are under investigation to determine the site-occupancy preferences for partially ordered B2 Nb-Al-Ti alloys, as recently measured by Hou et al. (1997) and Johnson et al. (1999). Nevertheless, contributions from size and electrostatic effects
274
COMPUTATION AND THEORETICAL METHODS
must still be included in the multicomponent case. At this point, only more applications and comparisons with experiment on a system-by-system basis will reveal important new insights into origins of alloy phase stability. ACKNOWLEDGMENTS This work was supported by the Division of Materials Science, Office of Basic Energy Sciences, U.S. Department of Energy when Dr. Johnson was at Sandia National Laboratories (contract DE-AC04-94AL85000) and recently at the Frederick Seitz Materials Research Laboratory (contract DEFG02-96ER45439). Dr. Johnson would like to thank Alphonse Finel for informative discussions regarding his recent work, and, the NATO AGARD program for making the discussion possible with a visit to O.N.E.R.A., France. It would be remiss not to acknowledge also the many collaborators throughout developments and applications: Jeff Althoff, Mark Asta, John Clark, Beniamino Ginatempo, Balazs Gyo¨ rffy, Jeff Hoyt, Michael Ling, Bill Shelton, Phil Sterne, and G. Malcolm Stocks. LITERATURE CITED Althoff, J. D., Johnson, D. D., and Pinski, F. J., 1995. Commensurate and incommensurate ordering tendencies in the ternary fcc Cu-Ni-Zn system. Phys. Rev. Lett. 74:138. Althoff, J. D., Johnson, D. D., Pinski, F. J., and Staunton, J. B. 1996. Electronic origins of ordering in multicomponent metallic alloys: Application to the Cu-Ni-Zn system. Phys. Rev. B 53:10610. Asta, M. and Johnson, D. D. 1997. Thermodynamic properties of fcc-based Al-Ag alloys. Comp. Mater. Sci. 8:64. Badalayan, D. A., Khachaturyan, A. G., and Kitaigorodskii, A. I. 1969. Theory of order-disorder phase transformations in molecular crystals. II. Kristallografiya 14:404. Barrachin, M., Finel, A., Caudron, R., Pasturel, A., and Francois, A. 1994. Order and disorder in Ni3V effective pair interactions and the role of electronic excitations. Phys. Rev. B 50:12980. Berlin, T. H. and Kac, M. 1952. Spherical model of a ferromagnet. Phys. Rev. 86:821. Boric¸ i135>i-Kuqo, M. and Monnier, R. 1997. Short-range order in the random binary Madelung lattice. Comp. Mater. Sci. 8:16. Brout, R. and Thomas, H. 1965. Phase Transitions. W. A. Benjamin, Inc., Menlo Park, Calif. Brout, R. and Thomas, H. 1967. Molecular field theory, the Onsager reaction field and the spherical model. Physics 3:317. Bruno, E., Ginatempo, B., and Guiliano, E. S. 1995. Fermi-surface and electronic topological transition in metallic random alloys (I): Influence on equilibrium properties. Phys. Rev. 52:14544. Butler, C. J., McCartney, D. G., Small, C. J., Horrocks, F. J., and Saunders, N. 1997. Solidification microstructure and calculated phase equilibria in the Ti-Al-Mn system. Acta Mater. (USA). 45:2931. Ceder, G., Garbulsky, G. D., Avis, D., and Fukuda, K. 1994. Ground states of a ternary fcc lattice model with nearest-neighbor and next-nearest-neighbor interactions. Phys. Rev. B 49:1. Chepulskii, R. V. and Bugaev, V. N. 1998. Analytical methods for calculations of the short-range order in alloys (a) I General theory, J. Phys.: Conds. Matter 10:7309. (b) II Numerical accuracy studies, J. Phys.: Conds. Matter 10:7327.
Clapp, P. C. and Moss, S. C. 1966. Correlation functions of disordered binary alloys I. Phys. Rev. 142:418. Clark, J., Pinski, F. J., Johnson, D. D., Sterne, P. A., Staunton, J. B., and Ginatempo, B. 1995. van Hove singularity induced L11 ordering in CuPt. Phys. Rev. Lett. 74:3225. Cole, R. J., Brooks, N. J., and Weightman, P. 1997. Madelung potentials and disorder broadening of core photoemission spectra in random alloys. Phys. Rev. Lett. 78P:3777. Connolly, J. W. D. and Williams, A. R. 1983. Density-functional theory applied to phase transformations in transition-metal alloys. Phys. Rev. B 27:RC5169. Dahmani, C. E., Cadeville, M. C., Sanchez, J. M., and MoranLopez, J. L. 1985. Ni-Pt phase diagram: experiment and theory. Phys. Rev. Lett. 55:1208. Duscastelle, F. 1991. Order and phase stability in alloys. (F. de Boer and D. Pettifor, eds.) p. 303313. North-Holland, Amsterdam. Duscastelle, F. and Gautier, F. 1976. Generalized perturbation theory in disordered transitional alloys: Application to the calculation of ordering energies. J. Phys. F6:2039. de Fontaine, D. 1973. An analysis of clustering and ordering in multicomponent solid solutions. I. Fluctuations and kinetics. J. Phys. Chem. Solids 34:1285. de Fontaine, D. 1975. k-space symmetry rules for order-disorder reactions. Acta Metall. 23:553. de Fontaine, D. 1979. Configurational thermodynamics of solid solutions. Solid State Phys. 34:73. Evans, R. 1979. Density-functional theory for liquids. Adv. Phys. 28:143. Faulkner, J. S. and Stocks, G. M. 1980. Calculating properties with the coherent-potential approximation. Phys. Rev. B 21:3222. Faulkner, J. S., Wang, Yang, and Stocks, G. M. 1997. Coulomb energies in alloys. Phys. Rev. B 55:7492. Finel, A., Barrachin, M., Caudron, R., and Francois, A. 1994. Effective pairwise interactions in Ni3V, in metallic alloys: Experimental and theoretical perspectives. (J.S. Faulkner and R. Jordon, eds.). NATO-ASI Series Vol. 256, page 215. Kluwer Academic Publishers, Boston. Gonze, X. 1997. First-principles responses of solids to atomic displacements and homogeneous electric fields: Implementation of a conjugate-gradient algorithm. Phys. Rev. B 55:10337. Gyo¨ rffy, B. L., Johnson, D. D., Pinski, F. J., Nicholson, D. M., and Stocks, G. M. 1989. The electronic structure and state of compositional order in metallic alloys (G. M. Stocks and A. Gonis, eds.) NATO-ASI Series Vol. 163: Alloy Phase Stability. Kluwer Academic Publishers, Boston. Gyo¨ rffy, B. L. and Stocks, G. M. 1983. Concentration waves and fermi surfaces in random metallic alloys. Phys. Rev. Lett. 50:374. Hashimoto, S., Iwasaki, H., Ohshima, K., Harada, J., Sakata, M., and Terauchi, H. 1985. Study of local atomic order in a ternary Cu47Ni29Zn24 alloy using anomalous scattering of synchrotron radiation. J. Phys. Soc. Jpn. 54:3796. Hou, D. H., Jones, I. P., and Fraser, H. L. 1997. The ordering tie-line method for sublattice occupancy in intermetallic compounds. Philos. Mag. A 74:741. Hume-Rothery, W., 1963. Electrons, Atoms, Metals, and Alloys, 3rd ed. Dover, New York. Inoue, A., Zhang, T., and Matsumoto, T. 1990. Zr-Al-Ni amorphous alloys with high glass transition temperatures and significant supercooled liquid region. Mater. Trans. JIM 31:177.
COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS Johnson, D. D. 1999. Atomic short-range order precursors to Heusler phase in disordered BCC ternary alloys. To be submitted for publication.
275
Lifshitz, I. M. 1960. High-pressure anomalies of electron properties of a metal. Zh. Eksp. Teor. Fiz. 38:1569.
Johnson, D. D. 2001. An electronic-structure-based theory of atomic short-range order and phase stability in multicomponent alloys: With application to CuAuZn2. Phys. Rev. B. Submitted.
Ling, M. F., Stuanton, J. B., and Johnson, D. D. 1995a. All-electron, linear-response theory of local environment effects in magnetic, metallic alloys and multilayers. J. Phys. (Condensed Matter) 7:1863.
Johnson, D. D. and Asta, M. 1997. Energetics of homogeneouslyrandom fcc Al-Ag alloys: A detailed comparison of computational methods. Comput. Mater. Sci. 8:54. Johnson, D. D., Asta, M. D., and Althoff, J. D. 1999. Temperaturedependent chemical ordering in bcc-based ternary alloys: a theoretical study in Ti-Al-Nb. Philos. Mag. Lett. 79:551.
Ling, M. F., Staunton, J. B., Johnson, D. D., and Pinski, F. J. 1995b. Origin of the (1 1=2 0) atomic short-range order in Aurich Au-Fe alloys. Phys. Rev. B 52:R3816. Lu, Z. W., Wei, S.-H., and Zunger, A. 1991. Long-range order in binary late-transition-metal alloys. Phys. Rev. Lett. 66:1753.
Johnson, D. D., Nicholson, D. M., Pinski, F. J., Stocks, G. M., and Gyo¨ rffy, B.L. 1986. Density-functional theory for random alloys: Total energy within the coherent-potential approximation. Phys. Rev. Lett. 56:2088.
Lu, Z. W., Wei, S.-H., and Zunger, A. 1993. Relativisticallyinduced ordering and phase separation in intermetallic compounds. Europhys. Lett. 21:221.
Johnson, D. D., Nicholson, D. M., Pinski, F. J., Stocks, G. M., and Gyo¨ rffy, B.L. 1990. Total energy and pressure calculations for random substitutional alloys. Phys. Rev. B 41:9701. Johnson, D. D. and Pinski, F. J. 1993. Inclusion of charge correlations in the calculation of the energetics and electronic structure for random substitutional alloys. Phys. Rev. B 48:11553. Johnson, D. D., Staunton, J. B., and Pinski, F. J. 1994. First-principles all-electron theory of atomic short-range ordering in metallic alloys: DO22- versus L12-like correlations. Phys. Rev. B 50:1473. Khachaturyan, A. G. 1972. Atomic structure of ordered phases: Stability with respect to formation of antiphase domains. Zh. Eksp. Teor. Fiz. 63:1421. Khachaturyan, A. G. 1983. Theory of structural transformations in solids. John Wiley & Sons, New York. Kohn, W. and Rostaker, N, 1954. Phys. Rev. 94:1111. Kohn, W. and Sham, L. J. 1965. Self-consistent equations including exchange and correlation effects. Phys. Rev. 140:A1133. Korringa, J. 1947. On the calculation of the energy of a Bloch wave in a metal. Physica 13:392.
Magri, R., Wei, S.-H., and Zunger, A. 1990. Ground-state structures and the random-state energy of the Madelung lattice. Phys. Rev. B 42:11388. Masanskii, I. V., Tokar, V. I., and Grishchenko, T. A. 1991. Pair interactions in alloys evaluated from diffuse-scattering data. Phys. Rev. B 44:4647. Massalski, T. B., Okamoto, H., Subramanian, P. R., and Kacprzak, L. 1990. Binary Alloy Phase Diagrams, 2nd ed. ASM International, Materials Park, Ohio. McCormack, R., Asta, M., Hoyt, J. J., Chakoumakos, B. C., Misture, S. T., Althoff, J. D., and Johnson, D. D. 1997. Experimental and theoretical investigation of order-disorder in Cu2AlMn. Comp. Mater. Sci. 8:39. Mohri, T., Sanchez, J. M., and de Fontaine, D. 1985. Short-range order diffuse intensity calculations in the cluster variation method. Acta Metall. 33:1463. Moss, S. C. 1969. Imaging the Fermi surface through diffraction scattering from a concentrated disordered alloy. Phys. Rev. Lett. 22:1108. Mott, N. 1937. The energy of the superlatiice in b brass. Proc. Phys. Soc. Lond. 49:258.
Korzhavyi, P. A., Ruban, A. V., Abrikosov, I. A., and Skriver, H. L. 1995. Madelung energy for random metallic alloys in the coherent potential approximation. Phys. Rev. 51:5773.
Nicholson, D. M. and Brown, R. H. 1993. Electrical resistivity of Ni0.8Mo0.2: Explanation of anomalous behavior in short-range ordered alloys. Phys. Rev. Lett. 21:3311.
Krivoglaz, M., 1969. Theory of x-ray and thermal-neutron scattering by real crystals. Plenum, New York.
Oates, W. A., Wenzl, H., and Mohri, T. 1996. On putting more physics into CALPHAD solution models. CALPHAD, Comput. Coupling Phase Diagr. Thermochem. (UK). 20:37.
Landau, L. D. 1937a. Theory of phase transformations I. Phys. Zeits. d. Sowjetunion 11:26. Landau, L. D. 1937b. Theory of phase transformations II, Phys. Zeits. d. Sowjetunion, 11:545. Landau, L. D. and Lifshitz, E. M. 1980. Statistical Physics, 3rd ed. Pergamon Press, New York. Lankford, W. T. Jr., Samways, N., Craven, R., and McGannon, H. (eds.), 1985. The Making, Shaping and Forming of Steels. AISE, Herbick and Hild, Pittsburgh. Le Bolloc’h, D., Caudron, R., and Finel, A. 1998. Experimental and theoretical study of the temperature and concentration dependence of the short-range order in Pt-V alloys. Phys. Rev. B 57:2801. Le Bolloc’h, D., Cren, T., Caudron, R., and Finel, A. 1997. Concentration variation of the effective pair interactions measured on the Pt-V system. Evaluation of the gamma-expansion method. Comp. Mater. Sci. 8:24. Lifshitz, E. M. 1941. A theory of phase transition of 2nd kind. I. Measurement of the elementary crystal unit cells during the phase transition of the 2nd kind. Zh. Eksp. Teor. Fiz. 11:255. Lifshitz, E. M. 1942. (Title unavailable). Akad. Nauk SSSR Izvestiia Seriia Fiz. 7:251.
Onsager, L. 1936. Electric moments of molecules in liquids. J. Am. Chem. Soc. 58:1486. Ornstein, L. S. 1912. Accidental deviations of density in mixtures. K. Akad. Amsterdam 15:54. Ornstein, L. S. and Zernike, F. 1914. Accidental deviation of density and opalescence at the critical point. K. Akad. Amsterdam 17:793. Ornstein, L. S. and Zernike, F. 1918. The linear dimensions of density variations. Phys. Z. 19:134. Pavone, P., Bauer, R., Karch, K., Schuett, O., Vent, S., Windl, W., Strauch, D., Baroni, S., and De Gironcoli, S. 1996. Ab initio phonon calculations in solids. Physica B 219–220:439. Paxton, A. T., Methfessel, M., and Pettifor, D. G. 1997. A bandstructure view of the Hume-Rothery electron phases. Proc. R. Soc. Lond. A 453:1493. Peker, A. and Johnson, W. L. 1993. A highly processable metallic glass: Zr41.2Ti13.8Cu12.5 Ni10.0Be22.5. Appl. Phys. Lett. 63:2342. Pierron-Bohnes, V., Kentzinger, E., Cadeville, M. C., Sanchez, J. M., Caudron, R., Solal, F., and Kozubski, R. 1995. Experimental determination of pair interactions in a Fe0.804V0.196 single crystal. Phys. Rev. B 51:5760.
276
COMPUTATION AND THEORETICAL METHODS
Pinski, F. J., Ginatempo, B., Johnson, D. D., Staunton, J. B., and Stocks, G. M., and Gyo¨ rffy, B. L. 1991. Origins of compositional order in NiPt. Phys. Rev. Lett. 66:766.
Vaks, V. G., Larkin, A. I., and Likin, S. A. 1966. Self-consistent method for the description of phase transitions. Zh. Eksp. Teor. Fiz. 51:361.
Pinski, F. J., Ginatempo, B., Johnson, D. D., Staunton, J. B., Stocks, G. M. 1992. Reply to comment. Phys. Rev. Lett. 68: 1962.
van der Wegen, G. J. L., De Rooy, A., Bronsveld, P. M., and De Hosson, J. Th. M. 1981. The order/disorder transition in the quasibinary cross section Cu50Ni50-xZnx. Scr. Met. 15:1359.
Pinski, F. J., Staunton, J. B., and Johnson, D. D. 1998. Chargecorrelation effects in calculations of atomic short-range order in metallic alloys. Phys. Rev. B 57:15177. Quong, A. A. and Liu. A. Y. 1997. First-principles calculations of the thermal expansion of metals. Phys. Rev. B 56:7767. Reichert, H., Moss, S. C., and Liang, K. S. 1996. Anomalous temperature dependence of x-ray diffuse scattering in Cu3Au. Phys. Rev. Lett. 77:4382. Reichert, H., Tsatskis, I., and Moss, S. C. 1997. Temperature dependent microstructure of Cu3Au in the disordered phase. Comp. Mater. Sci. 8:46. Reinhard, L. and Moss, S. C. 1993. Recent studies of short-range order in alloys: The Cowley theory revisited. Ultramicroscopy 52:223. Reinhard, L., Scho¨ nfeld, B., Kostorz, G., and Bu¨ hrer, 1990. Shortrange order in a-brass. Phys. Rev. B 41:1727. Ruban, A. V., Abrikosov, I. A., and Skriver, H. L. 1995. Groundstate properties of ordered, partially ordered, and random Cu-Au and Ni-Pt alloys. Phys. Rev. B 51:12958. Rubin, G. and Finel, A. 1995. Application of first-principles methods to binary and ternary alloy phase diagram predictions. J. Phys. (Condensed) Matter 7:3139.
van Hove, L. 1953. The occurrence of singularities in the elastic frequency distribution of a crystal. Phys. Rev. 89:1189. Wolverton, C. and de Fontaine, D. 1994. Cluster expansions of alloy energetics in ternary intermetallics. Phys. Rev. B 49: 8627. Wolverton, C. and Zunger, A. 1995a. Ising-like description of structurally relaxed ordered and disordered alloys. Phys. Rev. Lett. 75:31623165. Wolverton, C. and Zunger, A. 1995b. Short-range order in a binary Madelung Lattice. Phys. Rev. B 51:6876. Wolverton, C. and Zunger, A. 1997. Ni-Au: A testing ground for theories of phase stability. Comp. Mater. Sci. 8:107. Wolverton, C., Zunger, A., and Schonfeld, B. 1997. Invertible and non-invertible alloy Ising problems. Solid State Commun. 101:519. Yu, R. and Krakauer, H. 1994. Linear-response calculations within the linearized augmented plane-wave method. Phys. Rev. B 49:4467. Zernike, F. 1940. The propogation of order in co-operative phenomena. Part I: The AB case. Physica 7:565.
Sanchez, J. M. and de Fontaine, D. 1978. The fcc Ising model in the cluster variation method. Phys. Rev. B 17:2926. Sanchez, J. M. and de Fontaine, D. 1980. Ordering in fcc lattices with first- and second-neighbor interactions. Phys. Rev. B 21:216.
KEY REFERENCES
Sato, H. and Toth, R. S. 1962. Long period superlattices in alloys. II. Phys. Rev. 127:469.
Althoff et al., 1996. See above.
Althoff et al., 1995. See above.
Saunders, N. 1996. When is a compound energy not a compound energy? A critique of the 2-sublattice order/disorder model, CALPHAD. Comput. Coupling Phase Diagr. Thermochem. 20:491.
Provides a detailed explanation for the electronic origins of ASRO in fcc Cu-Ni-Zn. The 1996 paper details all contributing binaries, along with several ternaries, along with an appendix that describes how to determine the polarization of the ternary concentration waves from the diffuse scattering intensities, as applied to Cu2NiZn.
Soven, P. 1967. Coherent-potential model of substitutional disordered alloys. Phys. Rev. B 156:809.
Asta and Johnson, 1997. See above.
Staunton, J. B., Johnson, D. D., and Pinski, F. J. 1990. Theory of compositional and magnetic correlations in alloys: Interpretation of a diffuse neutron-scattering experiment on an ironvanadium single crystal. Phys. Rev. Lett. 65:1259. Staunton, J. B., Johnson, D. D., and Pinski, F. J. 1994. Compositional short-range ordering in metallic alloys: Band-filling, charge-transfer, and size effects from a first-principles, all-electron, Landau-type theory. Phys. Rev. B 50:1450. Stell, G. 1969. Some critical properties of the Ornstein-Zernike system. Phys. Rev. 184:135. Taylor, D. W. 1968. Vibrational properties of imperfect crystals with large defect concentrations. Phys. Rev. 156:1017. ¨ ber den elektrischen Widerstand von KupferThomas, H. 1972. U Nickel-Zink-Legierungen und den Einfluß einer Tieftemperatur-Verformung. Z. Metallk. 63:106. Tokar, V. I. 1985. A new series expansion for lattice statistics. Phys. Lett. 110A:453. Tokar, V. I. 1997. A new cluster method in lattice statistics. Comp. Mat. Sci. 8:8. Treglia, G. and Ducastelle, F. 1987. Is ordering in PtNi alloys induced by spin-orbit interactions. J. Phys. F 17:1935.
Johnson and Asta, 1997. See above. These references provide a proper comparison of complimentary methods briefly discussed in the text, and show that when done carefully the methods agree. Moreover, the first paper shows how this information may be used to calculate the equilibrium (or metastable) phase diagram of an alloy with very good agreement to the assessed phase diagram, as well as how calculations help interpretation when there is contradictory experimental data. Brout and Thomas, 1967. See above. Some original details on connection of Onsager corrections and meanspherical models in simple model Hamiltonians. Clark et al., 1995. See above. Application of the present approach, which was the first theory to explain the L11 ordering in CuPt. Furthermore, it detailed how electronic van Hove singularities play the key role in producing such a unique ordering in CuPt. Evans, 1979. See above. Provides a very complete reference for classical density-functional theory as applied to liquids, but which is the basis for the connection between electronic and classical DFT as performed here.
COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS Gyo¨ rffy and Stocks, 1983. See above. The first paper to detail the conceptual framework for the first-principles, electronic DFT for calculating ASRO within a classical DFT, and also to show how the concentration effects on the random alloy Fermi surface explains the concentration-dependent shifts in diffuse scattering peaks, with application to Cu-Pd alloys. Ling et al., 1995b. See above. Details how magnetism and chemical correlations are intimately connected, and how the quenching rate, which determined the magnetic state, affects the ASRO observed, which may be altered via magnetic annealing. Staunton et al., 1990. See above. First application of the first-principles concentration wave technique applied to a magnetic binary system and details how useful such calculations may be in explaining the origins of the ASRO features in the experimental data. Also details how quenching a sample can play
277
an important role in establishing what type of exchange-split electronic structure (e.g., ferromagnetic and disordered paramagetic) gives rise to the chemical fluctuations. Staunton et al., 1994. See above. Johnson et al., 1994. See above. First complete details, and applications in several alloy systems, of the fully self-consistent, all-electron, mean-field, density-functional-based theory for calculating ASRO for binary metallic alloys, which includes band, charge, and dielectric effects, along with Onsager corrections.
DUANE D. JOHNSON FRANK J. PINSKI JULIE B. STAUNTON University of Illionis Urbana, Illinois
This page intentionally left blank
MECHANICAL TESTING INTRODUCTION
testing remains an active field at the forefront of modern technology.
The mechanical behavior of materials is concerned primarily with the response of materials to forces or loads. This behavior ultimately governs the usefulness of materials in a variety of applications from automotive and jet engines to skyscrapers, as well as to more common products such as scissors, electric shavers, coffee mugs, and cooking utensils. The forces or loads that these materials experience in their respective applications make it necessary to identify the limiting values that can be withstood without failure or permanent deformation. Indeed, in many cases it is necessary to know not only the response of the materials to an applied load, but also to be able to predict their behavior under repeated loading and unloading. In other applications, it is also necessary to determine the time-dependent behavior and routine wear of materials under the applied loads and under operating conditions. Knowledge of the mechanical behavior of materials is also necessary during manufacturing processes. For example, it is often necessary to know the values of temperature and loading rates that minimize the forces necessary during mechanical forming and shaping of components. Determination of the resulting microstructures during shaping and forming is an integral part of these operations. This is an area that clearly combines the determination of mechanical properties with that of the microstructure. Of course, the atomistic concept of flow and materials failure is an integral part of the determination of mechanical properties. Basic mechanical property and materials strength measurements are obtained by standardized mechanical tests, many of which are described in this chapter. Each unit in this chapter not only covers details, principles, and practical aspects of the various tests, but also provides comprehensive reference to the standard procedures and sample geometries dictated by the ASTM or other standards agencies. This chapter also covers additional mechanical testing techniques such as high-strain-rate testing, measurements under pressure, and tribological and wear testing. As such, the chapter covers a broad spectrum of techniques used to assess materials behavior in a variety of engineering applications. Whereas mechanical testing has a long history and is largely a mature field, methods continue to take advantage of advances in instrumentation and in fundamental understanding of mechanical behavior, particularly in more complex systems. Indeed, the first category of materials one thinks of that benefits from knowledge and control of mechanical properties is that of structural materials, i.e., substantial components of structures and machines. However, these properties are also crucial to the most technologically sophisticated applications in, for example, the highest-density integrated electronics and atom-byatom deposited multilayers. In this respect, mechanical
REZA ABBASCHIAN
TENSION TESTING INTRODUCTION Of all mechanical properties, perhaps the most fundamental are related to what happens when a material is subjected to simple uniaxial tension. In its essence, a tensile test is carried out by attaching a specimen to a loadmeasuring device, applying a load (or imposing a given deformation), and measuring the load and corresponding deformation. A schematic of a specimen and test machine is provided in Figure 1 (Schaffer et al., 1999). The result obtained from a tensile test is a so-called stress/strain curve, a plot of stress (force/unit area) versus strain (change in length/original length), illustrated schematically in Figures 1 and 2 (Dieter, 1985). The results of such a test (along with the results of other tests, to be sure) are basic to determination of the suitability of a given material for a particular load-bearing application. In this regard the results obtained from such a test are of great engineering significance. Tensile-test results also provide a great deal of information relative to the fundamental mechanisms of deformation that occur in the specimen. Coupled with microscopic examination, tensile test results are used to develop theories of hardening and to develop new alloys with improved properties. Tensile tests can be used to obtain information on the following types of property: Elastic Properties. These are essentially those constants that relate stresses to strains in the (usually) small strain regime where deformation is reversible. This is the linear region in Figures 1B and 2. Deformation is said to be reversible when a specimen or component that has been subjected to tension returns to its original dimensions after the load is removed. Young’s modulus (the slope of the linear, reversible portion of Fig. 1B) and Poisson’s ratio (the ratio of strain in the loading direction to strain in the transverse direction) are typical examples of elastic properties. Plastic Properties. Plastic properties are those that describe the relationships between stresses and strains when the deformation is large enough to be irreversible. Typical plastic properties are the yield stress (the stress at which deformation become permanent, denoted by the symbol sys ), the extent of hardening with deformation (referred to as strain hardening), the maximum in the 279
280
MECHANICAL TESTING
Figure 1. Schematic of specimen attached to testing machine (Schaffer et al., 1999).
stress/strain plot (the ultimate tensile strength, denoted suts ), the total elongation, and the percent reduction in area. These various quantities are illustrated in Figure 2 and discussed in more detail below. Indication of the Material’s Toughness. In a simplified view, toughness is the ability of a material to absorb energy in being brought to fracture. Intuitively, toughness is manifested by the absorption of mechanical work and is related to the area under the curve of stress versus strain, as shown in Figure 3 (Schaffer et al., 1999). Tensile testing is carried out at different temperatures, loading rates, and environments; the results of these tests are widely used in both engineering applications and scientific studies. In the following sections, basic principles are developed, fundamentals of tensile testing are pointed out, and references are provided to the detailed techniques employed in tensile testing.
Figure 2. Schematic stress/strain curve for metallic material (Dieter, 1985).
Competitive and Related Techniques In addition to the mechanical tests described in the following sections, information can be obtained about elastic properties through vibrational analysis, and information about plastic properties (e.g., tensile strength) may be obtained from microhardness testing (HARDNESS TESTING). While such information is limited, it can be obtained quickly and inexpensively.
PRINCIPLES OF THE METHOD Analysis of Stress/Strain Curves The stress/strain curve will typically contain the following distinct regions. 1. An initial linear portion in which the deformation is uniform and reversible, meaning that the specimen
TENSION TESTING
281
Figure 3. Area under stress/strain curve for (left) brittle and (right) ductile materials (Schaffer et al., 1999).
comes back to its original shape when the load is released. Such behavior is referred to as elastic and is seen in the initial straight-line regions of Figures 1 and 2. 2. A region of rising load in which the deformation is uniform and permanent (except, of course, for the elastic component), as illustrated in Figure 1. 3. A region in which the deformation is permanent and concentrated in a small localized region or ‘‘neck.’’ The region of nonuniform deformation is indicated in Figure 1, and necking is illustrated schematically in Figure 4 (Dieter, 1985). These regions are discussed in the sections that follow. Elastic Deformation Under the application of a specified external load, atoms (or molecules) are displaced by a small amount from their equilibrium positions. This displacement results in an increase in the internal energy and a corresponding force, which usually varies linearly for small displacements from equilibrium. This initial linear variation (which is reversible upon release of the load) obeys what is known as Hooke’s law when expressed in terms of force and displacement: F ¼ kx
ð1Þ
where k is the Hooke’s law constant, x ¼ displacement, and F is the applied force. In materials testing, Hooke’s law is more frequently expressed as a relationship between stress and strain:
s¼Ee
ð2Þ
where s is stress (in units of force/area normal to force), e ¼ strain (displacement/initial length), and E ¼ Young’s modulus. Definitions of Stress and Strain While stress always has units of force/area, there are two ways in which stress may be calculated. The true stress, usually represented by the symbol s, is the force divided by the instantaneous area. The engineering stress is usually represented by the symbol S and is the force divided by the original area. Since the cross-sectional area decreases as the specimen elongates, the true stress is always larger than the engineering stress. The relationship between the true stress and the engineering stress is easily shown to be: s ¼ Sð1 þ eÞ
ð3Þ
where e is the engineering strain. The engineering strain is defined as displacement (l) divided by the original length (l0 ) and is denoted by e. That is: l l0
ð4Þ
dl i ¼ ln l l 0 l0
ð5Þ
e¼ The true strain is given by e¼
ðl
where l is the instantaneous length. This term is simply the sum of all of the instantaneous strains. The true strain and engineering strain are related by the equation: e ¼ lnð1 þ eÞ
Figure 4. Illustration of necking in a metallic specimen and the local variation in the strain (Dieter, 1985).
ð6Þ
There is not a significant difference in the true strain and engineering strain until the engineering strain reaches 10%. The difference between the conventional engineering stress/strain curve and the true stress/strain curve is illustrated in Figure 5 (from Schaffer et al., 1999). Young’s modulus is an indicator of the strength of the interatomic (or intermolecular) bonding and is related to the curvature and depth of the energy-versus-position
282
MECHANICAL TESTING
predominantly limited to annealed steels with low carbon content. Stress-strain curves for many metals in the region of uniform strain are well described by the following equation: s ¼ Ken
ð7Þ
where K is strength coefficient and n is strain-hardening exponent. The strength coefficient and strain-hardening coefficients can be obtained by representing the stress/strain curve on a log/log basis: log s ¼ n log e þ log K Figure 5. Illustration of the difference between the engineering and true stress/strain curves (Schaffer et al., 1999).
curve. Young’s modulus may be obtained from the initial portion of the stress/strain curves shown in Figures 1 and 2. Uniform Plastic Deformation After the initial linear portion of the stress-strain curve (obtained from the load deflection curve), metallic and polymeric materials will begin to exhibit permanent deformation (i.e., when the load is released the object being loaded does not return to its initial no-load position). For most metals, the load required to continue deformation will rise continually up to some strain level. Throughout this regime, all deformation is uniform, as illustrated in Figures 1 and 2. In some materials, such as low-carbon steels that have been annealed, there is an initial post-yield increment of nonuniform deformation, which is termed Luder’s strain. Luder’s strain is caused by bands of plastic activity passing along the gage length of the specimen, and is associated with dynamic interactions between dislocations (the defect responsible for plastic deformation) and carbon atoms. This is shown in Figure 6 (Dieter, 1985). In Figure 6, the stress at which there is a dropoff is referred to as the upper yield point and the plateau stress immediately following this drop is called the lower yield point. Such behavior is
ð8Þ
The slope of Equation 8 is the strain-hardening exponent. The value of K is simply obtained by noting that it is the value of the stress at a strain of 1. Nonuniform Plastic Deformation At some point, the increased strength of the material due to strain hardening is no longer able to balance the decreased cross-sectional area due to deformation. Thus a maximum in the load/displacement or stress/strain curve is reached. At this point dP ¼ 0 (i.e., zero slope, which implies that elongation occurs with no increment in the load P) and if Equation 7 is obeyed, the strain value is given by: en ¼ n
ð9Þ
In Equation 9 the subscript n refers to ‘‘necking,’’ by which it is meant that at this point all subsequent deformation is concentrated in a local region called a ‘‘neck’’ (Fig. 4) and is nonuniform. This nonuniformity is usually referred to as ‘‘plastic instability.’’ Independent of the form of the stress/strain curve, the onset of plastic instability (assuming that deformation takes place at constant volume) occurs when the following condition is achieved: ds s ¼ e 1þe
ð10Þ
Equation 10 is the basis of the so-called Considere construction, which may be used to find the onset of plastic instability. This is done by projecting a tangent to the stress/strain curve from the point (1,0). If this is done, Equation 10 is obviously satisfied and the point of tangency represents the necking strain. It is important to realize that in this case the term s is the true stress and e is the engineering strain. The Considere construction is illustrated in Figure 7 (Dieter, 1985). Factors Affecting the Form of the Stress/Strain Curve The stress/strain curve that is obtained depends upon the following variables:
Figure 6. Illustration of the upper and lower yield points and Lu¨ ders strain in a mild steel tensile curve (Dieter, 1985).
The material and its microstructure; The test temperature; The testing rate;
TENSION TESTING
Figure 7. Considere’s construction for determining the onset of necking (Dieter, 1985). Here, ‘‘u’’ refers to the point of necking.
The geometry of the test specimen; The characteristics of the testing machine (as discussed above); The mode in which the test is carried out. By the ‘‘mode’’ is meant whether the test is carried out by controlling the rate of load application (load control), the rate of machine displacement (displacement control), or the rate of specimen strain (strain control). Of these six factors, the first three may be considered intrinsic factors. The effects of the last three (discussed in Practical Aspects and Method Automation) tend to be less appreciated, but they are no less important. Indeed, some results that are considered to be fundamental to the material are largely influenced by the latter three variables, which may be considered to be extrinsic to the actual material. The first three factors are discussed below. Material and Microstructure The material and microstructure plays a critical role on the form of the stress/strain curve. This is illustrated in Figure 8 (Schaffer, 1999) for three different materials. The curve labeled Material I would be typical of a very brittle material such as a ceramic, a white cast iron, or a highcarbon martensitic steel. The curve labeled Material II is
283
representative of structural steels, low-strength aluminum alloys, and copper alloys, for example. The curve labeled Material III is representative of polymeric materials. The details of the microstructure are also very important. For example, if a given type of metal contains precipitates it is likely to be harder and to show less extension than if it does not contain precipitates. Precipitates are typically formed as a result of a specific heat treatment, which in effect is the imposition of a time/temperature cycle on a metal. In the heat treatment of steel, the formation of martensite is frequently a necessary intermediate phase. Martensite is very strong but is intrinsically brittle, meaning it has little or no ductility. It will have a stressstrain curve similar to that of Material I. If the martensitic material is heated to a temperature in the range of 2008C, precipitates will form and the crystal structure will change to one that is intrinsically more ductile; a stress/strain curve similar to Material II will then be obtained. Grain size also has a significant effect. Fine-grained materials both are stronger and exhibit more ductility than coarsegrained materials of the same chemistry. The degree of prior cold working is also important. Materials that are cold worked will show higher yield strengths and less ductility, since cold working produces defects in the crystalline structure that impede the basic mechanisms of plastic deformation. Effects of Temperature and Strain Rate Since deformation is usually assisted by thermal energy, the stress/strain curve at high temperature will usually lie below that at low temperature. Related to the effects of thermal energy, stress/strain curves at high rate will usually lie above those obtained at low rates, since at high rate more of the energy must be supplied athermally. The stress as a junction of strain rate for a given strain and temperature is usually expressed through the simplified equation: se;T ¼ C_em
ð11Þ
where C and m are material constants and e_ is the strain rate. Yield-Point Phenomena In some materials, such as mild steels, a load drop is observed at the onset of plastic deformation, as shown in Figures 6 and 9B. This is easily understood in terms of some elementary concepts in the physics of deformation. There is a fundamental relationship between the strain rate, the dislocation density, and the velocity of dislocation movement. (Dislocations are defects in the material that are responsible for the observed deformation behavior.) The strain rate is given by: e_ ¼ brv
Figure 8. Schematic illustration of stress/strain curves for a brittle material (I), a ductile metal (II), and a polymer (III) (Schaffer et al., 1999).
ð12Þ
where b ¼ Burger’s vector (presumed to be constant), v ¼ velocity of dislocation movement, and r is density of dislocations. At the yield point, there is a rapid increase in the density of mobile dislocations. Since b is constant, this means
284
MECHANICAL TESTING
Figure 9. Illustration of stress/strain curves (A) for typical metals and (B) for mild steel in which there is an upper and lower yield point. Determination of the 0.2% offset yield is also shown in (A).
that v must decrease if a constant strain rate is being maintained. However, there is a relationship between dislocation velocity and stress that is expressed as: n v s ¼ ð13Þ v0 s0 where V0 and s0 are constants and n is an exponent on the order of 30 or 40 depending on the material. Now if v decreases, then Equation 13 shows that the stress s will decrease, resulting in a load drop. However, if the control mode is a given rate of load application (i.e., load control), then no load drop will be observed since when the density of mobile dislocations increases, the machine will simply move faster in order to maintain the constant rate of loading. PRACTICAL ASPECTS OF THE METHOD The Basic Tensile Test The basic tensile test consists of placing a specimen in a test frame, loading the specimen under specified conditions, and measuring the loads and corresponding displacements. The fundamental result of the tensile test is a curve relating the simultaneous loads and displacements. This curve is called the load displacement record. The load displacement record is converted to a stress/strain curve by dividing the load by the cross-sectional area and the elongation by the gage-length of the specimen as discussed in the following sections. To carry out a tensile test, four essential components are required. They are: A specimen of appropriate geometry; A machine to apply and measure the load; A device to measure the extension of the specimen; An instrument to record the simultaneous values of load and displacement. Testing may be done at high or low temperatures and in a variety of environments other than air. Accessories are
required to carry out such tests, which would include a high-temperature furnace, an extensometer adapted for use with such a furnace, and possibly a chamber for testing in inert or aggressive environments or under vacuum. Typically, the following information can be obtained from the load displacement record of a tensile test or from the specimen: Young’s modulus, E; The yield strength, sys —or, more conventionally, the 0.2% offset yield as defined in Figure 9A (Schaffer et al., 1999); The strength coefficient and strain hardening exponent (Eq. 7); The ultimate tensile strength (suts ¼ Pmax =A0 ) as defined in Figures 1B and 2(Pmax is maximum load); The total strain to failure, ef ¼ (lfl0)/l0; The percent reduction in area from the specimen, %RA¼(A0 Af)/A0 (where A0 and Af are the initial and final areas, respectively). The ASTM has developed a standard test procedure for tensile testing (ASTM, 1987, E 8 & E 8M) that provides details on all aspects of this test procedure. The reader is strongly encouraged to consult the following documents related to tensile testing, calibration, and analysis of test results (all of which can be found in ASTM, 1987): Designation E 8: Standard Methods of Tension Testing of Metallic Materials (pp. 176–198). Designation E 8M: Standard Methods of Tension Testing of Metallic Materials [Metric] (pp. 199–222). Designation E 4: Standard Practices for Load Verification of Testing Machines (pp. 119–126). Designation E 83: Standard Practice for Verification and Classification of Extensometers (pp. 368–375). Designation E 1012: Standard Practice for Verification of Specimen Alignment Under Tensile Loading (pp. 1070–1078).
TENSION TESTING
Designation E 21: Standard Recommended Practice for Elevated Temperature Tests of Metallic Materials (pp. 272–281). Designation E 74: Standard Practices of Calibration of Force-Measuring Instruments for Verifying the Load Indication of Testing Machines (pp. 332–340). Designation E 1 1 1: Standard Test Method for Young’s Modulus, Tangent Modulus, and Chord Modulus (pp. 344–402). The specimens and apparatus used in testing are considered in the sections that follow.
Specimen Geometry The effect of geometry is also significant. As discussed above, the total elongation is made up of a uniform and a nonuniform component. Since the nonuniform component is localized, its effect on the overall elongation will be less for specimens having a long gage section. This means that the total strain to failure for a given material will be less for a long specimen of given diameter than it will be for a short specimen of the same diameter. Thus care must be exercised when comparing test results to ensure that geometrically similar specimens were used. It is standard in the US to use specimens in which the ratio of the length to diameter is 4:1. Since the total strain depends upon this ratio, comparisons can be made between different specimen sizes provided that this ratio is maintained. A more fundamental measure of the ductility is the percent reduction in area (%RA), which is independent of the specimen diameter. Typical specimen geometries are illustrated in ASTM (1987), E 83.
Test Machines, Extensometers, and Test Machine Characteristics Test Machine. Testing is carried out using machines of varying degrees of sophistication. In its most basic form, a machine consists of: a. A load cell attached to an upper support and a linkage to connect the load cell to the specimen; b. A lower crosshead (or piston) that connects to the specimen via another linkage; and c. A means to put the lower crosshead or piston into motion and thereby apply a force to the specimen. The applied forces are transmitted through a load frame, which may consist of two or more columns, and the lower crosshead or piston may be actuated either mechanically or hydraulically. The load cell consists of a heavy metal block onto which strain gages are attached, usually in the configuration of a Wheatstone bridge. As force is applied, the load cell suffers displacement,and this displacement is calibrated to the applied load (ASTM, 1987, E 4). Most modern test machines have sophisticated electronic controls to aid in applying precise load-time or displacement-time profiles to the specimen.
285
Extensometers. The extension of the specimen must be measured with an extensometer in order to obtain the displacement corresponding to a given load. There are various ways to measure small displacements with great accuracy. One technique that has become very popular is the so-called clip-on gage. This gage consists of two spring arms attached to a small block. Strain gages are attached to both spring arms and are connected to form a Wheatstone bridge similar to the load cell. However, the calibration is done in terms of displacement (ASTM, 1987, E 83), and it is possible to measure very small displacements with great accuracy using this instrument. While some indication of the specimen deformation can be obtained by monitoring the displacement of the crosshead (or hydraulic ram, depending on the nature of the test machine), it is, of course, preferable to attach an extensometer directly to the specimen. In this way the extension of the specimen is measured unambiguously. Two problems arise if the extension of the specimen is equated to the displacement of the lower crosshead or piston: (1) the gage length is assumed to be the region between the shoulders of the specimen and, more importantly, (2) deflection in the load train occurs to a rather significant degree. Thus, unless the machine stiffness is known and factored into account, the extension of the specimen will be overestimated. In addition, even if the machine stiffness is accounted for, the rate of straining of the specimen will be variable throughout the plastic region since the proportion of actual machine and specimen deflection changes in a nonlinear way. Given these sources of error it is highly desirable to measure the machine deflection directly using an extensometer. Of course this is not always possible, especially when testing in severe environments and/or at high temperatures. In such cases other techniques must be used to obtain reasonable estimates of the strain in the specimen. Typical extensometers are shown in Martin (1985). Testing Machine Characteristics and Testing Mode. The characteristics of the testing machine are also very important. If a machine is ‘‘soft’’ (i.e., there is considerable deflection in the load train during testing), then events that would tend to give rise to load drops (such as initial yielding in low-carbon steels, as discussed previously) can be masked by soft machines. In essence, the machine will spring back during the event that would otherwise cause a load drop and tend to maintain, at least to some degree, a constant load. Similarly, testing under conditions of load control would fully mask load drops, while testing under strain control would, by the same token, maximize the observability of load-drop phenomena. The preceding brief discussion illustrates that due care must be exercised when carrying out a test in order to obtain the maximum amount of information. Testing at Extreme Temperatures and in Controlled Environments While it is clearly not possible to consider all possible combinations of test temperature and environment, a few general comments on the subject of testing at low and high
286
MECHANICAL TESTING
temperatures and in environments other than ambient are in order. These are important in various advanced technology applications. For example, jet and rocket engines operate at temperatures that approach 1500 K, power generation systems operate at temperatures only slightly lower, and cryogenic applications such as refrigeration systems operate well below room temperature. Furthermore, a wide range of temperatures is encountered in the operation of ships and planes. High-Temperature Testing. Care must be exercised to assure a uniform temperature over the gage length of the specimen. For practical purposes, the temperature should not vary by more than a few degrees over the entire gage length. If the specimen is heated using a resistance furnace, a so-called ‘‘chimney effect’’ can occur if the top of the furnace is open. This occurs because hot air rises, and as it rises in the tube of the furnace, cold air is drawn in, which tends to cool the bottom of the specimen and create an excessive temperature gradient. This effect can be reduced by simply providing shutters at the top of the furnace that just allow the force rods to enter into the heating chamber but that block the remaining open area. Testing in a resistance furnace is also complicated by the way in which the extensometer is attached to the specimen and by the need to avoid temperature gradients. A small access hole is provided in the furnace and probes of alumina or quartz are attached to the specimen to define the gage length. These arms are then attached to the clip-on gage mentioned previously to measure specimen displacement. In some cases, ‘‘divots’’ are put in the specimen to assure positive positioning of the probe arms. However, this practice is not recommended since the divots themselves may affect the test results, especially the measured ductility. It is possible to adjust the radius of curvature of the probe arms and the tension holding the extensometer to the specimen so that slippage is avoided. A very popular way of heating the specimen is to use an induction generator and coil. By careful design of the coil, a very constant temperature profile can be established and the extensometer can be easily secured to the specimen by putting the probe arms through the coil. Low-Temperature Testing. Problems similar to those discussed above apply to low-temperature testing. Refrigeration units with circulating gases or fluids can be used, but constant mixing is required to avoid large temperature gradients. Uniform temperatures can be achieved with various mixtures of liquids and dry ice. For example, dry ice and acetone will produce a constant-temperature bath, but the temperature is limited. Extensometry becomes even more difficult than at high temperatures if a fluid bath is used. In such instances, an arrangement in which a cylindrical linearly variable differential transformer (i.e., LVDT-based extensometer) is attached to the specimen may be useful. The body of the LVDT is attached to the bottom of the gage section and the core is attached to the top. As the specimen elongates, a signal is generated that is proportional to the relative displacement of the core and LVDT body.
Environmental Testing. Tensile testing is also carried out in various environments or in vacuum. Since environmental attack is accelerated at high temperatures, such testing is often done at high temperatures. In such instances, an environmental chamber is used that can be evacuated and then, if appropriate, back-filled with the desired environment (e.g., oxygen in a neutral carrier gas at a prescribed partial pressure). All of the comments that were made relative to extensometry and temperature control apply here as well, with the added complication of the apparatus used to provide environmental control. In addition to being able to measure the temperature, it is highly desirable to measure the gaseous species present. These may arise from so-called ‘‘internal’’ leaks (e.g., welds that entrap gasses but that have a very small hole allowing such gasses to escape) or from impure carrier gasses. Gaseous species can be measured by incorporating a mass spectrometer into the environmental chamber. METHOD AUTOMATION As discussed above, the basic needs are a machine to apply a force, an instrument to measure the extension of the specimen, and a readout device to record the experimental results. However, it is frequently desirable to apply the load to the specimen in a well-defined manner. For example, it is well known that materials are sensitive to the rate at which a strain is applied and it is thus important to be able to load the specimen in such a way as to maintain a constant strain rate. In essence, this requirement imposes two conditions on the test: a. The control mode must be of the displacement type; and b. The displacement that is measured and controlled must be specimen displacement, as opposed to displacement of the crosshead or hydraulic ram. In this situation the following are required: a. An extensometer attached directly to the specimen, and b. A machine with the ability to compare the desired strain/time profile to the actual strain/time profile and make instantaneous adjustments in the strain rate so as to minimize, as far as possible, the differences between the command signal (i.e., the desired strain rate) and the resultant signal (i.e., the actual strain rate). Clearly these conditions cannot be met if the test machine can only move the lower crosshead at a predetermined rate, since this will not take deflection in the load train into account as discussed above. The control mode just described is referred to as ‘‘closed-loop control,’’ and is the way in which most modern testing is carried out. Modern machines are instrumented in such as way as to be able to operate in strain control, load control, and displacement control modes. In addition, manufacturers now supply controllers in which a combined signal may
TENSION TESTING
the fibers. In this case, vacuum equipment may be used and care must be taken to ensure proper venting of hazardous fumes. An additional requirement for obtaining reliable results is that the alignment of the load cell of the test machine and the specimen be coaxial. This will eliminate bending moments and produce a result in which only tensile forces are measured. Techniques for alignment are discussed in detail in ASTM (1987), E 1012.
be used as the control mode. This is called combinatorial control. An example of combinatorial control would be to load a specimen such that the rate of stress strain (i.e., power) was constant. A schematic of the closed-loop control concept is provided in Martin (1985). It is important to realize that the results that are obtained can depend quite sensitively on the control mode. Essentially, the results of a tensile test reflect not only the material’s intrinsic properties but also, more fundamentally, the interaction between the material being tested and the test machine. This is perhaps best illustrated for mild steel. If tested under strain control (or under displacement control, if the test machine is very stiff relative to the specimen), this material will exhibit an upper and lower yield point. On the other hand, the upper and lower yield points are completely eliminated if the specimen is tested under load control and the machine is very compliant relative to the specimen. The yield-point phenomenon is illustrated in Figure 5. Reasons for this behavior are discussed above (see Principles of the Method).
DATA ANALYSIS AND INITIAL INTERPRETATION As previously mentioned, the fundamental result of a tensile test is a load-displacement curve. Since the load required to bring a specimen to a given state (e.g., to fracture) depends upon the cross-sectional area, load is not a fundamental measure of material behavior. Therefore, the load must be converted to stress by dividing by the cross-sectional area. Similarly, the extension depends upon the actual length of the specimen and is also not a fundamental quantity. Elongation is put on a more fundamental basis by dividing by the length of the specimen, and the resulting quantity is called strain. The result of a tensile test is then a stress/strain curve, as shown in Figures 1B and 2. There are typically three distinct regions to a stress/strain curve, which have been discussed above (see Principles of the Method). When comparing test results, it is generally found that the yield and tensile strength are independent of specimen geometry. However, this is not the case for the percent elongation which is often used as a measure of ductility. If specimens of circular cross-section are considered, then the percent elongation will be higher the smaller the ratio of the gage length to diameter. This can be understood in part by noting that the elongation associated with necking will be constant for a given diameter. Thus, the contribution of the nonuniform deformation to the total percent elongation will increase as the gage length decreases. Since different length/diameter specimens are used in different countries, care must be exercised in making comparisons. However, the percent reduction in area (%RA) is independent of diameter. Since this quantity is a good measure of ductility and is independent of diameter, it is recommended for making ductility comparisons. The scatter associated with such tests is very small ( 200 s1 , alternate techniques using projectile-driven impacts to induce stress-wave propagation have been developed. Chief among these techniques is the split-Hopkinson pressure bar, which is capable of achieving the highest uniform uniaxial stress loading of a specimen in compression at nominally constant strain rates of the order of 103 s1 . Stress is measured by using an elastic element in series with the specimen of interest. Stress waves are generated via an impact event and the elastic elements utilized are long bars such that the duration of the loading pulse is less than the wave transit time in the bar. Utilizing this technique, the dynamic stressstrain response of materials at strain rates up to 2104 s1 and true strains of 0.3 can be readily achieved in a single test. Historical Background The Hopkinson bar technique is named after Bertram Hopkinson (Hopkinson, 1914) who, in 1914, used the induced wave propagation in a long elastic bar to measure the pressures produced during dynamic events. Through the use of momentum traps of different lengths, Hopkinson studied the shape and evolution of stress pulses as they propagated down long rods as a function of time. Based on this pioneering work, the experimental apparatus utilizing elastic stress-wave propagation in long rods to study dynamic processes was named the Hopkinson pressure bar. Later work by Davies (1948a,b) and Kolsky (1949) utilized two Hopkinson pressure bars in series, with the sample sandwiched in between, to measure the dynamic stress-strain response of materials. (A note on nomenclature: The terms input/incident bar and output/ transmitted bar will be used interchangeably.) This technique thereafter has been referred to as either the splitHopkinson pressure bar (Lindholm and Yeakly, 1968; Follansbee, 1985), Davies bar (Kolsky, 1964), or Kolsky bar (Kolsky, 1949; Follansbee, 1985). This unit describes the techniques involved in measuring the high-strain-rate stress-strain response of materials in compression utilizing a split-Hopkinson pressure bar, hereafter abbreviated as SHPB. Emphasis will be given to the method for collecting and analyzing compressive high-rate mechanical property data and to discussion of the critical experimental variables that must be controlled to yield valid and reproducible high-strain-rate stressstrain data. Competitive and Related Techniques In addition to the original split-Hopkinson pressure bar developed to measure the compressive response of a material, the Hopkinson technique has been modified for loading samples in uniaxial tension (Harding et al., 1960;
289
Lindolm and Yeakley, 1968), torsion (Duffy et al., 1971), and simultaneous torsion-compression (Lewis and Goldsmith, 1973). The basic theory of bar data reduction based upon one-dimensional stress wave analysis, as presented below (see Principles of the Method), is common to all three loading states. Of the different Hopkinson bar techniques—compression, tension, and torsion—the compression bar remains the most readily analyzed and least complex method to achieve a uniform high-rate stress state. The additional complications encountered in the tensile and torsional Hopkinson techniques are related to (1) the modification of the pressure bar ends to accommodate gripping of complex samples, which can alter wave propagation in the sample and bars; (2) the potential need for additional diagnostics to calculate true stress; (3) an increased need to accurately incorporate inertial effects into data reduction to extract quantitative material constitutive behavior; and (4) the more complicated stress-pulse generation systems required for tensile and torsion bars. Alteration of the bar ends to accommodate threaded or clamped samples leads to complex boundary conditions at the bar/specimen interface and therefore introduces uncertainties in the wave mechanics description of the test (Lindholm and Yeakley, 1968). When complex sample geometries are used, signals measured in the pressure bars record the structural response of the entire sample, not just the gauge section, where plastic deformation is assumed to be occurring. When plastic strain occurs in the sections adjacent to the sample’s uniform gauge area, accurate determination of the stress-strain response of the material is more complicated. In these cases, additional diagnostics, such as high-speed photography, are mandatory to quantify the loaded section of the deforming sample. In the tensile bar case, an additional requirement is that exact quantification of the sample cross-sectional area as a function of strain is necessary to achieve truestress data. An additional complexity inherent to both the tension and torsion Hopkinson loading configurations has to do with the increased sample dimensions required. Valid dynamic characterization of many material product forms, such as thin sheet materials and small-section bar stock, may be significantly complicated or completely impractical using either tensile or torsion Hopkinson bars due to an inability to fabricate test samples. High-rate tensile loading of a material may also be conducted utilizing an expanding ring test (Hoggatt and Recht, 1969; Gourdin et al., 1989). This technique, which requires very specialized equipment, employs the sudden radial acceleration of a ring due to detonation of an explosive charge or electromagnetic loading. Once loaded via the initial impulse, the ring expands radially and thereafter decelerates due to its own internal circumferential stresses. While this technique has been utilized to determine the high-rate stress-strain behavior of a material, it is complicated by the fact that the strain rate changes throughout the test. This variable rate is determined first by the initial loading history, related to the initial shock and/or magnetic pulse, and then by the rapid-strain-rate decelerating gradient during the test following the initial
290
MECHANICAL TESTING
impulse. The varied difficulties of the expanding ring technique, along with the expense, sample size, and shape, have limited its use as a standard technique for quantitative measurement of dynamic tensile constitutive behavior. The remaining alternate method of probing the mechanical behavior of materials at high strain rates, of the order of 103 s1 , is the Taylor rod impact test. This technique, named after its developer G.I. Taylor (1948), entails firing a solid cylinder of the material of interest against a massive and rigid target as shown schematically in Figure 1. The deformation induced in the rod by the impact shortens the rod as radial flow occurs at the impact surface. The fractional change in the rod length can then, by assuming one-dimensional rigid-plastic analysis, be related to the dynamic yield strength. By measuring the overall length of the impacted cylinder and the length of the undeformed (rear) section of the projectile, the dynamic yield stress of the material can be calculated according to the formula (Taylor, 1948):
s¼
rV 2 ðL XÞ 2ðL L1 Þ lnðL=XÞ
ð1Þ
where s is the dynamic yield stress of the material, r is the material’s density, V is the impact velocity, and L, X, and L1 are the dimensional quantities of the bar length and deformed length as defined in Figure 1. The Taylor test technique offers an apparently simplistic method to ascertain some information concerning the dynamic strength properties of a material. However, this test represents an integrated test, rather than a unique experiment with a uniform stress state or strain rate like the split-Hopkinson pressure bar test. Accordingly, the Taylor test has been most widely used as a validation experiment in concert with two-dimensional finite-element calculations. In this approach, the final length and cylinder profile of the Taylor sample is compared with code simulations to validate the material constitutive model implemented in the finite-element code (Johnson and Holmquist, 1988; Maudlin et al., 1995, 1997). Comparisons with the recovered
Taylor sample provide a check on how accurately the code can calculate the gradient in deformation stresses and strain rates leading to the final strains imparted to the cylinder during the impact event. New Developments The split-Hopkinson pressure bar technique, as a tool for quantitative measurement of the high-rate stress-strain behavior of materials, is far from static, with many new improvements still evolving. One- and two-dimensional finite-element models of the split-Hopkinson pressure bar have proven their ability to simulate test parameters and allow pretest setup validation checks as an aid to planning. Novel methods of characterizing sample diametrical strains are being developed using optical diagnostic techniques (Valle et al., 1994; Ramesh and Narasimhan, 1996). Careful attention to controlling wave reflections in the SHPB has also opened new opportunities to study defect/damage evolution in brittle materials during highrate loading histories (Nemat-Nasser et al., 1991). Finally, researchers are exploring exciting new methods for in situ dispersion measurements (Wu and Gorham, 1997) on pressure bars, which offer opportunities for increased signal resolution in the future.
PRINCIPLES OF THE METHOD The determination of the stress-strain behavior of a material being tested in a Hopkinson bar, whether it is loaded in compression as in the present instance or in a tensile bar configuration, is based on the same principles of onedimensional elastic-wave propagation within the pressure loading bars (Lindholm, 1971; Follansbee, 1985). As identified originally by Hopkinson (1914) and later refined by Kolsky (1949), the use of a long, elastic bar to study high-rate phenomena in materials is feasible using remote elastic bar measures of sample response because the wave propagation behavior in such a geometry is well understood and mathematically predictable. Accordingly, the displacements or stresses generated at any point can be deduced by measuring the elastic wave at any point, x, as it propagates along the bar. In what follows, the subscripts 1 and 2 will be used to denote the incident and transmitted side of the specimen, respectively. The strains in the bars will be designated ei , er , et and the displacements of the ends of the specimen u1, u2 at the input bar/specimen and specimen/output bar interfaces as given schematically in the magnified view of the specimen in Figure 2. From elementary wave theory, it is known that the solution to the wave equation q2 u 1 q2 u ¼ qx2 c2b qt2
ð2Þ
u ¼ f ðx cb tÞ þ gðx þ cb tÞ ¼ ui þ ur
ð3Þ
can be written Figure 1. Schematic of Taylor impact test showing the initial and final states of the cylindrical sample (Taylor, 1948).
HIGH-STRAIN-RATE TESTING OF MATERIALS
291
Assuming that after an initial ringing up period (the period during which the forces on the ends of the specimen become equal), where the exact time depends on the sample sound speed and sample geometry, the specimen is in force equilibrium; and assuming that the specimen is deforming uniformly, a simplification can be made equating the forces on each side of the specimen, i.e., F1¼F2. Comparing Equations 10 and 11 therefore means that Figure 2. Expanded view of input bar/specimen/output bar region.
for the input bar, where f and g are functions describing the incident and reflected wave shapes and cb is the wave speed in the rod. By definition, the one-dimensional strain is given by e¼
qu qx
ð4Þ
So differentiating Equation 3 with respect to X, the strain in the incident rod is given by e ¼ f 0 þ g0 ¼ ei þ er
ð5Þ
Differentiating Equation 3 with respect to time and using Equation 5 gives u_ ¼ cb ðf 0 þ g0 Þ ¼ cb ðei þ er Þ
ð6Þ
for the input bar. Since the output bar has only the transmitted wave, u ¼ hðx cb tÞ, propagating in it, u_ ¼ cb et
ð7Þ
in the output bar. Equations 6 and 7 are true everywhere, including at the ends of the pressure bars. The strain rate in the specimen is, by definition, given by e_ ¼
ðu_ 1 u_ 2 Þ ls
ð8Þ
where ls is the instantaneous length of the specimen and u_ 1 and u_ 2 are the velocities at the incident barspecimen and specimenoutput bar interfaces, respectively. Substituting Equations 6 and 7 into Equation 8 gives e_ ¼
cb ðei þ er þ et Þ ls
ð9Þ
et ¼ ei þ er
ð12Þ
Substituting this criterion into Equation 9 yields e_ ¼
2cb er ls
ð13Þ
The stress is calculated from the strain gauge signal measure of the transmitted force divided by the instantaneous cross-sectional area of the specimen, As: sðtÞ ¼
AEet As
ð14Þ
Utilizing Equations 13 and 14 to determine the dynamic stress-strain curve of a material is termed a ‘‘one-wave’’ analysis because it uses only the reflected wave for strain in the sample and only the transmitted wave for stress in the sample. Hence, it assumes that stress equilibrium is assured in the sample. Conversely, the stress in the sample at the incident barsample interface can be calculated using a momentum balance of the incident and reflected wave pulses, termed a ‘‘two-wave’’ stress analysis since it is a summation of the two waves at this interface. However, it is known that such a condition cannot be correct at the early stages of the test because of the transient effect that occurs when loading starts at the input barspecimen interface while the other face remains at rest. Time is required for stress-state equilibrium to be achieved. Numerous researchers have adopted a ‘‘three-wave’’ stress analysis that averages the forces on both ends of the specimen to track the ring-up of the specimen to a state of stable stress. The term ‘‘three-wave’’ indicates that all three waves are used to calculate an average stress in the sample; the transmitted wave to calculate the stress at the specimentransmitted interface (back stress) and the combined incident and reflected pulses to calculate the stress at the incident barspecimen interface (front stress). In the three-wave case, the specimen stress is then simply the average of the two forces divided by the combined interface areas:
By definition, the forces in the two bars are F1 ¼ AEðei þ er Þ
ð10Þ
F2 ¼ AEet
ð11Þ
where A is the cross-sectional area of the pressure bar and E is the Young’s modulus of the bars (normally equal, given identical material is used for the input and output bars).
sðtÞ ¼
F1 ðtÞ þ F2 ðtÞ 2As
ð15Þ
Substituting Equations 10 and 11 into Equation 15 then gives sðtÞ ¼
AE ðei þ er þ et Þ 2As
ð16Þ
292
MECHANICAL TESTING
From these equations, the stress-strain curve of the specimen can be computed from the measured reflected and transmitted strain pulses as long as the volume of the specimen remains constant, i.e., A0l0 ¼ Asls (where l0 is the original length of the specimen and A0 its original cross-sectional area) and the sample is free of barreling (i.e., friction effects are minimized). (Note: The stipulation of constant volume precludes, by definition, the testing of foams or porous materials.)
PRACTICAL ASPECTS OF THE METHOD While there is no worldwide standard design for a splitHopkinson pressure bar test apparatus, the various designs share common features. A compression bar test apparatus consists of (1) two long, symmetric, highstrength bars, (2) bearing and alignment fixtures to allow the bars and striking projectile to move freely but in precise axial alignment, (3) a gas gun or alternate device for accelerating a projectile to produce a controlled compressive wave in the input bar, (4) strain gauges mounted on both bars to measure the stress wave propagation, and (5) the associated instrumentation and data acquisition system to control, record, and analyze the wave data. A short sample is sandwiched between the input and output bar as shown schematically in Figure 3. The use of a bar on each side of the material sample to be tested allows measurement of the displacement, velocity, and/or stress conditions, and therefore provides an indication of the conditions on each end of the sample. The bars used in a Hopkinson bar setup are most commonly constructed from a high-strength material: AISISAE 4340 steel or maraging steel or a nickel alloy such as Inconel. This is because the yield strength of the pressure-bar material determines the maximum stress attainable within the deforming specimen. Bars made of Inconel, whose elastic properties are essentially invariant up to 6008C, are often utilized for elevated-temperature Hopkinson bar testing. Because a lower-modulus material increases the signal-to-noise level, the selection of a material with lower strength and lower elastic modulus for the bars is sometimes desirable to facilitate high-resolution dynamic testing of low-strength materials such as polymers or foams. Researchers have utilized bar materials
Figure 3. Schematic of a split-Hopkinson bar.
ranging from maraging steel (210 GPa) to titanium (110 GPa) to aluminum (90 GPa) to magnesium (45 GPa) and finally to polymer bars (< 20 GPa) (Gary et al., 1995, 1996; Gray et al., 1997). The length, l, and diameter, d, of the pressure bars are chosen to meet a number of criteria for test validity as well as the maximum strain rate and strain level desired. First, the length of the pressure bars must assure one-dimensional wave propagation for a given pulse length; for experimental measurements on most engineering materials this requires 10-bar diameters. To allow wave reflection, each bar should exceed a l/d ratio of 20. Second, the maximum strain rate desired must be considered in selecting the bar diameter, where the highest-strain-rate tests require the smallest diameter bar. The third consideration affecting the selection of the bar length is the amount of total strain desired to be imparted into the specimen; the absolute magnitude of this strain is related to the length of the incident wave. The pressure bar must be at least twice as long as the incident wave if the incident and reflected waves are to be recorded without interference. In addition, since the bars remain elastic during the test, the displacement and velocity of the bar interface between the sample and the bar can be accurately determined. Depending on the sample size, for strains >30% it may be necessary for the split-Hopkinson bars to have a l/d ratio of 100 or more. For proper operation, split-Hopkinson bars must be physically straight, free to move without binding, and carefully mounted to assure optimum axial alignment. Precision bar alignment is required for both uniform and one-dimensional wave propagation within the pressure bars as well as for uniaxial compression within the specimen during loading. Bar alignment cannot be forced by overconstraining or forceful clamping of curved pressure bars in an attempt to ‘‘straighten’’ them, as this clamping violates the boundary conditions for one-dimensional wave propagation in an infinite cylindrical solid. Bar motion must not be impeded by the mounting bushings utilized; the bar must remain free to move readily along its axis. Accordingly, it is essential to apply precise dimensional specifications during construction and assembly. In typical bar installations, as illustrated schematically in Figure 3, the pressure bars are mounted to a common rigid base to provide a rigid, straight mounting platform. Individual mounting brackets with slip bearings through which the bars pass are typically spaced every 100 to 200 mm,
HIGH-STRAIN-RATE TESTING OF MATERIALS
depending on the bar diameter and stiffness. Mounting brackets are generally designed so that they can be individually translated to facilitate bar alignment. The most common method of generating an incident wave in the input bar is to propel a striker bar to impact the incident bar. The striker bar is normally fabricated from the same material and is of the same diameter as the pressure bars. The length and velocity of the striker bar are chosen to produce the desired total strain and strain rate within the specimen. While elastic waves can also be generated in an incident bar through the adjacent detonation of explosives at the free end of the incident bar, as Hopkinson did, it is more difficult to ensure a onedimensional excitation within the incident bar by this means. The impact of a striker bar with the free end of the incident bar develops a longitudinal compressive incident wave in this bar, designated ei , as denoted in Figure 3. Once this wave reaches the barspecimen interface, a part of the pulse, designated er , is reflected while the remainder of the stress pulse passes through the specimen and, upon entering the output bar, is termed the transmitted wave, et . The time of passage and the magnitude of these three elastic pulses through the incident and transmitted bars are recorded by strain gauges normally cemented at the midpoint positions along the length of the two bars. Figure 4 is an illustration of the data measured as a function of time for the three wave signals. The incident and transmitted wave signals represent compressive loading pulses, while the reflected wave is a tensile wave. If we use the wave signals from the gauges on the incident and transmitted bars as a function of time, the forces and velocities at the two interfaces of the specimen can be determined. When the specimen is deforming uniformly, the strain rate within the specimen is directly proportional to the amplitude of the reflected wave. Similarly, the stress within the sample is directly proportional to the amplitude
of the transmitted wave. The reflected wave is also integrated to obtain strain and is plotted against stress to give the dynamic stress-strain curve for the specimen. To analyze the data from a Hopkinson bar test, the system must be calibrated prior to testing. Calibration of the entire Hopkinson bar setup is obtained in situ by comparing the constant amplitude of a wave pulse with the impact velocity of the striker bar for each bar separately. Operationally, this is accomplished by applying a known-velocity pulse to the input bar, and then to the transmitted bar, with no sample present. Thereafter, the impact of the striker with the input bar in direct contact with the transmitted bar, with no specimen, gives the coefficient of transmission. Accurate measurement of the velocity, V, of the striker bar impact into a pressure bar can be obtained using the linear relationship ej ¼
V 2cb
ð17Þ
where ej is the strain in the incident or transmitted bar, depending on which is being calibrated, and cb is the longitudinal wave speed in the bar. Equation 17 applies if the impacting striker bar and the pressure bar are the same material and have the same cross-sectional area. Careful measurement of the striker velocity—using a laser interruption scheme, for example, in comparison to the elastic strain signal in a pressure bar—can then be used to calculate a calibration factor for the pressure bar being calibrated. Optimum data resolution requires careful design of the sample size for a given material followed by selection of an appropriate striker bar length and velocity to achieve test goals. Determination of the optimal sample length first requires consideration of the sample rise time, t, required for a uniform uniaxial stress state to equilibrate within the sample. It has been estimated (Davies and Hunter, 1963) that this requires three (actually p) reverberations of the stress pulse within the specimen to achieve equilibrium. For a plastically deforming solid obeying the Taylorvon Karman theory, time follows the relationship t2 >
Figure 4. Strain gauge data, after signal conditioning and amplification, from a Hopkinson bar test of a nickel alloy sample showing the three waves measured as a function of time. (Note that the transmitted wave position in time is arbitrarily superimposed on the other waveforms.)
293
p2 rs l2s qs=qe
ð18Þ
where rs is the density of the specimen, ls is the specimen length, and qs=qe is the stage II work hardening rate of the true-stress/true-strain curve for the material to be tested. For rise times less than that given by Equation 18, the sample should not be assumed to be deforming uniformly and stress-strain data will accordingly be in error. One approach for achieving a uniform stress state during split-Hopkinson pressure-bar testing is to decrease the sample length, such that the rise time, t, from Equation 18 is as small as possible. Because other considerations of scale (see Sample Preparation) limit the range of l/d ratios appropriate for a specimen dependent on material, the specimen length may not be decreased without a concomitant decrease in the specimen and bar diameters. The use of small-diameter bars (5%, even though a stable strain rate is indicated throughout the entire test. The data in Figure 6 therefore substantiate the need to examine the technique of using thinner sample aspect ratios when studying the high strain rate constitutive response of low-sound-speed, dispersive materials (Gray et al., 1997; Wu and Gorham, 1997). Because of the transient effects that are dominant during the ring-up until stress equilibrium is achieved (well
295
Figure 6. Comparison of room temperature stress-strain response of the polymer Adiprene-L100 (Gray et al., 1997) for a 6.35-mm long sample showing the one- and three-wave stress curves in addition to strain rate.
over 1% plastic strain in the sample), it is impossible to accurately measure the compressive Young’s modulus of materials at high strain rates using the SHPB. The compressive Young’s modulus of a material is best measured using ultrasonic techniques. Increased resolution of the ring-up during SHPB testing of materials with high sound speeds and/or low fracture toughness values can be achieved by directly measuring the strain in the sample via strain gauges bonded directly on the sample (Blumenthal, 1992). Testing of ceramics, cermets, thermoset epoxies, and geological materials requires accurate measurement of the local strains. The difficulty with this technique is that (1) reproducible gauge application on small samples is challenging and labor intensive, and (2) the specimens often deform to strains greater than the gauges can survive (nominally 5% strain) and so can only be used once. Testing as a Function of Temperature The derivation of a robust model description of mechanical behavior often requires quantitative knowledge of the coincident influence of temperature variations. Accurate measurement of the high-rate response utilizing a SHPB at temperatures other than ambient, however, presents several technical and scientific challenges. Because the stress and strain as a function of time within the deforming specimen are determined by strain gauge measurements made on the elastic pressure bars, the longitudinal sound speed and elastic modulus of the pressure bars, both of which vary with temperature, are important parameters. The pronounced effect of temperature on the elastic properties of viscoelastic materials, which have been proposed as alternate bar materials to achieve increased stress resolution, has been a chief barrier to their adoption as a means to measure the high-rate response of polymers over a range of temperatures (Gray et al., 1997). In this case, both a rate and temperature-dependent constitutive model for the elastomer pressure bar material itself would be required to reduce SHPB data for the sample of interest.
296
MECHANICAL TESTING
The combined length of both pressure bars (which can easily exceed 2 m) makes it operationally impossible to heat or cool the entire bar assembly with any uniformity of temperature. Even if this were feasible, it would require a bar material capable of withstanding the high or low temperatures desired as well as the development of new temperature-calibrated strain gauges and robust epoxies to attach them to the bars. Operationally, therefore, the most common techniques for elevated-temperature testing include heating only the sample and perhaps a short section of the pressure bars and then correcting for the temperature-gradient effects on the properties of the pressure bars if they are significant. Researchers have shown that based on an assessment of the temperature gradient, either estimated or measured with a thermocouple, the strain gauge signals can be corrected for the temperaturedependent sound velocity and elastic modulus in the bars (Lindholm and Yeakley, 1968). This procedure has been utilized at temperatures up to 6138C (Lindholm and Yeakley, 1968). The selection of a bar material, such as Inconel, that exhibits only a small variability in its elastic properties up to 6008C is an alternative that avoids the need for corrections. An alternate method of performing elevated-temperature SHPB tests that eliminates the need for corrections to the strain gauge data is to heat only the specimen and no part of the bars. This is accomplished by starting with the bars separated from the sample. The sample is heated independently in a specially designed furnace. Just prior to firing the striker bar the pressure bars are automatically brought into contact with the heated sample (Frantz et al., 1984). If the contact time between the sample and the bars is minimized to 0, corresponding to a point of inflection in the TG curve rather than to a horizontal plateau. Conventionally, thermal analysis experiments are carried out at a constant heating rate, and a property change is measured as a function of time. An alternative approach is to keep the change in property constant by varying the heating rate (Paulik and Paulik, 1971; Reading, 1992; Rouquerol, 1989). For TG, the rate of mass loss is kept constant by variation in heating rate. [This technique is given various names, although controlled-rate thermal analysis (CRTA) appears to have become the accepted terminology.] To achieve this, the mass change is monitored and the heating rate decreased as the mass loss increases, and vice versa. At the maximum rate of mass loss, the heating rate is a minimum. This gives mass losses over very narrow temperature ranges and sometimes enables two close reactions to be resolved (see Fig. 2). This method has the advantage of using fast heating rates when no thermal event is taking place and then slowing down the heating rate when a mass change is in progress.
345
Figure 2. TG curve obtained from a controlled-rate experiment, in which the rate of mass loss is kept nearly constant,and the variable is the heating rate.
Thermogravimetry does not give information about reactions that do not involve mass change, such as polymorphic transformations and double-decomposition reactions. Also, it is not useful for identification of a substance or mixture of substances unless the temperature range of the reaction has already been established and there are no interfering reactions. However, when a positive identification has been made, TG by its very nature is a quantitative technique and can frequently be used to estimate the amount of a particular substance present in a mixture or the purity of a single substance. The types of processes that can be investigated by TG are given in Table 1. There is significant literature on TG, and for more extensive information the reader is referred to Keattch and Dollimore (1975), Earnest (1988), Wendlandt (1986), Dollimore (1992), Dunn and Sharp (1993), Haines (1995), and Turi (1997). Other methods are available for the measurement of mass change as a function of temperature, although they are often limited to changes at set temperatures. Two modes of measurement can be distinguished:
Table 1. Processes That Can be Studied by TG Process Adsorption and absorption Desorption Dehydration or desolvation Vaporization Sublimation Decomposition Oxidation Reduction Solid-gas reactions Solid-solid reactions
Mass Gain
Mass Loss
.
. .
. . . . . . . . .
346
THERMAL ANALYSIS
1. The sample is heated to a specific temperature and the mass change followed at that temperature. After a time lapse, the sample may be heated to a higher constant temperature (or cooled to a lower constant temperature) and again the mass change observed. Quartz spring balances are often used in this application, which are cheap to construct but cumbersome to use. These systems are used to study, e.g., the adsorption/desorption of a gas on a solid at different temperatures. 2. The sample is weighed at room temperature, heated to a constant temperature for a certain time, then removed, cooled back to room temperature, and reweighed. This is the process used in gravimetric analysis and loss on ignition measurements. This method is inexpensive. However, unless the hot sample is put into an inert environment, the sample can readsorb gases from the atmosphere during cooling and hence give anomalous results. The major disadvantage of both of the above methods is that the mass change is not followed continuously as a function of temperature. The complementary techniques of differential thermal analysis (DTA) and differential scanning calorimetry (DSC) are dealt with in DIFFERENTIAL THERMAL ANALYSIS AND DIFFERENTIAL SCANNING CALORIMETRY. Thermogravimetry can be combined with either DTA or DSC in a single system to give simultaneous TG-DTA or TG-DSC; these techniques will be discussed in SIMULTANEOUS TECHNIQUES INCLUDING ANALYSIS OF GASEOUS PRODUCTS. To characterize the sample as well as identify reaction products at various temperatures, a range of other techniques are used, including wet chemical analysis, x-ray diffraction (XRD) (X-RAY TECHNIQUES), spectroscopic techniques such as Fourier transform infrared (FTIR) spectroscopy, optical techniques such optical microscopy (OM) (OPTICAL MICROSCOPY, REFLECTED-LIGHT OPTICAL MICROSCOPY), and scanning electron microscopy (SEM) (SCANNING ELECTRON MICROSCOPY). The information obtained from these techniques enables the reaction intermediates to be identified so that the reaction scheme can be written with some confidence. Other techniques placed by the ICTAC Nomenclature Committee (Mackenzie, 1979, 1983) within the family of thermal analysis methods based on changes of mass are evolved gas detection, evolved gas analysis, and emanation thermal analysis. Emanation thermal analysis [developed by Balek (1991)] involves the release of radioactive emanation from a substance, which gives a change in mass if, for example, a-particles are emitted. PRINCIPLES OF THE METHOD The form of the TG/DTG curve obtained experimentally is dependent on the interplay of two major factors: the properties of the sample and the actual experimental conditions used, also called procedural variables. Both factors can affect the kinetics of any reaction that takes place, so that a change in either will have a subsequent effect on the form of the TG curve. It is also important to note that
unless the sample is held at constant temperature, applying a heating or cooling rate produces nonequilibrium conditions. It is possible to calculate a theoretical TG curve if the kinetic mechanism and parameters are known, on the assumption that heat transfer is instantaneous and no temperature gradient exists within the sample. Thus the kinetics of most reactions under isothermal conditions can be summarized by the general equation da ¼ k f ðaÞ dt
ð1Þ
Here a is the fraction reacted in time t and is equal to wi wt/wi wf, where wi is the initial weight of the sample, wt is the weight at time t, and wf is the final weight of the sample; k is the rate of reaction; and (a) is some function of a. The various forms adopted by the function (a) and the integrated forms g(a) have been discussed elsewhere (Keattch and Dollimore, 1975; Satava and Skvara, 1969; Sharp, 1972; Sestak, 1972) and are given in the next section. The temperature dependence of the rate constant follows the Arrhenius equation: k ¼ AeE=RT
ð2Þ
where T is the absolute temperature, A is the preexponential factor, E is the activation energy, and R is the gas constant. For a linear heating rate, T ¼ T0 þ bt
ð3Þ
where T0 is the initial temperature and b is the heating rate. Combination of Equations 1 and 2 give da ¼ A f ðaÞeE=RT dt
ð4Þ
Substitution for dt using Equation 3 gives da A ¼ f ðaÞeE=RT dT b
ð5Þ
which, if rearranged, provides da A ¼ eE=RT dT f ðaÞ b
ð6Þ
Equation 6 is the basic equation of the DTG curve, which when integrated is the equation of the TG curve. If the form of the function f ðaÞ is known, integration of the left-hand side of the equation is straightforward and gives the associated function g(a). The integration limits are between the initial and final temperatures of the reaction, or between a ¼ 0 and a ¼ 1. The values of E and A have a marked influence on the temperature range over which the TG curve is observed, but they do not influence
THERMOGRAVIMETRIC ANALYSIS
347
identified, as shown in Figure 3. Similar comments apply to these as to the corresponding temperatures already discussed. PRACTICAL ASPECTS OF THE METHOD Apparatus
Figure 3. Various temperatures used to define the TG curve.
the shape of the curve too greatly (Satava and Skvara, 1969). The kinetic mechanism, i.e., the form of f ðaÞ or gðaÞ, however, determines the shape of the curve. The TG curve can be defined in several ways. The temperature at which a change in mass is first detected— called the initial temperature Ti (see Fig. 3), or onset temperature, or procedural decomposition temperature— is not sufficiently well defined to use as a satisfactory reference point, since its detection is dependent on factors such as the sensitivity of the TG apparatus and the rate at which the initial reaction occurs. If the initial rate of mass loss is very slow, then the determination of Ti can be uncertain. A more satisfactory approach is to use the extrapolated onset temperature, Te, which provides consistent values (see Fig. 3). If, however, the decomposition extends over a wide temperature range and only becomes rapid in its final stages, the extrapolated onset temperature will differ considerably from the onset temperature. For this kind of reaction, it is more satisfactory to measure the temperature at which a fractional weight loss, a, has occurred (Ta ) (see Fig. 4). Clearly the temperature T0.05 is close to that at the start of the reaction and T0.90 is close to that at the end of the reaction. To define the complete range of reaction, two further temperatures, T0 and Tf may be
Figure 4. Measurement of the fraction reacted, a.
Thermobalance apparatus consist of the essential components of a balance and balance controller, a glass vessel to enclose the sample and balance to allow experiments to be carried out in a controlled atmosphere, a furnace and furnace controller, and a recorder system. Today it is common to purchase a thermobalance from a commercial source. Only in cases where the requirements are not met by the commercial sources should the construction of a self-designed system be contemplated, although most manufacturers will discuss possible modifications to existing systems. Balances that monitor mass changes are of two types: deflection and null deflection. The deflection balance monitors the movement of the weight sensor, and the nulldeflection balance the position of the beam. In the latter a servo system maintains the beam it in a quasiequilibrium position, the power supply to the servo system being the measure of the mass change. An advantage of the null-deflection balance is that the sample stays in a constant position in the furnace, which assists in various ways that will be explained later. Most modern thermobalances are based on null-type electronic microbalances, which means that sample masses of typically 10 to 20 mg are sufficient to be able to detect mass changes accurately and reliably. Thermobalances capable of taking larger samples, in the gram range, are also available, although they tend to be used for specialized applications. The automatic recording beam microbalance and ultra microbalances for use in vacuum and controlled environments have been reviewed by Gast (1974) and Czanderna and Wolsky (1980). The load-to-precision ratio (LPR) is often used as a means of comparing the performance of microbalances. A 1-g capacity beam balance with a precision of 2 mg will have an LPR of 5105. This compares with LPR values for high-performance microbalances of 108 or better (Czanderna and Wolsky, 1980, p. 7). In the balance sensitivity category, for example, the highest quoted sensitivity for all the reviewed thermobalances is l pg, and the lowest is 50 pg. However, the lower sensitivity balances may also be capable of taking larger sample masses and so the LPR value may not vary significantly across the range. This needs to be checked for each specific balance. Details of the sensitivity or detection limit, precision, and accuracy of the balance are not always evident in the literature, and the terminology used is not always descriptive or accurate. Some degree of standardization would be helpful to the prospective buyer. Thermobalances are enclosed in a glass envelope, or chamber, which can be sealed, partly to protect the balance against corrosion or damage but also to enable experiments to be carried out in a controlled atmosphere. Various ways are available for introducing the chosen gas. The most common way to introduce an inert gas is to
348
THERMAL ANALYSIS Table 2 . Common Furnace Windings and Their Approximate Maximum Working Temperature
Furnace Winding Nichrome Kanthal Platinum Platinum/10% rhodium Kanthal super (MoSi2) Molybdenum Tungsten
Figure 5. Protection of balance mechanism from corrosive gases by a secondary gas flow.
flow it first over the balance and then over the sample before exiting the system. This has the advantage of protecting the balance against any corrosive products generated by the decomposing sample as well as sweeping away any products that may condense on the hangdown arm of the balance and cause weighing errors. When a corrosive reaction gas is employed, the balance needs to be protected. This can be achieved by flowing an inert gas over the balance and introducing the corrosive gas at some other point so it is swept away from the balance mechanism. One way of realizing this is shown in Figure 5. A flow of the blanket gas is established followed by the reactive (corrosive) gas. Some commercial thermobalances are equipped with flowmeters for gas control, but if they are not, then external rotameters need to be fitted. When corrosive gases are used, the flow of corrosive gas and inert gas needs to be carefully balanced in order to provide the proper protection. Examples of thermobalances that have been modified to work in different atmospheres are many, e.g., in a sulfur atmosphere (Rilling and Balesdent, 1975) or in sodium vapor (Metrot, 1975). The problem of containing corrosive gases is considerably simplified by the magnetic balance designed by Gast (1975), where the balance and sample chambers are completely separated. This is a single-arm beam balance in which the pan is attached to a permanent magnet that is kept in suspension below an electromagnet attached to the hangdown wire. The distance between the two magnets is controlled electromagnetically. The sample chamber and the atmosphere containing the sample are thus isolated from the balance, and no attack on the mechanism can occur. This concept has been extended to microbalances (Pahlke and Gast, 1994). The ability to have controlled flowing atmospheres is usually available, with gas flows of up to l50 mL min1 being typical. Some instruments offer threaded gas valve connections, as well as a means of monitoring the gas flow. Work under vacuum down to 1 103 bar is common, and many thermobalances can operate down to 106 bar, although it is not usual to find instruments fitted with a means of monitoring the vacuum level. A commercial balance is reported to be able to operate up to 190 bars (Escoubes et al., 1984). Many examples of balances that
Approximate Maximum Working Temperature (8C) 1000 1350 1400 1500 1600 1800 2800
work in high vacuum are reported at the vacuum microbalance conferences. For high-pressure work, there is a TG apparatus capable of working up to 300 bars and 3508C (Brown et al., 1972) and a versatile TG apparatus for operation between 108 and 300 bars in the temperature range 200 to 5008C (Gachet and Trambouze, 1975). Furnaces can be mounted horizontally or vertically around the detector system. The most widely used furnaces are those wound with nichrome (nickel-chromium alloy) or platinum/rhodium heating wires. The most commonly used windings and their maximum temperatures of operation are given in Table 2. Molybdenum and tungsten windings need to be kept under a mildly reducing atmosphere to prevent oxidation. Furnaces are frequently wound in a noninductive manner, i.e., in one direction and then the other, so that magnetic fields generated when a current is passed through the furnace windings are canceled, thus eliminating any interaction with a magnetic sample. A requirement of a furnace is that there should be a region within which the temperature is constant over a finite distance. This distance is referred to as the hot zone. The hot zone diminishes as the temperature increases, as indicated in Figure 6. The sample should always be in the hot zone, so that all of the sample is at the same temperature. A thermocouple can be used to ensure that proper placement of the sample in relation to the hot zone is established. This is one reason to use a null-deflection balance, as the sample is always in the same position
Figure 6. Schematic of the hot zone in a furnace.
THERMOGRAVIMETRIC ANALYSIS
in the furnace. Furnaces capable of very high heating rates have been used for TG experiments of textiles at heating rates approaching 30008C/min (Bingham and Hill, 1975). These fast response furnaces can achieve isothermal temperatures rapidly. Alternative methods of heating include infrared and induction heating, which heat the sample directly without heating the sample chamber. Very fast heating rates on the order of 3000 to 60008C min1 can be achieved. Induction heating requires that the sample must be conducting, or placed in a conducting sample holder. The temperature ranges for commercially available instruments given in the manufacturer’s literature tend to be for the basic model, which operates typically in the temperature range ambient to 10008C, but other models may operate from 1968 to 5008C or ambient to 24008C with steps in between. There are three main arrangements of the furnace relative to the weighing arm of the balance. Lateral loading is a horizontal arrangement in which the balance arm extends horizontally into a furnace aligned horizontally. Two vertical arrangements are possible: one with the balance above the furnace with the sample suspended from the balance arm (bottom loading) and the other with the balance below the furnace and the sample resting on a flat platform supported by a solid ceramic rod rising from the balance arm (top loading). The following advantages are claimed for the horizontal mode: 1. Balance sensitivity is increased by virtue of the long balance arm. 2. It permits rapid gas purge rates, up to 1 L/min, since horizontal balance arms are perturbed less by a rapid gas flow than a vertical arrangement. 3. The influence of the Knudsen effect and convection errors can be ignored, i.e., chimney effects from the furnace are eliminated. 4. Evolved gas analysis is simplified. The main advantage claimed for the vertical mode is that much higher temperatures can be achieved, since no change occurs in the length of the balance arm. Escoubes et al. (1984) reported that the bottom-loading principle was adopted by about twice as many systems as the top-loading principle, with a small number of lateralloading models apparent. Temperature measurements are most commonly made with thermocouple systems, the most common being chromel-alumel, with an operating temperature range of ambient to 8008C; PtPt/10% Rh at ambient to 15008C; and Pt/6% RhPt/30% Rh at ambient to 17508C. The ideal location of the sample-measuring thermocouple is directly in contact with the sample platform. This requires that very fine wires made of the same material as the thermocouple are attached to the bottom of the platform. Such arrangements are found in simultaneous TG-DTA systems. The other alternative is to place the thermocouple at some constant distance from the sample and let the calibration technique compensate for the gap between the thermocouple and the sample. This method can only be
349
Figure 7. Typical configurations of the sample temperaturemeasuring thermocouple for a null-deflection balance. Left: lateral balance arm arrangement; right: bottom-loading balance.
adopted for a null-deflection balance, as this is the only system in which the distance between the sample platform and the thermocouple is constant. Two typical configurations are shown in Figure 7: one with the wires welded onto the bottom of the flat-plate thermocouple, which is kept at a constant distance from the sample pan, and the other a sideways configuration found in the horizontal balance arm arrangements. The response of the thermocouple varies with age, and so calibration with an external calibrant is required from time to time. Thermocouples, like other devices, do not respond well to abuse. Hence, operation of the furnace above the recommended temperature or in a corrosive environment will result in relatively rapid changes to signal output and eventually render it inactive. The chief functions of a temperature programmer are to provide a linear and reproducible set of heating rates and to hold a fixed temperature to within 18C. The programmer is usually controlled by a thermocouple located close to the furnace windings but occasionally situated close to the sample. Modern programmers can be set to carry out several different heating/cooling cycles, although this function is being increasingly taken over by microprocessors. The thermocouple output is usually fed to a microprocessor containing tabulated data on thermocouple readings, so that the conversion to temperature is made by direct comparison. Although heating rates often increase in steps, say, 1, 2, 5, 10, 30, 50, 1008C/min1, microprocessorcontrolled systems allow increments of 0.18C/min, which can be useful in kinetic studies based on multipleheating-rate methods. The recording device for most modern equipment is the computer. Thermobalances provide a millivolt output that is directly proportional to mass, and so it is possible to take this signal and feed it directly into a microprocessor via a suitable interface. This is by far the most common and best method for data acquisition and manipulation. Reading the direct millivolt output is the most accurate way of obtaining the measurement. Once stored, the data can be manipulated in various ways, and, e.g., the production of the derivative TG curve becomes a trivial task. Data collection relies on a completely linear temperature rise uninterrupted by deviations induced by
350
THERMAL ANALYSIS
Figure 8 Distorted TG signal resulting from deviation of the linear programmed heating rate.
instrumental or experimental factors. For example, the oxidation of sulfide minerals is so exothermic that for a short time the temperature-recording thermocouple may be heated above the temperature expected for the linear heating program. If the TG plot is being recorded as a temperature-mass plot, then the TG trace will be distorted (see Fig. 8A). This can be overcome by converting the temperature axis to a time-based plot (Fig. 8B). Experimental Variables The shapes of TG and DTG curves are markedly dependent on the properties of the sample and the experimental variables that can be set on the thermobalance. These procedural variables include sample preparation (dealt with later), sample mass, sample containers, heating rate, and the atmosphere surrounding the sample. The mass of the sample affects the mass and heat transfer, i.e., the diffusion of reaction products away from the sample or the interaction of an introduced reactive gas with the sample, and temperature gradients that exist within the sample. For a reaction in which a gas is produced, e.g., dehydration, dehydroxylation, or decomposition, the shape of the curve will depend on the thickness of the sample. For thick samples, gaseous products that evolved from the bottom layers of sample will take considerable time to diffuse away from the sample, and hence delay the completion of reaction. Therefore thick layers
of sample will give mass losses over greater ranges of temperature than thin layers of sample. Temperature gradients are inevitable in TG (or any thermal analysis method) because experiments are usually carried out at some constant heating or cooling rate. There is a temperature gradient between the furnace and the sample, as it takes a finite time for heat transfer to take place across the air gap between the furnace and the sample. Only under isothermal conditions will the sample and furnace temperatures be the same. At a constant heating rate this temperature gradient, commonly called the thermal lag, is approximately constant. The thermal lag increases as the heating or cooling rate increases or as the mass of the sample increases. A second and more important temperature gradient is that within the sample. Heat diffusion will be dependent on the thermal conductivity of the sample, so that heat gradients in metals will tend to be lower than those in minerals or polymers. If a large sample of a poorly conducting material is heated, then large temperature gradients can be expected, and the rate of reaction will be faster at the exterior of the sample because it is at a higher temperature than the interior. This will cause the reaction to occur over a greater temperature range relative to a smaller sample mass or one with a better thermal conductivity. These effects can be illustrated with reference to calcium carbonate (see Table 3), which decomposes above 6008C with the evolution of carbon dioxide (Wilburn et al., 1991). Both mass and heat transfer effects are operative, as the carbon dioxide has to diffuse out of the sample, which takes longer as the mass increases; also, significant thermal gradients between the furnace and sample and within the sample increase as the mass increases. The increase in T0.1 with an increase in mass of 598C is only related to the thermal lag of the sample with respect to the furnace, but the increase in T0.9 with an increase in mass is the sum of the thermal lag, temperature gradients in the sample, and the mass transfer of the carbon dioxide. The increase in T0.9 is somewhat larger at 978C relative to the increase in T0.1. In general, therefore, the interval between T0.1 and T0.9 increases as the sample mass increases. For large samples, the heat generated during an exothermic reaction may be sufficient to ignite the sample. In this case, the mass loss will occur over a very narrow temperature range that will be different from that obtained under nonignition conditions. In general, these problems were more significant with early thermobalances that required large sample masses to accurately record mass changes. Modern thermobalances,
Table 3. Effect of Sample Mass on Decomposition of Calcium Carbonate Heated in Nitrogen at 78C/min Sample Mass (mg) 50 100 200 300 400
T0.1 (8C)
T0.9 (8C)
T0.9–0.1 (8C)
716 742 768 775 775
818 855 890 902 915
102 113 122 127 140
THERMOGRAVIMETRIC ANALYSIS
based on null-deflection microbalances, are much less troubled by these problems as the sample masses tend to be in the milligram rather than the hundreds of milligram mass ranges. Even so, it is sometimes necessary to use large samples, and so these constraints need to be considered. Generally small samples spread as a fine layer are preferred. This ensures that (1) gaseous components can disperse quickly, (2) thermal gradients are small, and (3) reaction between solid and introduced reactive gas is rapid. Both the geometry of the sample container and the material from which it is made can influence the shape of a TG curve. Deep, narrow-necked crucibles can inhibit gaseous diffusion processes and hence alter the shape and temperature of curves relative to a flat, open pan (Hagvoort, 1994). Sometimes these effects are quite marked, and it has been demonstrated that the rate of oxidation of pyrite decreases by about one-half as the wall height of the sample pan changes from 2 to 4 mm (Jorgensen and Moyle, 1986). Most crucibles can have a lid fitted with either a partial or complete seal, and this is useful for the prevention of sample loss as well as for holding the sample firmly in place. It is also of value to encapsulate the sample if it is being used as a calibrant, so that if the reaction is completely reversible, it can be used several times. Paulik et al. (1982) have designed what they call ‘‘labyrinth’’ crucibles, which have special lids that inhibit gaseous diffusion out of the cell. The partial pressure of the gaseous product reaches atmospheric pressure and remains constant until the end of the process, so leading to the technique being named quasi-isobaric. Crucibles are usually fabricated from metal (often platinum) or ceramic (porcelain, alumina, or quartz), and their thermal conductivity varies accordingly. Care must be taken to avoid the possibility of reaction between the sample and its container. This can occur when nitrates or sulfates are heated or in the case of polymers, which contain halogens such as fluorine, phosphorus, or sulfur. Platinum pans are readily attacked by such materials. Only a very small number of experiments are required with sulfides heated to 7008 to 8008C in an inert atmosphere to produce holes in a platinum crucible. Less obviously, platinum crucibles have been observed to act as a catalyst, causing change in the composition of the atmosphere; e.g., Pt crucibles promote conversion of SO2 to SO3. Hence, if sulfides are heated in air or oxygen, the weight gain due to sulfate formation is always greater if Pt crucibles are used, especially relative to ceramic crucibles. The reaction sequence is MSðsÞ þ 1:5O2 ðgÞ ! MOðsÞ þ SO2 ðgÞ MOðsÞ þ SO3 ðgÞ ! MSO4 ðsÞ
351
Table 4 . Effect of Heating Rate on Temperature of Decomposition of Calcium Carbonate Heating Rate (8C/min) 1 7
T0.1 (8C)
T0.9 (8C)
T0.9–0.1 (8C)
634 716
712 818
78 102
of 50 mg of calcium carbonate heated in nitrogen at two different heating rates (Wilburn et al., 1991). The faster heating rate causes a shift to higher temperatures of T0.1, as well as T0.9, although the latter effect is greater than the former. Hence the mass loss occurs over a greater temperature range as the heating rate increases. Faster heating rates may also cause a loss of resolution when two reactions occur in similar temperature ranges, and the two mass losses merge. Fast heating rates combined with a large sample sometimes leads to a change in mechanism, which produces a significantly different TG curve. The atmosphere above the sample is a major experimental variable in thermal analytical work. Two effects are important. First, the presence in the atmosphere of an appreciable partial pressure of a volatile product will suppress a reversible reaction and shift the decomposition temperature to a higher value. This effect can be achieved by having a large sample with a lid or introducing the volatile component into the inlet of the gas stream. When the atmosphere within the crucible is the result of the decomposition of the reactant, the atmosphere is described as ‘‘self-generated’’ (Newkirk, 1971). An example of the decomposition of a solid material that is affected by the partial pressure of product gas is calcium carbonate, which decomposes according to the reaction CaCO3 ðsÞ ! CaOðsÞ þ CO2 ðgÞ
ð8Þ
In a pure CO2 atmosphere, the decomposition temperature is greatly increased. This effect can be predicted from thermodynamics. If the values of the Gibbs free-energy change (G8), the enthalpy change (H8), and the entropy change (S8), are fitted into the following equation, then the thermodynamic decomposition temperature T can be calculated for any partial pressure P of CO2: G ¼ H TS ¼ RT ln P 40,250 34:4T ¼ 4:6T log PCO2 At 1 atm CO2 : At 0:1 atm CO2 :
T ¼ 1170 K ð897 CÞ T ¼ 1132 K ð759 CÞ
ð9Þ
ð7Þ
The temperature range over which a TG curve is observed depends on the heating rate. As the heating rate increases, there is an increase in the procedural decomposition temperature, the DTG peak temperature, the final temperature, and the temperature range over which the mass loss occurs. The magnitude of the effect is indicated by the data in Table 4 for the decomposition
Hence, under equilibrium conditions, the decomposition temperature is lowered by 1388C. The temperatures found by thermal methods are always higher because these are dynamic techniques, although for small sample sizes and slow heating rates the correspondence is close. For calcium carbonate the difference between the onset temperature of decomposition and the calculated thermodynamic value was in the range 108 to 208C (Wilburn et al., 1991).
352
THERMAL ANALYSIS
The second major effect is the interaction of a reactive gas with the sample, which will change the course of the reaction. Under an inert atmosphere, organic compounds will degrade by pyrolytic decomposition, whereas in an oxidizing atmosphere oxidative decomposition obviously takes place. Hydrogen can be used to study reductive processes, although great care has to be taken to remove all oxygen from the system before carrying out the reaction. Other gases frequently studied are SO2 and halogens. It should be evident from the foregoing comments and examples that the conditions required to minimize the effects of heat and mass transfer, as well as decreasing the possibility of change in the reaction sequence, are to have (1) a small sample (10 to 20 mg), (2) finely ground material (1273 K, 100 K below respective Tm
Thermal Conductivity
Purge Gas
Excellent Excellent
Inert, reducing, oxidizing Inert, reducing
Good Excellent Poor
Inert, reducing Inert, reducing Inert, oxidizing
Caution! Reactions with sample and=or sample holder might occur at a lower temperature due to the existence of lower melting eutectics.
372
THERMAL ANALYSIS
DSC/DTA þ thermal gravimetric analysis (TGA; see THERDSC/DTA þ mass spectrometry (MS; see SIMULTANEOUS TECHNIQUES INCLUDING ANALYSIS OF GASEOUS PRODUCTS), and DSC/DTA þ Fourier transform infrared spectroscopy (FTIR; see SIMULTANEOUS TECHNIQUES INCLUDING ANALYSIS OF GASEOUS PRODUCTS). With the advent of synchrotron light sources, combinations such as DSC/DTA þ x-ray absorption fine structure (XAFS; Tro¨ ger et al., 1997) and DSC/DTA þ x-ray diffraction (XRD; Lexa, 1999) became possible. MOGRAVIMETRIC ANALYSIS),
PROBLEMS Heat-flux DSC and DTA instruments use thermocouples to detect temperature. Because of interdiffusion at the junction, it is possible that thermocouple calibrations will change. This is particularly troublesome for cases of extended times of operation near the upper limit of the temperature range of the thermocouple. Periodic temperature calibration of the instrument is recommended. Reactions with sample pans are a chronic problem that must be considered, particularly for high-temperature work. A variety of DTA/DSC sample pans are commercially available (see Table 2). It is usually possible to find suitable materials, but it is important to verify that no significant reaction has taken place. Serious errors and damage to equipment can result from ignoring this possibility. Resolution of close peaks can present difficulties. Indeed, the experimenter may not even be aware of the existence of hidden peaks. It is important when working with unfamiliar systems to conduct scans at several heating/cooling rates. Lower rates allow resolution of closely lying peaks, at the expense, however, of signal strength. Examination of both heating and cooling traces can also be useful. It should be obvious that caution should be observed to avoid the presence of an atmosphere in the DSC/DTA system that could react with either the sample or the crucible. Less obvious, perhaps, is the need to be aware of vapors that may be evolved from the sample that can damage components of the experimental system. Evolution of chloride vapors, e.g., can be detrimental to platinum components. Vaporization from the sample can also significantly alter the composition and the quantity of sample present. Sealable pans are commercially available that can minimize this problem. Because DSC/DTA typically involves scanning during a programmed heating or cooling cycle, slow processes can be troublesome. In measuring melting points, e.g., severe undercooling is commonly observed during a cooling cycle. An instantaneous vertical line characterizes the trace when freezing begins. In studying phase diagrams, peritectic transformations are particularly sluggish and troublesome to define. LITERATURE CITED Boerio-Goates, J. and Callanan, J. E. 1992. Differential thermal methods. In Physical Methods of Chemistry, 2nd ed., Vol. 6, Determination of Thermodynamic Properties (B. W. Rossiter
and R. C. Baetzold, eds.). pp. 621–717. John Wiley & Sons, New York. Boersma S. L. 1955. A theory of differential thermal analysis and new methods of measurement and interpretation. J. Am. Ceram. Soc. 38:281–284. Brennan, W. P., Miller, B., and Whitwell, J. 1969. An improved method of analyzing curves in differential scanning calorimetry. I&EC Fund. 8:314–318. Flynn J. H., Brown, M., and Sestak, J. 1987. Report on the workshop: Current problems of kinetic data reliability evaluated by thermal analysis. Thermochim. Acta 110:101–112. Gray, A.P. 1968. A simple generalized theory for the analysis of dynamic thermal measurements. In Analytical Calorimetry (R. S. Porter and J. F. Johnson, eds.). pp. 209–218. Plenum Press, New York. Ho¨ hne, G., Hemminger, W., and Flammersheim, H. J. 1996. Differential Scanning Calorimetry. An Introduction for Practitioners. Springer-Verlag, Berlin. Kubaschewski, O. and Alcock, C. B. 1929. Metallurgical Thermochemistry. Pergamon Press, New York. Lewis, G. N. and Randall, M. 1923. Thermodynamics. McGrawHill, New York. Lexa, D. 1999. Hermetic sample enclosure for simultaneous differential scanning calorimetry/synchrotron powder X-ray diffraction. Rev. Sci. Instrum. 70:2242–2245. Mackenzie R. C. 1985. Nomenclature for thermal analysis—IV. Pure Appl. Chem. 57:1737–1740. Mraw, S. C. 1988. Differential scanning calorimetry. In CINDAS Data Series on Material Properties, Vol. I-2, Specific Heat of Solids (C. Y. Ho, ed., A. Cezairliyan, senior author and volume coordinator) pp. 395–435. Hemisphere Publishing, New York. Richardson, M. J. 1984. Application of differential scanning calorimetry to the measurement of specific heat. In Compendium of Thermophysical Property Measurement Methods, Vol. 1, Survey of Measurement Techniques (K. D. Maglic, A. Cezairliyan, and V. E. Peletsky, eds.). pp. 669–685. Plenum Press, New York. Richardson, M. J. 1992. The application of differential scanning calorimetry to the measurement of specific heat. In Compendium of Thermophysical Property Measurement Methods, Vol. 2, Recommended Measurement Techniques and Practices (K. D. Maglic, A. Cezairliyan, and V. E. Peletsky, eds.). pp. 519– 545. Plenum Press, New York. Tro¨ ger, L., Hilbrandt, N., and Epple, M. 1997. Thorough insight into reacting systems by combined in-situ XAFS and differential scanning calorimetry. Synchrotron Rad. News 10:11–17. Watson, E. S., O’Neill, M. J., Justin, J., and Brenner, N. 1964. A differential scanning calorimeter for quantitative differential thermal analysis. Anal. Chem. 36:1233–1237.
KEY REFERENCES Boerio-Goates and Callanan, 1992. See above. A comprehensive look at the development and current status of thermal analysis with emphasis on DSC and DTA. Ho¨ hne et al., 1996. See above. A detailed and up-to-date review of DSC. Sound presentation of the theoretical basis of DSC. Emphasis on instrumentation, calibration, factors influencing the measurement process, and interpretation of results. Richardson, 1984, 1992. See above. Two excellent reviews of heat capacity determination by DSC.
COMBUSTION CALORIMETRY
APPENDIX: ACQUIRING A DSC/DTA INSTRUMENT
373
COMBUSTION CALORIMETRY INTRODUCTION
Acquisition of a DSC/DTA instrument should be preceded by a definition of its intended use, e.g., routine quality control analyses vs. research and development. While in the former setting an easy-to-use model with an available autosampler might be called for, the latter setting will likely require a highly flexible model with a number of user-selectable controls. A technical specification checklist (Ho¨ hne et al., 1996, used with permission) should then be compiled for different instruments from values obtained from manufacturers:
Manufacturer Type of measuring system
Special feature Sample volume (standard crucible) Atmosphere (vacuum?, which gases?, pressure?) Temperature range Scanning rates Zero-line repeatabilty Peak-area repeatability Total uncertainty for heat Extrapolated peak-onset temperature Repeatability Total uncertainty for temperature Scanning noise (pp) at . . . K/min Isothermal noise (pp) Time constant with sample Additional facilities
... Heat-flux disk type Heat-flux cylinder type Power compensation ... mm 3 ... From . . . to . . . K From . . . to . . . K/min From . . . mW (at . . . K) to . . . mW (at . . . K) . . . % (at . . . K) . . . % (at . . . K)
. . . K (at . . . K) . . . K (at . . . K) From . . . mW (at . . . K) to . . . mW (at . . . K) From . . . mW (at . . . K) to . . . mW (at . . . K) ... s ...
The lists should then be compared with each other and with a list of minimum requirements for the intended use. (Fiscal considerations will, undoubtedly, also play a role.) Manufacturer data should be critically evaluated as to the conditions under which the same have been determined. For instance, the majority of manufacturers give the same value for isothermal noise of their instruments as 0.2 mW. This value has apparently been obtained under extremely well controlled conditions and will not be reproduced in everyday use, where isothermal noise levels of 1 mW are more realistic.
DUSAN LEXA LEONARD LEIBOWITZ Argonne National Laboratory Argonne, Illinois
In the absence of direct experimental information, one often must rely on a knowledge of thermodynamic properties to predict the chemical behavior of a material under operating conditions. For such applications, the standard molar Gibbs energy of formation f Gm is particularly powerful; it is frequently derived by combining the standard molar entropy of formation f Sm and the standard molar enthalpy of formation (f Gm ¼ f Hm T f Sm ), often as functions of temperature and pressure. It follows, therefore, that the standard molar enthalpy of formation, , is among the most valuable and fundamental therf H m modynamic properties of a material. This quantity is defined as the enthalpy change that occurs upon the formation of the compound in its standard state from the component elements in their standard states, at a designated reference temperature, usually (but not necessarily) 298.15 K, and at a standard pressure, currently taken to be 100 kPa or 101.325 kPa. Many methods have been devised to obtain f Hm experimentally. Those include the so-called second- and third-law treatments of Knudsen effusion and massspectrometric results from high-temperature vaporization observations, as well as EMF results from high-temperature galvanic cell studies (Kubaschewski et al., 1993). Each of those techniques yields f Gm for the studied pro cess at a given temperature. To derive values of f Hm , the high-temperature results for f Gm must be combined with auxiliary thermodynamic information (heat capacities and enthalpy increments at elevated temperatures). Frequently, it turns out that the latter properties have not been measured, so they must be estimated (Kubaschewski et al., 1993), with a consequent degradation in the accu racy of the derived f Hm . Furthermore, high-temperature thermodynamic studies of the kind outlined here often suffer from uncertainties concerning the identities of species in equilibrium. Another potential source of error arises from chemical interactions of the substances or their vapors with materials of construction of Knudsen or galvanic cells. All in all, these approaches do not, as a rule, yield precise values of f Hm . Sometimes, however, they are the only practical methods available to the investigator. Alternative procedures involve measurements of the enthalpies (or enthalpy changes) of chemical reactions, r H m , of a substance and the elements of which it is composed. Experimentally, the strategy adopted in such determinations is dictated largely by the chemical properties of the substance. The most obvious approach involves direct combination of the elements, an example of which is the combustion of gaseous H2 in O2 to form H2O. In the laboratory, however, it is often impossible under experimental conditions to combine the elements to form a particular compound in significant yield. An alternative, albeit less direct route has been used extensively for the past century or more, and involves measurements of the enthalpy change associated with a suitable chemical reaction of the material of interest. Here, the compound is ‘‘destroyed’’ rather than formed. Such chemical reactions have
374
THERMAL ANALYSIS
involved, inter alia, dissolution in a mineral acid or molten salt (Marsh and O’Hare, 1994), thermal decomposition to discrete products (Gunn, 1972), or combustion in a gas such as oxygen or a halogen. This last method is the the subject matter of this unit. As an example of the latter technique, we mention here the determination of f Hm of ZrO2 (Kornilov et al., 1967) on the basis of the combustion of zirconium metal in high-pressure O2(g)
of Equations 3, 5, and 6, one obtains Equation 4. In other words
ZrðcrÞ þ O2 ðgÞ ¼ ZrO2 ðcrÞ
Thus, the standard molar enthalpy of formation of benzoic acid is obtained as the difference between its standard molar enthalpy of combustion in oxygen and the standard molar enthalpies of formation of the products, {CO2(g) and H2O(l)}, the latter being multiplied by the appropriate stoichiometric numbers, 7 and 3, respectively. This is a general rule for calculations of f Hm from reaction calorimetric data. Not surprisingly, oxygen bomb calorimetry has also been productive in thermochemical studies of organometallic materials. In that connection, the work of Carson and Wilmshurst (1971) on diphenyl mercury is taken as an example. These authors reported the massic energy of the combustion reaction
ð1Þ
ðZrO2 Þ ¼ c Hm ðZrÞ, the standard molar enwhere f Hm thalpy of combustion of Zr in O2(g), i.e., the standard molar enthalpy change associated with the reaction described in Equation 1. It is essential when writing thermochemical reactions that the physical states of the participants be stated explicitly; thus, (cr) for crystal, (s) for solid, (g) for gas, and (l) for liquid. Enthalpies of formation of numerous oxides of metals (Holley and Huber, 1979) are based on similar measurements. These are special cases in that they involve oxidation of single rather than multiple elements, as in the combustion of aluminum carbide (King and Armstrong, 1964)
Al4 C3 ðcrÞ þ 6O2 ðgÞ ¼ 2Al2 O3 ðcrÞ þ 3CO2 ðgÞ
ð2Þ
Of all the materials studied thermodynamically by combustion, organic substances have been predominant, for the simple reason (apart from their numerousness) that most of them react cleanly in oxygen to give elementary and well defined products. Thus, the combustion in highpressure oxygen of benzoic acid, a standard reference material in calorimetry, proceeds as follows C6 H5 COOHðcrÞ þ ð15/2ÞO2 ðgÞ ¼ 7CO2 ðgÞ þ 3H2 OðlÞ ð3Þ
f Hm ðC7 H6 O2 Þ ¼ r Hm ðEquation 3Þ þ 7 r Hm ðEquation 5Þ þ 3 r Hm ðEquation 6Þ ¼ r Hm ðEquation 3Þ þ 7 f Hm ðCO2 ; gÞ þ 3 f Hm ðH2 O; lÞ
HgðC6 H5 Þ2 ðcrÞ þ ð29=2ÞO2 ðgÞ ¼ HgðlÞ þ 12CO2 ðgÞ þ 5H2 OðlÞ
ð7Þ
ð8Þ
f H m
and derived {Hg(C6 H5)2}. Corrections were applied to the experimental results to allow for the formation of small amounts of byproduct HgO and HgNO3, both of which were determined analytically. Notice that values of f Hm are required for the products shown in Equation 3 and Equation 8. Because the value of f Hm varies as a function of temperature and pressure, it must always, for compounds interconnected by a set of reactions, refer to the same standard pressure and reference temperature. PRINCIPLES OF THE METHOD
The enthalpy of formation of benzoic acid is the enthalpy change of the reaction 7CðcrÞ þ 3H2 ðgÞ þ O2 ðgÞ ¼ C7 H6 O2 ðcrÞ
ð4Þ
where all substances on both sides of Equation 4 are in their standard states. It should be noted that the reference state of C is not merely the crystalline form but, more exactly, ‘‘Acheson’’ graphite (CODATA, 1989). Obviously, a direct determination of the enthalpy change of the reaction in Equation 4 is not possible, and it is in addressing such situations that combustion calorimetry has been particularly powerful. Similarly, with reference to Equation 3, separate equations may be written for the formation of CO2(g) and H2O(l)
Thermodynamic Basis of Combustion Calorimetric Measurements Investigations of combustion calorimetric processes are based on the first law of thermodynamics Uf Ui ¼ U ¼ Q W
ð9Þ
where Ui and Uf are the internal energies of a system in the initial and final states, respectively; Q is the quantity of heat absorbed between the initial and final states; and W is the work performed by the system. Frequently, the only work performed by a system is against the external pressure; thus ð Vf Uf Ui ¼ Q p dV ð10Þ Vi
CðcrÞ þ O2 ðgÞ ¼ CO2 ðgÞ H2 ðgÞ þ ð1=2ÞO2 ðgÞ ¼ H2 OðlÞ
ð5Þ ð6Þ
where r Hm for Equation 5 ¼ f Hm (CO2, g), and r Hm for (H2O, l). By appropriate combination Equation 6 ¼ f Hm
where Vi and Vf are, respectively, the initial and final volumes of the system. In the case of a process at constant volume, W ¼ 0, and Q ¼ QV, from which QV ¼ Uf Ui ¼ U
ð11Þ
COMBUSTION CALORIMETRY
Therefore, the quantity of heat absorbed by the system or released to the surroundings at constant volume is equal to the change in the internal energy of the system. Measurements of energies of combustion in a sealed calorimetric bomb are of the constant-volume type. If a process occurs at constant pressure, Equation 10 can be rewritten as Uf Ui ¼ Qp p ðVf Vi Þ
ð12Þ
Qp ¼ ðUf þ p Vf Þ ðUi þ p Vi Þ
ð13Þ
whence
where p denotes the external pressure. By substituting H ¼ U þ p V, one obtains Qp ¼ Hf Hi ¼ H
ð14Þ
where H denotes enthalpy. Reactions at constant pressure are most frequently studied in apparatus that is open to the atmosphere. The relation between Qp and QV is as follows Qp ¼ QV þ p ðVf Vi Þ ¼ QV þ ng R T
375
(aneroid calorimeter) is used for this purpose (Carson, 1979). Most thermostats used in combustion calorimetry are maintained at, or close to, a temperature of 298.15 K. The combination of bomb and water-filled vessel is usually referred to as the calorimetric system, but this really includes auxiliary equipment such as a thermometer, stirrer, and heater. Apparatuses that have constant-temperature surroundings are most common, and are called isoperibol calorimeters. Rarer is the adiabatic combustion calorimeter (Kirklin and Domalski, 1983); its jacket temperature is designed to track closely that of the calorimeter so that the heat exchanged with the surroundings is negligible. For the purposes of the present discussion, we shall deal only with the isoperibol calorimeter, although what we say here, with the exception of the corrected temperature change, applies to both types. A typical combustion bomb is illustrated in Figure 1. The bomb body (A) is usually constructed of stainless steel with a thickness sufficient to withstand not only the initial pressure of the combustion gas, but also the instantaneous surge in pressure that follows the ignition of a sample.
ð15Þ
where ng is the change in stoichiometric numbers of gaseous substances involved in a reaction, R is the gas constant, and T is the thermodynamic temperature. Equation 15 permits the experimental energy change determined in combustion calorimetric experiments (constant volume) for a given reaction to be adjusted to the enthalpy change (constant pressure). For example, vg ¼ 1 2 1 ¼ 2 for the combustion of gaseous methane CH4 ðgÞ þ 2O2 ðgÞ ¼ CO2 ðgÞ þ 2H2 OðlÞ
ð16Þ
PRACTICAL ASPECTS OF THE METHOD The Combustion Calorimeter The combustion calorimeter is the instrument employed to measure the energy of combustion of substances in a gas (e.g., Sunner, 1979). A common theme running through the literature of combustion calorimetry is the use of apparatus designed and constructed by individual investigators for problem-specific applications. To the best of our knowledge, only two calorimetric systems are commercially available at this time, and we shall give details of those later. In general, combustion calorimetry is carried out in a closed container (reaction vessel, usually called the bomb) charged with oxidizing gas to a pressure great enough (3 MPa of O2, for example) to propel a reaction to completion. The bomb rests in a vessel filled with stirred water which, in turn, is surrounded by a 1-cm air gap (to minimize convection) and a constant-temperature environment, usually a stirred-water thermostat. Occasionally, a massive copper block in thermostatically controlled air
Figure 1. Cross-section of typical calorimetric bomb for combustions in oxygen. A, bomb body; B, bomb head; C, sealing O-ring; D, sealing cap; E, insulated electrode for ignition; F, grounded electrode; G, outlet valve; H, needle valve; I, packing gland; J, valve seat; K, ignition wire; L, crucible; M, support ring for crucible.
376
THERMAL ANALYSIS
Copper has also been used as a bomb material; it has a thermal conductivity superior to that of stainless steel, but is mechanically weaker and chemically more vulnerable. In Figure 1, B is called the bomb head or lid. It is furnished with a valve (G) for admitting oxygen gas before an experiment and for discharging excess O2 and gaseous products of combustion through the needle valve (H) after an experiment. This valve is surrounded with a packing gland (I) and is also fitted with a valve seat (J) against which it seals. The bomb head (B) also acts as a platform to support components of an electrical ignition circuit, namely, an electrically insulated electrode (E) and a grounded electrode (F). These electrodes are connected by means of a thin wire (K) that is usually made of platinum, although other wires may be used, depending on the specific application. A ring (M) attached to the grounded electrode supports a platinum crucible (L) in which the combustion sample rests (in one example mentioned above, this would be a compressed pellet of benzoic acid). The entire lid assembly is inserted in the bomb body and tightened in position by means of the sealing cap (D), which brings force to bear against the rubber O-ring (C). Thus, a properly functioning bomb can be charged with O2 to the desired pressure while the contents are hermetically sealed. Figure 2 gives a simplified view of the most common combustion calorimeter, that with a constant temperature environment (isoperibol). A combustion bomb (A) is supported in the water-filled calorimeter vessel (E) which, in turn, is surrounded by the air gap and thermostat (C), whose inner surface (D) can also be discerned. Note that the lid of the thermostat (F) is connected to the main thermostat; thus, water circulating at a uniform temperature surrounds the calorimetric system at all times. The thermometer used to measure the temperature of the calorimeter is inserted at G, and a stirrer is connected at J (to ensure that the energy released during the combustion is
Figure 2. Cross-section of assembled isoperibol calorimeter. A, combustion bomb; B, calorimeter heater for adjusting starting temperature; C, outer surface of thermostat; D, inner surface of thermostat; E, calorimeter can; F, lid of thermostat; G, receptacle for thermometer; H, motor for rotation of bomb; J, calorimeter stirrer motor; K, thermostat heater control; L, thermostat stirrer.
expeditiously transferred to the calorimeter so that the combustion bomb, the water, and the vessel are quickly brought to the same temperature). A heater (B) is used to adjust the temperature of the calorimeter to the desired starting point of the experiment. A synchronous motor (H) can be used to rotate the bomb for special applications that are outside the scope of the present discussion. A pump (L) circulates water (thermostatted by unit K) through the jacket and lid. While the combustion experiment is in progress, the temperature of the calorimeter water is recorded as a function of time, and that of the thermostat is monitored. The thermometer is an essential part of all calorimeters (see THERMOMETRY). In the early days of combustion calorimetry, mercury-in-glass thermometers of state-of-the-art accuracy were used. These were superceded by platinum resistance thermometers which, essentially, formed one arm of a Wheatstone (Mueller) bridge, and the change in resistance of the thermometer was determined with the help of the bridge and a galvanometer assembly. Electrical resistances of such thermometers were certified as an accurate function of temperature by national standards laboratories. Thus, a particular value of the resistance of the thermometer corresponded to a certified temperature. Thermistors have also been used as temperature sensors and, more recently, quartz-crystal thermometers. The latter instruments read individual temperatures with an accuracy of at least 1 104 K, and give temperature differences (the quantities sought in combustion calorimetric studies) accurate to 23 105 K. They display an instantaneous temperature reading that is based on the vibrational frequency of the quartz crystal. Nowadays, quartz-crystal thermometers are interfaced with computers that make it possible to record and process temperatures automatically and are, therefore, much less labor-intensive to use than their Wheatstone bridge and galvanometer predecessors. Most ignition systems for combustion calorimetry are designed to release a pulse of electrical energy through a wire fuse from a capacitor of known capacitance charged to a preset voltage. Thus, the ignition energy introduced into the calorimeter, which usually amounts to 1 J, can be accurately calculated or preprogrammed. Overall energies of reaction must be corrected for the introduction of this ‘‘extra’’ energy. As we have mentioned previously, a stirrer helps to disperse the energy released during the combustion throughout the calorimetric system. Its design is usually a compromise between efficiency and size: if the stirrer is too small, it does not dissipate the energy of reaction efficiently; if it is too bulky, an undesirable quantity of mechanical heat is introduced into the experiment. Usually, stirrer blades have a propeller design with a major axis of 1 to 2 cm in length. They are attached via a shaft to a synchronous motor to ensure that the stirring energy is constant, and also by means of a nylon (or other thermal insulator) connector to minimize the transfer of heat by this route from the calorimeter to the surroundings. Most conventional combustion calorimeters are equipped with a heater, the purpose of which is to save time by quickly raising the temperature of the calorimeter to a value that is close to the planned starting point of the
COMBUSTION CALORIMETRY
experiment. Such heaters, typically, have resistances of 150 . Outline of a Typical Combustion Calorimetric Experiment. In brief, a combustion is begun by discharging an accurately known quantity of electrical energy through a platinum wire bent in such a way (see Fig. 1) that it is close to the combustion crucible when the bomb is assembled. Usually, a short cotton thread is knotted to the wire and the free end is run beneath the material to be burned. This sparks ignition of the sample in the high-pressure O2, and initiates a combustion reaction such as that depicted in Equation 3. Complete combustion of the sample often requires some preliminary experimentation and ingenuity. Although many organic compounds oxidize completely at p(O2) ¼ 3 MPa, some require higher, others occasionally lower, pressures, to do so. Cleaner combustions can sometimes be realized by changing, on a trial and error basis, the mass of the crucible. Some investigators have promoted complete combustion by adding an auxiliary substance to the crucible. For example, a pellet of benzoic acid placed beneath a pellet of the substance under study has often promoted complete combustion of a material that was otherwise difficult to burn. Paraffin oil has also been used for this purpose. Needless to say, allowance must be made for the contribution to the overall energy of reaction of the chosen auxiliary material. Role of Analytical Chemistry in Combustion Calorimetry There are, usually, two main underlying reasons for performing combustion calorimetry. One may wish to determine the energy of combustion of a not necessarily pure substance as an end in itself. An example is the measurement of the ‘‘calorific value’’ of a food or coal, where one simply requires a value (J/g) for the energy released when 1 g of a particular sample is burned in O2 at a specified pressure. Here, the sample is not a pure compound and may not be homogeneous. It is usually necessary to dry a coal sample, but few, if any, analytical procedures are prerequisite. On the other hand, the ultimate research objective may be the establishment of the enthalpy of formation of a compound, a fundamental property of the pure substance, as we have pointed out. In that case, a number of crucial operations must be an integral part of an accurate determination of the energy of combustion, and it is recommended that most of these procedures, for reasons that will become apparent, be performed even before calorimetric experiments are begun. The underlying philosophy here is governed by the first law of thermodynamics. The first law, in brief, demands exact characterization of the initial states of the reactants and final states of the products, in that they must be defined as precisely as possible in terms of identity and composition. An unavoidable consequence of these criteria is the need for extensive analytical characterization of reagents and products, which, in the experience of this writer, may require more time and effort than the calorimetric measurements themselves. Consequently, combustion processes must be as ‘‘clean’’ as possible, and incomplete reactions should be avoided to eliminate the time and expense required to define
377
them. For example, in combustion calorimetric studies of a compound of iron, it is clearly preferable to produce Fe(III) only, rather than a mixture of Fe(III) and Fe(II). In dealing with organic compounds, the formation of CO2 only, to the exclusion of CO, is a constant objective of combustion calorimetrists. Such desiderata may require preliminary research on suitable arrangements of the sample, experiments on the effects of varying the pressure of O2, or the addition of an auxiliary kindler, as mentioned previously. Although it may seem trivial, it is essential that the experimental substance be identified. For example, the x-ray diffraction pattern of an inorganic compound should be determined. For the same reason, melting temperatures and refractive indices of organic compounds, together with the C, H, and O contents, are used as identifiers. Impurities above minor concentrations (mass fractions of the order of 105 ) can exercise a consequential effect on the energy of combustion of a material. This is a particularly serious matter when the difference between the massic energies of combustion (J/g) of the main and impurity phases is large. In the case of organic compounds, water is a common impurity; it acts as an inert contaminant during combustions in O2. Thus, if H2O is present at a mass fraction level of 0.05, then, in the absence of other impurities, the experimental energy of combustion will be lower (less exothermic) than the correct value by 5 percent. For most applications, this would introduce a disastrous error in the derived enthalpy of formation. Isomers can be another form of impurity in organic materials. Fortunately, the energy difference between the substance under investigation and its isomer may be small. In that case, a relatively large concentration of the latter may introduce little difference between the experimental and correct energies of combustion. When isomeric contamination is possible, it is essential that organic materials be examined, by methods such as gas chromatography or mass spectrometry, prior to the commencement of calorimetric measurements. In summary then, it is vital that substances to be studied by combustion calorimetry in oxygen be analyzed as thoroughly as possible. Inorganics should be assayed for principal elements (especially when the phase diagram raises the possibility of nonstoichiometry), and traces of metals and of C, H, O, and N. In the case of organics, impurities such as H2O and structural isomers should be sought, and elemental analyses (for C, H, O, N, etc.) should be performed as a matter of course. Fractional freezing measurements may also be used to determine contaminant levels. A further check on the purity, but not the presence, of isomers is obtained by carefully weighing the gaseous products of reaction, usually CO2 and H2O. This is not a trivial procedure, but has been described in detail (Rossini, 1956). Because oxygen takes part in the combustion, its purity, too, is of importance. Gas with mole fraction x(O2)¼0.9999 is acceptable and can be purchased commercially. Exclusion of traces of organic gases and CO is crucial here; their participation in the combustion process introduces a source of extraneous energy that usually cannot be quantified. Efforts are also made routinely to eliminate or minimize N2(g) because it forms HNO3 and HNO2 in
378
THERMAL ANALYSIS
the combustion vessel. These acids can be assayed accurately, and are routinely sought and corrected for in carefully conducted combustion calorimetric experiments. We have dwelt at some length on the efforts that must be made analytically to characterize calorimetric samples. From this facet of a study, one or more of the following conclusions may generally be drawn: (1) the sample can be regarded with impunity as pure; (2) trace impurities are present at such a level that they play a significant thermal role in the combustion and must be corrected for; and/or (3) major element analyses show the sample to be nonstoichiometric, or stoichiometric within the uncertainties of the analyses. By fractional freezing, recrystallization, distillation, or other methods of purification, organic substances can be freed of contaminants to the extent that none are detected by techniques such as infrared spectroscopy and gas chromatography. In such cases, the measured energy of combustion is attributed to the pure compound. It is uncommon, apart from the pure elements and specially synthesized substances such as semiconductors or glasses, to encounter inorganic materials devoid of significant levels of contamination. In applying thermochemical corrections here, a certain element of guesswork is often unavoidable. In titanium carbide (TiC), for instance, trace nitrogen is likely to be combined as TiN. But how would trace silicon be combined? As SiC or TiSix? As in this case, there are frequently no guidelines to help one make such a decision. Small amounts of an extraneous phase can be difficult to identify unequivocally. The predicament here is compounded if the massic energies of combustion of the putative phases differ significantly. In such cases, one may have no alternative but to calculate the correction for each assumed form of the contaminant, take a mean value, and adopt an uncertainty that blankets all the possibilities. Recall that an impurity correction is based not only on the amount of contaminant but on the difference between its massic energy of combustion and that of the major phase. When major element analyses suggest that a substance is nonstoichiometric, one must be sure that this finding is consistent with phase diagram information from the literature. A stoichiometric excess of one element over the other in a binary substance could mean, for example, that a separate phase of one element is present in conjunction with the stoichiometric compound. At levels of 0.1 mass percent, x-ray analyses (see X-RAY TECHNIQUES) may not reveal that separate phase. Microscopic analyses (OPTICAL MICROSCOPY and REFLECTED-LIGHT OPTICAL MICROSCOPY) can be helpful in such cases. Products of combustion must be examined just as thoroughly as the substance to be oxidized. When they are gaseous, analyses by Fourier transform infrared (FTIR) spectroscopy may be adequate. If a unique solid is formed, it must be identified—by x-ray diffraction (X-RAY POWDER DIFFRACTION) and, preferably, by elemental analyses. A complete elemental analysis, preceded by x-ray diffraction examination, must be performed where two or more distinct solids are produced; thus, their identities and relative concentrations are determined. It is wise to carry out and interpret the analytical characterization, as much as possible, before accurate determi-
nations of the energy of combustion are begun. It would clearly be unfortunate to discover, after several weeks of calorimetric effort, that the substance being studied did not correspond to the label on the container, or that sufficient contamination was present in the sample to make the measurements worthless. Details of Calorimetric Measurements In practice, once the analytical characterization has been completed, a satisfactory sample arrangement devised, and the bomb assembled and charged with O2, the remaining work essentially involves measurements, as a function of time, of the temperature change of the calorimetric system caused by the combustion. As we have already mentioned, the temperature-measuring devices of modern calorimeters are interfaced with computers, by means of which the temperature is recorded and stored automatically at preset time intervals. When a quartz-crystal thermometer is used, that time interval is, typically, 10 sec or 100 sec. The change in energy to be measured in an experiment, and thus the temperature change, depends on the mass and massic energy of combustion of the sample. Thus, the combustion in O2 of 1 g of benzoic acid, whose massic energy of combustion is 26.434 kJ/g, will result in an energy release of 26.434 kJ. What will be the corresponding approximate temperature rise of the calorimeter? That depends on a quantity known as the energy equivalent of the calorimetric system, e(calor), and we shall shortly describe its determination. Simply put, this is the quantity of energy required to increase the temperature of the calorimeter by 1 K. Thus, the combustion of 1 g of benzoic acid will increase the temperature of a conventional macro calorimeter with e(calor) ¼ 13200 J/K by 2 K. It is worth noting that, because many of the most interesting organic materials are available in only milligram quantities, some investigators (Ma˚ nsson, 1979; Mackle and O’Hare, 1963; Mueller and Schuller, 1971; Parker et al., 1975; Nagano, 2000) constructed miniature combustion calorimeters with e(calor) 1 kJ/K. In principle, it is desirable to aim for an experimental temperature change of at least 1 K. However, that may be impossible in a macro system because of, e.g., the paucity of sample, and one may have to settle for a temperature change of 0.5 K or even 0.1 K. Needless to say, the smaller this quantity, the greater the scatter of the results. In short, then, the energy change of the experiment will be given by the product of the energy equivalent of the calorimetric system and the temperature change of the calorimeter. At this point, we shall discuss the determination of the latter quantity. Figure 3 shows a typical plot of temperature against time for a combustion experiment, and three distinct regions of temperature are apparent. Those are usually called the fore period, the main period, and the after period. Observations begin at time ti; here the calorimeter temperature (say, at T ¼ 297 K) drifts slowly and uniformly upward because the temperature of the surroundings is higher (T ¼ 298.15 K). When a steady drift rate has been established (fore period) and recorded, the sample is
COMBUSTION CALORIMETRY
Figure 3. Typical temperature versus time curve for calorimetric experiment. ti, initial temperature of fore-period; tb, final temperature of fore-period; te, initial temperature of after period; tf, final temperature of after period.
ignited at time tb, whereupon the temperature rises rapidly (main period) until, at time te (for a representative macro combustion calorimeter, te tb 10 min), the drift rate is once again uniform. The temperature of the calorimeter is then recorded over an after period of 10 min, and, at tf, the measurements are terminated. At the end of an experiment (typical duration 0.5 hr), one has recorded the temperature at intervals of, say, 10 sec, from ti to tf, from which the temperature rise of the calorimeter is calculated. To be more exact, the objective is to calculate the correction that must be applied to the observed temperature rise to account for the extraneous energy supplied by stirring and the heat exchanged between the calorimeter and its environment (recall that a stirrer is used to bring the temperature of the calorimeter to a uniform state, and that there are connections between the calorimeter and the surroundings, by way of the stirrer, thermometer, ignition, and heater leads, all of which afford paths for heat leaks to the environment). It is not within the scope of this unit to give particulars of the calculation of the correction to the temperature change. That procedure is described in detail by several authors including Hubbard et al. (1956) and Sunner (1979). Modern calorimeters, as we have mentioned, are programmed to carry out this calculation, once the time versus temperature data have been acquired. Suffice it to say that the correction, which is based on Newton’s law of cooling, can be as much as 1 percent of the observed temperature rise in isoperibol calorimeters, and therefore cannot be ignored. By contrast, adiabatic calorimeters are designed in such a way that the correction to the temperature rise is negligible (Kirklin and Domalski, 1983). Why, then, are not all combustion calorimeters of the adiabatic kind? The simple answer is that the isoperibol model is easier than the adiabatic to construct and operate. Calibration of the Calorimetric System Earlier in the discussion, we quickly passed over details of the determination of e(calor), the energy equivalent of the calorimeter, defined as the amount of energy required to increase the temperature of the calorimeter by 1 K. It
379
can also be thought of as the quantity by which the corrected temperature rise is multiplied to obtain the total energy measured during the combustion. In practice, to obtain e(calor), one must supply a precisely known quantity of energy to the calorimeter, and, as part of the same experiment, determine the accompanying corrected temperature rise as outlined above. The most precise determination of e(calor) is based on the transfer to the calorimeter of an accurately measured quantity of electrical energy through a heater placed, preferably, at the same location as the combustion crucible. If the potential drop across the heater is denoted by E, the current by i, and the length of time during which the current flows by t, then the total electrical energy is given by Eit, and e(calor) ¼ Eit/yc, where yc denotes the corrected temperature rise of the calorimeter caused by the electrical heating. Because most laboratories that operate isoperibol calorimeters are not equipped for electrical calibration, a protocol has been adopted which, in essence, transfers this procedure from standardizing to user laboratories. Thus, standard reference material benzoic acid, whose certified energy of combustion in O2 has been measured in an electrically calibrated calorimeter, can be purchased, e.g., from the National Institute of Standards and Technology (NIST) in the U.S.A. The certified value of the massic energy of combustion in pure O2 of the current reference material benzoic acid (NIST Standard Reference Material SRM 39j) is 26434 J/g, when the combustion is performed within certain prescribed parameters (of pressure of O2, for example). An accompanying certificate lists minor corrections that must be applied to the certified value when the experimental conditions deviate from those under which the energy of combustion was determined at the standardizing laboratory. It is desirable that the combustion gases be free from CO, and that soot and other byproducts not be formed. While it is possible to correct for the formation of soot and CO, for example, it is always preferable that combustions yield only those chemical entities, CO2 and H2O, obtained during the certification process. Thus, by simultaneous measurement of yc, one determines e(calor). Usually, a series of seven or eight calibration experiments is performed, and the mean value of the energy equivalent of the calorimetric system, he(calor)i, and its standard deviation, are calculated. In carefully performed calibrations with calorimeters of the highest accuracy, e(calor) can be determined with a precision of 0.006 percent. It is recommended practice to examine, from time to time, the performance of combustion calorimeters. Check materials are used for that purpose; they are not standard reference materials but, rather, substances for which consistent values of the massic energies of combustion have been reported from a number of reputable laboratories. Examples include succinic acid, nicotinic acid, naphthalene, and anthracene. Standard State Corrections in Combustion Calorimetry So far, we have discussed the analytical characterization of the reactants and products of the combustion and the
380
THERMAL ANALYSIS
determination of the corrected temperature rise. At this point, when, so to speak, the chemical and calorimetric parts of the experiment have been completed, we recall that we are in pursuit of the change of energy associated with the combustion of a substance in O2, where all reactants and products are in their standard reference states at a particular temperature, usually 298.15 K. Therefore, it is essential to compute the energy difference between the state in the bomb of each species involved in the reaction and its standard reference state. If, for example, one studies the combustion of a hydrocarbon at a pressure of 3 MPa of O2 at T ¼ 297.0 K, a correction must be made that takes into account the energy change for the (hypothetical) compression of the solid or liquid from the initial p ¼ 3 MPa to p ¼ 0.1 MPa, the reference pressure, and to T ¼ 298.15 K. Similarly, the energy change must be calculated for the expansion of product CO2(g) from a final pressure of 3 MPa and T ¼ 298.10 K, to p ¼ 0 and T ¼ 298.15 K. Among other things, allowance must be made for the separation of CO2 from the solution it forms by reaction with product H2O and adjustment of its pressure to p ¼ 0.1 MPa and temperature to T ¼ 298.15 K. There are numerous such corrections. Their sum is called the standard state correction. It is beyond the scope of the present treatment to itemize them, and detail how they are calculated. This topic has been dealt with extensively by Hubbard et al. (1956) and by Ma˚ nsson and Hubbard (1979). Normally, six to eight measurements are performed. The mean value and standard deviation of the massic energy of combustion are calculated. In thermochemical work, uncertainties are conventionally expressed as twice the standard deviation of the mean (see MASS AND DENSITY MEASUREMENTS). Combustion Calorimetry in Gases Other Than Oxygen Several times in this unit we referred to the desirability of obtaining so-called ‘‘clean’’ combustions. The impression may have been given that it is always possible, by hook or by crook, to design combustion experiments in oxygen that leave no residue and form reaction products that are well defined. Unfortunately, that is not so. For example, combustions of inorganic sulfurcontaining metal sulfides generally yield mixtures of oxides of the metal and of sulfur. A good example is the reaction of MoS2 which forms a mixture of MoO2(cr), MoO3(cr), SO2(g), SO3(g) and, possibly, an ill-defined ternary (Mo, S, O). Recall that reliable calorimetric measurements require not only that the products of reaction be identified, but that they be quantified as well. Thus, a determination of the standard energy of combustion of MoS2 in oxygen would require extensive, and expensive, analytical work. One could give additional similar examples of ‘‘refractory’’ behavior, such as the combustion of UN to a mixture of uranium oxides, nitrogen oxides, and N2(g). In such instances, O2(g) is clearly not a powerful enough oxidant to drive the reaction to completion: MoS2 to MoO3 and 2SO3, and UN to (1/3)U3O8 and (1/2)N2(g). Because of the expanding interest in the thermochemistry of inorganic materials, scientists, of necessity, began to explore more potent gaseous oxidants, among them the
halogens: F2, Cl2, Br2, and interhalogens such as BrF3 and ClF3. Of the oxidants just listed, F2 has been used most frequently, and it will be discussed briefly here. Its clear advantage over O2 is illustrated by the reaction with MoS2 MoS2 ðcrÞ þ 9F2 ðgÞ ! MoF6 ðgÞ þ 2SF6 ðgÞ
ð17Þ
Here, simple, well-characterized products are formed, unlike the complicated yield when O2 is used. Comparatively little analytical characterization is required, apart from identification of the hexafluorides by infrared spectroscopy. Unfortunately, the very reactivity that makes F2 so effective as a calorimetric reagent imposes special requirements (apart from those that arise from its toxicity), particularly with regard to materials of construction, when it is used in thermochemical studies. Combustion vessels must be constructed of nickel or Monel, and seals that are normally made of rubber or neoprene, suitable for work with oxygen, must be fashioned from Teflon, gold, lead, or other fluorine-resistant substances. Other precautions, too numerous to detail here, are listed in a recent book that deals with fluorine bomb calorimetry (Leonidov and O’Hare, 2000).
DATA ANALYSIS AND INITIAL INTERPRETATION Computation of Enthalpies of Formation From Energies of Combustion As we pointed out in the introduction to this article, massic energies of combustion are generally determined in order to deduce the standard molar enthalpy of formation. In this section, we will show how the calculations are per formed that lead to f Hm for an organic and inorganic substance. Ribeiro da Silva et al. (2000) reported the standard massic energy of combustion in oxygen c u of D-valine (C5H11NO2) according to the following reaction C5 H11 NO2 ðcrÞ þ ð27=4ÞO2 ðgÞ ¼ 5CO2 ðgÞ þ ð11=2ÞH2 OðlÞ þ ð1=2ÞN2 ðgÞ
ð18Þ
c u ¼(24956.9 10.2) J g1. The molar mass of valine, based on the atomic weights from IUPAC (1996), is 117.15 g mol1; therefore, the standard molar energy of combus tion cUm is given by: (0.11715) (24956.9 10.2) kJ mol1 ¼ (2923.7 1.2) kJ mol1. For the combustion reaction of valine, ng R T ¼ 3.1 kJ mol1 (on the basis of Equation 18), where R ¼ 8.31451 J K1 mol1 and T ¼ 298.15 K. Thus, c H m ¼ fð2923:7 1:2Þ 3:1g kJ mol1¼(2926.8 1.2) kJ mol1 which, combined with the standard molar enthalpies of formation of (11/2)H2O(l) and 5CO2(g) (CODATA, 1989), yields f H m ðC5 H11 NO2 ; cr; 298:15 KÞ=ðkJ mol1 Þ ¼ ð11=2Þ ð285:83 0:04Þ þ 5 ð393:51 0:13Þ þ ð2926:8 1:2Þ
¼ ð612:8 1:4Þ kJ mol1
ð19Þ
COMBUSTION CALORIMETRY
Johnson et al. (1970) determined calorimetrically the massic energy of combustion in oxygen of a specimen of PuC0.878, according to the following equation PuC0:878 ðcrÞ þ 1:878O2 ðgÞ ¼ PuO2 ðcrÞ þ 0:878CO2 ðgÞ ð20Þ and reported cu8¼ (5412.0 10.5) J g1. On the basis of the molar mass of the carbide, M ¼ 249.66 g mol1, the molar energy of combustion c Um is calculated to be (1351.2 2.6) kJ mol1. From Equation 20, ng R T ¼ 2.5 kJ mol1 and, thence, c Hm ¼ ð1353:7 2:6Þ kJ
1 mol . We take f Hm ðPuO2 Þ ¼ ð1055:8 0:7Þ kJ mol1 and f Hm ðCO2 ; gÞ ¼ ð393:51 0:13Þ kJ mol1 (CODATA, ðPuC0:878 ; CrÞ ¼ ð47:6 2:7Þ 1989), and calculate f Hm 1 kJ mol . In these examples, the original studies gave uncertainties of the cuo results. These were, in turn, combined in quadrature by the authors with uncertainties associated with the values of f Hm .
SAMPLE PREPARATION The preparation of a sample for combustion calorimetry is governed by the desideratum that all the sample, or as much as possible of it, be reacted to completion in a welldefined combustion process. The choice of physical form of the specimen is fairly limited: a massive piece of material (for example, a compressed pellet of an organic substance, or part of a button of an inorganic compound prepared by arc melting) should be used in initial experiments, and, if it fails to react completely, combustions of the powdered form at progressively finer consistencies should be explored. The experience of numerous combustion calorimetrists suggests, however, that organics almost always react more completely in pellet form. It is believed that most such reactions occur in the gas phase, after some material is initially vaporized from the hot zone. This behavior is also true of some inorganic substances, but others melt, however, and react in the liquid phase, in which case chemical interaction with the sample support is a distinct possibility. Even after prolonged exploration of the effect of subdivision of the material, acceptable combustions sometimes cannot be designed, at which point, as we have mentioned earlier, other variables such as sample support, auxiliary combustion aid, and gas pressure are investigated. What the experimentalist is attempting here is to concentrate, as sharply as possible, the hot zone at the core of the combustion. That the local temperatures in a combustion zone can be substantial is demonstrated by the discovery of small spheres of tungsten, formed from the molten metal, that remained in calorimetric bombs after combustions of tungsten in fluorine to form tungsten hexafluoride (Leonidov and O’Hare, 2000). In any case, it is important to avoid contamination during the comminution of pure materials to be used in calorimetry. Operations such as grinding should be performed with clean tools in an inert atmosphere. Needless to say,
381
if exploratory experiments indicate the use of powder, it is that material, and not the bulk sample, that should be subjected to the exhaustive analytical characterization detailed elsewhere in this article.
PROBLEMS The most common problems in combustion calorimetry have to do with: (1) characterization of the sample and reaction products; (2) attainment of complete combustion of the sample; (3) accurate measurement of the temperature change in an experiment; and (4) the proper functioning of the calorimeter. It is clear from our earlier discussions that characterization of the sample and combustion products and, therefore, the problems related to it, lie entirely in the realm of analytical chemistry. Each calorimetric investigation has its own particular analytical demands, but those lie outside the scope of the present unit and will not be considered here. The major practical problem with any combustion calorimetric study is the attainment of a complete, welldefined reaction. When burnt in oxygen, organic substances containing C, H, and O ideally form H2O and CO2 only. However, side-products such as carbon and CO may also be present, and their appearance indicates that the combustion did not reach an adequately high local temperature. One solution to this problem is to raise the combustion pressure, say from the typical 3 MPa to 3.5 MPa or even 4 MPa. Other solutions include lessening the mass of the combustion crucible, reducing the mass of sample, or adding an auxiliary combustion aid. The latter is usually a readily combustible material with large massic energy of combustion (numerous authors have used benzoic acid or plastic substances such as polyethylene capsules). Problems encountered with inorganic substances can also be addressed by raising or in some cases lowering the combustion gas pressure. A lower combustion pressure can moderate the reaction and prevent the melting and consequent splattering of the specimen. Complete combustions of refractory inorganic substances have been achieved by placing them on secondary supports which are themselves consumed in the reaction. In fluorine-combustion calorimetry, for example, tungsten metal serves this purpose very well, as it boosts the combustion of the sample while itself being converted to the gaseous WF6. It is clear that, even with well-characterized samples and products and ‘‘clean’’ combustions, a calorimetric experiment will not be reliable if the temperature-measuring device is not performing accurately. Thus, thermometers to be used for combustion calorimetry should be calibrated by standardizing laboratories, or, alternatively, against a thermometer that itself has been calibrated in this way, as are, for example, many quartz-crystal thermometers used in present-day studies (see THERMOMETRY). As we pointed out in the earlier part of this unit, temperature differences, not absolute temperature values, are the values obtained in calorimetric work. Finally, we address problems related to the proper functioning of the calorimeter itself. Examples of such
382
THERMAL ANALYSIS
problems might include the following: (1) the thermostatted bath that surrounds the calorimeter is not maintaining a constant temperature over the course of an experiment; (2) the stirrer of the calorimetric fluid is functioning erratically, so that the stirring energy is not constant (as is assumed in the calculations of the corrected temperature change of the calorimeter); or (3) the oxidizing gas is leaking through the reaction vessel gasket. All such problems can be pinpointed (as an example, by monitoring the temperature of the calorimetric jacket with a second thermometer). However, other systematic errors may not be so easily diagnosed. Investigators traditionally check the proper functioning of a combustion calorimeter by using it to measure the energy of combustion of a ‘‘secondary reference’’ material. For combustions in oxygen, succinic acid is recommended. Its energy of combustion has been accurately determined with excellent agreement in numerous studies since the advent of modern calorimetry, and is thus well known. If, in check experiments, disagreement with this value (beyond the uncertainty) arises, it should be taken as an indication that a systematic error or errors are present.
Mackle, H. and O’Hare, P. A. G. 1963. High-precision, aneroid, semimicro combustion calorimeter. Trans. Faraday Soc. 59:2693–2701. Ma˚ nsson, M. 1979. A 4.5 cm3 bomb combustion calorimeter and an ampoule technique for 5 to 10 mg samples with vapour pressures below approximately 3 kPa (20 torr). J. Chem. Thermodyn. 5:721–732. Ma˚ nsson, M. 1979. Trends in combustion calorimetry. In Experimental Chemical Thermodynamics. Vol. 1. (S. Sunner and M. Ma˚ nsson, eds.), Chap. 17:2. Pergamon Press, New York. Ma˚ nsson, M. and Hubbard, W. N. 1979. Strategies in the calculation of standard-state energies of combustion from the experimentally determined quantities. In Experimental Chemical Thermodynamics. Vol. 1. (S. Sunner, and M. Ma˚ nsson, eds.), Chap. 5. Pergamon Press, New York. Marsh, K. N. and O’Hare, P. A. G. 1994. Solution Calorimetry. Blackwell Scientific, Oxford. Mueller, W. and Schuller, D. 1971. Differential calorimeter for the determination of combustion enthalpies of substances in the microgram region. Ber. Bunsenges. Phys. Chem. 75:79–81. Nagano, Y. 2000. Micro-combustion calorimetry of coronene. J. Chem. Thermodyn. 32:973–977.
LITERATURE CITED
Parker, W., Steele, W. V., Stirling, W., and Watt, I. 1975. A highprecision aneroid static-bomb combustion calorimeter for samples of about 20 mg: The standard enthalpy of formation of bicyclo[3.3.3]undecane. J. Chem. Thermodyn. 7:795–802.
Carson, A. S. 1979. Aneroid bomb combustion calorimetry. In Experimental Chemical Thermodynamics. Vol. 1. (S. Sunner and M. Ma˚ nsson, eds.), Chap. 17:1. Pergamon Press, New York.
Ribeiro da Silva, M. A. V., Ribeiro da Silva, M. D. M. C., and Santos, L. M. N. B. F. 2000. Standard molar enthalpies of formation of crystalline L-, D- and DL-valine. J. Chem. Thermodyn. 32: 1037–1043.
Carson, A. S. and Wilmshurst, B. R. 1971. The enthalpy of formation of mercury diphenyl and some associated bond energies. J. Chem. Thermodyn. 3:251–258. CODATA Key Values for Thermodynamics. 1989. (J. D. Cox, D. D. Wagman, and V. A. Medvedev, eds.). Hemisphere Publishing Corp., New York. Gunn, S. R. 1972. Enthalpies of formation of arsine and biarsine. Inorg. Chem. 11:796–799. Holley, C. E., Jr. and Huber, E. J., Jr. 1979. Combustion calorimetry of metals and simple metallic compounds. In Experimental Chemical Thermodynamics. Vol. 1. (S. Sunner and M. Ma˚nsson, eds.), Chap. 10. Pergamon Press, New York. Hubbard, W. N., Scott, D. W. and Waddington, G. 1956. Standard states and corrections for combustions in a bomb at constant volume. In Experimental Thermochemistry (F. D. Rossini, ed.), Chap. 10. Interscience, New York. IUPAC. 1996. Atomic weights of the elements 1995. Pure Appl. Chem. 68:2339–2359. Johnson, G. K., van Deventer, E. H., Kruger, O. L., and Hubbard, W. N. 1970. The enthalpy of formation of plutonium monocarbide. J. Chem. Thermodyn. 2:617–622. King, R. C. and Armstrong, G. 1964. Heat of combustion and heat of formation of aluminum carbide. J. Res. Natl. Bur. Stand. (U.S.) 68A:661–668. Kirklin, D. R. and Domalski, E. S. 1983. Enthalpy of combustion of adenine. J. Chem. Thermodyn. 15:941–947. Kornilov, A. N., Ushakova, I. M., and Skuratov, S. M. 1967. Standard heat of formation of zirconium dioxide. Zh. Fiz. Khim. 41:200–204. Kubaschewski, O., Alcock, C. B., and Spencer, P. J. 1993. Materials Thermochemistry. 6th ed. Pergamon Press, New York. Leonidov, V. Ya. and O’Hare, P. A. G. 2000. Fluorine Calorimetry. Begell House, New York.
Rossini, F. D. 1956. Calibrations of calorimeters for reactions in a flame at constant pressure. In Experimental Thermochemistry (F. D. Rossini, ed.), Chap. 4. Interscience, New York. Sunner, S. 1979. Basic principles of combustion calorimetry. In Experimental Chemical Thermodynamics. Vol. 1. (S. Sunner, S. and M. Ma˚ nsson, eds.), Chap. 2. Pergamon Press, New York.
KEY REFERENCES Hubbard, W. N., Scott, D. W., and Waddington, G. 1959. Experimental Thermochemistry. (F. D. Rossini, ed.), Chap. 5. Interscience, New York. This book chapter gives detailed instructions for the calculation of standard-state corrections for combustion in oxygen of organic compounds of sulfur, nitrogen, chlorine, bromine, and iodine. An excellent outline of the corrected temperature rise in a calorimetric experiment is also included. Kubaschewski et al., 1993. See above. Deals with the theory and practice behind the determination of thermodynamic properties of inorganic materials, and includes an appendix of numerical thermodynamic values. Leonidov and O’Hare, 2000. See above. Gives a comprehensive survey of the technique of fluorine combustion calorimetery, along with a completely updated (to 2000) critical evaluation of the thermodynamic properties of (mainly inorganic) substances determined by this method. Sunner, S. and Ma˚ nnson, M. (eds.). 1979. Experimental Chemical Thermodynamics, vol. 1. Pergamon Press, New York. Contains authoritative articles on most aspects of combustion calorimetry, including the theory and practice, application to organic and inorganic substances, history, treatment of errors and uncertainties, and technological uses.
THERMAL DIFFUSIVITY BY THE LASER FLASH TECHNIQUE
APPENDIX: COMMERCIALLY AVAILABLE COMBUSTION CALORIMETERS We know of just two commercial concerns that market calorimeters to measure energies of combustion in oxygen. We refer to them briefly here, solely to allow interested consumers to contact the vendors. Parr Instrument Company (Moline, Illinois, U.S.A.; http://www.parrinst.com) lists several isoperibol combustion calorimeters, some of which are of the semimicro kind. IKA-Analysentechnik (Wilmington, North Carolina, U.S.A.; http://www.ika.net) catalogs combustion calorimeters of both isoperibol and aneroid design. P. A. G. O’HARE Darien, Illinois
THERMAL DIFFUSIVITY BY THE LASER FLASH TECHNIQUE INTRODUCTION Thermal diffusivity is a material transport property characterizing thermal phenomena significant in many engineering applications and fundamental materials studies. It is also directly related to thermal conductivity, another very important thermophysical property. The relationship between thermal diffusivity a and thermal conductivity l is given by a ¼ l/Cpr, where r and Cp are (respectively) density and specific heat at constant pressure of the same material. Contrary to the thermal conductivity, whose measurements involve measuring heat fluxes that are difficult to control and measure accurately, particularly at elevated temperatures, measuring thermal diffusivity basically involves the accurate recording of the temperature caused by a transient or periodic thermal disturbance at the sample boundary. It is thus frequently easier to measure thermal diffusivity than thermal conductivity. The two other properties involved in the relationship, density and specific heat, are thermodynamic properties and are either known or can be relatively easily measured. Thermal diffusivity experiments are usually short and relatively simple. In most cases they require small samples, disks a few millimeters in diameter and less than 4 mm thick. Although the derivation of thermal diffusivity from recorded experimental data may involve complex mathematics, availability of large-capacity personal computers and the ease with which thermal diffusivity experiments can be interfaced and automated compensate for this limitation. Another important feature of thermal diffusivity methods is that the temperature variation in the sample during measurement can be quite small, so the measured property is related to an accurately known temperature. This advantage enables studies of phase transitions via thermal diffusivity throughout transition ranges, which is often not feasible with thermal conductivity measurements whose methods involve appreciable temperature gradients.
383
Thermal diffusivity techniques have been used in the temperature range from 4.2 to 3300 K. In general, diffusivity/specific heat techniques are not well suited for cryogenic temperatures. Diffusivity values decrease rapidly and specific heat values increase rapidly with decreasing temperature, and their product cannot be determined with great accuracy. Therefore, the most popular temperature range for thermal diffusivity measurements is from near room temperature to 2000 K. According to the shape of the temperature disturbance, diffusivity techniques may be categorized into two basic groups: the transient heat flow and the periodic heat flow techniques. Transient techniques are divided into two subgroups, depending upon whether the temperature disturbance is relatively short (pulse techniques) or long (monotonic heating). These methods are discussed in detail in several chapters of Maglic et al. (1984). Periodic heat flow variants are based on the measurement of the attenuation or the phase shift of temperature waves propagating through the material. Periodic heat flow methods are divided into two groups. The first constitutes temperature wave techniques, which are predominantly devoted to lower and medium temperatures and are frequently called multiproperty, because they can provide data on a number of thermophysical properties within a single experiment. The second group constitutes the high-temperature variants, where the energy input is effected by modulated electron or photon beams bombarding the sample. The laser flash technique is the most popular method. According to some statistics, 75% of thermal diffusivity data published in the 1980s were measured with this technique. In summarizing advantages and disadvantages of the techniques, the laser flash and the wave techniques may be considered within same category. Both generally require small samples and vacuum conditions, and measurements are made within very narrow temperature intervals. The latter characteristic makes them convenient for studying structural phenomena in the materials to very high temperatures. Contrary to wave methods which need two types of apparatus for the whole temperature range, the laser flash method can cover the entire range with minor modifications in traversing from subzero to elevated temperatures. The laser flash measurements last less than 1 s and do not require very stable temperatures, while wave techniques require quasi-stationary conditions. Both methods are well studied and in both consequences of the main sources of error can be adequately compensated. There is a large volume of competent literature on this subject. Basic components of the laser flash equipment as well as complete units are commercially available from Anter Laboratories, Pittsburgh, PA, and Theta Industries, Port Washington, NY. In addition, there are well-established research and testing laboratories available for thermophysical property testing, such as TPRL, Inc., West Lafayette, IN. Temperature wave variants cover a very wide materials range and are suitable for operation under high pressures and for multiproperty measurement. Of particular advantage is the possibility of cross checking results by comparing data
384
THERMAL ANALYSIS
derived from the amplitude decrement and the phase lag information. The wave techniques have proved particularly convenient for measuring thermal conductivity and thermal diffusivity of very thin films and deposits on the substrates. This feature is very important, as properties of very thin films typically differ from the properties of the bulk of the respective materials. The laser flash method is still establishing its place in this area. A less attractive feature of wave techniques is that the measurements generally have to be carried out in vacuum. This limits these methods to materials in which ambient gas does not contribute to the energy transport. Mathematical procedures involved in corrections for both the laser flash and wave techniques utilize sophisticated data reduction procedures. The potential user of this equipment should be proficient in this area. Due to small samples, techniques are not well suited for coarse matrix materials where local inhomogeneities may compare with the sample thickness as well as for optically translucent or thermally transparent materials.
Figure 1. Comparison of normalized rear-face temperature rise with the theoretical model.
PRINCIPLES OF THE METHOD The flash method of measuring thermal diffusivity was first described by Parker et al. (1961). Its concept is based on deriving thermal diffusivity from the thermal response of the rear side of an adiabatically insulated infinite plate whose front side was exposed to a short pulse of radiant energy. The resulting temperature rise of the rear surface of the sample is measured, and thermal diffusivity values are computed from the temperature-rise-versus-time data. The physical model assumes the following ideal boundary and initial conditions: (a) infinitely short pulse, (b) onedimensional heat flow normal to the plate face, (c) adiabatic conditions on both plate faces, (d) uniform energy distribution over the pulse, (e) pulse absorption in a very thin layer of investigated material, (f) homogeneous material, and (g) relevant materials properties constant within the range of disturbance. As shown in the Appendix, the last two assumptions reduce the general heat diffusion equation to the form qT q2 T ¼a 2 qt qx
ð1Þ
where the parameter a represents the thermal diffusivity of the plate material. The solution of this equation relates the thermal diffusivity of the sample to any percent temperature rise and the square of the sample thickness (L): a ¼ Kx
L2 tx
of the plate thickness, and the time needed for the rear temperature to reach 50% of its maximum: a ¼ 0:1388
ð3Þ
Clark and Taylor (1975) showed that the rear-face temperature rise curve could be normalized and all experimental data could be immediately compared to the theoretical (Carslaw and Jaeger, 1959) solution on-line. Figure 1 shows this comparison and Table 1 gives the corresponding diffusivity (a) values for various percent rises along with the elapsed time to reach those percentages. The calculated values are all 0.837 0.010 cm2/s, even though the calculations involved times that varied by almost a factor of 3. Thermal diffusivity experiments using the energy flash technique were one of the first to utilize rapid data acquisition to yield thermal transport data on small, simple-shaped specimens. The original techniques used a flash lamp, but the invention of the laser greatly improved the method by increasing the distance between the heat source and sample, thus permitting measurements at high temperature and under vacuum conditions. Table 1. Computer Output For Diffusivity Experimenta a (cm2/s)
Rise (%)
0.8292 0.8315 0.8313 0.8374 0.8291 0.8347 0.8416 0.8466 0.8304 0.8451 0.8389
20 25 30 33.3 40 50 60 66.7 70 75 80
ð2Þ
where Kx is a constant corresponding to an x percent rise and tx is the elapsed time to an x percent rise. Table 2 in the Appendix relates values of K calculated for specific x percent rises. Parker et al. (1961) selected the point corresponding to 50% of the temperature rise to its maximum value, t0.5, relating thermal diffusivity of the material to the square
L2 t0:5
a
Value (V)
Time (s)
2.57695 2.75369 2.93043 3.04826 3.28391 3.63739 3.99086 4.22652 4.34434 4.52108 4.69782
0.023793 0.026113 0.028589 0.029275 0.033008 0.038935 0.041497 0.049997 0.054108 0.058326 0.065112
Maximum, 5.40477 V; half maximum, 3.63739 V; baseline, 1.870 V.
THERMAL DIFFUSIVITY BY THE LASER FLASH TECHNIQUE
The flash technique did not remain limited to conditions prescribed by the ideal model. The theoretical and experimental work of many researchers continuing from the 1960s supplemented the original concept with corrections, which accounted for heat exchange between sample and ambient, finite pulse shape and time, nonuniform heating, and in-depth absorption of the laser pulse. The result was extension of the method to, e.g., high temperatures, thin samples, layered structures, dispersed composites, and semitransparent or semiopaque materials. The review of these developments is presented elsewhere (Taylor and Maglic, 1984; Maglic and Taylor, 1992), supplemented here in further sections below with information on the literature published during the 1990s. The capabilities of modern data acquisition and data reduction systems offer advantages over the direct approach, i.e., procedures limited to the analysis of the half-rise time or a discrete number of points. The inverse method relies on the complete transient response for thermal diffusivity measurement. This approach results in excellent agreement between theoretical and experimental curves. The parameter estimation procedure is convenient for the flash method, as it enables simultaneous determination of more than one parameter from the same temperature response. The capabilities and advantages of such determinations will be presented in Data Analysis and Initial Interpretation. PRACTICAL ASPECTS OF THE METHOD Implementation of the laser flash method for thermal diffusivity measurement requires the following (see Fig. 2): in a sample holder, a furnace or a cooler capable of maintaining and recording the sample reference temperature; a vacuum-tight enclosure equipped with two windows for the laser and the detector; a pulse laser with characteristics adequate for the range of materials to be studied, including available sample diameters and thicknesses; a system for measuring and recording the rear-face temperature transient; a power supply for the furnace or the cooling unit; an adequate vacuum system; and a com-
Figure 2. Schematic view of the laser flash apparatus.
385
puter for controlling the experiment, data acquisition, and subsequent data processing. Measurement times of less than 1 s are often involved. The ambient temperature is controlled with a small furnace tube. The flash method is shown schematically in Figure 2, which includes the temperature response of the rear face of the sample. This rear-face temperature rise is typically 1 to 2 K. The apparatus consists of a laser, a highvacuum system including bell jar with windows for viewing the sample, a heater surrounding a sample holding assembly, an infrared (IR) detector, appropriate biasing circuits, amplifiers, analog-to-digital (A/D) converters, crystal clocks, and a computer-based digital data acquisition system capable of accurately taking data in the 100-ms time domain. The computer controls the experiments, collects the data, calculates the results, and compares the raw data with the theoretical model. The method is based on the Carslaw and Jaeger (1959) solution of the heat conduction equation for such a case. The furnace should be of low thermal inertia and equipped with a programmable high-stability power supply to enable quick changes in reference temperature. Typical examples are electroresistive furnaces heated by direct passage of electrical current through a thin metallic foil or a graphite tube. Vacuum- and/or gas-tight enclosures are mandatory for preventing heat exchange between sample and ambient by convection and gas conduction and protecting furnace and sample from chemical damage. They must have two windows along the optical axis of the sample, the front allowing entrance of the laser pulse and the rear for optical access to the sample rear side. The windows should be protected with rotating shields against evaporated deposits, as the vapor from both the heater and the sample material can condense on the cold window surfaces, reducing or completely blocking optical access. The transmittance of the rear window should be high within the infrared optical detector bandwidth. The pulse laser [now most commonly neodymium/ yttrium aluminum garnet (Nd:YAG)] should be capable of supplying pulses preferably lasting less than 1 ms and of 30 to 40 J in energy with a beam diameter of 16 mm and an energy distribution as homogeneous as possible. Usual sample diameters for homogeneous materials range from 6 to 12 mm, but some applications require diameters as large as 16 mm. Studies of thin films or samples where half times are of the order of 1 ms require much shorter pulses and Q-switched laser operation. Systems for measuring reference temperature might be contact or contactless. For most metallic samples, miniature thermocouples spot welded to the specimen rear face are most convenient. For samples where this is not feasible, it is common to use a thermocouple accommodated in the sample holder or to provide a blackbody-like hole in the holder for measurement using an optical pyrometer. In both latter cases the sample temperature must be calibrated against the measured sample holder temperature. The rear-face temperature transients should be detected with optical detectors adequate for the corresponding temperature ranges, meaning that they
386
THERMAL ANALYSIS
should be much faster than the change they are recording and be sufficiently linear within small temperature excursions. Optical detectors should be interfaced to the computer via a suitable A/D converter. The power supply should be matched with the furnace characteristics to easily and rapidly cover the whole temperature range of measurement. It should be capable of maintaining the prescribed temperature during the period of a few seconds of measurement for the baseline recording and up to ten half times after the laser discharge. The measurement should be effected at constant temperature or in the presence of a small and constant temperature gradient that can be compensated for in the data processing procedure. For realizing temperatures below ambient, a miniature chamber cooled by circulation of fluid from an outside cooling unit is adequate. Thermal diffusivity values are calculated from the analysis of the temperature-time dependence of the rear face of a thin sample, whose front face has been exposed to a pulse of radiant energy. The duration of the pulse originally had to be approximately one-hundredth of the time needed for the temperature of the rear face to reach 50% of its maximum value. However, present methods for correcting deviations from the initial and boundary conditions, along with data processing possibilities, have reduced this requirement considerably. When the samples are small and homogeneous, lasers are the best means for providing energy pulses. When high accuracy is not the primary requirement and where samples have to be larger, flash lamps may sometimes be adequate, particularly at low temperatures. After establishing that the heat flow from the front to the rear sample face is sufficiently unidirectional, the accurate measurement of the rear-face temperature change is actual. Originally thermocouples were the most common device used for this purpose, but IR and other contactless detectors now provide far better service because of increased accuracy and reliability. Distortion of signals caused by contact temperature detectors may lead to significant errors in the measured thermal diffusivity. The laser flash technique has been used from 100 to 3300 K, but its most popular application has been from near room temperature to 2000 K. This technique has been successfully applied to metals in the solid and liquid states, ceramics, graphites, biological products, and many other materials with diffusivities in the range 107 to 103 m2/s. The method is primarily applicable to homogeneous materials but has been applied successfully to certain heterogeneous materials. Even with manually driven hardware, the whole range from room temperature to the maximum temperature can be traversed within one day. With more sophisticated equipment, laser flash thermal diffusivity measurements can be quite fast. Methods based on monotonic heating have been used from the lowest temperature limit of 4.2 K to as high as 3000 K. They proved convenient for a range of materials: ceramics, plastics, composites, and thermal insulators with thermal diffusivity values falling in the range from 5 108 to 10 m2/s. They are irreplaceable for coarse-matrix and large-grain materials and porous materials in which part of thermal energy is transported via gas-filled interstices. They are also useful for property measurements
over large temperature intervals carried out in relatively short times, e.g., on materials undergoing chemical or structural changes during heating. The cost is reduced accuracy, but it is usually adequate for industrial purposes. The monotonic heating techniques comprise two subgroups, for measurements in the narrow and in very wide temperature intervals, so-called regular regime and quasi-stationary regime methods. The temperature wave variants may be applied from 60 to 1300 K, but they are generally used between 300 and 1300 K. They are convenient for metals and nonmetals and also fluids and liquid metals with thermal diffusivity in the range 107 to 104 m2/s. This technique has a large variety of variants and their modifications, depending on the sample geometry and direction of propagation of the temperature waves. They include information on the mean temperature field and the amplitude and phase of the temperature waves. This opens the possibility of multiproperty measurements, i.e., simultaneous measurement of thermal conductivity, thermal diffusivity, and specific heat on the same sample. The high-temperature variants (modulated electron beam or modulated light beam) can be used from 330 to 3200 K, but their usual range is between 1100 and 2200 K. These high-temperature variants have been applied to refractory metals in the solid and in the molten state. They also have been used for high melting metal oxides, whose thermal diffusivity is in the range 5 107 to 5 10 m2/s. Improvements in techniques, measurement equipment, and computing analysis are continuing to be made. The measurement consists of heating or cooling the sample to the desired temperature, firing the laser, digitizing and recording the data, and possibly repeating the measurements for statistical purposes before changing to the next temperature level. The main difficulties that can be encountered include (1) deviations from assumed ideal conditions due to time of passage of the heat pulse through the sample being comparable with the pulse duration, heat exchange occurring between sample and environment, nonuniform heating of the sample surface, subsurface absorption of the laser pulse, and combined mechanisms of heat transmission through the sample; (2) interference in the transient response signal due to passage of the laser light between sample and sample holder directly to the IR detector and thermal radiation from the sample holder; (3) changes of sample chemical composition due to evaporation of alloying constituents at high temperatures and chemical interaction with the sample holder or optical barrier; and (4) errors in determining sample reference temperature. Deviations from assumed ideal conditions manifest in the shape of the transient response, each of them deforming it in a specific manner. Procedures for correcting all deviations have been developed and are a part of standard data processing. They are included in Data Analysis and Initial Interpretation. Transient response is also affected by the laser light passing between the sample and the sample holder or reflected laser light reaching the optical sensor. If either of these occur, the sensor may saturate, deforming the initial portion of the signal rise, thus making it difficult
THERMAL DIFFUSIVITY BY THE LASER FLASH TECHNIQUE
to define the baseline. Precautions therefore should be taken to eliminate detector saturation. If the thermal diffusivity of the sample holder is higher than that of the sample or the holder is partially translucent to the laser light and if stray laser pulse light hits the holder, then the optical detector could measure contributions from both the sample and its holder. This will deform the transient curve. To avoid this, the proper choice of the sample holder material and the careful design of its dimensions are necessary. Operation at high temperatures may change the chemical composition of the sample due to the preferential evaporation of some alloying constituents. A convenient way to reduce this effect is to shorten the time that the sample is exposed to high temperatures. This is accomplished either by devoting one sample solely to high-temperature measurements and performing these as quickly as possible or by reducing the number of experiments at temperatures beyond the thermal degradation point. In addition, the sample holder material should be selected so that chemical interaction between it and the sample is minimized. Errors in determining sample reference temperature can be significant depending upon many factors. These include the location of the temperature sensor with respect to the sample or its holder, the quality of thermal contact between the sample and the holder, and the measurement temperature and sample emissivity. It may be possible to avoid some of the errors by proper selection of the method and by establishing the relationship between the true sample temperature and that measured in the holder.
METHOD AUTOMATION Powerful personal computers are presently available at reasonable cost. This offers ample opportunity for automation of all measurement procedures contributing to productivity and ease with which measurements are performed. Excessive automation, however, might relegate decisions to the computer that are too subtle to be automated. The following paragraph therefore reviews tasks that can be entrusted to modern computers and those that should be reserved for the operators who are knowledgeable about the limitations of the experimental technique and the physics of the properties being measured. Desirable automation involves continuous checking of experimental conditions. This include achieving and maintaining the required level of vacuum and temperature as well as controlling the heating rates between specified temperature levels. Computers are capable of executing all these steps, as well as positioning samples when a multisample accessory is involved. Computers should also be involved in performing tasks such as firing the laser and collecting and processing the data. However, all significant phases of experiment and processing of the data should be performed by the operator. He or she should first ensure that prescribed temperature stability has been achieved. After the laser has fired and the initial data set is collected, the quality of the data should be inspected, making sure that the moment of laser discharge is recognizable,
387
the shape of transient agrees with expectation, and possible noise on the signal will not affect the data processing. As the processing progresses, visual insight into transformation of initial experimental data is necessary before proceeding to each successive step of the procedure. Full automation of the experiment, where the choice of accepting or rejecting the initial data set and selection of correcting procedures to be involved are left to computer, makes the experiment susceptible to various sources of systematic error. If nothing else, suspicion of their existence ought always be present. For reliable results in flash diffusivity measurements, the operator should be highly experienced.
DATA ANALYSIS AND INITIAL INTERPRETATION After the sample has been brought to a desired temperature and its stability established, the experiment is started by initiating the program that records the baseline, fires the laser, and acquires and records data on the rear-face temperature transient. Insight into the shape of the resulting data curve indicates whether it is acceptable for extracting results or the experiment should be repeated. To increase the statistical weight of the data points, several measurements should be taken at the same temperature. Experimental data necessary for the calculation of thermal diffusivity include baseline, which represents the equilibrium temperature of the specimen prior to laser discharge, the time mark of the laser discharge, and the transient temperature curve, extending over at least ten lengths of the characteristic half time, t0.5. It is also useful to have a working knowledge of the shape of the laser pulse (especially its duration and intensity distribution across the beam) and the response characteristics of the temperature detector (response time, linearity). The sample thickness measurement should be done as precisely as possible prior to the diffusivity measurement. It is important that the transient response curve be thoroughly analyzed to verify the presence or absence of laser pulse duration comparable with the time of passage of the temperature disturbance through the material (finite-pulse-time effect), heat exchange between sample and environment (heat losses or gains), nonuniform heating, subsurface heating by the laser beam, or other effects that can cause errors in thermal diffusivity calculations. This means comparing the normalized rise curve with the ideal curve for the theoretical model. It should be noted that these four effects can be regarded merely as deviations from an ideal situation in which such effects are assumed to be negligible. It is entirely feasible to develop models that incorporate and account for these effects. This has been done for all four of these effects. Once the mathematical (analytical or numerical) expression for temperature rise is known, parameter estimation techniques can be used to calculate the thermal diffusivity of the sample. However, other unknown model parameters such as heat loss coefficients, duration and shape of the heat pulse, spatial distribution characteristics of the pulse, and effective penetration depth of the pulse must be estimated simultaneously. Even
388
THERMAL ANALYSIS
today, when advanced parameter estimation techniques are available and high-speed, high-capacity computers can be used, estimating all parameters simultaneously is sometimes quite difficult. Careful and nontrivial mathematical analysis of so-called sensitivity coefficients [see Beck and Arnold (1977) for details] has to be conducted to find out, if it is possible to calculate all of these unknown parameters from the given model for the temperature response. We will briefly describe how to analyze the response curve with regard to all of the above-mentioned effects. To normalize the experimental curve and as an inherent part of determining the half time (or the time to reach any other percentage temperature rise), it is necessary to determine accurately the temperature baseline and maximum temperature rise. It is not a trivial task, especially when the signal is noisy and the temperature of the sample was not perfectly stable. Standard smoothing procedures, e.g., cubic-spline smoothing, can be used. The baseline temperature can be obtained conveniently by measuring the detector response for a known period prior to charging the laser flash tube capacitor bank and extrapolating the results to the time interval in which the rear-face temperature is rising. The normalized experimental temperature rise (in which the baseline temperature is set to zero, the maximum temperature is equal to 1, and the time scale unit is set to the experimental half time) is then compared with the ideal dimensionless temperature rise, given by Vðt0 Þ ¼ 1 þ 2
1 X
ð1Þn expð0:1388n2 p2 t0 Þ
ð4Þ
n¼1
where t0 ¼ t=t0:5 is dimensionless time. When there are no evident deviations of the experimental normalized temperature rise curve from the ideal one (analysis of residuals is a useful tool for this purpose), then any of the data reduction methods based on the simple ideal model can be used for the thermal diffusivity calculations. The result should be later checked by the other correction procedures, which should give the same answers. The presence of the finite-pulse-time effect (without other effects being present) can be readily determined from a comparison of the normalized experimental curve to the theoretical model. The distinguishing features are that (1) the experimental curve lags the theoretical ideal curve by from an 5% to a 50% rise, (2) the experimental curve leads the ideal theoretical curve by from an 59% to an 98% rise, and (3) a long, flat maximum is observed. Of these major effects, the finite-pulse-time effect is the easiest to handle. Cape and Lehman (1963) developed general mathematical expressions for including pulse time effects. Taylor and Clark (1974) tested the Cape and Lehman expressions experimentally. Larson and Koyama (1968) presented experimental results for a particular experimental pulse characteristic of their flash tube. Heckman (1976) generated tabular values for triangular shaped pulses. Taylor and Clark (1974) showed how calculated diffusivity varied with percent rise for a triangular shaped heat pulse. They also showed how to correct diffusivities
at percent rises other than the half-rise value so that the diffusivity values could be calculated over the entire experimental curve rather than at one point. A procedure based on the Laplace transform was used to correct the finite-pulse-time effect in the data reduction method proposed by Gembarovic and Taylor (1994a). Experimental data are first transformed with the Laplace transformation and then fitted with the transform of the theoretical time-domain relation. More realistic heat pulse shapes can be calculated with this method. Using data reduction method based on discrete Fourier transformation can eliminate errors from a noisy signal and unstable baseline conditions (Gembarovic and Taylor, 1994b). Azumi and Takahashi (1981) proposed one of the simplest and universal methods for the correction of the finite-pulse-time effect. This correction consists of taking as the time origin the effective irradiation time (center of gravity of the pulse). Degiovanni (1987) used this technique to correct simultaneously both the finite-pulse-time and heat loss effects. The presence of heat losses is shown by the following features: (1) the experimental curve slightly lags the theoretical curve from an 5% to a 50% rise, (2) the experimental curve leads the theoretical curve from 50% to 100%, and (3) a relatively short maximum is observed followed by a pronounced smooth decline. The calculated value of thermal diffusivity increases with increasing percent rise at an increasing rate. Radiation heat losses may be corrected for by the method of Cowan (1963) or Heckman (1976). Cowan’s method involves determining the values of the normalized temperature rise at 5t0.5 or 10t0.5. From these values one can estimate the radiation loss parameter and correct a. Heat losses can also be corrected by the so-called ratio method (Clark and Taylor, 1975). In this method, the experimental data are compared at several particular points with the theoretical rear-face temperature rise with heat losses. The Clark and Taylor method has the advantage of using the data collected during the initial rise rather than the cooling data, which are subject to greater uncertainties caused by conduction to the sample holder. Clark and Taylor tested their procedure under severe conditions and showed that corrections of 50% could be made satisfactorily. The data reduction method proposed by Degiovanni and Laurent (1986) uses the temporal moments of order zero and 1 of the defined temperature interval of the rising part of the experimental curve to correct the heat loss effect. Koski (1982) proposed an integrated data reduction method in which both the finite-pulse-time effect and heat losses are corrected using the form
VðL; tÞ ¼ 2
M X
Fðtm Þ
1 X bn ðbn cos bn þ Lg sin bn Þ
b2n þ L2g þ 2Lg n¼1 " # b2n aðt tm Þ exp tm L2 m¼1
ð5Þ
THERMAL DIFFUSIVITY BY THE LASER FLASH TECHNIQUE
where the pulse time t is divided into M subintervals t1, t2, . . ., tM, the function F represents the heat pulse shape at the time tm, tm ¼ tm tm1 , Lg is the heat loss parameter, and bn are roots of the transcendental equation ðb2 Lg Þtan b ¼ 2Lg b
ð6Þ
Other original methods (Balageas, 1982; Vozar et al., 1991a,b) are based on the assumption that the experimental temperature rise is less perturbed by heat losses at times closer to the origin (i.e., time of the flash). The thermal diffusivity is obtained by extrapolating the time evolution of calculated values of an apparent diffusivity to zero time. Nonlinear least-squares fitting procedures have been developed (Takahashi et al., 1988; Gembarovic et al., 1990) in which all experimental points of the experimental temperature rise can be used for the determination of the thermal diffusivity. The methods are particularly useful in the case of noisy data, which are otherwise close to an ideal model. A data reduction method for heat loss correction is described by Beck and Dinwiddie (1997). A parameter estimation technique was used to calculate thermal diffusivity in the case when heat loss parameters from the front and rear faces of the sample are different. Finite-pulse-time effects and radiative heat losses only occur in selective cases; i.e., finite-pulse-time effects occur with thin samples of high-diffusivity materials and radiative heat losses occur at high temperatures with thicker samples. In contrast, nonuniform heating can occur during any flash diffusivity experiment. However, very little has been published on the effects of nonuniform heating. Beedham and Dalrymple (1970), Mackay and Schriempf (1976), and Taylor (1975) have described the results for certain nonuniformities. These results show that when the heating is uniform over the central portion of the sample, reasonably good results can be obtained. However, a continuous nonuniformity over the central portion can lead to errors of at least ten times the usual error. Nonuniform heating can be corrected by properly designing the experiment. If the laser beam is nonhomogeneous in cross-section, then an optical system that homogenizes the beam can be used, the sample surface can be covered with an absorbative layer that will homogenize the beam, or a thicker sample that is less prone to this effect can be used. If the sample is not completely opaque to the laser beam, then subsurface heating can distort experimental temperature rise. The presence of a spike and shift of baseline temperature to a new higher level after the laser flash indicates when the beam is completely penetrating the sample. It is more difficult to detect the case where the laser beam is absorbed in a surface layer of a finite thickness. It is not recommended to eliminate this effect using mathematical means. Covering the sample surface with one or more protective layers can eliminate this effect. Data reduction procedures for layered structures are based on two- or three-layer models (see, e.g. Taylor et al., 1978). Thicknesses, densities, and specific heats of all layers have to be known, along with the thermal
389
diffusivities of all but the measured layer. Parameter estimation techniques or other nonlinear fitting procedures are used to calculate the desired value of the thermal diffusivity of the measured layer. If the known layers are relatively thin and highly conductive and the contact thermal resistance is low, then the results of the two- or threelayer thermal diffusivity calculation are the same as for a homogeneous (one-layer) sample with the thickness given as a sum of all layers. Application of the laser flash method to very thin, highly conductive multilayer structures still remains an unsolved problem and a big challenge to experimenters.
SAMPLE PREPARATION Preparing opaque specimens for flash diffusivity measurement is generally simple. The diameter of the sample has to conform to the size of the most homogeneous part of the laser energy pulse, and its thickness to the permissible ratio between the laser pulse duration and the characteristic time of heat passage through the specimen. Problems might arise from satisfying the requirement of plane-parallelism of the specimen flat sides. If the sample material is magnetic or if its length is sufficient to be held during machining and the material is readily machinable, no serious problem exists. However, if it has to be made thin or very thin or the material is hard and brittle or difficult to be fixed to a substrate, a lot of ingenuity on the part of the sample manufacturer will be necessary. Coping with its transparency to thermal radiation or the laser pulse may be difficult. Often it can be solved by coating the front or both sample faces with a thin metallic or graphite layer. This overlayer has to absorb the laser pulse energy within its finite thickness and convert it into a single thermal function. Layers of refractory metals are thinner and longer lasting, but their high reflectivity allows only a small portion of the pulse to be absorbed. Graphite is much better in this respect, but laser pulses, particularly those with higher energies, tend to evaporate the layer. Attaching the coating to the sample may be a real challenge, particularly if the sample material is smooth and slippery. The lifetime of a layer in terms of pulses is shorter as temperature increases. As machining of metallic samples always involves mechanical deformation of the sample material, it is advisable to relieve strains (which may affect the measured diffusivity values) by methods well known to metallurgists.
SPECIMEN MODIFICATION Experimenting with the flash diffusivity technique may affect the specimen in a few ways. The most obvious include modifications due to exposure to elevated temperature in vacuum, contamination from the coating layer material, and damage caused by the laser pulse action due to structural changes caused by fast heating or cooling. Often repeated cycling of experiments will not affect the outcome of measurements, but this will depend on the material and the maximum temperature reached. If
390
THERMAL ANALYSIS
the specimen structure is in a state that is susceptible to thermal treatment, temperature cycling will definitely involve specimen modification. If the specimen material is an alloy whose alloying components preferentially evaporate at temperatures much below its melting point, experiments above this temperature will lead to undesirable modifications of specimen composition. A vacuum environment will definitely stimulate this process. How big the damage will be and how much it will affect the outcome of the flash diffusivity results will depend on the specimen diameter-to-thickness ratio, the maximum operating temperature, and the length of time that the specimen is exposed to undesirable temperatures as well as the relative vapor pressures of the constituents. For flash diffusivity measurements, specimens of materials that are partly or totally transparent or translucent must be coated. If the sample is porous, colloidal graphite from the coating may penetrate the specimen structure, evidenced by a change of its color, and will most likely influence its overall thermal properties. Such a change of the sample’s thermal optical properties, however, does not necessarily preclude the measurement of thermal diffusivity by this technique. Within the small specimen temperature excursion caused by the laser pulse, a small amount of graphite within grain interstices will not affect the basic mechanisms governing energy transport through it. Although the powerful laser energy pulse might cause damage to the specimen surface, in testing common materials like metals, graphites, and ceramics and with energy densities of 10 J/cm2, adverse effects of specimen modification have not been observed. More danger lies in defining the depth of the energy pulse absorption in the case of rough surfaces typical of, e.g., composite materials, as this affects the basic geometry parameter L, which enters as a squared term in Equation 3.
PROBLEMS Some of the most common problems are discussed below: 1. Optimum half-rise time is 40 to 100 ms. Half times are controlled by sample thickness and diffusivity value. Longer times have larger heat loss corrections. Shorter times have finite-pulse-time effects and greater uncertainty in baseline values. 2. Very thin samples have large uncertainty in sample thickness (which enters as a squared term), larger surface damage effects (sample preparation), and possibly too large a temperature rise (nonlinear IR detector response). Also, thin samples may not be sufficiently homogeneous. 3. Rear-face temperature rise may be too large (resulting in nonlinear IR response and laser damage) or too small (resulting in noisy signal). Sample emissivity and laser power should be controlled to change the energy absorbed and linearity of the IR detector response. Heat-resistive paints can be effectively used to increase sample resistivity in the case of a small signal.
4. A nonuniform laser beam can be a major problem. The uniformity should be checked using laser foot print paper with a partially absorbing solution to reduce laser power to the paper. Copper sulfate solution is very good for this purpose (for Nd:YAG primary frequency). If the beam is nonuniform, adjusting the dielectric mirrors or using optics may improve homogeneity. 5. Scattered radiation can cause spurious signals and even temporarily saturate the IR detector, making baseline determinations difficult. 6. Applying a coating to translucent samples may result in a thermal contact resistance. Thermal contact resistance may lower the value of the measured thermal diffusivity of the sample material. Coatings applied to high-diffusivity samples are especially prone to this problem. 7. Diffusivity values for porous materials can be strongly affected by the surrounding gas and its moisture content. The diffusivity values for gases are very large, even though their conductivity values are quite small. The laser flash technique is an ASTM (1993) standard (E-1461) and the step-by-step procedures are given there. This standard is easily obtained and cannot be duplicated here. Also included in ASTM E1461-92 is a discussion of the measurement errors and the ‘‘nonmeasurement’’ errors. The latter arise from the nonobeyance of the initial and boundary conditions used in the data analysis. In general, the nonmeasurement errors cause greater uncertainty in the results than measurement errors, which simply involve length and time measurements. Both of these can generally be measured to a high degree of accuracy. Heat losses and finite-pulse-time effects have been studied intensively and procedures for correcting these are generally adequate. Nonuniform heating is more insidious since there is an infinite variety of nonuniformities possible and the nonuniformity can change with time as room temperature varies and the operating characteristics of the laser change. Testing with an acceptable standard such as a fine-grain, isotropic graphite AXM5Q (Hust, 1984; ASTM, 1993) is useful, but the errors in the actual tests may be significantly different due to, e.g., sample translucency, significantly different emissivity, and rise times.
LITERATURE CITED American Society for Testing and Materials (ASTM). 1993. Standard Test Method for Thermal Diffusivity of Solids by the Flash Method, E1461–92. ASTM, Philadelphia, PA. Azumi, T. and Takahashi, Y. 1981. Rev. Sci. Instrum. 52:1411– 1413. Balageas, D. L. 1982. Rev. Phys. Appl. 17:227–237. Beck, J. V. and Arnold, K. J. 1977. Parameter Estimation in Engineering and Science. Wiley, New York. Beck, J. V. and Dinwiddie, R. B. 1997. Parameter estimation method for flash thermal diffusivity with two different heat transfer coefficients In Thermal Conductivity 23 (K. E. Wilkes,
THERMAL DIFFUSIVITY BY THE LASER FLASH TECHNIQUE R. B. Dinwiddie, and R. S. Graves, eds.). pp. 107–118. Technomic, Lancaster. Beedham, K. and Dalrymple, I. P. 1970. The measurement of thermal diffusivity by the flash method. An investigation into errors arising from the boundary conditions. Rev. Int. Hautes Temp. Refract. 7:278–283. Cape, J. A. and Lehman, G. W. 1963. Temperature and finite pulse-time effects in the flash method for measuring thermal diffusivity. J. Appl. Phys. 34:1909. Carslaw, H. S. and Jaeger, J. C. 1959. Conduction of Heat in Solids, 2nd ed. Oxford University Press, Oxford. Clark III, L. M. and Taylor, R. E. 1975. Radiation loss in the flash method for thermal diffusivity. J. Appl. Phys. 46:714. Cowan, R. D. 1963. Pulse method of measuring thermal diffusivity at high temperatures. J. Appl. Phys. 34:926. Degiovanni, A. 1987. Int. J. Heat Mass Transfer 30:2199–2200. Degiovanni, A. and Laurent, M. 1986. Rev. Phys. Appl. 21:229– 237. Gembarovic, J. and Taylor, R. E. 1994a. A new technique for data reduction in the laser flash method for the measurement of thermal diffusivity. High Temp. High Pressures 26:59–65. Gembarovic, J. and Taylor, R. E. 1994b. A new data reduction in the laser flash method for the measurement of thermal diffusivity. Rev. Sci. Instrum. 65:3535–3539. Gembarovic, J., Vozar, L., and Majernik, V. 1990. Using the least square method for data reduction in the flash method. Int. J. Heat Mass Transfer 33:1563–1565. Heckman, R. C. 1976. Error analysis of the flash thermal diffusivity technique. In Proceedings of the Fourteenth International Thermal Conductivity Conference, Vol. 14 (P. G. Klemens and T. K. Chu, eds.). Plenum Press, New York. Hust, J. G. 1984. Standard reference materials: A fine-grained, isotropic graphite for use as NBS thermophysical property RM’s from 5 to 2500 K. NBS Special Publication 260–289. Koski, J. A. 1982. Improved data reduction methods for laser pulse diffusivity determination with the use of microcomputers In Proceedings of the Eighth Symposium on Thermophysical Properties (A. Cezairliyan, ed., Vol. II), pp. 94–103. American Society of Mechanical Engineers, New York.
391
Taylor, R. E. 1975. Critical evaluation of flash method for measuring thermal diffusivity, Rev. Int. Hautes Temp. Refract. 12:141–145. Taylor, R. E. and Clark, L. M., III. 1974. Finite pulse time effects in flash diffusivity method. High Temp. High Pressures 6:65. Taylor, R. E. and Maglic, K. D. 1984. Pulse method for thermal diffusivity measurement. In Compendium of Thermophysical Property Measurement Methods, Vol. 1: Survey of Measurement Techniques (K. Maglic, A. Cezairliyan, and V. E. Peletsky, eds.). pp. 305–334. Plenum Press, New York. Taylor, R. E., Lee, T. Y. R., and Donaldson, A. B. 1978. Thermal diffusivity of layered composites. In Thermal Conductivity 15 (V. V. Mirkovich, ed.). pp. 135–148. Plenum Press, New York. Vozar, L., Gembarovic, J., and Majernik, V. 1991a. New method for data reduction in flash method. Int. J. Heat Mass Transfer 34:1316–1318. Vozar, L., Gembarovic, J., and Majernik, V. 1991b. An application of data reduction procedures in the flash method. High Temp. High Pressures 23:397–402. Watt, D. A. 1966. Theory of thermal diffusivity by pulse technique. Br. J. Appl. Phys. 17:231–240.
KEY REFERENCES Taylor and Maglic, 1984. See above. Survey of thermal diffusivity measurement techniques. Maglic and Taylor, 1992. See above. Specific description of the laser flash technique.
INTERNET RESOURCES http:/www.netlib.org Collection of mathematical software, papers, and databases. http:/www.netlib.org/odrpack/ ODRPACK 2.01—Software package for weighted orthogonal distance regression (nonlinear fitting procedure used to calculate optimal values of the unknown parameters).
Larson, K. B. and Koyama, K. 1968. Correction for finite pulsetime effects in very thin samples using the flash method of measurement thermal diffusivity. J. Appl. Phys. 38:465.
APPENDIX
Mackay, J. A. and Schriempf, J. T. 1976. Corrections for nonuniform surface heating errors in flash-method thermal diffusivity measurements. J. Appl. Phys. 47:1668–1671.
The heat balance equation for transient conditions may be written as
Maglic, K. D. and Taylor, R. E. 1984. The apparatus for thermal diffusivity measurement by the laser pulse method. In Compendium of Thermophysical Property Measurement Methods, Vol. 2: Recommended Measurement Techniques and Practices (K. Maglic, A. Cezairliyan, and V. E. Peletsky, eds.). pp. 281– 314. Plenum Press, New York. Maglic, K. D., Cezairliyan, A., and Peletsky, V. E. (eds.). 1984. Compendium of Thermophysical Property Measurement Methods, Vol 1: Survey of Measurement Principles. Plenum Press, New York. Parker, W. J., Jenkins, R. J., Buttler, C. P., and Abbott, G. L. 1961. Flash method of determining thermal diffusivity, heat capacity and thermal conductivity. J. Appl. Phys. 32:1679. Takahashi, Y., Yamamoto, K., Ohsato, T., and Terai, T. 1988. Usefulness of logarithmic method in laser-flash technique for thermal diffusivity measurement. In Proceedings of the Ninth Japanese Symposium on Thermophysical Properties (N. Araki, ed.). pp. 175–178. Japanese Thermophysical Society, Sapporo.
r l rT þ ðinternal sources and sinksÞ ¼ Cp r
qT qt
ð7Þ
where l is the thermal conductivity, Cp is the specific heat at constant pressure, and r is the density. If there are no internal sources and sinks, r l rT ¼ Cp r
qT qt
ð8Þ
For homogeneous materials whose thermal conductivity is nearly independent of temperature, we may treat l as a constant. Then r lrT becomes lr2 T, and Equation 8 can be written as l r 2 T ¼ Cp r
qT qt
ð9Þ
392
THERMAL ANALYSIS Table 2. Values of Kx in Equation 19
or
x (%)
Cp r qT 1 qT ¼ l qt a qt
r2 T ¼
ð10Þ
where a ¼ l/Cpr is the thermal diffusivity. For one-dimensional heat flow
a
q2 T qT ¼ qx2 qt
ð11Þ
The assumed adiabatic conditions at the faces of a plate of thickness L result in boundary conditions qTð0; tÞ qTðL; tÞ ¼ ¼0 qx qx
ð12Þ
t>0
The solution of Equation 11 defining temperature at a given time at position x within the plate is then given by 2 2 1 2X n p at f ðx Þ dx þ exp L L2 0 n¼1 ðL npx npx0 0 dx f ðx0 Þcos cos L 0 L ðL
1 Tðx; tÞ ¼ L
0
0xg
:
g<x 40s. Locating the probes closer than four probe spacings from the wafer edge can also result in measurement error. Correction factors to account for edge proximity can be found in Schroder (1990). The Van der Pauw method can determine the resistivity of small, arbitrarily shaped layers and generally requires less surface area than the four-point probe method (Van der Pauw, 1958). It is often used in integrated circuit processing. The method considers four small contacts placed around the periphery of a homogeneous, uniform thickness (t) sample, as indicated in Figure 6. In this figure, a resistance Rab,cd is determined by driving a current from point a to b and measuring the voltage from point c to d, or Rab;cd ¼
Figure 5. In-line four-point probe measurement of a conductive film of thickness, t, uses a known current source, high-impedance voltmeter, and spring-loaded sharp probes.
ð7Þ
jVc Vd j jIab j
ð9Þ
Figure 6. Van der Pauw measurement of an arbitrarily shaped sample uses a known current and a high-impedance voltmeter.
406
ELECTRICAL AND ELECTRONIC MEASUREMENTS
Figure 8. Common Van der Pauw structures: (A) square (contacts placed on edges, or placed on each corner), (B) ‘‘Greek Cross,’’ and (C) clover leaf.
Figure 7. Van der Pauw function, F, plotted versus resistance ratio.
Using a conformal mapping approach Van der Pauw shows that exp
pRab;cd t pRbc;da t þ exp ¼1 r r
ð10Þ
Solving for resistivity gives r¼
pt Rab;cd þ Rbc;da F ln 2 2
ð11Þ
where the value of F is given by Rab;cd Rbc;da F expðln 2=FÞ ¼ arccosh Rab;cd þ Rbc;da ln 2 2
ð12Þ
This equation is used to plot F as a function of the resistance ratio Rab,cd/Rbc,da in Figure 7. Equation 11 is simplified for 4-fold symmetrical shapes. For instance, the rotation of the structures in Figure 8 by 908 should result in the same measured resistance for a uniform-thickness, homogeneous film with identical contacts. In such a case, F ¼ 1 and r¼
pt Rab;cd ¼ 4:532tRab;cd ln 2
ð13Þ
themselves have a tightly controlled spacing, typically 1 mm 0.01 mm, and the probe tips are individually spring-loaded with a force typically between 1 and 2 N to minimize damage to the film surface during probing. Hard tungsten probe tips are common, but osmium tips are also available, which can make good contact when heated on a semiconductor. A typical measurement range for such a system is 0.001 -cm to 1000 -cm. In Van der Pauw measurements, it is common to calculate resistivity from two sets of measurements (Rab,cd and Rbc,da). For uniform samples with good contacts, the same results should be measured. The square pattern shown in Figure 8A, or a circular pattern with four points equidistant about the periphery, can be made very small for integrated-circuit film measurements and are often used in practice. However, because of probe alignment difficulties and problems making ideal probe to sample contacts, the ‘‘Greek Cross’’ of Figure 8B or the clover leaf structure of Figure 8C is often used. These shapes isolate the contacts and reduce the error encountered for finite-sized probe tips that are not located all the way out to the periphery of the sample. Method Automation As with bulk measurements, four-point probes are routinely connected to current sources and voltage meters for automated measurements. Four-point probing systems typically contain an adjustable constant current source used in conjunction with a high-impedance voltmeter. These systems automatically test and average both forward and reverse signals to reduce errors from thermal effects and rectifying contacts. More elaborate Van der Pauw measurement apparatus will make multiple measurements (Rab,cd and Rbc,da) and average the results. Problems
Practical Aspects of the Method Like the four-point method used for bulk resistivity measurements, separating the current source from the highimpedance voltage meter avoids errors associated with contact resistance. Also, when considering semiconductor measurements, sufficient separation between the current and voltage probes is required so that minority carriers injected near the current probes recombine before their presence can be felt at the voltage probes. The probes
The measured surface resistance depends on the doping profile, which affects both carrier concentration and carrier mobility. The conductivity of such a doped semiconductor is therefore a function of depth into the semiconductor. Measuring the surface resistance of a doped semiconductor assumes that the substrate below the junction depth is much more resistive than the layer to be measured. If the substrate is appreciably conductive, then a surface resistance measurement can still be made by forming a reverse-biased diode with the substrate.
CONDUCTIVITY MEASUREMENT
407
NON-CONTACT METHODS Principles of the Method Contact-free test methods are useful when electrical contacts are difficult to make or when the sample must not be damaged. Also, a contactless measurement can be much quicker, since the time it takes to make contacts is eliminated. Eddy current approaches and relaxation techniques are often used for non-contact measurement. While such methods are not as accurate as the contact approaches discussed above, errors below 5% are achievable. The most widely used approach for contactless measurement of conductivity is the eddy current method (ASTM E1004, 1991). In its most straightforward realization, an alternating current-carrying coil is placed close to a conductive sample. The magnetic fields generated by the coil induce circulatory currents, called eddy currents, in the sample which act to oppose the applied magnetic field. Small impedance changes are then measured in the coil as it is loaded by the conductive sample. The measured impedance is a function of sample conductivity, and also of the sample’s mechanical structure near the surface. In fact, eddy current testing is routinely used to inspect the mechanical condition of conductive surfaces (Cartz, 1995). Figure 9 illustrates the general measurement approach. The coil is driven by a sinusoidal current source, and measurement with a high-impedance voltmeter determines the coil impedance, Z ¼ V/I. The current through the coil induces eddy currents in the conductive sample which act to oppose the change in magnetic flux. The induced eddy currents are in a plane parallel to a plane containing the coil loops and are a function of sample conductivity, magnetic permeability, and thickness, along with driving coil properties such as number of turns and distance from the sample. The coil itself may also be wrapped around or contained within a ferrite block to enhance the magnetic field. An equivalent circuit for the simple test setup is shown in Figure 10, where the mutual inductance term M accounts for field coupling into the sample, which appears as a series RL circuit. It can be shown (Cartz, 1995) that the impedance looking into the circuit is o 2 M 2 R0 o3 M2 L0 Z ¼ RT þ jXT ¼ R þ 02 þ j oL R þ o2 L02 R02 þ o2 L02
Figure 10. Equivalent circuit for the single coil eddy-current measurement approach.
where RT and XT represent the total real and imaginary parts of the impedance, respectively, and j is the imaginary square root of (1). A plot of XT versus RT for a series of samples traces a conductivity curve as shown in Figure 11. This is a comparison method where samples of known conductivity are plotted, and the test sample behavior is then inferred from its location on the chart. Another non-contact method is the relaxation approach, where a voltage step function applied to a resistivereactive circuit will result in an exponential response. For instance, when the switch in Figure 12 is opened, the capacitor discharges through the resistor such that Vr ðtÞ ¼ Vs exp ðt=RCÞ
ð15Þ
When time t equals one ‘‘RC time constant,’’ the voltage has dropped to e1 of its initial value. Most often, this approach is used with a known value of resistance in order to measure the reactive component, but knowing the reactive component allows computation of the resistance. This is illustrated in Figure 13, where the voltage VR from Figure 12 is initially charged to 1 V, and at time t ¼ 0 the switch is thrown which allows this voltage to discharge through a 1-mF capacitor. The relaxation for three different resistance values is shown. This relaxation approach has been used in conjunction with eddy currents where the sample is excited and the fields are measured as they decay (Kos and Fickett, 1994). Practical Aspects of the Method
ð14Þ
The eddy current approach requires two or more calibration standards covering the range of expected sample
Figure 9. Single coil eddy-current measurement circuit. The impedance, Z, looking into the coil is changed by eddy currents induced in the sample.
Figure 11. Impedance plane chart showing a typical conductivity curve.
408
ELECTRICAL AND ELECTRONIC MEASUREMENTS
value of e1 of the surface value at one skin depth, d, given by d ¼ 1=
Figure 12. RC network used in relaxation method.
conductivity. Test conditions, such as driving current and frequency, must be the same for both calibration and test samples. The sample thickness and the distance between the coil and the sample surface (termed the ‘‘lift-off’’) must be carefully controlled. Finally, the coil should not be placed within two coil diameters of a sample discontinuity (such as a hole or edge). In the simple method presented, the driving coil is also used for measurement. Improved eddy-current probes separate the driving and measuring coils, using them to sandwich the test sample. Since very little current is in the measurement coil, eddy currents induced in the sample are essentially all from the driving coil. The measurement coil sees fields from both the driving coil and from the eddy currents in the sample. For best sensitivity, the driving coil fields can be removed from the measurement using a lock-in amplifier circuit (Crowley and Rabson, 1976). Further improvement in the measurement circuit uses an additional set of coils for compensation. Such an approach has been used with miniature radio frequency (RF) coils to provide resistivity mapping with a 4 mm spatial resolution (Chen, 1989). An interesting variation on this method uses a driving coil and a pair of resonant coils to monitor in situ aluminum thin films deposited by chemical vapor deposition (Ermakov and Hinch, 1997). The magnitude of the eddy currents attenuate by ey=d as they penetrate a depth y into the sample, reaching a
pffiffiffiffiffiffiffiffiffiffiffi pf ms
ð16Þ
where f is frequency and m is sample magnetic permeability. For a single-coil system, the effect of thickness is negligible for samples that are at least 2.6 skin depths thick. Note that for measuring surface defects, or the resistivity near the surface, higher frequencies may be employed. For a system where the measuring coil is on the other side of the sample from the driving coil, samples thinner than a skin depth are required. The eddy current method is also used to measure conductivity of cylindrical samples. This is done by comparing the impedance of the coil with and without a cylindrical sample inserted (Zimmerman, 1961). Other approaches have relied on mutual inductance measured between a driving coil and measurement coil wrapped around the cylindrical sample (Rosenthal and Maxfield, 1975). In the relaxation approach, the step function excites all possible resonances, with the one showing the longest decay time being proportional to the sample resistivity. It may be necessary to determine this time constant with a curve-fitting routine after sufficient time has passed for the other resonances to subside. Also, the time constant for the measurement coils must be much shorter than the sample time constant. This approach is most often used for low-resistivity samples (metals with r < 104 ohmmicrometer), since otherwise the time constant is excessive. A relaxation approach can also find semiconductor resistivity, provided the electrical permittivity is known. Here, a semiconductor wafer is inserted between conductive plates to form a parallel resistive-capacitive circuit. A step voltage is applied to the capacitors, and the current is measured versus time to arrive at an RC time constant equal to the product of the sample resistivity r and electrical permittivity, e, called the ‘‘space charge relaxation time.’’ Details of such a procedure used to map the resistivity profile across a gallium arsenide wafer are provided in Stibal et al. (1991).
MICROWAVE TECHNIQUES Principles of the Method
Figure 13. Response of the RC network with a 1-mF capacitor that has been initially charged to 1 V. The relaxation is shown for three values of resistance. The line at 0.368 indicates when the voltage has decayed to e1 of its initial value.
A number of approaches exist to determine dielectric sample conductivity at high frequencies. The primary approaches use open-ended coaxial probes, coaxial or waveguide transmission lines, free-space radiation, and cavity resonators. Scattering parameter measurements are made using vector network analyzers, which are available at considerable expense, to perform measurements up to 1011 Hz. These scattering parameters can be related to the sample’s electrical permittivity and loss tangent, from which the conductivity can be extracted. From Maxwell’s equations for time harmonic fields r H ¼ ðs þ joeÞE
ð17Þ
CONDUCTIVITY MEASUREMENT
409
where E and H are the electric and magnetic fields, respectively, and o is the angular frequency. Here we consider the conductivity s and the permittivity e within the dielectric. It is convenient to express Equation 17 as r H ¼ joec E
ð18Þ
where ec is a complex permittivity ec ¼ e j
s o
ð19Þ
Plotting the imaginary part of Equation 19 versus the real part reveals where the term loss tangent, tan d, arises tand ¼
s oE
ð20Þ
Here d is the angle made with the real axis. Note that this d has nothing to do with skin depth, which inconveniently shares the same symbol. The term loss tangent is typically applied when discussing dielectric materials, for which a small value is desirable. Extracting the complex permittivity from scattering parameter data is far from trivial and is almost always performed using software within the network analyzer or in post-processing. In the cavity resonator, waveguide transmission line, and free space approaches, it is also possible to extract the material’s complex permeability from the measured scattering parameters (Application Note 1217; see Literature Cited). This can be useful for the study of magnetic materials. Practical Aspects of the Method The four general approaches are briefly described below, along with suitable references. Of particular interest are the publications of Bussey (1967) and Baker-Jarvis et al. (1992). Coaxial probes. In this approach, the open end of a coaxial probe is placed against the flat face of a solid sample as shown in Figure 14A. The probe may also be immersed in a liquid sample. The fringing fields between the center and the outer conductors are affected by the sample material. Complex permittivity is extracted from measurements of the reflection scattering parameter S11. This approach is typically used to measure complex permittivity over a frequency range from 2 108 Hz up to 2 1010 Hz, and has the advantage that no special sample preparation is required. In addition, stainless steel coaxial probes have performed at temperatures up to 10008C (Gershon et al., 1999). Transmission lines. In the various transmission line methods, a material sample is snugly inserted into a section of coaxial transmission line, rectangular waveguide, or circular waveguide (Srivastava and Jain, 1971; Ligthart, 1983; Wolfson and Wentworth, 2000). More than one measurement is required for extraction of complex permittivity data from one port measurements. This can be accomplished by using more than one length of sample or
Figure 14. Microwave measurement approaches: (A) open-ended coaxial probe in contact with sample, (B) sample inserted into coaxial transmission line, (C) free-space approach where sample is placed between horn antennas, and (D) sample placed within a cavity resonator.
by terminating the guide containing the sample in more than one load. The measured complex reflection coefficients can be manipulated to extract complex permittivity. Two-port measurements can be performed on a single sample where in addition to reflection, the transmission properties are measured. Precise machining of the samples to fit inside the transmission line is critical, which can be somewhat of a disadvantage for the coaxial approach (shown in Figure 14B) where an annular sample is required. An advantage of the coaxial approach is that measurements can be made over a much broader frequency range than is possible with the waveguide procedure. Common problems for the transmission line techniques include air gaps, resonances at sample lengths corresponding to half wavelengths, and excitation of higher-order propagating modes. Free-space radiation. Microwaves are passed through material samples using transmitting and receiving antennas as shown in Figure 14C. In addition to measurements through the sample, the transmitting and receiving antennas may be placed at angles on the same side of the sample and a reflection measurement taken. The samples used in free-space techniques must be fairly large, with flat surfaces. Measurement approaches and data processing proceeds as with the transmission line approaches. The free-space approach is a noncontacting method that can be especially useful for making measurements at high temperature or under harsh environmental conditions. It has been employed in measurements ranging from 2 109 Hz up to 1.1 1011 Hz (Otsuka et al., 1999). Cavity resonators. Here, a sample of precisely known geometry is placed within a microwave cavity and changes in the resonant frequency and the resonator Q are measured and processed to yield the complex permittivity (Kraszewski and Nelson, 1992). As shown in Figure 14D,
410
ELECTRICAL AND ELECTRONIC MEASUREMENTS
a coaxial probe inserted into the center of one wall of the cavity is used to supply energy for resonance. If the sample dimensions are precisely known, and if calibration standards of the same dimensions are available, then this approach can yield very accurate results for both complex permittivity and complex permeability, although it tends to have limited accuracy for low-loss materials. It has been employed in measurements ranging from 5 108 Hz up to 1.1 1011 Hz.
ACKNOWLEDGMENT The author would like to thank Professor Peter A. Barnes for his helpful suggestions on this work.
LITERATURE CITED Albers, J. and Berkowitz, H. L. 1985. An alternative approach to calculation of four-probe resistances on nonuniform structures. J. Electrochem. Soc. 132:2453–2456. Albert, M. P. and Combs, J. F. 1964. Correction factors for radial resistivity gradient evaluation of semiconductor slices. IEEE Trans. Electron. Dev. 11:148–151. Application Note 1217-1. Basics of measuring the dielectric properties of materials. Hewlett-Packard literature number 5091– 3300. Hewlett-Packard, Palo Alto, Calif. ASTM B193-1987 (revised annually). Standard test method for resistivity of electrical conductor materials. In Annual Book of Nondestructive Testing. American Society for Testing and Materials, Philadelphia. ASTM E1004-1991 (revised annually). Standard test method for electromagnetic (eddy-current) measurement of electrical conductivity. In Annual Book of Nondestructive Testing. American Society for Testing and Materials, Philadelphia. Baker-Jarvis, J., Janezic, M. D., Grosvenor, J. H. Jr., and Geyer, R. G. 1992. Transmission/reflection and short-circuit line methods for measuring permittivity and permeability. NIST Technical Note 1355. Bussey, H. E. 1967. Measurement of RF properties of materials, a survey. Proc. IEEE. 56:1046–1053. Cartz, L. 1995. Nondestructive Testing, pp. 173–188. American Society for Metals, Materials Park, Ohio. Chen, M. C. 1989. Sensitive contactless eddy-current conductivity measurements on Si and HgCdTe. Rev. Sci. Instrum. 60:1116– 1122. Combs, J. F. and Albert, M. P. 1963. Diameter correction factors for the resistivity measurement of semiconductor slices. Semic. Prod./Solid State Technol. 6:26–27. Coombs, C. F. 1995. Electronic Instrument Handbook, 2nd Ed. McGraw-Hill, New York. Crowley, J. D. and Rabson, T. A. 1976. Contactless method of measuring resistivity. Rev. Sci. Instrum. 47:712–715. Ermakov, A. V. and Hinch, B. J. 1997. Application of a novel contactless conductivity sensor in chemical vapor deposition of aluminum films. Rev. Sci. Instrum. 68:1571–1574. Gershon, D. L., Calame, J. P., Carmel, Y., Antonsen, T. M., and Hutcheon, R. M. 1999. Open-ended coaxial probe for high temperature and broad-band dielectric measurements. IEEE Transactions on Microwave Theory and Techniques. 47:1640– 1648.
Hall, R. 1967. Minimizing errors of four-point probe measurements on circular wafers. J. Sci. Instrum. 44:53–54. Kos, A. B. and Fickett, F. R. 1994. Improved eddy-current decay method for resistivity characterization. IEEE Transactions on Magnetics 30:4560–4562. Kraszewski, A. W. and Nelson, S. O. 1992. Observations on resonant cavity perturbation by dielectric objects. IEEE Transactions on Microwave Theory and Techniques (40)1:151–155. Ligthart, L. P. 1983. A fast computational technique for accurate permittivity determination using transmission line methods. IEEE Transactions on Microwave Theory and Techniques 31: 249–254. Otsuka, K., Hashimoto, O., and Ishida, T. 1999. Measurement of complex permittivity of low-loss dielectric material at 94 GHz frequency band using free-space method. Microwave and Optical Technology Letters (22)5:291–292. Rosenthal, M. D. and Maxfield, B. W. 1975. Accurate determination of the electrical resistivity from mutual inductance measurements. Rev. Sci. Instrum. 46:398–408. Schroder, D. K. 1990. Semiconductor Material and Device Characterization pp. 1–40. Wiley Interscience. Srivastava, G. P. and Jain, A. K. 1971. Conductivity measurements of semiconductors by microwave transmission technique. The Review of Scientific Instruments 42:1793–1796. Stibal, R., Windscheif, J., and Jantz, W. 1991. Contactless evaluation of semi-insulating GaAs wafer resistivity using the timedependent charge measurement. Semicond. Sci. Technol. 6: 995–1001. Valdes, L. B. 1954. Resistivity measurements on germanium for transistors. Proceedings of the IRE 42:420–427. Van der Pauw, L. J. 1958. A method of measuring specific resistivity and Hall effect of discs of arbitrary shape. Phil. Res. Rep. 13: 1–9. Witte, R. A. 1993. Electronic Test Instruments: Theory and Applications, pp. 59–63. Prentice-Hall, Englewood Cliffs N.J. Wolfson, B. and Wentworth, S. 2000. Complex permittivity and permeability measurement using rectangular waveguide. Microwave and Optical Technology Letters 27:180–182. Zimmerman, J. E. 1961. Measurement of Electrical Resistivity of Bulk Metals. Rev. Sci. Instrum. 32:402–405.
KEY REFERENCES Baker-Jarvis et al., 1992. See above. An extremely thorough reference for coaxial and rectangular waveguide transmission line techniques for measuring electrical permittivity and magnetic permeability. Cartz, 1995. See above. Presents eddy-current testing from a material inspection point of view. Coombs, 1995. See above This reference goes into considerable detail on bulk measurements and gives much practical information on measurement instrumentation. Heaney, M. B. 1999. Electrical conductivity and resistivity. In The Measurement, Instrumentation, and Sensors Handbook, Vol. 43 (J.G. Webster, ed.) pp. 1–14. CRC Press. Gives practical advice for making two point, four point, four-point probe and Van der Pauw measurements. It cites equipment for conducting measurements and discusses common experimental errors.
HALL EFFECT IN SEMICONDUCTORS Schroder et al., 1990. See above.
411
field E and magnetic field B is
This gives a thorough discussion of conductivity measurements as they are applied to semiconductors (four-point probing, Van der Pauw methods).
STUART M. WENTWORTH Auburn University Auburn, Alabama
HALL EFFECT IN SEMICONDUCTORS INTRODUCTION The Hall effect, which was discovered in 1879 (Hall, 1879), determines the concentration and type (negative or positive) of charge carriers in metals, semiconductors, or insulators. In general, the method is used in conjunction with a conductivity measurement to also determine the mobility (ease of movement) of the charge carriers. At low temperatures and high magnetic fields, quantum effects are sometimes evident in lower dimensional structures; however, such effects will not be considered here. Also, this unit will concentrate on semiconductors, rather than metals or insulators, although the same theory generally applies (Gantmakker and Levinson, 1987). Three, strong advantages of Hall effect measurements are ease of instrumentation, ease of interpretation, and wide dynamic range. With respect to implementation, the only elements necessary, at the lowest level, are a current source, a voltmeter, and a modest-sized magnet. The carrier concentration can then be calculated within a typical accuracy of 20% without any other information about the material. In our laboratory, we have measured concentrations ranging from 104 to 1020 cm3. Also, the type (n or p) can be unambiguously determined from the sign of the Hall voltage. Competing techniques include capacitance-voltage (C-V) measurements to determine carrier concentration (see CAPACITANCE–VOLTAGE (C-V) CHARACTERIZATION OF SEMICONDUCTORS); thermoelectric probe (TEP) measurements to determine carrier type; and magnetoresistance (MR) measurements to determine mobility (Look, 1989). The C-V measurements have an advantage of depth profile information but a disadvantage of requiring a Schottky barrier. The TEP measurements require a temperature gradient to be imposed and are often ambiguous in the final conclusions. The MR measurements require a high mobility m or high magnetic field strength B to implement, because the signal varies as m2B2 rather than m B, as in the Hall effect. Further comparisons of these techniques, as well as others, can be found in monographs by Runyan (1975), Look (1989), and Schroder (1990).
m v_ ¼ eðE þ v BÞ m
v veq t
ð1Þ
where m* is the effective mass, veq is the velocity at equilibrium (steady state), and t is the velocity or (momentum) relaxation time, i.e., the time in which oscillatory phase information is lost through collisions. Consider a rectangular sample, as shown in Figure 1, with an external electric field Eex ¼ Exx and magnetic field B ¼ Bzz. Then, if no current is allowed to flow in the y direction (i.e., vy ¼ 0), the steady-state condition v_ ¼ 0 requires that Ey ¼vxBz, and Ey is known as the Hall field. For electron concentration n, the current density jx ¼ nevx, and thus, Ey ¼ jxBz/ en jxBzRH, where RH¼1/en is the Hall coefficient. Thus, simple measurements of the quantities Ey, jx, and Bz yield a very important quantity, n, although a more detailed analysis given below slightly modifies this relationship. The above analysis assumes that all electrons are moving with the same velocity v (constant t), which is not true in a semiconductor. To relieve this constraint, we note that Equation 1 consists of three, coupled differential equations (in vx, vy, vz) that can be solved by standard techniques. After averaging over energy, the steady-state currents can then be shown to be jx ¼ sxx E þ sxy Ey jy ¼ syx Ex þ syy Ey
ð2Þ ð3Þ
where the si j are elements of the conductivity tensor, defined by t ð4Þ sxx ¼ syy ¼ 1 þ o2c t2 oc t2 sxy ¼ syx ¼ ð5Þ 1 þ o2c t2 Here oc ¼ eB=m is the cyclotronic frequency, where B is the magnitude of B, and the brackets denote an average over energy e taken as follows: Ð1 hFðeÞi ¼
0
FðeÞe3=2 qqef0 de ! Ð1 3=2 q f0 de 0 e qe
Ð1 3=2 e=kT e de 0 Ð FðeÞe 1 3=2 e=kT e e de 0
ð6Þ
PRINCIPLES OF THE METHOD A phenomenological equation of motion for electrons of charge e moving with velocity v in the presence of electric
Figure 1. Hall bar configuration for resistivity and Hall effect measurements.
412
ELECTRICAL AND ELECTRONIC MEASUREMENTS
where FðeÞ is any function of e and f0 is the Fermi-Dirac distribution function. The second equality in Equation 6 holds for nondegenerate electrons, i.e., those describable by a Boltzmann distribution function. For small magnetic fields, i.e., oc t 1, and under the usual constraint jy¼0, it is easy to show that 2
ne hti Ex nemc Ex m Ey 1 ht2 i r ¼ ¼ RH ¼ ne hti2 en jx B jx ¼
ð7Þ ð8Þ
where r is the ‘‘Hall factor’’ and mc ¼ ehti=m is known as the ‘‘conductivity’’ mobility, since the quantity nemc is just the conductivity s. We define the ‘‘Hall’’ mobility as mH ¼ RH s ¼ rmc and the ‘‘Hall’’ concentration as nH ¼ n=r ¼ 1=eRH . Thus, a combined Hall effect and conductivity measurement gives nH and mH , not n and mc ; fortunately, however, r is usually within 20% of unity and almost never as large as 2. In any case, r can often be calculated or measured ½r ¼ RH ðmB 1Þ=RH ðmB 1Þ so that an accurate value of n can usually be determined. The relaxation time tðeÞ depends on how the electrons interact with the lattice vibrations as well as with extrinsic elements, such as charged impurities and defects. For example, acoustical-mode lattice vibrations scatter electrons through the deformation potential (leading to a relaxation time tac ) and piezoelectric potential (tpe ); optical-mode vibrations through the polar potential (tpo ); ionized impurities and defects through the screened coulomb potential (tii ); and charged dislocations, also through the coulomb potential (tdis ). The strengths of these various scattering mechanisms depend upon certain lattice parameters, such as dielectric constants and deformation potentials, and extrinsic factors, such as donor, acceptor, and dislocation concentrations ND, NA, and Ndis, respectively (Rode, 1975; Wiley, 1975; Nag, 1980; Look, 1989). The total momentum scattering rate, or inverse relaxation time, is 1 1 1 1 t1 ðeÞ ¼ t1 ac ðeÞ þ tpe ðeÞ þ tpo ðeÞ þ tii ðeÞ þ tdis ðeÞ
ð9Þ
This expression is then used to determine htðeÞi and ht2 ðeÞi via Equation 6, and thence, mH ¼ eht2 i=m hti. Since mH is a function of temperature T, a fit of mH vs. T can often be used to determine ND, NA, and Ndis. The mH -vs.-T curve should be solved simultaneously with the n-vs.-T curve, given by the charge-balance equation (CBE) n þ NA ¼ ðg0 =g1 ÞNC0
ND 1 þ n=fD
For a p-type sample, we use the nearly equivalent equation p þ ND ¼
ð10bÞ
where p is the hole concentration and fA ¼ ðg1 =g0 Þ NV0 exp ðaA =kÞT3=2 expðEA0 =kTÞ, and where NV0 ¼ 2ð2pmp kÞ3=2 =h3 and EA ¼ EA0 aA T: The mobility analysis described above, known as the relaxation time approximation (RTA), is limited to elastic (energy-conserving) scattering processes, because for these, a relaxation time tðeÞ can be well defined. Unfortunately, this is usually not the case for polar optical-mode (po) scattering, since the energy exchanged in the po scattering event is typically kT. However, we can often approximate tpo by an analytic formula, and, in any case, at low temperatures, po scattering is not very important. Nevertheless, for the most accurate calculations, the Boltzmann transport equation (BTE) should be solved directly, as discussed in several references (Rode, 1975; Nag, 1980; Look, 1989). Hall samples do not have to be rectangular, such as that shown in Figure 1; in fact, we will discuss arbitrarily shaped specimens below (see Practical Aspects of the Method). However, the above analysis does assume that n and m are homogeneous throughout the sample. If n and m vary with depth z only, then the measured quantities are ssq ¼
ðd
sðzÞ dz ¼ e
ðd
0
RHsq s2sq ¼
ðd
nðzÞmðzÞ dz
ð11Þ
0
nðzÞm2 ðzÞ dz
ð12Þ
0
where d is the sample thickness and the subscript ‘‘sq’’ denotes a sheet (areal) quantity (in reciprocal centimeters squared) rather than a volume quantity (in reciprocal cubic centimeters). If some of the carriers are holes rather than electrons, then the sign of e for those carriers must be reversed. The general convention is that RH is negative for electrons and positive for holes. In some cases, the hole and electron contributions to R2Hsq ssq exactly balance at a given temperature, and this quantity vanishes. PRACTICAL ASPECTS OF THE METHOD Useful Equations The Hall bar structure of Figure 1 is analyzed as follows: Ex ¼ Vc/l, Ey ¼ VH/w, and jx ¼ I/wd, so that
ð10aÞ s ¼ r1 ¼
3=2
where fD ¼ expðaD =kÞT expðED0 =kTÞ: Here, g0 =g1 is a degeneracy factor (¼ 1=2 for an s-state), NC0 ¼ 2ð2pmn kÞ3=2 =h3 , where h is Planck’s constant, ED is the donor energy, k is Boltzmann’s constant, and ED0 and aD are defined by ED ¼ ED0 aD T. If more than one donor exists within a few kT of the Fermi energy, then equivalent terms are added on the right-hand side of Equation 10a.
NA 1 þ p=fA
RH ¼
jx Il ¼ Ex Vc wd
Ey VH d ¼ jx B IB
mH ¼ Rs ¼
VH l Vc wB
nH ¼ ðeRÞ1
ð13Þ ð14Þ ð15Þ ð16Þ
HALL EFFECT IN SEMICONDUCTORS
In the meter-kilogram-second (mks) system, current I is in amperes, voltage V in volts, magnetic field strength B in tesla, and length l, width w, and thickness d in meters. By realizing that 1 T ¼ 1 V s/m2, 1 A ¼ 1 C/s, and 1 ¼ 1 V/A, we find that s is in units of 1 m1 , RH in m3/C, mH in m2/ V s, and nH in m3 . However, it is more common to denote s in 1 cm1 , RH in cm3/C, mH in cm2/V s, and nH in cm3 , with obvious conversion factors (1 m ¼ 102 cm). Since B is often quoted in gauss, it is useful to note that 1 T ¼ 104 G. Although the Hall bar configuration discussed above is the simplest and most straightforward geometry to analyze, it is not the most popular shape in use today. The reason stems from a very convenient formulation by van der Pauw (1958) in which he solved the potential problem for a thin layer of arbitrary shape. A convenient feature of the van der Pauw technique is that no dimension need be measured for the calculation of sheet resistance or sheet carrier concentration, although a thickness must of course be known for volume resistivity and concentration. Basically, the validity of the van der Pauw method requires that the sample be flat, homogeneous, isotropic, and a singly connected domain (no holes) and have line electrodes on the periphery, projecting to point contacts on the surface, or have true point contacts on the surface. The last requirement is the most difficult to satisfy, so that much work has gone into determining the effects of finite contact size. Consider the sample shown in Figure 2A. Here, a current I flows between contacts 1 and 2, and a voltage Vc is measured between contacts 3 and 4. Let resistance Rij,kl Vkl/Iij, where the current enters contact i and leaves contact j and Vkl ¼ Vk Vl. [These definitions, as well as the contact numbering, correspond to ASTM (1988) Standard F76.] The resistivity r, with B ¼ 0, is then calculated as pd R21;34 þ R32;41 r¼ f ð17Þ 2 lnð2Þ
413
Figure 3. Resistivity ratio function used to correct the van der Pauw results for asymmetric sample shape.
for determining f due to Wasscher and reprinted in Weider (1979). First calculate a from lnð1=2 aÞ lnð1=2 þ aÞ
ð19Þ
lnð1=4Þ lnð1=2 þ aÞ þ lnð1=2 aÞ
ð20Þ
Q¼ and then calculate f from f ¼
Here, it is of course required that 1/2 < a < 1/2, but this range of a covers Q ¼ 0 to Q ¼ 1. For example, a ratio Q ¼ 4.8 gives a value a 0.25, and then f 0.83. Thus, the ratio must be fairly large before r is appreciably reduced. It is useful to further average r by including the remaining two contact permutations, and also reversing current for all four permutations. Then pd ðR21;34 R12;34 þ R32;41 R23;41 ÞfA lnð2Þ 8 ðR43;12 R34;12 þ R14;23 R41;23 ÞfB þ 8
r¼
ð21Þ
where fA and fB are determined from QA and QB, respectively, by applying either Equation 18 or 19. Here, where f is determined from a transcendental equation: Q1 f 1 lnð2Þ ¼ arccosh exp Q þ 1 lnð2Þ 2 f
R21;34 R12;34 R32;41 R23;41 R43;12 R34;12 QB ¼ R14;23 R41;23
QA ¼ ð18Þ
Here, Q ¼ R21,34/R32,41 if this ratio is greater than unity; otherwise, Q ¼ R32,41/R21,34. A curve of f vs. Q accurate to 2% is presented in Figure 3 (van der Pauw, 1958). Also useful is a somewhat simpler analytical procedure
ð22Þ ð23Þ
The Hall mobility is determined by using the configuration of Figure 2B, in which the current and voltage contacts are crossed. The Hall coefficient becomes RH ¼
d R31;42 þ R42;13 B 2
ð24Þ
In general, to minimize magnetoresistive and other effects, it is useful to average over current and magnetic field polarities. Then
Figure 2. Arbitrary shape for van der Pauw measurements: (A) resistivity; (B) Hall effect.
RH ¼ ðd=BÞ½R31;42 ðþBÞ R13;42 ðþBÞ þ R42;13 ðþBÞ R24;13 ðþBÞ þ R13;42 ðBÞ R31;42 ðBÞ þ R24;13 ðBÞ R42;13 ðBÞ=8 ð25Þ
414
ELECTRICAL AND ELECTRONIC MEASUREMENTS
Sensitivity The most common magnetic field strength used by researchers for Hall effect measurements in 5 kG ¼ 0.5 T. Also, typical semiconductor materials (e.g., GaAs) vary in resistivity r from 104 to 109 cm, depending on intrinsic factors as well as impurity and defect concentrations. Finally, typical sample thicknesses d may range from 105 to 101 cm. Consider the worst case on the high-resistivity end of the spectrum. Let r ¼ 109 cm, d ¼ 105 cm, and l/w 1. Then, if a voltage of 100 V can be applied, the current will be 1 pA, which can be measured with an electrometer. If the mobility mH 103 cm2/ V s, then the Hall coefficient R will be 1012 cm3/C, and the Hall voltage VH will be 5 V. Conversely, let r ¼ 104 cm and d ¼ 101 cm. Then, the resistance is about 103 , and we will not be able to apply more than about 103 V in order to avoid sample heating. Such a voltage will produce 1 A of current and a resulting Hall voltage of 50 mV, again easily measurable. Of course, if mH 1 cm2/V s, then VH 50 nV, which can still probably be measured if noise levels are low. We may note that the lowest mobility that we have measured in our laboratory, by using our standard dc techniques, is 0.1 cm2/V s.
METHOD AUTOMATION A basic design for a high-impedance, automated van der Pauw–Hall apparatus, which can accommodate samples of up to 1012 is given in Figure 4. All components are commercially available, and the electrometers can, of course, be replaced by high-impedance, unity-gain buffer amplifiers, although possibly at some sacrifice in input
impedance. The inputs and outputs of the high-impedance scanner and the inputs to the ammeter and electrometers are of triaxial design, which has the advantage that the inner shields can be driven by the unity-gain output of the electrometers, which effectively reduces cable-charging effects. The outer shields are all grounded at a common point, although in practice the grounding may not be critical. The current source should have an effective output impedance of about 1012 and be able to regulate currents of 1010 A. At higher resistance levels, the current regulation will cross over to voltage regulation, and the currents will diminish; however, the data will still be accurate as long as the actual current and voltage are both measured. The low-impedance scanner should be mainly designed for low thermal offsets (say, a few microvolts) since current leakage is not a problem on this side of the electrometers. Note that although the ammeter needs to be of electrometer design, to be able to measure the very low currents, the voltmeter does not, since the electrometer output impedances are only a few kilohms. The wiring to the high-impedance scanner in this design allows the current source and current sink to be individually applied to any desired sample contact. Also, the high input and low input on the voltmeter can be connected to any electrometer unity-gain output through the low-impedance scanner. Figure 4 does not represent the most efficient design in terms of the minimum number of scanner switches, but it does illustrate the main ideas quite well. Although not shown here, for very low resistance samples it is desirable, and sometimes necessary, to be able to bypass the electrometers, since their typical noise levels of 10-mV peak to peak may be unacceptably high. We will not discuss the computer and peripherals in any detail here, because so many variations available today will work well. However, we would recommend that the system be designed around the standard IEEE-488 interface bus, which allows new instruments to be added at will and effectively eliminates serious hardware problems. Complete Hall systems, including temperature and magnetic field control, are available from several companies, including Bio-Rad Semiconductor, Keithley Instruments, Lake Shore Cryotronics, and MMR Technologies.
DATA ANALYSIS AND INITIAL INTERPRETATION
Figure 4. Schematic of an automated, high-impedance Hall effect apparatus.
The primary quantities determined from Hall effect and conductivity measurements are the Hall carrier concentration nH or pH and mobility mH. As already discussed, nH ¼ 1/eRH, where RH is given by Equation 24 (for a van der Pauw configuration) and mH ¼ RHs ¼ RH/r, where r is given by Equation 21. Although simple 300-K values of r, nH, and mH are quite important and widely used, it is in temperature-dependent Hall (TDH) measurements that the real power of the Hall technique is demonstrated, because then the donor and acceptor concentrations and energies can be determined. We will illustrate the methodology with a two-layer GaN problem. The GaN sample discussed here was a square (6 6-mm) layer grown on sapphire to a thickness d ¼ 20 mm.
HALL EFFECT IN SEMICONDUCTORS
Figure 5. Uncorrected Hall concentration data (squares) and fit (solid line) and corrected data (triangles) and fit (dashed line) vs. inverse temperature.
Small indium dots were soldered on the corners to provide ohmic contacts, and the Hall measurements were carried out in an apparatus similar to that illustrated in Figure 4. Temperature control was achieved by using a He exchange-gas dewar. The temperature dependences of nH and mH are shown by the squares in Figures 5 and 6, respectively. At temperatures below 30 K (i.e., 103/T > 33), the carriers (electrons) in the main part of the layer ‘‘freeze out’’ on their parent donors and thus are no longer available for conduction. However, this sample had a very thin, strongly n-type layer between the sapphire and GaN, and the carriers in such a layer do not freeze out and, in fact, have a temperature-independent concentration and mobility, as seen at low T (high 103/T) in Figures 5 and 6, respectively. Thus, we need to use the depth-profile analysis given by Equations 11 and 12. For two layers, Equations 11 and 12 give mH ¼ Rsq ssq ¼
nH ¼
Rsq s2sq =d m2H1 nH1 þ m2H2 nHsq2 =d ¼ ssq =d mH1 nH1 þ mH2 nHsq2 =d
ð26Þ
s2sq =d2 nHsq ðmH1 nH1 þ mH2 nHsq2 =dÞ2 1 ¼ ¼ ¼ eRsq d eRsq s2sq =d d m2H1 nH1 þ m2H2 nHsq2 =d ð27Þ
415
where layer 1 is the main, 20-mm-thick GaN growth and layer 2 is the very thin interface region. Since we do not know the thickness of layer 2, we simply normalize to thickness d ¼ 20 mm for plotting purposes. From the figures, we get mH2 ¼ 55 cm2/V s and nHsq2/d ¼ 3.9 1017 cm3 . Because these values are constant, we can invert Equations 26 and 27 and solve for mH1 and nH1 at each temperature. (The resulting equations are given later.) To fit the uncorrected data (squares), we parametrize n vs. T by Equation 10a and mH vs. T by mH ¼ eht2 i=m hti, where t is given by Equation 9. Because n ¼ nH r ¼ nH ht2 i=hti2 the fits of nH vs. T and mH vs. T must be carried out simultaneously. In this case r varies only from about 1.2 to 1.4 as a function of T. Formulas for tac ; tpe; tpo; and tii are given below. For ionized impurity (or defect) scattering in a nondegenerate, n-type material,
tii ðeÞ ¼
e4 ð2NA
29=2 pe20 ðm Þ1=2 e3=2 þ nÞ½lnð1 þ yÞ y=ð1 þ yÞ
ð28Þ
h2 e2 n. Here e0 is the low-frequency where y ¼ 8e0 m kTe= (static) dielectric constant. [If the sample is p type, let (2NA þ n) ! (2ND þ p)]. For acoustic-mode deformationpotential scattering,
tac ðeÞ ¼
p h4 rd s2 e1=2 21=2 E21 ðm Þ3=2 kT
ð29Þ
where rd is the density, s is the speed of sound, and E1 is the deformation potential. For acoustic-mode piezoelectricpotential scattering,
tpe ðeÞ ¼
22=3 p h2 e0 e1=2 e2 P2 ðm Þ1=2 kT
ð30Þ
where P is the piezoelectric coupling coefficient [P ¼ ðh2pz =rs2 e0 Þ1=2 ]. Finally, for polar optical-mode scattering, a rough approximation can be given tpo ðeÞ ¼ ðCpo 23=2 p h2 ðeTpo =T 1Þ½0:762e1=2 þ 0:824ðkTpo Þ1=2 1 0:235ðkTpo Þ1=2 eÞ=½e2 kTpo ðm Þ1=2 ðe1 1 e0 Þ
Figure 6. Uncorrected Hall mobility data (squares) and fit (solid line) and corrected data (triangles) and fit (dashed line) vs. temperature.
ð31Þ
where Tpo is the Debye temperature, e1 is the high-frequency dielectric constant, and Cpo is a fitting parameter, of order unity, to correct for the inexact nature of tpo ðeÞ. That is, if we had only po scattering, then the exact (i.e., BTE) calculation of mH vs. T would be almost identical with the RTA calculation (i.e., mH ¼ eht2 i=m htiÞ, with Cpo ¼ 1. However, if other scattering mechanisms are also important, then the correction factor Cpo will be dependent on the relative strengths of these other mechanisms. As an example, to get a good fit to high-temperature (>300 K) mH-vs.-T data in GaN (Fig. 6), we use Cpo 0.6. Fortunately, below 150 K, the po mechanism
416
ELECTRICAL AND ELECTRONIC MEASUREMENTS
is no longer important in GaN, and the RTA approach is quite accurate. The RTA analysis discussed above and the CBE analysis discussed earlier (Equation 10a or 10b) constitute a powerful method for the determination of ND, NA, and ED in semiconductor material. It can be easily set up on a personal computer, using, e.g., MATHCAD software. In many cases, it is sufficient to simply assume n ¼ nH (i.e., r ¼ 1) in Equations 10a and 28, but a more accurate answer can be obtained by using the following steps: (1) let n ¼ nH ¼ 1/eRH at each T; (2) fit mH vs. T using mH ¼ ht2 i=m hti, and get a value for NA; (3) calculate r ¼ ht2 i=hti2 at each T; (4) calculate a new n ¼ rnH at each T; and (5) fit n vs. T to Equation 10a and get values of ND and ED. Further iterations can be carried out, if desired, but usually add little accuracy. For the benefit of the reader who wishes to set up such an analysis, we give the scattering strength parameters and fitting parameters used for the GaN data fits in Figures 5 and 6. Further discussion can be found in the work of Look and Molnar (1997). From the GaN literature we find: E1 ¼ 9.2 eV ¼ 1.47 1018 J; P ¼ 0.104, e0 ¼ 10.4(8.8542 1012 ) F/m; e1 ¼ 5.47(8.8542 1012 ) F/m; Tpo ¼ 1044 K; m* ¼ 0.22(9.1095 1031 ) kg; rd ¼ 6.10 103 kg/m3; s ¼ 6.59 103 m/s; g0 ¼ 1; g1 ¼ 2; aD ¼ 0; and NC0 ¼ 4.98 1020 m3 . With all the parameters in mks units, mH is in units of m2/V s; a useful conversion is mH (cm2/V s) ¼ 104 mH (m2/V s). To fit the high-T mH data, we also have to set Cpo ¼ 0.56, as mentioned earlier. However, before carrying out the RTA and CBE analyses, it was necessary in this case to correct for a degenerate interface layer, having mH2 ¼ 55 cm2/V s and nHsq2/d ¼ 3.9 1017 cm3 . The corrected data, also shown in Figures 5 and 6, are calculated by inverting Equations 26 and 27:
mH1 ¼
m2H nH m2H2 nHsq2 =d mH nH mH2 nHsq2 =d
ð32Þ
nH1 ¼
ðmH nH mH2 nHsq2 =dÞ2 m2H nH m2H2 nHsq2 =d
ð33Þ
The fitted parameters are ND ¼ 2.1 1017 cm3 , NA ¼ 5 1016 cm3 , and ED ¼ 16 meV.
SAMPLE PREPARATION In deciding what kind of geometrical structure to use for a particular application, several factors should be considered, including (1) available size, (2) available fabrication techniques, (3) limitations of measurement time, (4) necessary accuracy, and (5) need for magnetoresistance data. In considering these factors, we will refer to Figure 7, which depicts six of the most popular structures. The available size can be a severe constraint. In our laboratory, we have often measured bulk samples of dimension 2 mm or less, which virtually rules out any complicated shapes, such as (B), (D), or (F) in the figure; in fact, it is sometimes not possible to modify an existing sample shape at all. Thus, the best procedure for small samples
Figure 7. Various specimen/contact patterns commonly used for resistivity and Hall effect measurements: (A) Hall bar; (B) Hall bar with contact arms; (C) square; (D) Greek cross; (E) circle; (F) cloverleaf.
is simply to put four very small contacts around the periphery and apply the standard van der Pauw analysis. We have found that indium, applied with a soldering iron, works well for almost any semiconductor material. Contact size errors can be estimated if the shape is somewhat symmetrical. The simple Hall bar structure (A) is not recommended for a very small bulk sample unless the sample is already in that form, because it is then necessary to measure l and w, which can introduce large errors. For larger bulk samples, there is a greater choice among the various structures. The availability of an ultrasonic cutting tool opens up the possibility of using structures (B), (D), (E), or (F), if desired. Here, it might be noted that (B) is rather fragile compared to the others. If the samples must be cleaved from a wafer, then the shapes are basically limited to (A) and (C). In our laboratory, for example, it is common to use a square (C) of about 6 6 mm and then put contacts of dimension 1 mm or less on the corners. If photolithographic capabilities are available, then one of the more complex test structures (B), (D), or (F) should be used, because of the advantages they offer. Most manufacturers of discrete semiconductor devices or circuits include Hall bar or van der Pauw structures at several points on the wafer, sometimes one in every reticle (repeated unit). Another possible constraint is measurement time. By comparing Equation 13 with Equation 17 and Equation 14 with Equation 24, it appears that a typical van der Pauw experiment should take about twice as long as a Hall bar experiment, and indeed, this is experimentally the case. Thus, in an experiment that involves many measurements on the same sample, such as a temperature dependence study, it may well be a distinct advantage to use a Hall bar instead of a van der Pauw pattern. Also, if a contact-switching network (Fig. 4) is not part of the available apparatus, then a van der Pauw structure cannot be conveniently used. If high accuracy is necessary for the Hall effect measurements, then structures (B), (D), and (F) are the best, because contact size effects are much stronger for Hall effect data than for resistivity data. The same structures,
HALL EFFECT IN SEMICONDUCTORS
along with (A), should be used if mB must be made large, since structures (C) and (E) do not have as good a VH-vs.-B linearity. From a heat dissipation point of view, structures (A) and (B) are the worst, since they are required to be long and narrow, and thus the Hall voltage is relatively small for a given current, since VH / w. A high heat dissipation can lead to large temperature gradients, and thus stronger thermomagnetic effects, or may simply raise the sample temperature an unacceptable amount. Finally, it is important to ask whether or not the same sample is going to be used for magnetoresistance measurements, because, if so, structures (A) and (B) are undoubtedly the best. The reason is that the analysis of magnetoresistance is more complicated in van der Pauw structures than in Hall bar structures, and, in general, a simple formula cannot be found. It may be noted that in Shubnikov–de Haas and quantum Hall measurements, in which magnetic field dependence is critical, the Hall bar is nearly always used. PROBLEMS Contact Size and Placement Effects Much has been written about this subject over the past few decades. Indeed, it is possible to calculate errors due to contact size and placement for any of the structures shown in Figure 7. For (A), (C), and (E), great care is necessary, while for (B), (D), and (F), large or misplaced contacts are not nearly as much of a problem. In general, a good rule of thumb is to keep contact size and distance from the periphery each below 10% of the smallest sample edge dimension. For Hall bar structures (A) and (B), in which the contacts cover the ends, the ratio l/w > 3 should be maintained. Thermomagnetic Errors Temperature gradients can set up spurious electromotive forces that can modify the measured Hall voltage. Most of these effects, as well as misalignment of the Hall contacts in structure (B), can be averaged out by taking measurements at positive and negative values of both current and magnetic field and then applying Equations 21 and 25.
417
potentials fs and fi , respectively. Then regions of width ws and wi will be depleted of their free carriers, where wsðiÞ ¼
2e0 fsðiÞ eðND NA Þ
1=2 ð34Þ
Here it is assumed that fðsÞi kT=e and efðsÞi eC eF . The electrical thickness of the film will then be given by delec ¼ d ws wi. Typical values of fs and fi are 1 V, ˚ ¼ so that if ND NA ¼ 1017 cm3 , then ws þ wi 2000 A 0.2 mm in GaN. Thus, if d 0.5 mm, 40% of the electrons will be lost to surface and interface states, and delec 0.3 mm. Inhomogeneity A sample that is inhomogeneous in depth must be analyzed according to Equations 11 and 12. In simple cases (e.g., for two layers) such an analysis is sometimes possible, as was illustrated in Figures 5 and 6. However, if a sample is laterally inhomogeneous, it is nearly always impossible to carry out an accurate analysis. One indication of such inhomogeneity is a resistivity ratio Q 1 (Fig. 3) in a symmetric sample, which would be expected to have Q ¼ 1. The reader should be warned to never attempt an f correction (Fig. 3) in such a case, because the f correction is valid only for sample shape asymmetry, not inhomogeneity. Nonohmic Contacts In general, high contact resistances are not a severe problem as long as enough current can be passed to get measurable values of Vc and VH. The reason is that the voltage measurement contacts carry very little current. However, in some cases, the contacts may set up a p-n junction and significantly distort the current flow. This situation falls under the ‘‘inhomogeneity’’ category, discussed above. Usually, contacts this bad show variations with current magnitude and polarity; thus, for the most reliable Hall measurements, it is a good idea to make sure the values are invariant with respect to the magnitudes and polarities of both current and magnetic field. ACKNOWLEDGMENTS
Conductive Substrates If a thin film is grown on a conductive substrate, the substrate conductance may overwhelm the film conductance. If so, and if msub and nsub are known, then Equations 32 and 33 can be applied, where layer 2 is the substrate. If the substrate and film are of different types (say, a p-type film on an n-type substrate), then a current barrier ( p-n junction) will be set up, and the measurement can possibly be made with no correction. However, in this case, the contacts must not overlap both layers.
The author would like to thank his many colleagues at Wright State University and Wright-Patterson Air Force Base who have contributed to his understanding of the Hall effect and semiconductor physics over the years. He would also like to thank Nalda Blair for help with the manuscript preparation. Finally, he gratefully acknowledges the support received from the U.S. Air Force under Contract F33615-95-C-1619. LITERATURE CITED
Depletion Effects in Thin Films Surface states as well as film-substrate interface states can deplete a thin film of a significant fraction of its charge carriers. Suppose these states lead to surface and interface
American Society for Testing and Materials (ASTM). 1988. Standard F76. Standard Method for Measuring Hall Mobility and Hall Coefficient in Extrinsic Semiconductor Single Crystals. ASTM, West Conshohocken, Pa.
418
ELECTRICAL AND ELECTRONIC MEASUREMENTS
Gantmakker, V. F. and Levinson, Y. B. 1987. Carrier scattering in metals and semiconductors. In Modern Problems in Condensed Matter Physics, Vol. 19 (V. M. Agranovich and A. A. Maradudin, eds.). North-Holland, Amsterdam, The Netherlands. Hall, E. H. 1879. On a new action of the magnet on electric circuits. Am. J. Math. 2:287–292. Look, D. C. 1989. Electrical Characterization of GaAs Materials and Devices. John Wiley & Sons, New York. Look, D. C. and Molnar, R. J. 1997. Degenerate layer at GaN/sapphire interface: Influence on Hall-effect measurements. Appl. Phys. Lett. 70:3377–3379. Nag, B. R. 1980. Electron Transport in Compound Semiconductors. Springer-Verlag, Berlin. Rode, D. L. 1975. Low-field electron transport. In Semiconductors and Semimetals, Vol. 10 (R. K. Willardson and A. C. Beer, eds.) pp.1–89. Academic Press, New York. Runyan, W. R. 1975. Semiconductor Measurements and Instrumentation. McGraw-Hill, New York. Schroder, D. K. 1990. Semiconductor Material and Device Characterization. John Wiley & Sons, New York. van der Pauw, L. J. 1958. A method of measuring specific resistivity and Hall effect of discs of arbitrary shape. Philips Res. Repts. 13:1–9. Wieder, H. H. 1979. Laboratory Notes on Electrical and Galvanomagnetic Measurements. Elsevier/North-Holland, Amsterdam. Wiley, J. D. 1975. Mobility of holes in III-V compounds. In Semiconductors and Semimetals, Vol. 10 (R. K. Willardson and A.C. Beer, eds.). pp. 91–174. Academic Press, New York.
KEY REFERENCES Look, 1989. See above. A detailed description of both theory and methodology related to Hall Effect, magnetoresistance, and capacitance-voltage measurements and analysis. Nag, 1980. See above. A comprehensive treatise on electron scattering theory in semiconductor materials. Schroder, 1990. See above. A brief, practical description of the Hall effect and many other techniques related to the measurement of semiconductor properties. Wiley, 1975. See above. A good reference for hole scattering in p-type semiconductors
DAVID LOOK Wright State University Dayton, Ohio
DEEP-LEVEL TRANSIENT SPECTROSCOPY
have always been among the most important and crucial tasks in materials and electronic device development. The performance and reliability of devices can be significantly affected by only minute concentrations of undesirable defects. Since the determination of the type and quality of defects in a material depends on the sensitivity of a characterization technique, the challenge in materials characterization has been to develop detection methods with improved sensitivity. Whereas electrical characterization methods are more sensitive than physical characterization techniques, they may arguably be less sensitive than some optical techiques. However, since device operation depends largely on the electrical properties of its components, it is conceivable that a characterization from an electrical point of view is more relevant. Also, the activation of defects due to electrical processes requires scrutiny, as it has a direct impact on the performance and reliability of a device. Deep-level transient spectroscopy (DLTS) probes the temperature dependence of the charge carriers escaping from trapping centers formed by point defects in the material. This technique is able to characterize each type of trapping center by providing the activation energy of the defect level relative to one of the energy band edges and the capture cross-section of the traps. It can also be used to compute the concentration and depth profile of the trapping centers. Although several electrical characterization techniques exist, such as Hall effect (see HALL EFFECT IN SEMICONDUCTORS), current-voltage, capacitance-voltage (see CAPACITANCE-VOLTAGE (C-V) CHARACTERIZATION OF SEMICONDUCTROS), and carrier lifetime measurements (see CARRIER LIFETIME: FREE CARRIER ABSORPTION, PHOTOCONDUCTIVITY, AND PHOTOLUMINESCENCE), very few of them exploit spectroscopy. The spectroscopic nature of DLTS is a key feature that provides both convenience and sensitivity. Deep-level transient spectroscopy has been widely used for many different semiconductors. This technique has distinguished itself in contributing to the resolution of many defect-related problems in several technologically important semiconductors such as silicon, the Group III to V and II to VI compounds, and alloys. Many variations of basic DLTS have also been developed for improved sensitivity and for more specialized applications in device structures different from the normal p-n or Schottky barrier diodes. Deep-level transient spectroscopy is not able to determine the chemistry or the origin of a defect. Deep-level transient spectroscopy data should therefore be used in conjunction with other techniques. A successful study of defects requires the concerted efforts of many researchers using various characterization techniques in order to derive a more accurate and consistent picture of the defect structure of a given material.
INTRODUCTION Defects are responsible for many different characteristic properties of a semiconductor. They play a critical role in determining the viability of a given material for device applications. The identification and control of defects
Defects in Semiconductors In a real crystal, the periodic symmetry of the lattice can be broken by defects. Lattice defects produce localized energy states that may have energy levels occurring within the
DEEP-LEVEL TRANSIENT SPECTROSCOPY
band gap. A charge carrier (electron or hole) bound to such a defect in a lattice has a localized wave function as opposed to a carrier in the allowed energy bands (conduction or valence bands) that is free to move. Crystal imperfections known as point defects can be vacancies or impurities introduced either deliberately or unintentionally during the growth process. Processing of materials during device fabrication can also introduce point defects. Some defects are unavoidable, and they play a key role in determining the properties of a semiconductor. Chemical impurities that form point defects may exist interstitially or substitutionally in the lattice. An interstitial atom may be of the same species as the atoms in the lattice (intrinsic defects) or of a different species (extrinsic defects). Defects can also consist of vacant lattice sites. There are also defect complexes that are conglomerations of different point defects. In addition to point defects, there are also one-dimensional defects such as dislocations, two-dimensional defects such as surfaces and grain boundaries, and three-dimensional defects such as micropipes and cavities (voids). Defects obey the laws of thermodynamics and the law of mass action. Hence, the removal or suppression of one type of defect will enhance the effects of another type. For instance, the removal of defects such as grain boundaries and dislocations increases the significance of point defects. The presence of defects in semiconductors can be either beneficial or detrimental, depending on the nature of the defects and the actual application of the material in devices. Gold impurities in silicon junctions are used to provide fast electron-hole recombination, resulting in faster switching time. Impurities such as gold, zinc, and mercury in silicon and germanium produce high-quantum-efficiency photodetectors. The emission wavelength of lightemitting diodes (LEDs) is determined by the presence of deep defect levels. In undoped semi-insulating GaAs, a family of deep donor levels, commonly known as EL2, compensates the acceptor levels due to carbon impurities to impart the high resistivity or semi-insulating properties to these materials. Chromium is also used to dope GaAs to produce semi-insulating GaAs-Cr, although this is no longer in widespread use. Device performance and reliability are greatly affected by the presence of defects. The success of fiber-optics-based telecommunication systems employing laser diodes depends critically on the lifetime of the laser diodes and LEDs. Degradation of laser diodes and LEDs has been widely attributed to formation of local regions where nonradiative recombination occurs. There is general agreement that these regions result from the motion of dislocations that interact with defect centers to promote nonradiative recombination. A dislocation array can also propagate from a substrate into active layers resulting in device failure. This is a reason why it is crucial that electronic materials have a low dislocation density. Device fabrication processes can involve ion implantation, annealing, contact formation, mechanical scribing, and cleaving, all of which introduce dislocations and point defects. Dislocations in devices may be of little significance
419
by themselves. The problems only arise when these dislocations are ‘‘decorated’’ by point defects. Impurities tend to gather around dislocations, and the diffusion of impurities is also enhanced at dislocations. It is difficult to differentiate between the effects of point defects and dislocations because the production of dislocations also generates point defects such as interstitial atoms and vacancies. The topic of reliability and degradation of devices is wide ranging, and the above examples serve only to show the importance of the study of defects in devices. The location of localized states within the band gap can range from a few milli-electron-volts to a few tenths of an electron volt from either the bottom of the conduction band or the uppermost valence band. The determination of whether a level can be considered deep or shallow is rather arbitrary. A depth of 0.1 eV is usually termed ‘‘deep level.’’ It must also be noted that the accuracy in the determination of any level 30 years (Standard Method for Measuring the MinorityCarrier Lifetime in Bulk Germanium and Silicon, F28 in ASTM, 1987). Numerous noncontact PC methods, such as microwave reflectance and rf bridges, have recently become quite popular because their nondestructive quality and also their suitability for use in a clean-room environment. Most of the considered techniques do not require particularly expensive and bulky instruments. This motivates the wide diffusion of noncontact PC decay techniques into diverse areas of carrier lifetime measurements including on-line processing of silicon wafers. Principle of PC Decay Methods The principle of the experiment involves shining a pulse of light into the semiconductor sample to generate excess carriers and monitoring the time decay of the corresponding excess conductivity. Since electrons and holes have opposite charge, the total conductivity is the sum of the partial conductivities, which is always positive. Consequently, conductivity can be written in the form sðtÞ ¼ q½nðtÞmn þ p ðtÞmp ffi qnðtÞ½ðmn þ mp Þ
ð34Þ
where q is the electron charge and mn, mp are the corresponding drift mobilities of electrons and holes that characterize PC for different injection levels. Before proceeding to details, it is worthwhile to reiterate two general points. First, the approximate equality for the last term of Equation 34 is valid in the absence of asymmetric capture within an impurity recombination model (Blakemore, 1962) or the absence of carrier trapping (Ryvkin, 1964). In general, these conditions are valid at high injections for carrier generation by the electromagnetic
radiation in the fundamental band with a photon energy hn Eg, where Eg is the forbidden gap energy. Then both n and p are proportional to the optical energy absorbed in that time and in the unit volume. Numerous experiments have shown that recombination misbalance effects can be neglected at any injection for measured lifetimes in the range 1 to 500 ms of crystalline Si and Ge (Graff and Fisher, 1979). However, we note that the approximate equality in Equation 34 does not hold for low injection conditions in some highly irradiated, highly damaged crystalline semiconductors and in polycrystalline or amorphous films (an example is provided by Bru¨ ggemann, 1997). The same indeterminate equation may appear in compensated compound semiconductor materials, since some of them may contain a substantial concentration of nonstoichiometric deep centers acting as carrier traps. These cases might require quite specific methods for carrier lifetime extraction from PC measurements. In the following, we shall assume that the approximate equality in Equation 34 is an inherent property of the semiconductor material. This assumption leaves complicated interpretation of PC transients out of the scope of this unit. Second, it has usually been assumed that the PC decay process senses the carrier concentration decay where mobilities are constant parameters. Such an assumption may be made in the great majority of actual cases. However, this cannot be true at high injection where electron-hole scattering reduces mobility substantially, for example, in high-resistivity Si (Grivickas et al., 1984). A few papers have also reported that carrier mobility was changed during PC transients in high-resistivity polycrystalline and compound semiconductors because of the effective recharging of scattering clusters. We do not attempt to provide an exhaustive accounting for these effects. As explained earlier (see Theory), in an extrinsic semiconductor, under low injection conditions, we shall assume that the decay of excess carrier concentration is always controlled by minority carriers through SRH recombination and either by a direct band-to-band or an Auger-type recombination mechanism.
Practical Aspects of the Standard PC Decay Method By the standard technique, PC decay is monitored through the change of sample conductance in a constant electric field. Current is passed through a monocrystalline semiconductor specimen by means of ohmic contacts. The experimental arrangement in its simplest form is shown in Figure 16. The sample is bar-shaped with dimensions of typically l d, w and has ohmic contacts on its end faces. The light beam is oriented normal to the applied electric field (a case of transverse PC). The intensity decreases with depth in the sample according to Beer’s law. Neglecting multiple reflection, the photoconductance of the whole sample can be obtained (Ryvkin, 1964) as G ¼ ðw=lÞqðmn þ mp ÞI0 ½1 expðkdÞ ðd ¼ ðw=lÞqðmn þ mp Þ nx dx; 0
ð35Þ
CARRIER LIFETIME: FREE CARRIER ABSORPTION, PHOTOCONDUCTIVITY, AND PHOTOLUMINESCENCE
445
linearity is obtained when the relative change of the photoconductance is small (G/G0 1). Thus, a constant current regime is recommended for minority carrier lifetime measurements in bulk Si and Ge by the ASTM standard F28 (ASTM, 1987). The threshold of G measurement sensitivity depends on various types of noise superimposed on the measured signal and the type of recording instrument. The minimum detectable photoconductance is also governed by the value of the equilibrium conductance, G0, other conditions being equal (Ryvkin, 1964). PC Method Automation for High-Frequency Range
Figure 16. Basic experimental setup for photoconductive decay measurements in a constant electric field. Excess carriers are monitored as an increase in sample conductance.
where I0 is intensity penetrated from the front surface of the sample and k is the absorption coefficient. For sufficiently thick samples (kd 1) all light energy is absorbed in the sample and the photoconductance is independent of the absorption coefficient. Therefore, in this case the carrier distribution in the x direction is highly inhomogeneous. In addition to the transverse PC (Fig. 16), the longitudinal PC is sometimes investigated. In this type of experiment, a semitransparent electrode is illuminated with the light beam parallel to the field. In general, the longitudinal case has a more complex relationship with fundamental semiconductor parameters. We confine our discussion in this unit to transverse photoconductivity. As shown in Figure 16, the sample is connected in series with a battery supplying a voltage V and a load resistance RL. Consequently, interrupting the light, the current in the circuit has a constant component and an alternating one. The voltage drop may be monitored with an oscilloscope or by a signal averager between the ohmic contacts (or between a second pair of potential contacts in a fourterminal measurement). If, in general, voltage drop is monitored on the load resistance RL, a rather complex relationship between the alternating voltage received at the amplifier input and the variation of G can be expected under the action of light. In some special cases, this relationship is simpler (Ryvkin, 1964). For example, if a small load resistance is used, RL (G0 þ G)1 , where G0 is the equilibrium conductance, then the relationship is linear because the illumination does not greatly alter the electric field distribution in the sample and the load resistance. This case sometimes is called the constant field regime. The resistor must be nonreactive, with a resistance at least 20 times less that of the sample in the excited state, to provide a condition of essentially constant field. As described by Ryvkin (1964), two other regimes, such as the constant current regime, RL (G0 þ G)1 , or the maximumsensitivity regime, RL ¼ (G0)1 [1 þ G/G0]1=2 , can be frequently used. While the last two regimes, in general, do not imply proportionality between the signal and G, a
Ordinary PC transient measurements are quite restricted on a nanosecond time scale because capacitance and/or induction of conventional electrical circuit elements produce failure effects. Recently, PC lifetime measurements in the 1- to 15-GHz frequency range have been developed through strip-line techniques by driving so-called Auston switches (Lee, 1984). Coplanar metal waveguides have been monolithically integrated on the semiconductor sample to provide characteristic impedance 50 , with a corresponding spacing (photoconductive gap) of a few micrometers wide. Free carriers can be generated in the photoconductive gap by sharp femtosecond laser pulses and accelerated in an applied dc bias field, producing an electrical transient. The output of the circuit can be measured via high-frequency connectors attached to the striplines by a fast sampling oscilloscope. In this case, lifetime resolution down to 30 ps can be achieved. Sampling experiments can also be performed without the need for an expensive fast oscilloscope. In this case, two photoconductive switches, one serving as a gate and having a very fast response function, can be connected in parallel and excited with delayed optical pulses. The temporal profile of an incident signal voltage pulse, vsign(t), can be measured by scanning the time delay, t, between the signal pulse and the gate pulse. The sampling yields a dc current given by the signal correlation ð IðtÞ ¼ nsign ðtÞfsamp ðt tÞ dt
ð36Þ
where fsamp is the known sampling function corresponding to the optical gate pulse. Relatively inexpensive lock-in amplifiers can measure dc currents, provided that colliding optical pulses are chopped at low frequency. The lifetime can be estimated from mathematical simulations of the correlation function I(t). The time resolution is limited by the duration of the sampling function of the gate switch. The ideal sampling function is a delta pulse, and, in fact, this has motivated the fabrication of a gate switch made from semiconductor materials with reduced carrier lifetime in a subpicosecond range (Smith et al., 1981). Therefore, the gate sample in this case should be regarded as a distributed circuit element, and the propagation of the wave through the photoconductive gap should be properly treated (Lee, 1984). Four factors have been shown to be of special importance. These are the free carrier lifetime, local dynamic screening, velocity overshoot, and the
446
ELECTRICAL AND ELECTRONIC MEASUREMENTS
carrier transient time from the photoconductive gap (Jacobsen et al., 1996). High-frequency measurements usually involve high excitation levels and high electric fields. In these cases, carrier recombination is affected by fundamental processes of carrier momentum randomization, carrier thermalization, and energy relaxation in the bands (Othonos, 1998). At very high injections, PC in semiconductors usually obeys a nonlinear dependence as a function of light intensity. The most probable cause of any nonlinearity in a photoconductor is related to carrier lifetime. Pasiskevicius et al. (1993) have proposed to use PC correlation effects for measuring nonlinearities in ultrafast photoconductors. In this way, correlation signals are recorded on a single switch using a double pulse excitation scanned relative to one another. A peak will be observed on the autocorrelation traces when the photocurrent is a superlinear function of optical power and a dip when this function is sublinear. The widths of these extrema will be determined by the duration for which the switch remains in the nonlinear state. The typical experimental setup used by Jacobsen et al. (1996) is sketched in Figure 17. Colliding pulses (100 fs) of a mode-locked Ti:sapphire laser are divided into two beams, with one beam passing through a delay line. Both beams are mechanically chopped and focused onto a biased photoconductive switch, and the relative arrival time is varied by mechanically moving the delay line. The current flowing in the photoconductive gap is measured by using a current-sensitive preamplifier, and the signal is sent to a lock-in amplifier. The preamplifier operation is slow compared to the time scale of the carrier dynamics, so a time average of the current is measured. In order to suppress noise when performing a photocurrent correlation measurement, the two beams are chopped at different frequencies f1 and f2, and the lock-in amplifier is referenced to the difference frequency |f1 – f2|. In this way, the correlation measurement integrates only the product of the carrier
density and the local electric field that corresponds to both pulses. Therefore, a rapid recharging of the photoconductive switch is required by a short recombination lifetime. Lifetimes as short as 200 fs have been measured using this correlation scheme. Additional information on PC processes can be obtained from a set of compatible electrooptic techniques based on Pockels, Kerr, Franz-Keldysh effects, as reviewed by Cutolo (1998). Detailed protocols of minority-carrier lifetime determination in bulk Si and Ge by a classical PC technique are provided in F28 in the Annual Book of ASTM Standards (ASTM, 1987). Other applicable documents of ASTM Standards are E177 (Recommended Practice for Use of the Terms Precision and Accuracy as Applied to Measurement of a Property of a Material (ASTM, 1987, Vol. 14.02) and F43 Test Methods for Resistivity of Semiconductor Materials (ASTM, 1987, Vol. 10.05). The specimen resistivity should be uniform. The lowest resistivity value should not be 0. On the other hand, a stable spontaneous magnetization is predicted for H ¼ 0 when A(T) < 0. This spontaneous magnetization is: M2 ¼
AðTÞ 2BðTÞ
ð10Þ
In the presence of a field, and for the case A(T) < 0, this gives rise to a stable spontaneous magnetization, which is described by the equation of state (see Equation 7). This suggests that plotting different isotherms of M2 versus H/M allows for the determination of A(T) and B(T). Moreover, since A(TC) ¼ 0, i.e., A vanishes at the Curie temperature, TC, then the Curie temperature can be determined as the isotherm with 0 as its intercept. Plots of M2 versus H/M isotherms are called Arrott plots and are discussed below as a method for determining the magnetic ordering temperature from magnetization measurements.
ð6Þ
that results in the expression: M2 ¼
AðTÞ m H þ 0 BðTÞ BðTÞM
ð7Þ
By invoking a microscopic model of the magnetization (of which there are many), one can determine the coefficients A(T) and B(T), specifically. For example, in the Stoner theory of itinerant ferrromagnets, the magnetic equation of
Figure 2. Magnetization curve metamagnetic response. M1 is the external magnetization in the paramagnetic state; M2 is the saturation magnetization in the metamagnetic state.
530
MAGNETISM AND MAGNETIC MEASUREMENTS
Table 1. Ordering Temperatures for Some Selected Magnetic Materials Curie Temperature (TC), K
Ferrimagnets CO Fe Ni Gd
Ferrimagnets
Curie Temperature (TC), K
Fe3O4 MnFe2O4 Y3FeO12 g-Fe2O3
1388 1043 627 289
858 573 560 948
Another free energy minumum exists, which gives rise to a so-called metamagnetic state. If A(T) > 0 but small, a Pauli paramagnetic state is predicted in zero field. However if B(T) < 0 it is possible to have a minimum in the Helmholtz free energy at H 6¼ 0. This minimum may in fact be deeper than that at M ¼ 0. In such a case, application of a field causes the system to choose the second minimum (at M2) giving rise to an M(H) curve as depicted in Figure 2 for this so called metamagnetic response.
CRITICAL EXPONENTS IN MAGNETIC PHASE TRANSITIONS One of the goals of thermodynamic treatments of magnetic phase transitions is to determine critical exponents associated with the phase transition. This involves describing power law exponents for the temperature dependence of thermodynamic quantities as the ordering transition is approached. Determination of critical exponents and scaling laws allows for closed-form representations of thermodynamic quantities with significant usefulness in prediction and/or extrapolation of thermodynamic quantities (Callen, 1985). To describe such scaling-law behavior, the reduced temperature variance, e, is defined as: ðT TC Þ e¼ T
ð11Þ
which approaches 0 as T ! TC from below or above. For T > TC and H ¼ 0, the specific heat (C, which can be CM or CH) and isothermal susceptibility obey the scaling laws: C ea
wT eg
and
ð12Þ
For T < TC, the specific heat, the magnetization and the isothermal susceptibility obey the scaling laws: 0
C ea wT ðeÞg
0
and
M ðeÞb
ð13Þ
At TC the critical isotherm is described by H jMjd
ð14Þ
Thermodynamic arguments, such as those of Rushbrooke (see Callen, 1985), allow us to place restrictions on the critical exponents. For example, defining aH ¼
qM qT
ð15Þ H
Antiferromagnets
Neel Temperature (TN), K
NiO Cr Mn FeO
Helimagnets
(TH), K
MnO2 MnAu2 Er
84 363 20
600 311 95 198
and using thermodynamic Maxwell relations, it is possible to show that: wT ðCH CM Þ ¼ Ta2H
ð16Þ
which implies CH >
Ta2H wT
ð17Þ
This then requires that, as T ! TC from below: a0 þ 2b þ g0 2
ð18Þ
Furthermore, if CM/CH approaches a constant (6¼ 1) then two more inequalities may be expressed: a0 þ bð1 þ dÞ 2
and
g0 bðd þ 1Þ
ð19Þ
In the Landau theory, it can be determined that: a0 ¼ a ¼ 0;
b ¼ 1=2;
g0 ¼ g ¼ 1;
and
d¼3
ð20Þ
If we consider a vector magnetization, and allow for a spatially varying local magnetization, then the Landau theory must be extended to a more complicated form. Further, it is often necessary to add terms (rM)2 to the energy functional. In such cases, TC ¼ TC(k) and w ¼ w(k) can be taken as reciprocal space expansions and the spatial dependence of the susceptibility, w(r), can be determined as the Fourier transform of w(k). In these cases a correlation length, , for the magnetic order parameter is defined, which diverges at TC. Further discussion of the spatial dependence of the order parameter is beyond the scope of this unit. Table 1 summarizes ordering temperatures for selected materials with a variety of types of magnetic order. These various local magnetic orders and the exchange interactions that give rise to them are discussed in TECHNIQUES TO MEASURE MAGNETIC DOMAIN STRUCTURES. For most of that discussion, we will consider ferromagnetic ordering only.
LITERATURE CITED Ausleos, M. and Elliot, R. I. 1983. Magnetic Phase Transitions. In Springer Series in Solid State Sciences, vol 48 (M. Cardona, P. Fulde, K. von Klitzing, and H.-J. Queisser, eds.). SpringerVerlag, New York.
MAGNETOMETRY Callen, H. B. 1985. Thermodynamics and an Introduction to Thermostatistics. John Wiley & Sons, New York. Gignoux, D. 1995. Magnetic Properties of Metallic Systems. Electronic and Magnetic Properties of Metals and Ceramics. In Materials Science and Technology: A Comprehensive Treatment, Vol. III (K. H. J. Buschow, ed.). VCH, Weinheim.
KEY REFERENCES Ausleos and Elliot, 1983. See above. Collection of articles dealing with statistical mechanics and theory of magnetic phase transitions. Callen, 1985. See above. Advanced undergraduate or graduate text on thermodynamics. This is notable for its discussion of magnetic work terms, phase transitions, critical phenomena, etc., as well as its excellent exposition on classical thermodynamics and statistical mechanics. Mattis, D. C. 1981. The Theory of Magnetism. In Springer Series in Solid-State Sciences, vol. 17 (M. Cardona, P. Fulde, K. von Klitzing, and H.-J. Queisser, eds.). Springer-Verlag, New York. Advanced text on the theory of magnetism with strong basis in the quantum mechanical framework. Much interesting historical information. Callen, 1985. See above.
MICHAEL E. MCHENRY DAVID E. LAUGHLIN Carnegie Mellon University Pittsburgh, Pennsylvania
531
samples), and phase transitions. For paramagnetic systems, the total moment at high fields or the temperature dependence of the magnetic susceptibility can yield a measure of the moment per chemical constituent or magnetic impurity concentration, whether introduced purposefully or as impurities. The temperature dependence of the moment yields information on interactions between the paramagnetic ions or with the lattice. Certain features of the electronic structure of metallic materials can be determined from magnetic susceptibility, such as the density of states at the Fermi surface. For superconductors, high field magnetization can provide information about the critical field and critical current density. This unit is restricted to the centimeter-gram-second (cgs) system of units for the magnetic moment, which is given in electromagnetic units (emu), where 1 emu ¼ 103 A-m2 [International (SI) System of Units]. To determine the magnetization M (G/cm3), where B ¼ H þ 4pM, multiply moment per gram (emu/g) by the density (g/cm3). The magnetic moment, magnetic field (H), magnetic induction (B), and associated conversion factors may be found in MAGNETIC MOMENT AND MAGNETIZATION of this part. Recent reviews of magnetometry are given by Foner (1994) and by Flanders and Graham (1993) and recent developments are often published in the Review of Scientific Instruments. Proceedings of the annual Intermag Conferences appear in a special fall issue of the IEEE Transactions on Magnetics, and those of the annual Magnetism and Magnetic Materials Conference generally appear in a special spring issue of the Journal of Applied Physics.
MAGNETOMETRY PRINCIPLES OF THE METHOD INTRODUCTION All materials possess a magnetic moment; techniques for measuring that bulk macroscopic property are defined here as magnetometry. This unit will review the most common techniques for measuring the total magnetic moments of small samples (volume 1 cm3 and/or mass 1 g). Several factors contribute to the bulk magnetic moment of a sample. Essentially all materials show weak diamagnetism from filled electronic core states of the atoms. Metallic materials show an additional contribution to their diamagnetism from the orbital motion of the otherwise degenerate spin-up and spin-down conduction electrons as well as Pauli paramagnetism from splitting of the conduction bands. The largest contribution to the magnetic moment of materials comes from unpaired localized spins of elemental constituents—unpaired 3d and sometimes 4d and 5d electrons in the case of the transition metal series, 4f electrons for the rare earths, and 5f electrons for the actinide constituents. It is important for many reasons to know a material’s magnetic moment, a thermodynamic quantity. For strongly magnetic materials, it is probably the most useful and important physical property, determining their utility in applications. For magnetically ordered systems, magnetic moment measurements provide information about spin structure, anisotropy (in the case of nonpolycrystalline
There are two direct techniques for measuring the magnetic moment of materials: one utilizes the detection of a change in magnetic flux produced by a sample, and the other utilizes the detection of a change in the force experienced by a sample. An analog device, a transducer, couples the change in flux or force to electronic circuitry for signal processing and readout, usually in the form of a directcurrent (dc) voltage proportional to the sample moment. The transducer is usually a configuration of pickup coils for flux-based measurements. For force measurements, the transducer will most likely be an electromechanical balance or piezoelectric device. Often a magnetometer system operates in some null detection mode using feedback,which is proportional to the magnetic moment of the sample being measured. Flux measurements are usually performed with a configuration of pairs of series-opposing pickup coils. Most of these magnetometers utilize the strategy of moving the sample with respect to the pickup coils in order to measure the moment of the sample while minimizing contributions from the background magnetic field. The induced electromotive force (emf) in a pickup coil is generated from a temporal change in magnetic flux, obeying Faraday’s law: EðVÞ ¼ N 108 ðd=dtÞ, where N is the number of turns in the coil and the magnetic flux, ðG-cm2 Þ ¼ Ba ¼ ðHþ 4pMÞa, a being the effective area of the coil
532
MAGNETISM AND MAGNETIC MEASUREMENTS
All force magnetometers depend on the difference in energy experienced by the sample in a magnetic field gradient. Figure 2 is a schematic representation of four force-related magnetometers: the traditional Faraday, the cantilever, the piezoresistive, and the alternatinggradient magnetometers. In general, assuming the (spherical) sample composition is homogeneous, isotropic, and in a vacuum, the force for a gradient along the z axis is given by F ðdynesÞ ¼ nwHðdH=dzÞ
ð1Þ
if w, the magnetic susceptibility, is field independent, i.e., M ¼ wH. Here M is the magnetization per cubic centimeter and n is the volume (in cubic centimeters). If the sample is ferromagnetic and above saturation, M ¼ Ms , the force is given by F ¼ ðn=2ÞMs ðdH=dzÞ
ð2Þ
If the sample is inherently anisotropic or has shape anisotropy, effects of the x and y gradients can contribute to
Figure 1. Schematic representation of three flux-based magnetometers and the time dependence of the sample motion for each. The planes of the coil windings are perpendicular to the field for all these examples (axial configuration). The figures show the cross-section of the coils (not to scale) and the dots indicate coil polarity. The pickup coils of the extraction and vibrating sample magnetometers are multiturn, but those for the SQUID magnetometer are constructed with very few turns to minimize inductance and maximize coupling to the low-impedance SQUID transducer.
normal to B. A change in the magnetic flux can also be induced in a sample by changing an externally controlled variable such as magnetic field strength or temperature. Figure 1 is a schematic representation of three fluxbased magnetometers: the extraction, vibrating sample, and superconducting quantum interference device (SQUID) magnetometers. In each case the sample is moved relative to series-opposing pickup coils, which detect the flux changes. The oscillatory sample motion for the vibratingsample magnetometer is very small and rapid, the motion for the extraction magnetometer is stepwise and slower and confined to the coil dimensions, and in general the SQUID motion extends beyond the coil dimensions to measure peak-to-peak flux change. The pickup coils of the first two magnetometers are multiturn in order to maximize the induced voltage, but for the SQUID, the pickup coils have only a few turns in order to minimize the inductance and maximize the coupling to the extremely low impedance SQUID transducer.
Figure 2. Schematic representation of four force-based magnetometers and the time and spatial field variation for each. In all cases, a feedback mechanism is often used to allow null detection of the sample moment. The Faraday magnetometer generally accommodates somewhat larger samples than the other three systems. The piezoresistive element is indicated by the resistor and the piezoelectric sensors for the AGM are shown as thick lines.
MAGNETOMETRY
the measured force. Those components of the gradient are always present (divergence B ¼ 0). For force measurements, one needs to know the field and the gradient and also be able to reproduce accurately the position of the sample. The sensitivity is a strong function of field and gradient. Samples with high anisotropy or highly nonspherical shape can be problematic due to the transverse field gradients. Wolf (1957) has shown that, under certain conditions, even the sign of the measured force can be incorrect. External magnetic fields are usually required for magnetometers (see GENERATION AND MEASUREMENT OF MAGNETIC FIELDS). The two common sources of external fields for magnetometry are iron core electromagnets, which provide fields up to 20,000 G (2 T), and superconducting solenoids, which provide much higher fields, up to 150,000 G (15 T) or more. For many purposes the field from the electromagnets is ample. These can be turned on in a few minutes and furnish easy access to the field volume for manipulation and modifications. Superconducting solenoids generally require liquid helium for operation, take several hours to cool to operating temperatures, and offer a more restricted access to the experimental volume. Bitter solenoids, resistive solenoids that can provide fields to 30 T, and hybrid magnets, a combination of Bitter solenoids surrounded by superconducting solenoids providing even larger fields, are accessible at national laboratory facilities. Such user-oriented facilities include those in the United States, France, Japan, The Netherlands, and Poland. It is important for the experimenter to estimate the sample moment prior to measurement because this may dictate which technique is suitable. A 10-mg polycrystalline Ni spherical calibration sample will display a moment of 0.55 emu at saturation (above 6000 G) at 300 K and 0.586 emu at T ¼ 4:2 K. A 10-mg pure Pd sample will have a moment of 0.007 emu at the same field and temperature. The moment of a superconducting sample or ferromagnetic material depends not only on the applied field and temperature but also on the history of the sample in field. A 10-mg bulk Pb superconducting sample cooled to below its transition temperature in a zero field will show a moment of 0.007 emu (diamagnetic) in 100 G, but it will have a much smaller moment above the critical field. A given magnetometer can cover the range from a few emu to 106 emu or less. It should be stressed that ultimate sensitivity may not always be the most important criterion, given other constraints. Clearly, a sample moment of 1 emu is fairly large and easily measured. PRACTICAL ASPECTS OF THE METHOD Flux Magnetometers Flux-Integrating Magnetometer. The basic instrument for detecting a change in sample moment or magnetic flux change, f, in a pickup coil is the fluxmeter, essentially an operational amplifier with capacitive feedback C (Flanders and Graham, 1993). A pickup coil having N turns and a resistance Rc in which the sample is located is connected in series with the fluxmeter input resistance Ri . The fluxmeter integrates the voltage change given
533
by Faraday’s law, V (in volts)dt ¼ N 108 df (G-cm2). The sensitivity of a direct-reading fluxmeter depends on N, the time constant (which involves C, Ri , and Rc ), but is also limited by drift in the input, often generated by thermal emfs. The sensitivity and utility depend heavily on coil design and the values chosen for Ri and C. Commercially available high-speed digital voltmeters effectively replace the operational amplifier-based fluxmeter and deal more effectively with the drift problem. Variation of the integration time generally involves a trade-off between lower noise and higher drift (longer integration times) or lower drift and higher noise level (shorter integration times). Krause (1992) described a magnetometer that used a stepping motor to move the sample in a few seconds between opposing pickup coils in series [the same coils used for alternating-current (ac) susceptometry] so that the instrument carried out both dc and ac measurements. An extraction magnetometer works on the principle of relatively rapid sample removal from or insertion into the interior of a pickup coil. Vibrating-Sample Magnetometer. Commercially available since the mid-1960s, the vibrating sample magnetometer (VSM) was first described by Foner (1956). Most commonly, the sample is mechanically vibrated in zero or uniform applied field at 300 K) require a thermally shielded oven in the field volume. The furnace should be noninductively wound to minimize background effects. In addition, sample mounts and support elements must be chosen to withstand these temperatures and add minimum magnetic contributions.
METHOD AUTOMATION Several turnkey automated DC magnetometer systems are commercially available, and more are under development. These systems use desktop computers for convenient data acquisition, control of independent variables such as temperature and magnetic field, data analysis, and real-time display of data for on-line assessment. Most commercial SQUID systems are fully automated with respect to magnetic field and temperature programming (using gas flow control), and typical experimental runs may extend for days without intervention. More recently, some vendors have provided similar automation for VSM systems, which generally have somewhat higher external magnetic fields. The inherent instability of the sample position in force detection magnetometers implies that temperature control may be more difficult; thus gas flow temperature control is generally not employed in these systems.
DATA ANALYSIS AND INITIAL INTERPRETATION The following simulation using isothermal magnetization data for a reference (calibration) sample and a test material serves to illustrate the method for analyzing magnetic moment data. In this case, a small, pure, polycrystalline Ni sphere is used as the reference sample with data taken at 4.2 K. Refer to Figure 3. At T ¼ 4:2 K, the saturation moment of Ni is 58.6 emu/g. (It is 5% smaller at room temperature.) Therefore, the measured moment at a field sufficient to saturate the
Figure 3. Simulation showing how the saturation moment of a polycrystalline Ni sphere may be used to calibrate the magnetization of an unknown, sample X.
magnetization, SNi , will be SNi ¼ KmNi sNi , where K is a constant of proportionality dependent on the pickup coil geometry, mNi the mass, and sNi (¼ 58:6 emu/g) the saturation moment. This provides the calibration of the signal, S, with which all other samples can be measured. From Figure 3, the test point (Sx ¼ 0:4 units) yields a total moment of 0.088 emu ½¼ ð0:4=2:75Þ 0:604 or sx ¼ 3:75 emu/g using mx ¼ 2:34 102 g. Sample X, the test sample, represents a material that appears to have a metamagnetic transition near room temperature at about H ¼ 1 T. Both the reference sample and the test sample are presumed to be small relative to the pickup coils in the case of a flux detection magnetometer (sample dimensions less than about half the inner dimension of the pickup coil). For force detection magnetometers, (1) the sample dimensions should be small relative to the change in the field and field gradient, and (2) this procedure assumes a constant field gradient, otherwise a field dependent correction for gradient is required. Although this illustration involves isothermal magnetization, isofield magnetization as a function of temperature
538
MAGNETISM AND MAGNETIC MEASUREMENTS
(which in the limit of zero field is the magnetic susceptibility) can also be used to calibrate an unknown sample moment provided both temperature and moment are known. Corrections for larger samples (causing K to be slightly different in the two cases) may be made, and the reader is referred to Zieba and Foner (1982). PROBLEMS System Automation Automation and computer control of turnkey instruments tend to produce the trusting ‘‘black box’’ syndrome. The appropriate response is to trust and then verify by regularly making measurements with the standard calibration sample, especially before and after a given set of measurements. A record of the readouts for the standard is useful to observe whether the system is degrading with time. Temperature sensors and amplifiers can drift with time: the calibrations of temperature and field should also be tested periodically. Several issues arise from system automation that deserve attention, including the possibility of overshooting the target temperature and/or field. The system control may cause the sample temperature to overshoot (or undershoot if decreasing) relative to the target temperature. In systems that are magnetically reversible (nonhysteretic), paramagnetic, or weakly diamagnetic, this is not a problem, but for ferromagnetic or superconducting systems with magnetic hysteresis, this can yield spurious results. Overshooting or undershooting the external magnetic field relative to the target can also be problematic. Contamination of Sample Chamber It is important to assure that the sample chamber in the magnetometer is contaminant free, and to this end, the volume should be cleaned at convenient intervals. A few of the most common contaminants are discussed below. Generally, the most ubiquitous source of contamination is Fe in the form of small particles or oxides (dirt) that may ˚) stick to the sample or container. An Fe film [1 nm (10 A 2 over 1 cm ] left by a machine tool on a sample mount will show a moment of 2 104 emu at saturation. Iron contamination is easily observed as a magnetization nonlinearity as it is magnetized. Most surface Fe can be removed easily through etching in a dilute HCl solution. A common source of contamination is superconducting solder at low temperatures and fields (T < 6 K and H < 1000 G). Superconducting contaminants such as solder can produce spurious diamagnetic or paramagnetic signals depending on their location relative to the transducer and on their history of exposure to external fields. The use of ‘‘nonmagnetic’’ stainless steels in sample mounts or cryogenic vessels is common. However, most stainless steels are strongly paramagnetic, especially at low temperatures. If mechanically worked or brazed, some stainless steels can become ferromagnetic. Finally, for low-temperature measurements, it is important to assure that no air (oxygen) is trapped in the
sample tube. Below 80 K, oxygen (O2) may condense on the sample or sample mount. Oxygen is paramagnetic, spin S ¼ 1, and becomes solid below 50 K, depending on partial pressure. Oxygen can accumulate slowly with time through leaking sliding seals and during sample changes. The only solution to O2 contamination is to warm up the contaminated region, pump out the gas, and backfill with high-purity He gas. Nonsaturation of a Ni standard magnetization and/or erratic values of the Ni saturation moment may indicate solid O2 contamination. Unfortunately the magnetism literature is replete with evidence for solid O2 contamination of data. For example, the amount of O2 gas in 1 mm3 of air at STP will yield a moment of 2 104 emu at T ¼ 4:2 K and H ¼ 10,000 G (1 T). Sample Support Structure All support structures for sample mounts will show magnetic moments. Their contributions can be minimized in several ways for flux techniques, including minimizing the mass of the sample container and mount, minimizing a uniform cross-section support rod, and extending this support rod symmetrically well beyond the pickup coils so that no flux change is detected as the rod itself is translated. As stated above, it is a good practice to measure the moment of the empty sample mount from time to time. If it is generally 400 A increasing amount of Co present within the penetration ˚ of Co) depth of the light. In the thick regime (>400 A the signal approaches a constant value since the absorption of light limits the depth sensitivity. In the intermedi˚ of Co ate regime the maximum in the ellipticity at 120 A is due to the reflectivity changing from being dominated by Cu to being dominated by Co. Similar behavior is observed in the Fe/Au system (Moog et al., 1990). It is interesting to note that the ellipticity is independent of crystal orientation in the thickness range studied. In this example we can see the difference between MOKE and SMOKE. Traditionally, MOKE is defined as being independent of thickness, as opposed to the Faraday effect, which is pro˚ we are portional to thickness. For film thicknesses >400 A in the traditional MOKE regime. But the initial linear rise of the magneto-optic signal is a characteristic of the SMOKE regime, which is encountered in the ultrathin limit. For a quantitative analysis, we applied the formalism derived in the Appendix to simulate the results. The refractive indices we used are obtained from tabulations (Weaver, 1999) in the literature: nCu ¼ 0.249 þ 3.41i and nCo ¼ 2.25 þ 4.07i. We left the values of Q1 and Q2, where Q ¼ Q1 þ iQ2, for Co as free parameters to best fit the experimental curves and obtained Q1 ¼ 0.043 and Q2 ¼ 0.007. The calculated curves, depicted as solid lines in Figure 5, are in good overall agreement with the experimental data. In particular, the peaked behavior of the overlayers is faithfully reproduced. The ellipticities of three epitaxial Co/Cu superlattices ˚ )/ also appear in Figure 5. The superlattices are [Co(16 A ˚ )]n, grown on Cu(100), and [Co(11 A ˚ )/Cu(31 A ˚ )]n Cu(28 A
˚ )/Cu(35 A ˚ )]n, both grown on Cu(111). Their and [Co(18 A ellipticities in Figure 5 appear as a function of the total superlattice thickness. The superlattice ellipticities initially increase linearly in the ultrathin region and saturate in the thick regime, although there is no maximum at intermediate thicknesses. The lack of a maximum is readily understood because the reflectivity is not evolving from that of Cu to that of Co, as in the overlayer cases above. Instead, the reflectivity remains at an average value between the two limits, since both Co and Cu remain within the penetration depth of the light, no matter how thick the superlattice becomes. Using the Q value obtained from the Co overlayers, the Kerr ellipticities for the superlattices were calculated and plotted in Figure 5. The agreement with experimental data is obvious and is discussed further below. To test the additivity law applicable to the ultrathin regime, the experimental data in Figure 5 were replotted in Figure 6 as a function of the magnetic Co layer thickness only (the Cu layers are ignored). In the ultrathin regime all the data lie on a single straight line. This result provides a demonstration and confirmation of the additivity law that states that the total Kerr signal in the ultrathin regime is a summation of the Kerr signal from each individual magnetic layer and is independent of the nonmagnetic spacer layers. Despite the good overall semiquantitative agreement, the calculated ellipticity can be seen, upon close inspection, to exceed the experimental values in the ultrathin regime. For example, the calculated linear slope is ˚ , while the experimental result yields only 6.6 mrad/A ˚ 4.3 mrad/A. This systematic deviation can be due, for instance, either to the breakdown of the macroscopic description in the ultrathin region or to optical parameters (n and Q) that deviate from their bulk values.
Figure 6. The additivity law shows that the Kerr signal in the ultrathin regime depends on the magnetic layer only.
SURFACE MAGNETO-OPTIC KERR EFFECT
575
SAMPLE PREPARATION
Hulme, H. R. 1932. The Faraday effect in ferromagnetics. Proc. R. Soc. A135:237–257.
Sample preparation is governed by the same considerations as in other areas of surface science and thin-film growth: Vacuum conditions are maintained in the UHV realm, Auger spectroscopy (AUGER ELECTRON SPECTROSCOPY) is usually used to monitor surface cleanliness, and electron diffraction is used to monitor structural integrity. The types of evaporators might range from electron beam to thermal evaporators and are only limited by the creativity of the fabricator and the ability to control and monitor flux rates of different chemical species.
Kiefl, R. F., Brewer, J. H., Affleck, I., Carolan, J. F., Dosanjh, P., Hardy, W. N., Hsu, T., Kadono, R., Kempton, J. R., Kreitzman, S. R., Li, Q., O’Reilly, A. H., Riseman, T. M., Schleger, P., Stamp, P. C. E., Zhou, H., Le, L. P., Luke, G. M., Sternleib, B., Uemura, Y. J., Hart, H. R., and Lay, K. W. 1990. Search for anomalous internal magnetic fields in high-Tc superconductors as evidence for broken time-reversal symmetry. Phys. Rev. Lett. 64: 2082–2085.
PROBLEMS Several factors need to be considered in the evaluation of technical problems associated with the SMOKE technique. Experimentally, instability of the laser intensity is probably the main contributor to noise in the SMOKE signal. It is highly recommended to use intensity-stabilized lasers and/or lock-in amplifiers for the SMOKE measurements. A certain degree of vibration isolation is also important as in any optical measurement. As for the interpretation of SMOKE signals, one has to keep in mind that although the SMOKE rotation or ellipticity is proportional to the magnetization, the proportionality coefficient depends on the optical properties of the materials and is usually an unknown quantity for ultrathin films. Therefore, SMOKE is not a form of magnetometry that can determine the absolute value of the magnetic moment. This is an important fact, especially in studies of spin dynamics, because the SMOKE dynamics will not necessarily reflect the spin dynamics. ACKNOWLEDGMENT Work supported by the U.S. Department of Energy, Basic Energy Sciences-Materials Sciences, under contract DEAC03-76SF00098 (at Berkeley) and W-31-109-ENG-38 (at Argonne).
Kittel, C. 1951. Optical rotation by ferromagnet. Phys. Rev. 83: 208. Kliger, D. S., Lewis, J. W., and Randall, C. E. 1990. Polarized Light in Optics and Spectroscopy. Academic Press, San Diego. Landau, L. D. and Lifshitz, E. M. 1960. Electrodynamics of Continuous Media. Pergamon Press, Elmsford, N.Y. Maxwell, J. C. 1892. A Treatise on Electricity and Magnetism, Vol. II, 3rd ed., Chapter XXI, Article 811-812, pp. 454–455. Oxford University Press, Oxford. Moog, E. R. and Bader, S. D. 1985. SMOKE signals from ferromagnetic monolayers: p(11) Fe/Au(100). Superlattices Microstructures 1:543–552. Moog, E. R., Bader, S. D., and Zak, J. 1990. Role of the substrate in enhancing the magneto-optic response of ultrathin films: Fe on Au. Appl. Phys. Lett. 56:2687–2689. Qiu, Z. Q., Pearson, J., and Bader, S. D. 1992. Magneto-optic Kerr ellipticity of epitaxial grown Co/Cu overlayers and superlattices. Phys. Rev. B 46:8195–8200. Shen, Y. R. 1964. Faraday rotation of rare-earth ions. I. Theory. Phys. Rev. 133:A511–515. Spielman, S., Fesler, K., Eom, C. B., Geballe, T. H., Fejer, J. J., and Kapitulnik, A. 1990. Test for nonreciprocal circular birefringence in YBa2Cu3O7 thin films as evidence for broken time-reversal symmetry. Phys. Rev. Lett. 65:123–126. Voigt, W. 1908. Magneto- und Elektrooptic. B.G. Teuner, Leipzig. Weaver, J. H. and Frederikse, M. P. R. 1999. Optical properties of metal and semiconductors. In CRC Handbook of Chemistry and Physics, 80th ed. (D.L. Lide, ed.). Sect. 12, p. 129. CRC Press, Boca Raton, Fla. Wilczek, F. 1991. Anyons. Sci. Am. May:58–65. Zak, J., Moog, E. R., Liu, C., and Bader, S. D. 1990. Universal approach to magneto-optics. J. Magn. Magn. Mater. 89:107–123. Zak, J., Moog, E. R., Liu, C., and Bader, S. D. 1991. Magneto-optics of multilayers with arbitrary magnetization directions. Phys. Rev. B 43:6423–6429.
LITERATURE CITED Argyres, P. N. 1955. Theory of the Faraday and Kerr effects in ferromagnetics. Phys. Rev. 97:334–345. Bader, S. D. 1991. SMOKE. J. Magn. Magn. Mater. 100:440–454. Bennet, H. S. and Stern, E. A. 1965. Faraday effect in solids. Phys. Rev. 137:A448–A461. Dillon, J. F., Jr. 1971. Magneto-optical properties of magnetic crystals. In Magnetic Properties of Materials (J. Smith, ed.). pp. 149–204. McGraw-Hill, New York. Erskine, J. E. and Stern, E. A. 1975. Calculation of the M23 magneto-optical absorption spectrum of ferromagnetic nickel. Phys. Rev. B 12:5016–5024. Falicov, L. M., Pierce, D. T., Bader, S. D., Gronsky, R., Hathaway, K. B., Hopster, H. J., Lambeth, D. N., Parkin, S. S. P., Prinz, G., Salamon, M., Schuller, I. K., and Victora, R. H., 1990. Surface, interface, and thin-film magnetism. J. Mater. Res. 5:1299– 1340.
KEY REFERENCES Bennemann, K. H. (ed.). 1998. Nonlinear Optics in Metals. Clarendon Press, Oxford. Covers the physics of linear as well as nonlinear magneto-optics. Dillon, 1971. See above. Provides an excellent background to the magneto-optic Kerr effect. Freeman, A. J. and Bader, S. D. (eds.). 1999. Magnetism Beyond 2000. North Holland, Amsterdam. Has 46 review articles that cover cutting-edge issues and topics in magnetism, many of which are being addressed using magnetooptic techniques. Kliger, D. S., Lewis, J. W., and Randall, C. E. 1990. Polarized Light in Optics and Spectroscopy. Academic, Boston. Covers instrumentation.
576
MAGNETISM AND MAGNETIC MEASUREMENTS
APPENDIX A: THE MEDIUM-BOUNDARY AND MEDIUM-PROPAGATION MATRICES The first matrix, denoted A, is the 44 medium-boundary matrix. It relates the components of the electric and magentic fields with the s and p components of the electric field, which are, respectively, perpendicular to and parallel to the plane of incidence. We define the incident (i) and reflected (r) waves at each boundary between two layers as in Figure 7. Then, we obtain the relation between the x, y components of E and H and the the s, p components of the electric field as 1 0 i1 Es Ex BE C B Ei C B yC B pC C ¼ AB r C B @ Hx A @ Es A Hy Erp 0
where 0
1 B i ½Q tan yð1 þ cos2 yÞ þ Q sin2 y B2 y z B in B 2 ðQy sin y þ Qz cos yÞ B B n cos y B A¼B B 1 B i B ½Qy tan yð1 þ cos2 yÞ þ Qz sin2 y B 2 B in @ 2 ðQy sin y Qz cos yÞ n cos y
ð10Þ Figure 7. Definitions of the s and p directions for the incidence and reflection waves at the boundary between two media.
1 0 cos y þ iQx sin y C C C C n C in C ðQ tan y þ Q Þ y z C 2 C C 0 C cos y þ iQx sin y C C C n A in 2 ðQy tan y Qz Þ
With the matrices A and D, we can calculate the magneto-optic effect under any conditions. Consider a multilayer structure that consists of N individual layers and a beam of light impinging on the top of the structure from the initial medium (i). After multiple reflections, there will be a reflected beam backscattered into the initial medium and a transmitted beam that emerges from the bottom layer into the final medium (f). The electric fields in the initial and final media can then be expressed as
ð11Þ The second matrix, denoted D, is the 4 4 mediumpropagation matrix. It relates the s and p components of the electric field at the two surfaces (1 and 2) of a film of thickness d. This relation can then be expressed by the matrix product 0 i1 0 i1 Es Es B i C B i C B Ep C B Ep C B B C C ð12Þ B E r C ¼ DB E r C @ sA @ sA Erp Erp 2 where 0
U cos di B U sin di D¼B @ 0 0
U sin di U cos di 0 0
0 0 U 1 cos dr U 1 sin dr
1
0 C 0 C 1 U sin dr A U 1 cos dr ð13Þ
with
0
Eis
0
1
Eis Eip
1
B i C B C B Ep C B C C¼B C Pi ¼ B B Er C B r Ei þ r Ei C ss sp @ sA @ s p A Erp rps Eis þ rpp Eip i
0
1
0
tss Eis B B tps Ei s B
Eis B Ei C B C Pf ¼ B p C ¼ B @ 0 A @ 0 f
and
þ þ 0 0
tsp Eip tpp Eip
1 C C C C A
where r and t are reflection and transmission coefficients of the corresponding components. If Pm is the field component at the bottom surface of the mth layer, then the field component at the top surface of the mth layer will be DPm. Because Ex, Ey, Hx, and Hy are related to P by the matrix A, the boundary condition—Ex, Ey, Hx, and Hy are continuous—at the interface between the mth layer and the (m þ 1)th layer is Am Pm ¼ Amþ1 Dmþ1 Pmþ1
U ¼ expðikd cos yÞ kd ðQy tan y þ Qz Þ di ¼ 2 kd ðQy tan y Qz Þ dr ¼ 2
ð15Þ
ð16Þ
Then the relation between Pi and Pf can be derived as ð14Þ 1 Ai Pi ¼ A1 D1 P1 ¼ A1 D1 A1 1 A1 P ¼ A1 D1 A1 A2 D2 P2
¼ ¼ Here, k ¼ 2p/l with l being the wavelength of the light.
N Y
ðAm Dm A1 m ÞAf Pf
m¼1
ð17Þ
SURFACE MAGNETO-OPTIC KERR EFFECT
If this expression is put in the form Pi ¼ TPf, where T ¼ A1 i
N Y
ðAm Dm A1 m Þ Af
m¼1
G I
H J
coefficients
ð18Þ
then the 2 2 matrices of G and I can used to obtain the Fresnel reflection and transmission coefficients:
tss tps
tsp tpp
and IG1 ¼
rss rps
rsp rpp
rps fs ¼ f0s þ if00s ¼ rss
ð19Þ
rsp and fp ¼ f0p þ if00p ¼ rpp
ADA1
1 2pd l nQy
0 sin y
i2pd 2 l n
2pd 2 l n Qz 2 i2pd l n
1 þ 2pd l nQx sin y
2
cos y
2pd 2 l n Qz
i2pd l
0 i2pd 2 l cos
y
1 2pd l nQx sin y 2pd l nQy
sin y
rsp
m
ð22Þ
m
ð20Þ
In the ultrathin limit the magneto-optic expressions simplify further. For ultrathin films the total optical thickness of of the light, P the film is much less than the wavelength 1 matrix can be simplii ni di l. In this limit the ADA fied to B B B B ¼B B B @
m
4p ni cos yi ¼ l ðni cos yi þ nf cos yf Þðnf cos yi þ ni cos yfÞ ! X X 2 ðmÞ ðmÞ cos yf dm nm Qz þ nf ni sin yi dm Qy m
APPENDIX B: THE ULTRATHIN LIMIT
0
rps
The Kerr rotation f0 and ellipticity f00 for p- and s-polarized light are then given by
ni cos yi nf cos yf ni cos yi þ nf cos yf nf cos yi ni cos yf ¼ nf cos yi þ ni cos yf 4p ni cos yi ¼ l ðni cos yi þ nf cos yf Þðnf cos yf Þ ! X X 2 ðmÞ ðmÞ dm nm Qz nf ni sin yi dm Qy cos yf
rss ¼ rpp
G1 ¼
577
0 0 1
1 C C C C C C C A
ð21Þ
Here ni, yi , and nf, yf are the refraction indices and the incident angles of the initial and the final media, respectively. Equation 22 provides a basis for the additivity law for multilayers in the ultrathin limit, which states that the total Kerr signal is a simple summation of the Kerr signals from each magnetic layer and is independent of the nonmagnetic spacer layers in the multilayer structure. This additivity law is true only in the limit where the total optical thickness of the layered structure is much less than the wavelength of the incident beam. For thick film, it is obvious that the additivity law must break down because the light attenuates and will not penetrate to the deeper layers of the structure. Z. Q. QIU University of California at Berkeley, Berkeley, California
S. D. BADER If the initial and final media are nonmagnetic, then the 2 2 matrices of G and I in Equation 19 yield the reflection
Argonne National Laboratory Argonne, Illinois
This page intentionally left blank
ELECTROCHEMICAL TECHNIQUES INTRODUCTION
the scope of this work. In most cases, an incomplete understanding of this aspect of electroanalysis will not affect the quality of the experimental outcome. In cases where the experimenter wishes to obtain a more detailed understanding of the physical chemical basis of electrochemistry a variety of texts are available. Two volumes, the first by Bard and Faulkner (1980) and the second by Gileadi (1993) are particularly well suited to a more detailed understanding of electrochemistry and electroanalysis. The latter volume is specifically aimed at the materials scientist while the former work is one of the central texts in physical electrochemistry. A third text edited by Kissinger and Heineman (1996) is an excellent source of experimental details and analytical technologies. All of these texts will be most valuable as supplements to this work, providing more chemical details but less materials-characterization emphasis than found here. Electrochemical measurements can be divided into two categories based on the required instrumentation. Potentiometric measurements utilize a sensitive voltmeter or electrometer and measure the potential of a sample against a standard of known potential. Voltammetric measurements utilize a potentiostat to apply a specified potential waveform to a sample and monitor the induced current response. Amperometric measurements, in which the potential is held constant and the current monitored as a function of time, are included in this latter category. Potentiometric measurements form the historical basis of electrochemistry and are of present utility as a monitoring technique for solution-based processes (i.e., pH monitoring or selective ion monitoring). However, potentiometry is limited with regard to materials characterization. Thus, this chapter focuses on voltammetric measurements and the use of the potentiostat. The potentiostat is fundamentally a feedback circuit that monitors the potential of the test electrode (referred to as the working electrode) versus a reference half-cell (typically called a reference electrode). If the potential of the working electrode drifts from a prescribed offset potential versus the reference electrode, a correcting potential is applied. A third electrode, the counterelectrode (or auxiliary electrode) is present in the associated electrochemical cell to complete the current pathway. The potentiostat typically contains a currentfollowing circuit associated with the auxiliary electrode, which allows a precise determination of the current, which is reported to a recording device as a proportional potential. The rudimentary operation of the potentiostat is covered in this text. However, for many applications a more detailed analysis of potentiostat electronics is desirable. To this end, the reader is directed to Gileadi’s excellent monograph (Gileadi et al., 1975). Potentiostats are available from a series of vendors and range in price from less than $1000 to approximately
Electrochemistry, in contrast to most materials characterization disciplines, contains a bifurcated methodology having applications both in materials synthesis and as a physical characterization technique. On the one hand, a variety of functional surfaces are prepared electrochemically, for example, the anodization of aluminum, electroetching of semiconductors, and electroplating of metallic films. Similarly, the electrosynthesis of bulk materials such as conducting organic polymers has taken on importance in recent years. On the other hand, electrochemistry has been demonstrated to be of use as a physical characterization technique for the quantification of conducting and semiconducting surfaces. The corrosion properties of metallic surfaces and semiconductor electronic properties are two key areas where electrochemical characterization has had a high impact. In keeping with the central analytical theme of this work, this chapter limits its focus to the class of electrochemical techniques specifically aimed at characterizing materials and their interfaces (as opposed to synthetic electrochemistry). In some cases, the line between materials synthesis and measurement can become hazy. The field of sensors based on chemically modified electrodes (CMEs) is a good example of this confusion. Often sensors of this type are electrosynthesized and the synthesis parameters represent an important portion of the systems characterization. Cases such as this will be included in the topical coverage of this chapter. Electrochemistry differs from other analytical techniques in a second important way. Unlike most instrumental characterization techniques, the sample under study is made into part of the measuring circuitry, and thus, inappropriate sample preparation can effectively lead to a malfunction of the instrument. More importantly, this synergism demands that the user have some knowledge of the detailed circuitry associated with the instrument being employed. Most instruments can be treated as ‘‘black boxes’’; the user need only understand the rudiments of the instrument’s functions. However, if this approach is employed in the execution of an electrochemical experiment, the result is often an artifactual instrument response that is easily misinterpreted as the chemical response of the sample. The electrochemical literature repeatedly testifies to the preeminence of this problem. Modern electroanalytical techniques have become viable characterization tools because of our abilities both to precisely measure small currents and to mathematically model complex heterogeneous charge transfer processes. Modeling of electrode-based charge transfer processes is only possible using digital simulation methods. The mathematics and computer science related to this underpinning of electrochemistry is sophisticated and beyond 579
580
ELECTROCHEMICAL TECHNIQUES
$30,000. Variations in price have to do with whether or not digital circuits are employed, the size of the power supply utilized, and the dynamic range and stability of the circuitry. A typical research grade potentiostat presently costs about $20,000 and thus is accessible as a standard piece of laboratory instrumentation. For certain specific applications, less expensive equipment is available (in the approximately $1,000 to $2,000 range) with top-end machines costing around $50,000. Electrochemical investigations involve making the sample of interest into one electrode of an electrochemical cell. As such, it is immediately obvious that this technique is limited to materials having excellent to good conductivity. Metallic samples and conducting organic polymers therefore, immediately jump to mind as appropriate samples. What may be less obvious is the application of electrochemical techniques to the characterization of semiconductors; however, this application is historically central to the development of electronic materials and continues to play a key role in semiconductor development to date. The listing above points to obvious overlaps with other materials-characterization techniques and suggests complimentary strategies that may be of utility. In particular, the characterization of electronic systems using electrical circuit responses (see ELECTRICAL AND ELECTRONIC MEASUREMENT) or electron spectroscopy (see ELECTRON TECHNIQUES) is often combined with electrochemical characterization to provide a complete picture of semiconducting systems. Thus, for example, one might employ solid-state electronics to determine the doping level or (n-type versus p-type character) of a semiconducting sample prior to evaluation of the sample in an electrochemical cell. Likewise, electron spectroscopy might be used to evaluate band-edge energetics or the nature of surface states in collaboration with electrochemical studies that determine interfacial energetics and kinetics. In passing, it is interesting to note that the first ‘‘transistor systems’’ reported by Bell Laboratories were silicon- and germanium-based three-electrode electrochemical cells (Brattain and Garrett, 1955). While these cells never met commercial requirements forcing the development of the solidstate transistor, these studies were critical in the development of our present understanding of semiconductor interfaces. This chapter considers the most commonly utilized techniques in modern electrochemistry: polarography (i.e., currents induced by a slowly varying linear potential sweep), cyclic voltammetry (i.e., currents induced by a triangular potential waveform), and AC impedance spectroscopy (i.e., the real and imaginary current components generated in response to an AC potential of variable frequency). The characterization of semiconducting materials, the corrosion of metallic interfaces, and the general behavior of redox systems are considered in light of these techniques. In addition, the relatively new use of electrochemistry as a scanning probe microscopy technique for the visualization of chemical processes at conducting and semiconducting surfaces is also discussed.
LITERATURE CITED Bard, A. J. and L. R. Faulkner. 1980. Electrochemical Methods, Fundamentals and Applications. John Wiley & Sons, New York. Brattain, W. H. and Garrett, C. G. B. 1955. Bell System Tech. J. 34:129. Gileadi, E. 1993. Electrode Kinetics for Chemists, Chemical Engineers, and Materials Scientists. VCH Publishers, New York. Gileadi, E., Kirowa-Eisner, E., et al. 1975. Interfacial Electrochemistry: An Experimental Approach. Addison-Wesley, London. Kissinger, P. T. and Heineman, W. R. (eds.). 1996. Laboratory Techniques in Electroanalytical Chemistry. Marcel Dekker, New York.
INTERNET RESOURCES http://electrochem.cwru.edu/estir/ Web site of the Electrochemical Science and Technology Information Resource (ESTIR), established by Zoltan Nagy at Case Western Reserve. It serves as the unofficial site of the electrochemistry community. http://www.electrochem.org/ Web site of The Electrochemical Society. http://www.eggpar.com/index.html Web site of Princeton Applied Research (a major supplier of electrochemical instrumentation. This site provides a series of ‘‘Application Notes.’’ http://seac.tufts.edu Web site of the Society of Electroanalytical Chemistry, providing a full compliment of links to electrochemical sites.
ANDREW B. BOCARSLY
CYCLIC VOLTAMMETRY INTRODUCTION A system is completely characterized from an electrochemical point of view if its behavior in the three-dimensional (3D) space composed of current, potential, and time is fully specified. In theory, from such phenomenologic data one can determine all of the system’s controlling thermodynamic and kinetic parameters, including the mechanism of charge transfer, rate constants, diffusion coefficients, standard redox potentials, electron stoichiometry, and reactant concentrations. A hypothetical i(E,t) (current as a function of potential and time) data set is shown in Figure 1. This particular mathematical surface has been synthesized using the assumption that the process of interest involves a reversible one-electron charge transfer. The term ‘‘reversible’’ is used here in its electrochemical sense, to indicate that charge transfer between the electrode and the redox-active species both is thermodynamically reversible—i.e., the molecule(s) of interest can be both oxidized and reduced at potentials near the standard redox
CYCLIC VOLTAMMETRY
Figure 1. Theoretically constructed current-potential-time [i(E,t)] relation for an ideal, reversible, one-electron charge-transfer reaction taking place at an electrode surface. The half-wave potential, E1/2, is the redox potential of the system under conditions where reactant and product diffusion coefficents are similar and activities can be ignored.
potential—and occurs at a rate of reaction sufficiently rapid that for all potentials where the reaction occurs, the process is never limited by charge-transfer kinetics. In such a case, the process is completely described by the standard redox potential of the electrochemical couple under investigation, the concentration(s) of the couple components, and the diffusion coefficients of the redox-active species. On the other hand, one can imagine a redox reaction that is totally controlled by the kinetics of charge transfer between the electrode and the redox couple. In this case, a different 3D current-potential-time surface is obtained that depends on the charge-transfer rate constant, the symmetry of the activation barrier, and the reactant concentrations. Of course, a variety of mechanisms in between these two extremes can also be obtained, each leading to a somewhat different 3D representation. In addition, purely chemical processes can be coupled to charge-transfer events, providing more complex reaction dynamics that will be reflected in the current-potential-time surface. The enormous information content and complexity of electrochemical dynamics in the current-potential-time domain introduces two major experimental complications associated with electrochemical data. First is the pragmatic consideration of how much time is necessary to obtain a sufficiently complete data set to allow for a useful analysis. The second, far more serious complication is that different charge-transfer mechanisms often translate into subtle changes in the i(E,t) response. Thus, even after one has access to a complete data set, visual inspection is typically an insufficient means to determine the mechanism of charge transfer. Furthermore, knowledge of this mechanism is critical to determining the correct approach to data analysis. Thus, given the requisite data set, one must rely on a series of large calculations to move from the data to chemically meaningful parameters. The approach described so far is of limited value as a survey tool, or as a global mechanistic probe of electrochemical phenomena. It is, however, a powerful technique for
581
obtaining precise thermodynamic and kinetic parameters once the reaction mechanism is known. The mathematical resolution to the problem posed here is to consider the projection of the i(E,t) surface onto a plane that is not parallel to the i-E-t axis system. This is the basis of cyclic voltammetry. Not surprisingly, a single projection is not sufficient to completely characterize a system; however, if judiciously chosen, a limited set of projections (on the order of a half dozen) will provide sufficient information to determine the reaction mechanism and obtain the key reaction parameters to relatively high precision. Perhaps more importantly, reduction of the data to two dimensions allows one to determine the mechanism based on pattern recognition, and thus no calculational effort is required. Practically, the projection of interest is obtained by imposing a time-dependent triangular waveform on the electrode as given by Equation 1: EðtÞ ¼
Ei þ ot Ei þ otl ot
0
for
t tl for tl t
ð1Þ
where Ei is the initial potential of the scan, o is the scan rate (in mV/s), and tl is the switching time of the triangular waveform as shown in Figure 2A. The induced current is then monitored as a function of the electrode potential as shown schematically in Figure 2B. The wave shape of the i-E plot is diagnostic of the mechanism. Different projection planes are simply sampled by varying the value of
Figure 2. (A) Triangular E(t) signal applied to the working electrode during a cyclic voltammetric scan. El is the switching potential where the scan direction is reversed. (B) Cyclic voltammetric current versus potential response for an ideal, reversible one-electron redox couple under the E(t) waveform described in (A).
582
ELECTROCHEMICAL TECHNIQUES
o for a series of cyclic voltammetric scans. Although the 2D traces obtained offer a major simplification over the 3D plots discussed above, their chief power, as identified initially by Nicholson and Shain (Nicholson and Shain, 1964), is that they are quantified from a pattern-recognition point of view by two easily obtained scan-rate-dependent parameters. Nicholson and Shain showed that these diagnostic parameters, peak current(s) and peak-to-peak potential separation, produce typically unique scan-rate dependencies as long as they are viewed over a scan-rate range that traverses at least 3 orders of magnitude. The details of this analysis are provided later. First, one needs to consider the basic experiment and the impact of laboratory variables on the results obtained. The qualitative aspect of cyclic voltammetry has made it a general tool for the evaluation of electrochemical systems. With regard to materials specifically, the technique has found use as a method for investigating the electrosynthesis of materials (including conducting polymers and redox-produced crystalline inorganics), the mechanism of corrosion processes, the electrocatalytic nature of various metallic systems, and the photoelectrochemical properties of semiconducting junctions. Here a general description of the cyclic voltammetric technique is provided along with some examples focusing on materials science applications.
PRINCIPLES OF THE METHOD Reaction Reversibility The Totally Reversible Reaction. The simplest chargetransfer mechanism to consider is given by Equation 2, a reversible redox process: Ox þ ne , Red
ð2Þ
For this process, the concentration of the reactant and products at the electrode surface is given by the Nernst equation (Equation 3) independent of reaction time or scan rate: E ¼ ER ð2:303Þ
RT ½Red log nF ½Ox
ð3Þ
where E is the electrode potential, ER is the standard redox potential of the redox couple, [Red] and [Ox] are the concentrations of the reduced and oxidized species, respectively, R is the gas constant, T is the temperature (K), F is Faraday’s constant, and n is defined by Equation 2. [At room temperature, 298 K, 2.303 (RT/nF) ¼ 59 mV/n.] Equation 3 indicates that the electrode potential, E, is equal to the redox couple’s redox potential, ER, when the concentrations of the oxidized species and reduced species at the electrode/electrolyte interface are equal. It can be shown that this situation is approximately obtained when the electrode is at a potential that is halfway between the anodic and cathodic peak potentials. This potential is referred to as the cyclic voltammetric halfwave potential, E1/2. More precisely it can be demonstrated
that the half-wave potential and the standard redox potential are related by Equation 4: E1=2
! RT gOx DRed 1=2 ¼ ER ln gRed DOx nF
ð4Þ
where DOx and DRed are the diffusion coefficients of the oxidized and reduced species, respectively, and the g values represent the activity coefficients of the two halves of the couple. While this result is intuitively appealing, a solid mathematical proof of the relationship is complex and beyond the scope of this unit. The mathematical details are well presented in a variety of textbooks, however (Gileadi et al., 1975; Bard and Faulkner, 1980; Gosser, 1993). By its very nature, the reversible redox reaction cannot cause a substantial change in the connectivity or shape of a molecular system. As a result, the diffusion coefficients of the oxidized and reduced species are expected to be similar, as are the activity coefficents—in which case Equation 4 reduces to E1/2 ¼ ER. Even if there is some variation in the diffusion coefficients, the square root functionality invariably produces a ratio close to 1, and thus the second term in Equation 4 can safely be ignored. Likewise, at a low concentration of electroactive species (as is typically employed in cyclic voltammetry), the activity coefficients can safely be ignored. Once a system has been demonstrated to be reversible (or quasireversible, as discussed later), the redox potential can then be directly read off the cyclic voltammogram. For the reversible mechanism, the idealized cyclic voltammetric response is illustrated in Figure 2B. Within the context of the Nicholson and Shain diagnostic criteria, this cyclic voltammetric response provides a coupled oxidation and reduction wave separated by 60/n mV independent of the scan rate. The exact theoretical value for the peak-to-potential separation is not critical since this value is based on a Monte Carlo approximation and is somewhat dependent on a variety of factors that are often not controlled. In a real electrochemical cell the ideal value is typically not achieved. The peak height of the anodic and cathodic waves should be equivalent, and the current function (for either the anodic or cathodic waves) is expected to be invariant with scan rate. Under such conditions the peak current is given by Equation 5 (Bard and Faulkner, 1980): 1=2 ip ¼ ð2:69 105 Þn3=2 AD1=2 Co o o
ð5Þ
where n is as defined by Equation 3, A is the area of the working electrode in cm2, Do is the diffusion coefficient of the electroactive species having units of cm2/s, o is the scan rate in mV/sec, and Co is the bulk electrolyte concentration in moles/cm3. (Note that these are not the standard units of concentration.) One important caveat must be noted here: the peak-topeak separation is not solely dependent on the chargetransfer mechanism; cell resistances, for example, will increase the peak-to-peak separation beyond the anticipated 60/n mV. Thus, the practical application of the diagnostic
CYCLIC VOLTAMMETRY
is a constant small peak-to-peak separation ( 100/n mV) with scan rate. It is also important to realize that the impact of cell resistance on potential (V ¼ iR) is scanrate dependent since the magnitude of the observed current increases with scan rate. Thus, under conditions of high cell resistance (for example, when a nonaqueous electrolyte is used), a reversible couple may yield a scan rate– dependent peak-to-peak potential variation significantly greater than the ideal 60-mV shift. It is therefore critical to evaluate all three of the diagnostics over a reasonable range of scan rates before reaching a mechanistic conclusion. The Totally Irreversible Reaction. The opposite extreme of a reversible reaction is the totally irreversible reaction. As illustrated by Equation 6, such a reaction is kinetically sluggish in one direction, producing a charge-transfer– limited current. The reaction is assigned a rate constant k; however, it is important to note that k is dependent on the overpotential, Z, where Z is the difference between the redox potential of the couple and the potential at which the reaction is being observed (ER – Eelectrode). It is convenient to define a heterogeneous charge transfer rate constant, ks, which is the value of k when Z ¼ 0. ne
Ox ! Red
ð6Þ
For this case, a mass transport–limited current is only achieved at large overpotentials, since the rate constant for charge transfer is small for reasonable values of the electrode potential. The concentrations of redox species near the electrode are never in Nernstian equilibrium. Thus, one cannot determine either the redox potential or the diffusion coefficients associated with this system. However, careful fitting of the cyclic voltammetric data can provide the heterogeneous charge-transfer rate constant, ks, and activation-barrier symmetry factor as initially discussed by Nicholson and Shain (1964) and reviewed by Bard and Faulkner (Bard and Faulkner, 1980; Gosser, 1993). In the case of a very small value of ks, the irreversible cyclic voltammogram is easily identified, since it consists of only one peak independent of the scan rate employed. This peak will shift to higher potential as the scan rate is increased. For moderate but still limiting values of ks one will observe a large peak-to-peak separation that is scan-rate dependent. The E1/2 value will also vary with scan rate. In this case, it is important to rule out unexpected cell resistance as the source of the potential dependence, before concluding that the reaction is irreversible. Often modern digital potentiostats are provided with software that allows the direct measurement (and real time correction) of cell resistance via a current-interrupt scheme in which the circuit is opened for a short (millisecond) time period and the cell voltage is measured. Analog potentiostats often have an iR compensation circuit (Kissinger and Heineman, 1996).
are those exhibiting rate constants less than 2 105 o1/2. (Note that this definition is quite surprising in that the requisite minimum rate constant for a reversible reaction depends on the potentiostat being employed. As the slew rate of the system increases, the ability to see large charge-transfer rate constants is enhanced. This is unfortunate, in that it clouds the distinction between thermodynamics and kinetics.) This definition produces a large number of reactions having intermediate rate constants (0.3o1/2 ks 2 105o1/2), which are referred to as quasireversible systems. These systems will appear reversible or irreversible depending on the scan rate employed. For sufficiently large scan rates, the rate of interfacial charge transfer will be limiting and the system will appear irreversible. For slower scan rates, the system response time will allow the Nernst equation to control the interfacial concentrations and the system will appear reversible. Depending on the scan rate employed one can determine systemic thermodynamic parameters (redox potential, n, and diffusion coefficient) or kinetic parameters. As in the reversible case, the current function for quasireversible systems tends to be scan rate independent. The peak-topeak potential dependence is also often an inconclusive indicator. However, the potential of the peak current(s) as a function of scan rate is an excellent diagnostic. At low scan rates, the peak potential can approach the theoretical 30/n mV separation from the half-wave potential in a scan rateindependent manner, however, as the scan rate is increased and the system enters the irreversible region and the peak potential shifts with log o. This dependence is given by Equation 7 (Gileadi, 1993): "
EIrr p
# 1=2 Do ¼ E1=2 s 0:52 þ log log ks s
ð7Þ
where s is the Tafel slope, a measure of the kinetic barrier, and the other terms are as previously defined. Note that the slope of Ep versus 0.5logo allows one to determine (s/D), while D and E1/2 can be obtained from cyclic voltammograms taken in the reversible region allowing one to determine k(E1/2) (see Fig. 5C for an example of this type of behavior). Nonreversible Charge-Transfer Reactions. In contrast to mechanistically ‘‘irreversible reactions,’’ which indicate a kinetic barrier to charge transfer, a mechanistically ‘‘nonreversible reaction’’ refers to a complex reaction mechanism in which one or more chemical steps are coupled to the charge-transfer reaction (Nicholson and Shain, 1965; Polcyn and Shain, 1966a,b; Saveant, 1967a,b; Brown and Large, 1971; Andrieux et al., 1980; Bard and Faulkner, 1980; Gosser, 1993; Rieger, 1994). For example, consider the generic chemical reaction shown as Equation 8: Ox þ ne , Red k
The Quasireversible Reaction. Although arbitrary, it is practically useful to define reversible charge-transfer systems as those having a heterogeneous charge-transfer rate constant in excess of 0.3o1/2, while totally irreversible systems
583
Red ! Product
ð8Þ
where k represents the rate constant for a nonelectrochemical transformation such as the formation or destruction of
584
ELECTROCHEMICAL TECHNIQUES
a chemical bond. This reaction couples a follow-up chemical step to a reversible charge-transfer process and is thus referred to as an EC process (electrochemical step followed by a chemical step). Consider the effect of this coupling on the cyclic voltammetric response of the Ox/Red system. At very fast scan rates, Ox can be converted to Red and Red back to Ox before any appreciable Product is formed. Under these conditions, the Nicholson and Shain diagnostics will appear reversible. However, as the scan rate is slowed down, the redox couple will spend a longer time period in the Red state and thus the formation of Product will diminish the amount of Red below the ‘‘reversible’’ level. Therefore, at slow scan rates the ratio of peak currents (ia/ic) will increase above unity (the reversible value). On the other hand, the current function for the cathodic current will decrease below the reversible level as Red is consumed and thus becomes unavailable for back-conversion to Ox. The peak-to-peak potential is not expected to change for the mechanistic case presented. This set of diagnostics is unique and can be used to demonstrate the presence of an EC mechanism. Likewise, one can consider mechanisms where a preliminary chemical step is coupled to a follow-up electrochemical step (a CE mechanism) or where multiple chemical and electrochemical steps are coupled. In many of the cases that have been considered to date, the Nicholson and Shain diagnostics provide for a unique identification of the mechanism if viewed over a sufficiently large scan-rate window. An excellent selection of mechanisms and their expected Nicholson and Shain diagnostics are presented in Brown and Large (1971).
PRACTICAL ASPECTS OF THE METHOD Since a cyclic voltammetric experiment is fundamentally a kinetic experiment, the presence of a well-defined internal ‘‘clock’’ is essential. That is, the time dependence is provided experimentally by the selected scan rate since the input parameter E(t) is implicitly a time parameter. However, in order for this implicit time dependence to be of utility it must be calibrated with a chemically relevant parameter. The parameter used for this purpose is the diffusion of an electroactive species in the electrolyte. Thus, the cyclic voltammetric experiment is fundamentally a ‘‘quiet’’ experiment in which convective components must be eliminated and the diffusion condition well defined. This is done by considering the geometry of the electrochemical cell, the shape of the electrode under investigation, and the time scale of the experiment. Additionally, from the essence of the experiment, the application of an electrode potential combined with the monitoring of the induced current, it is important to know to high precision the time-dependent electrode potential function, E(t), for all values of t. This capability is established by using a potentiostat to control an electrochemical cell having a three-electrode cell configuration. Successful interpretation of the electrochemical experiment can only be achieved if the experimenter has a good knowledge of the characteristics and limitations of the potentiostat and cell configuration employed.
Electrochemical Cells Cyclic voltammetric experiments employ a ‘‘three-electrode’’ cell containing a working electrode, counterelectrode (or auxiliary electrode), and reference electrode. The working electrode is the electrode of interest; this electrode has a well-defined potential for all values of E(t). The counterelectrode is simply the second electrode that is requisite for a complete circuit. The reference electrode is in fact not an electrode at all, but rather an electrochemical half cell used to establish a well-defined potential against which the working electrode potential can be measured. Typically, a saturated calomel electrode (SCE—a mercury-mercurous couple) or a silversilver chloride electrode is utilized for this purpose. Both of these reference half cells are commercially available from standard chemical supply houses. While the consideration of reference half cells is important to any electrochemical experiment, it is beyond the scope of this unit; the reader is referred to the literature for a consideration of this topic (Gileadi, 1993; Gosser, 1993; Rieger, 1994). The exact geometry of the three electrodes, along with the shape and size of the working electrode, will determine the internal cell resistance and capacitance. These electrical parameters will provide an upper limit for both the current flow (via Ohm’s law: i ¼ V/R, where V is voltage and R is resistance) and the cell response time (via the capacitive time constant of the cell). The contact between the reference half cell and the remainder of the electrochemical cell is typically supplied through a high-impedance frit, ceramic junction, or capillary. A high-impedance junction is employed for two purposes: to eliminate chemical contamination between the contents of the reference electrode electrolyte and the electrolyte under investigation and to guarantee a minimal current flow between the reference and working electrodes; the voltage drop associated with this current, referred to as iR drop (Ohm’s law), is uncompensated by standard potentiostat circuitry. If this voltage drop becomes substantial (see Equation 9 and discussion below), the data will be highly distorted and not representative of the chemical thermodynamics/kinetics under investigation.
Potentiostats and Three-Electrode Electrochemical Cells At its heart, the potentiostat is a negative-feedback device that monitors the potential drop between the working electrode and the reference electrode. If this value deviates from a preselected value, a bias is applied between the working and counterelectrodes. The working/counterelectrode voltage drop is increased until the measured working electrode versus reference electrode potential returns to the preset value (Gileadi et al., 1975; Gileadi, 1993). In order to carry out a cyclic voltammetric experiment, a potential waveform generator must be added to the potentiostat. The waveform generator produces the potential stimuli presented by Equation 1. Depending on the instrument being utilized, the waveform generator may be internal to the instrumentation or provided as a separate unit. The cyclic voltammetric ‘‘figure of merit’’ of a potentiostat will depend on the size of the power supply utilized
CYCLIC VOLTAMMETRY
and the rise time (or slew rate) of the power supply. The rise time determines the maximum scan rates that can be used in fashioning the E(t) waveform. In addition to the power-supply slew rate, the cell requirements of a nonconvective system coupled with practical limits at which the potential of the working electrode can be varied (due to the resistance-capacitance (RC) time constant associated with a metal/electrolyte interface) provide upper and lower limits for the accessible scan-rate range. Unless the cell is carefully insulated from the environment, thermal and mechanical convective currents will set in for scan rates much below 1 to 2 mV/s; this establishes a practical lower limit for the scan rate. The upper limit is dependent on the size of the electrode and the resistance of the cell. For typical cyclic voltammetric systems (electrode size
1 mm2), scan rates above 10 V/s tend to introduce complications associated with the cell RC time constant. However, scan rates as high as 100,000 V/s have been reported in specially engineered cells employing ultramicroelectrodes and small junction potentials. More realistically, careful limitation of cell resistance allows one to achieve maximum scan rates in the 10 to 100-V/s range. The size of the power supply determines the potentiostat compliance voltage (the largest voltage that can be applied between the working and counterelectrodes). This voltage determines the potentiostat’s ability to control the potential of the working electrode. If one is working in a single-compartment cell with an aqueous electrolyte containing a relatively high salt concentration (0.1 molar salt), then a relatively modest power supply ( 10 V) will provide access to all reasonable working electrode potentials. However, in order to carry out cyclic voltammetric studies in high-resistance cells (i.e., those with low salt concentrations, multiple electrochemical compartments, and/or nonaqueous electrolytes) a compliance voltage on the order of 100 V may be required. It is extremely important to note when using a high-compliance voltage potentiostat that the potential reported by the potentiostat is the voltage drop between the reference electrode and the working electrode. Although this value will never exceed
3 V, in order to achieve this condition, the potentiostat may have to apply 100 V between the counter and working electrodes. Since the leads to these electrodes are typically exposed, the researcher must use utmost care to avoid touching the leads, even when the potentiostat is reporting a low working electrode potential. If the experimenter is accidentally inserted into the circuit between the working and counterelectrodes, the full output voltage of the power supply may run through the experimenter’s body. In order to determine the potential of the working electrode with regard to the reference electrode, the potentiostat employs an electrometer. The value measured by the electrometer is used both to control the potentiostat feedback loop and to produce the potential axis in the cyclic voltammogram. As such it is assumed that the electrometer reading accurately reflects the potential of the working electrode, f. In fact, the electrometer reading, Eobs, is better represented by Equation 9: Eobs ¼ f þ iR
ð9Þ
585
where iR is the uncompensated voltage drop between the working and reference electrodes. In general, application of a voltage between two electrodes (in this case, the working and reference) may cause a current to flow. This would result in a large value for the iR term (and be deleterious to the reference half cell). This problem is circumvented by employing a high-impedance junction between the working electrode and reference electrode, as noted earlier. This junction ensures that the value of i will be quite small and thus iR will have a small value. In this case Eobs f and the potentiostat reading is reliable. However, if R is allowed to become excessively large, then even for a small value of i, the iR term cannot be ignored, and an error is introduced into the cyclic voltammogram. High scan rates produce large peak currents, not only exacerbating the iR drop but introducing a phase-lag problem associated with the cell’s RC time constant. Both these effects can severly distort the cyclic voltammogram. If this occurs, it can be remedied by switching to a low-impedance reference electrode. Decreasing the cell R improves both the iR and RC responses at the cost of destabilizing the potential of the reference half cell. The Working Electrode The working electrode may be composed of any conducting material. It must be recognized that the shape, area, and internal resistance of this electrode affect the resulting current (Nicholson and Shain, 1964; Bard and Faulkner, 1980; Kissinger and Heineman, 1996), and therefore these parameters must be controlled if the cyclic voltammetric data is to be used analytically. Almost all of the kinetic modeling that has been carried out for cyclic voltammetric conditions assumes a semi-infinite linear-diffusion paradigm. In order to meet this condition, one needs a relatively small electrode ( 1 mm2) that is planar. The small size assures low currents. This both limits complications associated with iR drop and provides well-defined diffusion behavior. If the above conditions cannot be met, then a correction factor can be employed prior to data analysis (Kissinger and Heineman, 1996). In cases where cyclic voltammetric studies are applied to solution species, typical working electrode materials include platinum, gold, mercury, and various carbon materials ranging from carbon pastes and cloths to glassy carbon and pyrolytic graphite. These materials are selected because they are inert with respect to corrosion processes under typical electrochemical conditions. A second selection criteria is the electrode material’s electrocatalytic nature (or lack thereof). Platinum, broadly speaking, presents a catalytic interface. This is particularly true of reactions involving the hydrogen couple. As a result, this material is most often utilized as the counterelectrode, thereby ensuring that this electrode does not kinetically limit the observed current. For the same reasons, platinum tends to be the primary choice for the working electrode. The other materials noted are employed as working electrodes because they either are electrocatalytic for a specific redox couple of interest, provide exceptional corrosion inertness in a specific electrolyte of interest, or present a high overpotential with respect to interfering redox couples. With
586
ELECTROCHEMICAL TECHNIQUES
respect to this latter attribute, carbon and mercury are of interest, since both have a high overpotential for proton reduction. As such, one can access potentials significantly negative of the water redox potential in aqueous electrolyte using a carbon or mercury working electrode. Typically, potentials that are either more negative than the electrolyte reduction potential or more positive than the electrolyte oxidation potential are not accessible, since the large current associated with the electrolysis of the electrolyte masks any other currents. Carbon also presents an interesting interface for oxidation processes in aqueous electrolytes since it also has a high overpotential for water oxidation. Mercury, on the other hand, is not useful at positive potentials since it is not inert in this region, oxidizing to Hg2þ. In addition to standard metal-based electrodes, a variety of cyclic voltammetric studies have been reported for conducting polymer electrodes, semiconducting electrodes, and high-Tc superconducting electrodes. An alternate basis for selecting an electrode material is mechanical properties. For example, the mercury electrode is a liquid electrode obtained by causing mercury to flow through a glass capillary. The active electrode area is at the outlet of the capillary, where a droplet of mercury forms, expands, and eventually drops off. This has two major electrochemical consequences. First, the electrode surface is renewed periodically (which can be helpful if surface poisoning by solutions species is an issue); second, the area of the electrode increases in a periodic manner. One can control (as a function of time) the electrode area and lifetime by controlling the pressure applied to the capillary. Obviously, this type of control is not available when using a solid electrode. Similarly, certain solid materials offer unique mechanical characteristics providing enhanced electrode properties. For example, carbon electrodes are available in a variety of crystalline and bulk morphological forms. Unlike many materials carbon is available as a fiber, woven cloth, or paper. The availability of carbon as a paper has recently become important in the search for new electrode materials using combinatorial materials techniques. Large-area carbon-paper electrodes can be used to hold a well-defined library of potential new electrode materials made by constructing a stoichiometric distribution of alloys of two or more materials as described by Mallouk et al. (Reddington et al., 1998) Each alloy electrode is prepared as a small dot on a carbon-paper electrode. Each carbon-paper electrode holds a grid of >100 dots that provide a gradient of stoichiometries over the chemical phase-space being investigated. A survey marker is then needed to determine which dots are catalytic when the whole carbon sheet is used as an electrode. (In Mallouk’s case a fluorescent pH indicator was used to identify electrodes that experienced a large pH shift in the electrolyte at modest potentials.) Potential new electrocatalyst formulations can then be cut out of the carbon grid and mounted as individual electrodes, for which a complete electroanalysis can then be carried out. In this manner a wide expanse of compositional phase-space can be searched for new electrocatalytic materials in a minimum amount of time. As a result, complex alloys containing three or more constituents can be evaluated as potential new electrode materials.
Electrolytes The basic requirements of an electrolyte are the abilities to support the flow of current and to solubilize the redox couple of interest. In the context of a cyclic voltammetric experiment solubility becomes central, both because of the resistance limitation expressed by Equation 9 and because of the need to maintain a diffusion-limited current. It is most useful to consider the electrolyte as composed of three components, a solvent system, a supporting electrolyte, and an electroactive species. The electroactive species is the redox couple of interest. In the case of a species in solution, the species is typically present in millimolar concentrations. Below 0.1 mM, the current associated with this species is below the sensitivity of the cyclic voltammetric experiment. At concentrations above 10 mM it is difficult to maintain a diffusion-limited system. The low concentration of electroactive material employed in the cyclic voltammetric experiment precludes the presence of sufficient charge carriers to support the anticipated current. Therefore, a salt must be added to the system to lower the electrolyte resistivity. This salt, which is typically present in the 0.1 to 1 M concentration range, is referred to as the supporting electrolyte. The supporting electrolyte cannot be electroactive over the potential range under investigation. Since this substance is present in significantly higher concentrations than the electroactive species, any redox activity associated with the supporting electrolyte will mask the chemistry of interest. Water, because of its high dielectric constant and solubilizing properties, is the electrolyte solvent of choice. However, organic electrolytes are often needed to solubilize specific electroactive materials or provide a chemically nonreactive environment for the redox couple under investigation. Because of its good dielectric and solvent properties, acetonitrile is a primary organic electrolyte for cyclic voltammetry. Acetonitrile will dissociatively dissolve a variety of salts, most typically perchlorates and hexafluorophosphate-based systems. Tetraalkylammonium salts are often the supporting electrolytes of choice in this solvent, with tetra-n-butylammonium perchlorate— which dissolves well in acetonitrile, is electrochemically inert over a large potential window, and is commercially available at high purity—being a popular option. One important limitation of the acetonitrile systems is their affinity for water. Acetonitrile is extremely hydroscopic, and thus water will be present in the electrolyte unless the solvent is initially dried by reflux in the presence of a drying agent and then stored under an inert gas. Extremely dry acetonitrile can be prepared using a procedure published by Billon (1959). A variety of other solvents have been employed for specific electrochemical applications. The types of solvents available, compatible supporting electrolytes, solvent drying procedures, and solvent electrochemistry are nicely summarized by Mann (1969).
DATA ANALYSIS AND INITIAL INTERPRETATION As noted earlier, one of the major attributes of cyclic voltammetric experiments is the ability to determine the
CYCLIC VOLTAMMETRY
587
charge-transfer mechanism with relatively little data analysis using a pattern-recognition approach. In accord with the procedures established by Nicholson and Shain, initial data analysis is achieved by considering the response of current and peak-to-peak potential shifts as a function of the potential scan rate (Nicholson and Shain, 1964; Bard and Faulkner, 1980; Gosser, 1993). Three diagnostic plots were suggested by these authors: A plot of current function, (ip/o1/2); A plot of peak-to-peak potential separation versus scan rate; A plot of the ratio of the anodic peak current to cathodic peak current versus scan rate. These three diagnostic plots have become the cornerstones of cyclic voltammetric analysis. The current function required for the first plot is the peak current divided by the square root of the scan rate. This functional form is related to the fact that in the absence of kinetic limitations, the rate of charge transfer is diffusion controlled. Fick’s second law of diffusion introduces a o1/2 (where o is the scan rate) dependence into the current response. Thus, division of the peak current by o1/2 removes the diffusional dependence from the cyclic voltammogram’s current response. Both the first and third diagnostic require measurement of peak currents. This represents a special challenge in cyclic voltammetry due to the unusual nature of the baseline. This is most easily seen by the representation in Figure 3, which shows an idealized cyclic voltammogram plotted both in its normal i-E form and as i versus t. Recall that both forms are mathematically equivalent since E and t are linearly related by Equation 1. Note that the first wave observed starting at a potential of Ei has a peak current that is easily obtained (line 1), since the baseline is well established. However, it is important to realize that the peak current for the return wave is not given by line 2. This is easily seen by looking at the time-dependent plot. Here it can be seen that the return wave has a baseline that is very different from the initial wave, and thus the measurement of peak current is complex. In addition, it has been noted that the baseline established for the return wave is a function of the switching potentials. Several approaches have been suggested to resolve this baseline problem. Nicholson (1966) has suggested a mathematical solution that produces the true peak current based on the values of lines 1, 2, and 3 in Figure 3A. Using this approach the ratio of peak currents is obtained as Equation 10: ic line 1 line 3 þ 0:485 þ 0:086 ¼ line 4 ia line 2
Figure 3. (A) Ideal one-electron reversible cyclic voltammogram showing measurements from an i ¼ 0 baseline. (B) Same as (A) but showing the time dependence of the data and identifying the true baselines for the anodic and cathodic cyclic voltammetric peaks.
cult to do. While in many cases one can still reach reasonable mechanistic conclusions, the uncertainty in the conclusions is increased as the scan rate range is limited. Certainly, determination of a charge transfer mechanism using less than 2 orders of magnitude in scan rate cannot be justified. Since a system which is chemically reactive may change with time, it is useful to randomize the order of scan rates employed. This way, the time dependence of the system will not be confused with the scan rate dependence. An Example One mechanism that has received much attention in recent years is the electrocatalytic mechanism. In its simplest form it can be expressed as in Equation 11: Ox þ ne , Red
ð10Þ
where ic and ia are the cathodic and anodic peak currents respectively. A critical, and often-forgotten, aspect of the Nicholson and Shain analysis is that it is necessary to observe the diagnostics over 3 orders of magnitude in scan rate in order to come to a definitive conclusion. This is often diffi-
k,
Red þ Z ! Ox þ Product
ð11Þ
The electrocatalytic mechanism involves an initial charge transfer from the electrode to Ox followed by a bimolecular solution charge transfer between Red and a dissolved reactant (Z) having a rate constant k. This follow up step regenerates Ox, so that the Ox/Red couple is never consumed. This mechanism is often referred to as
588
ELECTROCHEMICAL TECHNIQUES
‘‘mediated charge transfer.’’ It allows one to circumvent an interfacial activation barrier for the direct oxidation of Z at the electrode. In addition, it tends to endow the reaction with a degree of specificity, since the interaction of Z with Red can be tailored by one’s choice of molecular system. As such, this mechanism forms the basis for many modern electrochemical sensors. One example of this application is given in Figure 4, which demonstrates a sensor for the detection of the amino acid sarcosine as a clinical marker for kidney function. Sarcosine is unreactive at solid metal electrodes. The experiment presented here utilizes a graphite electrode and a phosphate buffered (pH ¼ 7.0) electrolyte containing 6 106 molar sarcosine oxidase (SOX), a naturally occurring enzyme that selectively oxidizes sarcosine according to the reactions shown in Equation 12. The redox mediator in this case is [Mo(CN)8]4 that has been bonded to the electrode surface. The sensor mechanism is a variation on the mechanism shown in Equation 11. Note that in this case, the oxidized form of the mediator is the active species. ½MoðCNÞ8 4 , ½MoðCNÞ8 3 þ e ½MoðCNÞ8 3 þ SOX ! ½MoðCNÞ8 4 þ SOXþ SOXþ þ Sarcosine ! SOX þ Sarcosineþ Sarcosineþ ! chemical products
ð12Þ
Figure 4. Cyclic voltammograms of a graphite electrode coated with a Mo(CN)84-containing polymer in an aqueous electrolyte containing 0.05 M phosphate buffer (pH ¼ 7), 0.35 M KCl supporting electrolyte, and 6 mM sarcosine oxidase. A scan rate of 2 mV/s was employed, with the scan being initiated at 0.00 V versus SCE and proceeding to positive potentials. Scan (A) shows a quasireversible wave for the one-electron oxidation of the Mo complex. Scan (B) is the same as (A) with the addition of 50 mM sarcosine. The increase in the anodic wave and decrease in the cathodic wave are indicative of a mediated charge-transfer mechanism. This electrode system has been used to analyze for creatinine, a species involved in kidney function.
Scan (a) in Figure 4 shows the cyclic voltammetric response of the [Mo(CN)8]4 system. Because of its low concentration, the addition of sarcosine oxidase has no perceptible effect on the cyclic voltammetric response. The electrocatalytic chain shown in Equation 12 is activated by the addition of the analyte, sarcosine, as shown in scan (b). The response is highly selective for the presence of sarcosine due to the selectivity of both the [Mo(CN)8]4 mediator and the enzyme. In addition the system is very sensitive to low concentrations of sarcosine. The shift from a reversible system, scan (a), to an electrocatalytic system, scan (b), is trivially monitored by cyclic voltammetry, nicely illustrating the qualitative utility of this technique in determining reaction mechanisms. Note that both the disappearance of the cathodic wave and the enhancement in the anodic wave are requisite for the mediation mechanism (Equation 11). Finally, it should be noted that the exact scan-rate dependence observed will be a function of k, the chemical rate constant for all coupled processes. Hence, one can obtain this quantity from a quantitative analysis of the scan-rate dependence. Thus, for chemically coupled electrochemical systems, cyclic voltammetry can be utilized as a kinetic tool for determination of the nonelectrochemical rate constant. Classically, this analysis has involved curve fitting data sets to the numerically derived functional forms of the cyclic voltammetric wave shape. This approach has been well documented and discussed from both theoretical and applied viewpoints by Gosser (1993). A more powerful kinetic analysis is available by comparison of data sets to digitally simulated cyclic voltammograms. Until recently this approach required extensive programming skills. However, digital simulation programs are now commercially available that allow one to import a cyclic voltammetric data set and correlate it with a simulated voltammogram based on a proposed mechanism (Feldberg, 1969; Speiser, 1996). Both Princeton Applied Research (PAR) EG&G and BioAnalytical Systems (BAS) offer digital simulators that interface with their data-collection software or operate on a freestanding PC data file. Such programs run on personal computers and bring high-powered kinetic and mechanistic cyclic voltammetric analysis to all users. The power of cyclic voltammetry, however, still resides in the fact that mechanistic and kinetic/thermodynamic data for a wide variety of systems can be directly obtained with little effort from raw data sets via the pattern-matching processes highlighted in this review. Another Example The cyclic voltammetric response of a real system is shown in Figure 5. The inorganic compound under study is shown in Figure 5A. The molecular system consists of two iron sites sandwiching a central Pt(IV). Each of the iron sites is composed of a ferrocyanide unit, FeðCNÞ4 6 , a well-studied, one-electron charge-transfer couple. The compound is held together by two bridging cyanide (CN) ligands. The study presented here involved an aqueous electrolyte composed of 0.1 M NaNO3 and 2.5 mM electroactive complex. A platinum working electrode and an SCE reference
CYCLIC VOLTAMMETRY
589
Figure 5. (A) Model of {[Fe(II)(CN)6][Pt(IV)(NH3)4] [Fe(II)(CN)6]}4 (B) A cyclic voltammogram of the complex in part (A) taken at 100 mV/s using a platinum working electrode. The scan initiates at 0.0V versus SCE and proceeds in the positive direction. (C) Shift in anodic peak potential as scan rate is increased, showing reversible behavior J0, as is generally the case, the 1 in Equation 22 can be neglected. We then obtain qE J Jph J0 exp kT
ð23Þ
We can then define the open-circuit voltage Voc as the absolute value of the voltage present when no net current flows and obtain Voc ¼
Jph kT ln J0 q
Basic Electrochemical Cell Design. Current density potential data for semiconductor electrodes are typically obtained using a potentiostat (Fig. 3). This instrument ensures that the measured J-E properties are characteristic of the semiconductor-liquid interface and not of the counter electrode–liquid contact that is needed to complete the electrical circuit in the electrochemical cell. A threeelectrode arrangement consisting of a semiconductor (working) electrode, a counter electrode, and a reference electrode is typically used to acquire data. The potentiostat uses feedback circuitry and applies the voltage needed between the working electrode and counter electrode to obtain the desired potential difference between the working and reference electrodes. The potentiostat then records the current flowing through the working electrode–counter electrode circuit at this specific applied potential. Nominally, no current flows through the reference electrode, which only acts as a point of reference for the system. The scan rate should be 50 mV s1 or slower in order to minimize hysteresis arising from diffusion of species to the electrode surface during the J-E scan. The electrochemical data are collected directly as the current vs. the applied potential. Electrode areas are, of course, needed to obtain current densities from the measured values of the current. The projected geometric area of the electrode is usually obtained by photographing the electrode and a microruler simultaneously under a microscope and digitally integrating the area defined by the exposed semiconductor surface.
ð24Þ
This voltage is significant in the field of solar energy conversion, as it represents the maximum free energy that can be extracted from a semiconductor-liquid interface. Equation 24 brings out several important features of the open-circuit voltage. First, Voc increases logarithmically with the light intensity, because Jph is linearly proportional to the absorbed photon flux. Second, the opencircuit voltage of a system increases (logarithmically) as J0 decreases. Chemically, such behavior is reasonable, because J0 represents the tendency for the system to return to charge transfer equilibrium. Third, Equation 24 emphasizes that a mechanistic understanding of J0 is crucial to controlling Voc. Only through changes in J0 can systematic, chemical control of Voc be established for different types of semiconductor-liquid junctions. Another parameter that is often used to describe illuminated semiconductor-liquid junctions is the short-circuit photocurrent density Jsc. Short-circuit conditions imply V ¼ 0. From Equation 22, the net current density at short circuit (Jsc) equals Jph. The short-circuit current density provides a measure of the collection efficiency of photogen-
Figure 3. Circuit consisting of a simple potentiostat and an electrochemical cell. A potential is set between the working and reference electrode, and the current flow from the counter electrode to the working electrode is measured.
610
ELECTROCHEMICAL TECHNIQUES
Reference Electrodes. Reference electrodes are constructed according to conventional electrochemical protocols. For example, two types of reference electrodes are an aqueous (or nonaqueous) saturated calomel electrode (SCE) and a nonaqueous ferrocenium-ferrocene electrode. A simple SCE can be constructed by first sealing a platinum wire through one leg of an H-shaped hollow glass structure. The platinum wire is then covered with mercury, and a ground mixture of approximately equal amounts of mercury and calomel (Hg2Cl2) dispersed into a small amount of saturated potassium chloride solution is then placed on top of the mercury. The remainder of the tube is filled with saturated potassium chloride solution, and the other leg of the structure, which contacts the solution, is capped with a fritted plug. Prior to use, the nonaqueous SCE should be calibrated against a reference electrode with a known potential, such as an aqueous SCE prepared in the same fashion. For work in nonaqueous solvents, a convenient reference electrode is the ferrocenium-ferrocene reference. This electrode consists of a glass tube with a fritted plug at the bottom. The tube is filled with a ferrocene-ferrocenium-electrolyte solution made using the same solvent and electrolyte that is to be used in the electrochemical experiment. A platinum wire is inserted into the top of the solution to provide for a stable reference potential measurement. When both forms of the redox couple are present in the electrochemical cell, an even simpler procedure can be used to construct a reference electrode. A platinum wire can be inserted into the electrolyte–redox couple solution or into a Luggin capillary that is filled with the electrolyte–redox couple solution (see Luggin capillaries, below). This wire then provides a stable reference potential that is equal to the Nernstian potential of the electrochemical cell. At any convenient time, the potential of this reference can be determined vs. another reference electrode, such as an SCE, through insertion of the SCE into the cell. This approach not only is convenient but also is useful when water and air exposure is to be minimized, as is the case for reactive semiconductor surfaces in contact with deoxygenated nonaqueous solvents. Luggin Capillaries. The cell resistance can be reduced by minimizing the distance between the working and reference electrodes. These small distances can be achieved through the use of a Luggin capillary as a reference electrode. The orifice diameter of the capillary should generally be 0.1 mm. A convenient method to form such a structure is to pull a disposable laboratory pipette under a flame and then to use a caliper to measure and then break the pipette glass at the point that corresponds to the desired orifice radius. The pipette is then filled with the reference electrode solution of interest, and the flow of electrolyte out of the pipette is minimized by capping the top of the pipette with a rubber septum. The contact wire is then inserted through the septum and into the electrolyte. Under some conditions, a syringe needle connected to an empty syringe can be inserted through the septum to facilitate manipulation of the pressure in the head space of the pipette. This procedure can be used to minimize mixing
between the solution in the pipette and the solution in the electrochemical cell. Illumination of Semiconductor-Liquid Contacts Monochromatic Illumination. Low-intensity monochromatic illumination can be obtained readily from a white light source and a monochromator. This is useful for obtaining spectral response data to measure the diffusion length or the optical properties of the semiconductor electrode, as described in more detail under Measurement of Semiconductor Band Gaps Using Semiconductor-Liquid Interfaces. Laser illumination can also be used to provide monochromatic illumination. However, care should be taken to diffuse the beam such that the entire electrode surface is as uniformly illuminated as possible. Because the photovoltage is a property of the incident light intensity, careful measurement of the photovoltage requires maintaining a uniform light intensity across the entire electrode surface. This protocol has not been adhered to in numerous measurements of the J-E properties of semiconductor electrodes, and the photovoltages quoted in such investigations are therefore approximate values at best. To control the light intensity from the laser, neutral density filters can be used to attenuate the incident beam before it strikes the electrode surface. Regardless of whether the monochromatic light is obtained from a white light–monochromator combination or from a laser, measurement of the incident photon power is readily achieved with pyranometers, photodiodes, thermopiles, or other photon detectors that are calibrated in their response at the wavelengths of interest. Polychromatic Illumination. For polychromatic illumination, solar simulators provide the most reproducible laboratory method for measuring J-E properties under standard, ‘‘solar-simulated’’ illumination. A less expensive method is to use tungsten-halogen ELH-type projector bulb lamps. However, their intensity-wavelength profile, like that of almost any laboratory light source, is not very well matched to the solar spectrum observed at the surface of the earth. Calibration of the light intensity produced by this type of source should not be done with a spectrally flat device such as a thermopile. Since laboratory sources typically produce more photons in the visible spectral region than does the sun at the same total illumination power, maintaining a constant power from both illumination sources tends to yield higher photocurrents, thus producing overestimates in efficiency of photoelectrochemical cells in the laboratory, relative to their true performance under an actual solar spectral distribution in the field. An acceptable measurement method instead involves calibration of the effective incident power produced by the laboratory source through use of a photodetector whose spectral response characteristics are very similar to that of the photoelectrochemical cell of concern. Preferably, the response properties of the photodetector are linear with light intensity and the absolute response of the detector is known under a standard solar spectral
SEMICONDUCTOR PHOTOELECTROCHEMISTRY
distribution and illumination power. The absolute detector response can be obtained either by measurements performed under a calibrated solar simulator or by measurement of the output of the detector in actual sunlight. If sunlight is used, another calibration is then required to determine the actual solar power striking the plane of the detector at the time of the measurement. Useful primary or secondary reference detectors for this purpose are silicon cells that have been calibrated on balloon flights by NASA. Spectrally flat radiometers, such as those produced by Eppley for the purpose of measuring the absolute spectral power striking a specific location on the surface of the earth, are also useful for determining the solar power under conditions used to calibrate the detector to be used in the laboratory. If these primary or secondary reference detectors are routinely available, it is of course also possible to determine the J-E properties of the photoelectrochemical cell directly in sunlight, as opposed to having to establish a reference detector measurement and then recalibrate a laboratory light source to produce the equivalent effective spectral power for the purposes of the measurement. Effects of Cell Configuration. Under potentiostatic control, the concentrations of both forms of the redox species need not be as high as might be required to sustain identical performance in an actual, field-operating photoelectrochemical cell. This occurs because a twoelectrode photovoltaic-type cell configuration requires sufficiently high concentrations of both forms of the redox couple dissolved in the solution to suppress mass transport limitations without mechanical stirring of the electrolyte. In a three-electrode cell with an n-type semiconductor electrode, the primary consideration is that sufficient redox donor be present such that the anodic current is limited by the light intensity and not by the mass transport of donor to the electrode surface. A high concentration of redox acceptor is not required to achieve electrode stability and often is undesirable when the oxidized form of the redox material absorbs significantly in the visible region of the spectrum. The concentration overpotential that results from a low concentration of electron acceptor in the electrolyte can be assessed and corrected for analytically using Equation 26 below. In contrast, the performance of an actual energy conversion device using a two-electrode cell configuration is so dependent on the properties of the working electrode, the counter electrode, the electrolyte, the cell thickness, and the properties of the various optical interfaces in the device that many design trade-offs are involved and are unique to a particular cell configuration used in the device assessment. Emphasis here has been placed on determining the properties of the semiconductor electrode in ‘‘isolation,’’ using a potentiostat, so that a comparison from electrode to electrode can be performed without considering the details of the device configuration used in each measurement. Data Analysis and Initial Interpretation Typical Raw Current–Potential Data. A representative example of a current-potential curve is shown in Figure 4,
611
Figure 4. Representative example of current-voltage data for a semiconductor-liquid interface. The system consists of a silicon electrode in contact with a methanol solution containing lithium chloride and oxidized and reduced forms of benzyl viologen. In this example, the current has been divided by the surface area of the electrode, yielding a current density as the ordinate. The curve has not been corrected for cell resistance or concentration overpotential.
which displays data for an n-type silicon electrode in contact with a redox-active methanol solution. In this figure, the current has been divided by the surface area of the electrode to allow for quantitative analysis of the data. To extract meaningful results from the current-potential curve, it is also necessary to perform corrections for concentration overpotential and solution resistance, as discussed below. Corrections to J-E Behavior: Concentration Overpotentials. Attention to electrochemical cell design is critical to minimize concentration overpotentials, mass transport restrictions on the available current density, and uncompensated resistance drops between the working and reference electrodes. Even with good cell design, in nonaqueous solvents the J-E curves must generally be corrected for concentration overpotential losses as well as for uncompensated ohmic resistance losses to obtain the inherent behavior of the semiconductor-liquid contact. To minimize mass transport limitations on the current, the electrolyte should be vigorously stirred during the J-E measurement. For a given redox solution, the limiting anodic current density Jl,a and the limiting cathodic current density Jl,c should be determined using a platinum foil electrode placed in exactly the same configuration as the semiconductor working electrode. The areas of the two electrodes should also be comparable. If the redox couple is known to be electrochemically reversible, the platinum-electrode data can then be used to obtain the cell parameters needed to perform the necessary corrections to the J-E data of the semiconductor electrode (see Steady-State J-E Data to Determine Kinetic Properties of Semiconductor-Liquid Interfaces). Alternatively, the semiconductor electrode can be fabricated into a disk configuration, and can be rotated in the electrolyte. Under these conditions, the mass transport
612
ELECTROCHEMICAL TECHNIQUES
parameters E can be determined analytically, and the limiting current density (Bard et al., 1980) is 2=3
1=2
J1;c ¼ 0:620nFD0 orde 1=6 ½A b
ð25Þ
where F is Faraday’s constant, D0 is the diffusion coefficient, orde is the angular velocity of the electrode, is the kinematic velocity of the solution (0.01 cm2 s1 for dilute aqueous solutions near 208C), and [A]b is the bulk concentration of oxidized acceptor species. A similar equation yields the limiting anodic current density based on the parameters for the reduced form of the redox species. This procedure allows control over the limiting current densities instead of merely measuring their values in a mechanically stirred electrolyte solution. Laminar flow typically ceases to exist above Reynolds numbers (defined as the product of o and the disk radius of the electrode divided by ) of 2 105 (Bard et al., 1980), so for electrode radii of 1 mm, this corresponds to a typical upper limit on the rotation velocity of 1 to 2 107 rpm. Beyond this limit, Equation 25 does not describe the mass transport to the electrode. Smaller electrodes can increase this limit on o, but use of smaller electrodes is generally not advisable, because edge effects become important and can distort the measured electrochemical properties of the solid-liquid contact by hindering diffusion of minority carriers and allowing recombination at the edges of the semiconductor crystal. Once the limiting current densities and the J-E data are collected for a reversible redox system at a metal electrode, the concentration overpotential Zconc can be determined (Bard et al., 1980): Zconc
kT J1;a J1;a J ln ¼ ln J1;c nq J J1;c
ð26Þ
These values can then be used to correct the data at a semiconductor electrode to yield the proper J-E dependence of the solid-liquid contact in the absence of such concentration overpotentials. Corrections to J-E Behavior: Series Resistance Overpotentials. Even with good cell design, measurement of the cell resistance Rsoln is required to perform another correction to the J-E data. Values for Rsoln can be extracted from the real component of the impedance in the high-frequency limits of Nyquist plots ELECTROCHEMICAL TECHNIQUES FOR CORROSION QUANTIFICATION for the semiconductor electrode or can be determined from steady-state measurements of the ohmic polarization losses of a known redox couple at the platinum electrode. In the former method, Rsoln is simply taken as the real part of the impedance in the high-frequency limit of the Nyquist plot. In the latter method, the current-potential properties of a platinum electrode are determined under conditions where the platinum electrode is in an identical location to that of the semiconductor electrode. After correction of the data for concentration polarization, Rsoln can be obtained from the inverse slope of the platinum current-potential data near the equilibrium potential of the solution.
The final corrected potential Ecorr is then calculated from E, the concentration overpotential Zconc, Rsoln, and the current I using (Bard et al., 1980) Ecorr ¼ E Zconc IRsoln
ð27Þ
The measured value of I is divided by the projected geometric area of the electrode and plotted vs. Ecorr to obtain a plot of the J-E behavior of the desired semiconductorliquid contact. Measurement of Jsc and Voc of Illuminated SemiconductorLiquid Contacts. The open-circuit photovoltage and shortcircuit photocurrent should be measured directly using four-digit voltmeters connected to the photoelectrochemical cell as opposed to estimating their values from a time-dependent scan of the J-E data. This steady-state measurement eliminates any bias that might arise due to the presence of hysteresis in the current-potential behavior. Also, in some cases, the light-limited photocurrent is not reached at short circuit; in this case, both the light-limited photocurrent value and the short-circuit photocurrent value are of experimental interest and should be measured separately. Sample Preparation Electrodes for photoelectrochemical measurements should be constructed to allow exposure of the front face of the semiconductor to the solution while providing concealment of the back contact and of the edges of the electrode. This is readily accomplished using an insulating material that is inert toward both the etchant and the working solution of interest. The area of the electrode should be large enough to allow ready measurement of the bulk surface area but should be small enough to limit the total current flowing through the electrochemical cell (because larger currents require larger corrections for the cell resistance). Because of these trade-offs, electrode areas are typically 0.1 to 1 cm2. Ohmic contacts vary widely between semiconductors, and several books are available for identifying the ohmic contact of choice for a given semiconductor (Willardson et al., 1981; Pleskov et al., 1986; Finklea, 1988). Although most ohmic contacts are prepared by evaporating or sputtering a metal on the back surface of the semiconductor, some semiconductors are amenable to more convenient methods such as using a scribe to rub a gallium-indium eutectic on the back surface of the solid. This latter procedure is commonly used to make an ohmic contact to n-type silicon. The quality of an ohmic contact can be verified by making two contacts, separated by a contact-free region, on one side of an electrode and confirming that there is only a slight resistance between these contacts as measured by a J-E curve collected between these contact points. The proper choice of a chemical etch depends on the semiconductor, its orientation, and the desired surface properties. Generally, an ideal etchant produces an atomically smooth surface with no electrical surface defects. Fluoride-based etches are most commonly used with
SEMICONDUCTOR PHOTOELECTROCHEMISTRY
silicon: a 40% (w/w) ammonium fluoride solution is well suited for (111)-oriented Si and a solution of HF is appropriate for (100)-oriented Si (Higashi et al., 1991, 1993). For many III to V semiconductors, an etch in 0.05% (v/v) Br2 followed by a rinse in a solution of NH4OH produces abrupt discontinuities in the dielectric at the solid-air interface (Aspnes et al., 1981). An exhaustive literature provides information on additional etches for these and other semiconductors, and the reader is referred to these references for further information (Wilson et al., 1979; Aspnes et al., 1981; Higashi et al., 1993). There are also several published reports of effective dry etching methods (Higashi et al., 1993; Gillis et al., 1997). Because many semiconductors are reactive in aerobic environments, it is often necessary to carry out experiments in anaerobic conditions using nonaqueous solvents. Air-sensitive experiments can be performed in specialized glassware that is continuously purged with an inert gas or in an inert atmosphere glove box specially modified for electrical throughputs. Although outlined here for current-voltage measurements, the electrode preparation techniques are applicable not only to these measurements but also to most other techniques discussed in this unit.
MEASUREMENT OF SEMICONDUCTOR BAND GAPS USING SEMICONDUCTOR-LIQUID INTERFACES Principles of the Method The basic photoelectrochemical techniques discussed above can be used to determine many important properties of the semiconductor and of the semiconductor-liquid contact. Monitoring the wavelength dependence of the photocurrent produced at a semiconductor-liquid contact can provide a nondestructive, routine method for determining some important optical properties of a semiconductor. Specifically, the value of the band gap energy and whether the electronic transition is optically allowed or forbidden can be obtained from measurement of the spectral response of the photocurrent at a semiconductor-liquid contact. Two types of optical transitions are commonly observed for semiconductors: direct gaps and indirect gaps. Near the absorption edge, the absorption coefficient a can be expressed as (Pankove, 1975; Sze, 1981; Schroder, 1990) a ðhv Eg Þb
ð28Þ
where h is Planck’s constant, n is the frequency of the light incident onto the semiconductor, and b is the coefficient for optical transitions. The absorption coefficient is obtained from Beer’s law, in which the ratio of transmitted, , to incident, 0, photon flux for a sample of thickness d is (Pankove, 1975) =0 ¼ expðadÞ
ð29Þ
For optically allowed, direct gap transitions, b ¼ 12, whereas for indirect, optically forbidden transitions, b ¼ 2 (Sze, 1981). Typically, a is determined through measurements of the extinction of an optical beam through a known thick-
613
ness of the semiconductor sample. Beer’s law is applied to solve for a from the measured ratio of the incident and transmitted intensities of light through the sample at each wavelength of interest. For a direct band gap semiconductor, a plot of a2 vs. photon energy will give the band gap as the x intercept. If the semiconductor has an indirect band gap, a plot of a1/2 vs. photon energy will have two linear regions, one corresponding to absorption with phonon emission and one corresponding to absorption with phonon capture. The average of the two intercepts is the energy of the indirect band gap. One-half the difference between the two intercepts is the phonon energy emitted or captured during the band gap excitation. An alternative method for determining the absorption coefficient vs. the wavelength is to determine the real and imaginary parts of the refractive index from reflectance, transmittance, or ellipsometric data and then to use the Kramers-Kronig relationship to determine the absorption coefficient of the solid over the wavelength region of interest (Sze, 1981; Lewis et al., 1989; Adachi, 1992). Practical Aspects of the Method The photocurrent response at a semiconductor-liquid interface can also be used to determine a as a function of wavelength and thus to determine Eg and b for that semiconductor. This method avoids having to make tedious optical transmission measurements of a solid sample in a carefully defined optical configuration. According to the Ga¨ rtner equation, the photocurrent is given as (Sze, 1981; Lewis et al., 1989; Schroder, 1990) expðaWÞ Iph ¼ q0 ð1 R Þ 1 1 þ aL
ð30Þ
where L is the minority carrier diffusion length and R* is the optical reflectivity of the solid. In a semiconductor sample with a very short minority carrier diffusion length and with aW 1, this equation simplifies to (Sze, 1981; Schroder, 1990) Iph ¼ q0 ð1 R ÞðaWÞ for aL 1
ð31Þ
Under these conditions, a can be measured directly from the photocurrent at each wavelength. These values can then be plotted against the photon energy to determine the band gap energy and transition profile, direct or indirect, for the semiconductor under study. The only parameter that needs to be controlled experimentally is 0 at the various wavelengths of concern. Methods for determining and controlling 0, which are common to this method and to other methods that use illuminated semiconductor-liquid contacts, are described in detail under Diffusion Length Determination Using Semiconductor-Liquid Contacts. Data Analysis and Initial Interpretation Figure 5 shows the spectral dependence of the photocurrent at an n-type MoS2 electrode (Tributsch et al., 1977). The principal analysis step in determining the band gap energy or any other intrinsic parameter, such as the
614
ELECTROCHEMICAL TECHNIQUES
ing that aW 1, yields the following expression for the wavelength dependence of the open-circuit photovoltage (Sze, 1981; Lewis et al., 1989; Schroder, 1990): Voc ½1 þ ðLp aÞ1 1
ð32Þ
Thus, provided that Lp W and a1 W, a plot of (Voc)1 vs. a1 will yield a value of the inverse of the slope that is equal to Lp. An additional check on the data is that the x intercept of the plot should equal Lp (Schroder, 1990). Measurement of Lp using the surface photovoltage method in air requires that the semiconductor be capacitively coupled through an insulating dielectric such as mica to an optically transparent conducting electrode. However, for a semiconductor-liquid contact, either the photovoltage or photocurrent can be determined as a function of wavelength l, and in this implementation the method is both nondestructive and convenient. Figure 5. Spectral response of MoS2 photocurrents (n type) in the anodic saturation region. (Reprinted with permission from Tributsch et al., 1977.)
diffusion length, from this spectrum depends on accurately transforming the wavelength-dependent photocurrent information to a corresponding absorption coefficient. With a table of absorption coefficients, photocurrent densities, and wavelengths, one can plot the absorption coefficient vs. photon energy and, by application of the methods described above, extract the band gap energy. The use of photocurrents to determine minority carrier diffusion lengths is discussed under Diffusion Length Determination Using Semiconductor-Liquid Contacts, below, and the subtleties of transforming this photocurrent data into absorption coefficients are illustrated. DIFFUSION LENGTH DETERMINATION USING SEMICONDUCTOR-LIQUID CONTACTS Principles of the Method The minority carrier diffusion length L is an extremely important parameter of a semiconductor sample. This quantity describes the mean length over which photogenerated minority carriers can diffuse in the bulk of the solid before they recombine with majority carriers. The value of L affects the bulk diffusion-recombination limited Voc and the spectral response properties of the solid. The ASTM (American Society for Testing and Materials) method of choice for measurement of diffusion length is the surface photovoltage method. A conceptually similar methodology can, however, be used when a liquid provides the electrical contact to the semiconductor. Use of a semiconductor-liquid contact has the advantage of allowing a reproducible analysis of the surface condition as well as control over the potential across the semiconductor during the experiment. In either mode, the method works well only for silicon and other indirect gap materials. We assume that the semiconductor is n type, so we are therefore interested in measuring the diffusion length of holes, Lp. Simplification of the Ga¨ rtner equation, assum-
Practical Aspects of the Method Both the surface photovoltage method and the determination of the optical properties of the semiconductor require accurate measurement of the wavelength dependence of the quantum yield for carrier collection at the semiconductor-liquid contact. The quantum yield is the ratio of the rate at which a specimen forces electrons through an external circuit, I(l)/q, to the rate at which photons are incident upon its surface: ðlÞ ¼
IðlÞ=q 0 ðlÞAs
ð33Þ
In this equation, 0(l) represents the flux of monochromatic light, which is assumed to be constant over the area of the specimen, As. Typically, the quantum yield is measured at short circuit. Commercial silicon photodiodes have very stable quantum yields of >0.7 throughout most of the visible spectrum, making them a nearly ideal choice for a calibration reference. The quantum yield of the experimental specimen can be calculated as follows: Icell ðlÞAs;ref cell ðlÞ ¼ ref ðlÞ ð34Þ Iref ðlÞAs;cell Experimentally, the excitation monochromator is scanned to record Icell(l); then the experimental specimen is replaced with the reference and Iref(l) is recorded. A significant source of error in this method, however, is the drift in the intensity of the light source over time, which can affect both Icell(l) and Iref(l). A preferred experimental setup is shown in Figure 6. In this arrangement, the ratio of the photocurrent response of the experimental specimen to that of an uncalibrated photodiode, Icell(l)/Iuncal(l), is recorded as the experimental variable. It is not necessary to know the quantum yield or area of the uncalibrated photodiode, which merely acts to calibrate the light intensity at each wavelength. The geometry of the uncalibrated photodiode with respect to the light source and the pick-off plate should be arranged such that the surface of the uncalibrated diode is
SEMICONDUCTOR PHOTOELECTROCHEMISTRY
615
Although various equations have been proposed for the relationship between a and l for silicon, the ASTM standard recommends the following for a stress-relieved, polished silicon wafer (Schroder, 1990):
Figure 6. Spectral response measurement system consisting of a white light source, a monochromator, a beam splitter, and a calibrated photodiode.
illuminated by a small fraction of the light that is incident on the main specimen. If t(l) is the ratio of the light diverted to the uncalibrated photodiode relative to that which reaches the main specimen, then Iuncal ðlÞ ¼ quncal ðlÞtðlÞ0 ðlÞAs;uncal
ð35Þ
The photocurrent response of the calibrated photodiode relative to that of the same uncalibrated photodiode, Iref(l)/Iuncal(l), must also be determined. When Icell(l) and Iref(l) in Equation 34 are replaced with the ratios Icell(l)/ Iuncal(l) and Iref(l)/Iuncal(l), respectively, the unknown terms uncal and As,uncal divide out. In the foregoing discussion, we have assumed that the excitation light is continuous, e.g., furnished by a xenon arc lamp or a tungsten filament source coupled to a monochromator. Pulsed light sources can also be used, and the most convenient method is to have the excitation beam interrupted by a mechanical chopper, with a lock-in amplifier used to monitor the photocurrent. If the response time of the specimen is short compared to the chopping period, then this method yields exactly the same information as the DC experiment. However, lock-in detection offers a substantial advantage in the signal-to-noise ratio, which can be important at low light and/or low current levels. Also, for experiments that are not carried out at short circuit, but at some applied bias, this method provides for an automatic subtraction of the dark current. In addition, when a lock-in detection method is used, it is possible to measure the differential quantum yield of a signal that is superimposed on top of a DC illumination source. This information is very useful for materials that have quantum yields that are dependent on the incident light intensity. Finally, when the response time of the system is on the same order as the chopping period, variations in the photocurrent with the chopping period and analysis of the data in the frequency domain can yield detailed information about the kinetic process occurring in the semiconductor bulk and at the semiconductor-liquid interface (Peter, 1990). Data Analysis and Initial Interpretation To determine the diffusion length of a solid using photoelectrochemical methods, it is important to have an accurate relationship between the excitation wavelength and the absorption coefficient of the solid, because any error in this relationship leads to the calculation of incorrect values for the diffusion length in the analysis of the experimental data.
a ¼ 5263:67 114425l1 þ 5853:68l2 þ 399:58l3 ð36aÞ where the wavelength is in units of micrometers and the absorption coefficient is in units of reciprocal centimeters. This relationship is valid for the wavelength range of 0.7 to 1.1 mm, which is typically the range used for determining diffusion lengths. For a nonstress-relieved silicon wafer, the relationship is a ¼ 10696:4 33498:2l1 þ 36164:9l2 þ 13483:1l3 ð36bÞ Recent expressions have also been developed that accurately fit published absorption data for GaAs and InP. These expressions are a ¼ ð286:5l1 237:13Þ2
ð36cÞ
for GaAs in the 0.75- to 0.87-mm wavelength range and a ¼ ð252:1l1 163:2Þ2
ð36dÞ
for InP in the 0.8- to 0.9-mm wavelength range. For quantitative results, the wavelength dependence of the optical reflectivity of the surface, R*, must be known. Silicon has a weak dependence of reflectance on wavelength, which is given by the following empirical fit to the data: 1 R ¼ 0:6786 þ 0:03565l1 0:03149l2
ð37Þ
Figure 7 illustrates the procedure for determining the diffusion length from a plot of vs. 1/a for silicon (Schroder, 1990). The minority carrier diffusion length is readily obtained by extrapolating the quantum yield data to the x intercept and taking the value of 1/a when ¼ 0. Since the quantum yield is measured at different wavelengths, the photon flux is adjusted to assure a constant photovoltage at each measurement. Other Methods for Determination of Diffusion Length. Many other methods for the measurement of minority carrier diffusion length involve determination of the minority carrier lifetime t and the minority carrier diffusion constant D and application of the relationship L ¼ (Dt)1/2 to determine the minority carrier diffusion length. One can, for example, monitor the bulk band-to-band radiative luminescence lifetime under conditions where the surface nonradiative recombination processes are negligible (Yablonovitch et al., 1987). Another technique uses the absorption of microwave radiation by free carriers. In this method, pulsed excitation of a semiconductor sample inside a microwave cavity produces a transient increase
616
ELECTROCHEMICAL TECHNIQUES
liquid interface. Because all of the dopants in the depletion region are assumed to be ionized, the charge in the depletion region at a potential E can be expressed as kT 1=2 Q ¼ qee0 Nd Vbi þ E q
ð38Þ
Taking the derivative of Q with respect to E yields an expression for the differential capacitance Cd of the semiconductor: Cd ¼
Figure 7. Plot of vs. the inverse absorption coefficient for three Si diodes with different diffusion lengths. The minority carrier diffusion lengths are obtained from the value of |1/a| when the quantum yield is zero. (Reprinted with permission from Schroder, 1990.)
in microwave absorption, and the absorption decay yields the carrier lifetime (Yablonovitch et al., 1986). The analysis of the data to extract a value for t, and thus to determine Lp, is similar in all cases, and details can be found, e.g., in the book by Many et al. (1965). Semiconductorliquid contacts are very useful in certain circumstances, as in, e.g., the iodine-treated silicon surface in methanol, for which the surface recombination velocity is so low that the observed recombination is almost always dominated by nonradiative decay processes in the bulk of the silicon sample (Msaad et al., 1994). In these cases, measurement of the photogenerated carrier decay rate by any convenient method directly yields the minority carrier lifetime, and thus the minority carrier diffusion length, of the sample of concern. These values can also be spatially profiled across the semiconductor-liquid contact to obtain important information concerning the uniformity of the material properties of the bulk solid of interest.
1=2 dQ qee0 Nd ¼ 2ðVbi þ E kT=qÞ dE
ð39Þ
where e is the relative permittivity of the semiconductor and e0 the permittivity of free space. A plot of C2 d vs. E (i.e., a Mott-Schottky plot) should thus yield a straight line that has a slope of 2(qes Nd)1 (with es ¼ ee0 and an x intercept of kT/q Vbi. Equivalent Circuit Model for Measuring Differential Capacitance of Semiconductor-Liquid Contact. The most common procedure for obtaining differential capacitance vs. potential data is to apply a small AC (sinusoidal) voltage at the DC potential of interest. An analysis of the resulting cell impedance and phase angle in response to this sinusoidal perturbation yields the value of the semiconductor differential capacitance Cd. This method, of course, requires that Cd can be accessed from the experimentally measured electrical impedance behavior of a semiconductor-liquid contact. In general, an equivalent circuit is required to relate the measured impedance values to physical properties of the semiconductor-liquid contact. The conventional description of the equivalent circuit for a semiconductor-liquid junction can be represented as in Figure 8A (Gerischer, 1975). The subscript s refers to
DIFFERENTIAL CAPACITANCE MEASUREMENTS OF SEMICONDUCTOR-LIQUID CONTACTS Principles of the Method Differential capacitance measurements of semiconductorliquid contacts are very useful in obtaining values for the dopant density of the bulk semiconductor that forms the semiconductor-liquid contact. In addition, such measurements have been found to be of great use in determining doping profiles of heterojunctions (Seabaugh et al., 1989) and of epitaxial layers of semiconductors (Leong et al., 1985) fabricated for use in light-emitting diodes, transistors, solar cells, and other optoelectronic devices. To obtain an expression for the differential capacitance vs. potential properties of a semiconductor-liquid contact, we refer again to Equations 3 to 6, which describe the basic electrostatic equilibrium conditions at a semiconductor-
Figure 8. (A) Circuit of a semiconductor-liquid junction and (B) a simplification of that circuit.
SEMICONDUCTOR PHOTOELECTROCHEMISTRY
the semiconductor bulk, sc to the space charge, ss to surface states, H to the Helmholtz layer, and soln to solution. This circuit is reasonable because there will clearly be a capacitance and resistance for the space-charge region of the semiconductor. Because surface states represent an alternate pathway for current flow across the interface, Css and Rss are in parallel with the elements for the space-charge region. The Helmholtz elements are in series with both the space-charge and surface-states components, since current flows through the Helmholtz layer regardless of the pathway of current flow through the semiconductor. Finally, the series resistance of the electrode and contact and the solution series resistance are included as distinct resistive components in this circuit. A possible simplification that can often be justified combines the resistances of the solution and of the electrode into one series resistor (Figure 8B). If the Helmholtz layer resistance is small, the RC portion of the Helmholtz layer circuit impedance is dominated by CH. Furthermore, CH is usually much larger than Csc. Therefore, CH can typically be neglected, because the smallest capacitance value for a set of capacitors connected in series dominates the total capacitance of the circuit. Additionally, if the AC impedance measurement is performed at a high frequency, such that surface states do not accept or donate charge as rapidly as the space-charge region, then the elements associated with surface state charging-discharging processes can also be neglected. Alternatively, if the surface state density of the semiconductor is low enough, the contributions of Css and Rss are negligible. At sufficiently high frequencies, the impedance of this simplified circuit can be written in terms of G ¼ (Rs)1 and B ¼ (oCscRs)2, where G is the conductance and B is the susceptance (both in units of siemens). This yields the desired determination of Cd vs. E from the impedance response of the system. Once Cd is determined, Equation 39 is used to obtain a value for Nd. When using this method, C2 d vs. E should be determined at several different AC frequencies o to verify that the slope and intercept do not change as the frequency is varied.
617
impedance, Zim, which is defined by Zim ¼ Ztot sin y, can be calculated. For the simplified three-element equivalent circuit illustrated in Figure 8, the dependencies of Ztot vs. AC signal frequency (Bode plot) and Zim vs. Zre (Nyquist plot) ideally obey patterns that not only permit extraction of the electrical elements in the circuit but also indicate the frequency range over which the cell acts capacitively. Of course, deviations from these ideal patterns can allow evaluation of the appropriateness of the theoretical equivalent circuit to the description of the experimental cell. A typical Bode plot for a silicon electrode in contact with a redox-active solution is shown in Figure 9. At high frequencies f of the input AC voltage signal, the capacitive reactance wc [i.e., the effective impedance of the capacitor, given by wc ¼ (2pfCsc)1] is small relative to Rsc, so most of the current flows through the Csc pathway, and the observed impedance is therefore simply Rs. At low frequencies, wc is high relative to Rsc, so most of the current flows through the Rsc pathway, and the observed impedance is Rs þ Rsc. In the intermediate-frequency range, as is shown in Figure 9, increments in frequency translate directly into changes in wc, so the magnitude of Ztot is dictated by Csc. The Mott-Schottky measurements should be performed in this capacitive frequency regime, which ideally has a slope of 1 in a Bode plot. A Nyquist plot is distinguished from the Bode plot in that it illustrates explicitly the branching between Zre and Zim as a function of frequency. A Nyqyist plot for a silicon electrode in contact with a redox-active solution is
Practical Aspects of the Method In a typical Mott-Schottky experiment, the DC bias to the electrochemical cell is delivered by a potentiostat, and the AC voltage signal is supplied by an impedance analyzer that is interfaced to the potentiostat. The DC potential applied to the semiconductor junction, which is usually in the range of 0 to 1 V vs. E(A/A), is always in the direction of reverse bias, since forward-bias conditions induce Faradaic current to flow and effectively force the cell to respond resistively instead of capacitively. The input AC signal, usually 5 to 10 mV in amplitude, must be small to ensure that the output signal behaves linearly around the DC bias point and is not convoluted with the electric field in the semiconductor (Morrison, 1980). The quantities that are measured by the impedance analyzer are the total impedance Ztot and the phase angle y between Ztot and the input AC signal. From these data, the real component of the impedance, Zre, which is defined by Zre ¼ Ztot cos y, and the imaginary component of the
Figure 9. Typical Bode plots taken at intermediate frequencies where the impedance is dominated by capacitive circuit elements. The electrode-solution system is identical to that used to obtain the data in Figure 4. The open circles represent data taken at þ0.2 V vs. E(A/A) and the filled circles represent data taken at þ0.8 V vs. E(A/A).
618
ELECTROCHEMICAL TECHNIQUES
Figure 10. Nyquist plot for the system described in Figure 4. The data were collected at an applied DC bias of 0.45 V.
shown in Figure 10. At high frequency, Zre ffi Rs. As the frequency is incrementally decreased, the contribution of Zim rises until the magnitude of the capacitive reactance is identical to the resistance of Rsc. At this point, equal current flows through the circuit elements Csc and Rsc. As the frequency is decreased further, the system behaves increasingly resistively, until the impedance is entirely dictated by Rsc. In practice, since the frequency may not reach a point where the system is predominantly resistive, it is generally easier to extract the value of Rsc from a Nyquist plot than from a Bode plot. Circle-fitting algorithms can provide reasonable estimates of Rsc if the Nyquist plot possesses a sufficient arc over the frequency range explored experimentally. Data Analysis and Initial Interpretation Assuming the three-element equivalent circuit depicted in Figure 8B,
Csc ¼
1þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 4Z2im =R2sc 2oZim
ð40Þ
where the angular frequency o ¼ 2pf. It is also assumed that Rs Rsc, so that Rs contributes negligibly to the measured impedance. Typically, since Rs is on the order of 102 and Rsc is on the order of 105 to 107, Equation 40 gives an accurate value for Csc. Therefore, measurement of Zim at various frequencies and DC biases, as well as extraction of the Rsc value at each DC bias from the circular fit of the Nyquist plot at that potential, allows calculation of Csc. The subsequent compilation of C2 sc vs. E for each frequency yields the Mott-Schottky plot, as shown in Figure 11 for a typical silicon-solution interface. Linear regression of these data points and extrapolation of these linear fits to the x intercepts of C2 sc ¼ 0 gives the value of Vbi for each frequency. Differential capacitance measurements can readily be performed with a potentiostat in conjunction with an
Figure 11. Mott-Schottky plots for the system described in Figure 4. The data shown are for two different concentrations of acceptor species and for two different acquisition frequencies, 100 and 2.5 kHz.
impedance analyzer, both of which are set up as described above, yielding direct values of the imaginary and real components of the impedance. Alternatively, AC voltamettry can be performed using a potentiostat in conjunction with a waveform generator and a lock-in amplifier. In this technique, the magnitude of the AC component of the current (i.e., Ztot) and the phase angle with respect to the input AC signal are measured at a given frequency. The real and imaginary parts of the impedance can then be extracted as described above. This powerful technique therefore requires no additional capabilities beyond those that are required for basic measurements of the behavior of semiconductor photoelectrodes. In aqueous solutions, useful electrolytes are those that yield anodic dissolution of the semiconductor but do not etch the solid or produce a passivating oxide layer in the absence of such illumination. For Si, the appropriate electrolyte is NaF-H2SO4 (Sharpe et al., 1980), whereas for GaAs, suitable electrolytes are Tiron (dihydroxybenzene3,5-disulfonic acid disodium salt) and ethylenediaminetetraacetic acid (EDTA)0.2 M NaOH (Blood, 1986). A variety of nonaqueous solutions can also be used to perform differential capacitance measurements, although more attention needs to be paid to series resistance effects in such electrolytes. Either solid-state or solid-liquid junctions can be utilized to measure the dopant density of the bulk semiconductor from C2-vs.-E plots, as described above. Solid-liquid contacts offer the additional opportunity to conveniently obtain a depth profile of the semiconductor doping level in the same experimental setup. A more comprehensive review of the technique of electrochemical profiling is given by Blood (1986). To characterize the dopant profile through the solid, an anodic potential is applied to the semiconductor electrode such that the surface is partially dissolved. The thickness of material dissolved from the electrode, Wd, is given by the
SEMICONDUCTOR PHOTOELECTROCHEMISTRY
time integral of the current density passed, J (Ambridge et al., 1975): ð M Wd ¼ J dt ð41Þ n þ Fr þ
where M is the molecular weight of the semiconductor, n is the number of holes required to oxidize one atom of the electrode material to a solution-phase ion, and r is the density of the solid. A separate Mott-Schottky plot is then obtained at each depth of etching through the material. The depth of the dopant density measurement is given by W0 ¼ W þ Wd. Thus, periodic dissolutions and MottSchottky determinations can yield a complete profile of Nd as a function of distance into the solid. Problems The most prevalent problem in Mott-Schottky determinations is frequency dispersion in the x intercepts of the linear regressions. Nonlinearity in these plots may originate from various experimental artifacts (Fajardo et al., 1997). If the area of the semiconductor is too small, edge effects may dominate the observed impedance. The area of the counter electrode must also be large relative to the working electrode (as a rule of thumb, by a factor of 10), to ensure that any electrochemical process occurring at the counter electrode does not introduce additional electrical elements into the equivalent circuit. Additionally, the redox component that is oxidized or reduced at the counter electrode must have a sufficient concentration in solution (roughly 5 mM), or this redox process may be manifested as an additional resistance that is a major component of the total cell impedance. Finally, the measuring resistor within the potentiostat should be adjusted in response to the magnitude of the currents passing through the cell in order to obtain the most accurate impedance value possible. TRANSIENT DECAY DYNAMICS OF SEMICONDUCTORLIQUID CONTACTS Principles of the Method Solid-liquid junctions are very useful in determining the electrical properties of the semiconductor surface. Principal methods in this endeavor involve measurement of the transient carrier decay dynamics, electrochemical photocapacitance spectroscopy (EPS), and laser spot scanning (LSS) techniques. Under conditions where bulk recombination is slow relative to surface recombination, measurements at the solid-liquid interface can provide valuable information regarding the concentration and energetics of electrical trap sites at the semiconductor surface. Furthermore, the presence of a liquid allows, in principle, manipulation of these trap levels through chemical reactions induced by addition of reagents to the liquid phase. Methods for determining the surface recombination velocity of solid-liquid interfaces are therefore described here. The rate of nonradiative recombination mediated through surface states can be described under many
619
experimental conditions by the steady-state ShockleyRead-Hall (SRH) rate equation (Hall, 1952; Shockley et al., 1952; Schroder, 1990; Ahrenkiel et al., 1993): RSRH ðsurfaceÞ ¼
ns ps n2i ðns þ n1;s Þ=Nt;s kp;s þ ðps þ p1;s Þ=Nt;s kn;s ð42Þ
where ni is the intrinsic electron concentration for the semiconductor, Nt,s defines the density of surface states (in units of reciprocal centimeters squared), ps is the surface hole concentration, and kn,s and kp,s are the capture coefficients for electrons and holes, respectively, by surface states. Each capture coefficient is related to the product of the thermal velocity v of the carrier in the solid and the capture cross-section s for each kinetic event such that (Many et al., 1965; Kru¨ ger et al., 1994a) kp;s ¼ np sp
and
kn;s ¼ nn sn
ð43Þ
The symbols n1,s and p1,s in Equation 42 represent the surface concentrations of electrons in the conduction band and holes in the valence band, respectively, when the Fermi level is at the trap energy. Values for n1,s and p1,s can be obtained through use of the principle of detailed balance, which, when applied to a system at equilibrium in the dark, yields (Many et al., 1965; Blakemore, 1987) n1;s ¼ Nc exp½ðEcb Et Þ=kT
ð44aÞ
p1;s ¼ Nv exp½ðEt Evb Þ=kT
ð44bÞ
and
where Et is the energy of the surface trapping state. When other recombination processes are minimal and the initial carrier concentration profiles are uniform throughout the sample, surface recombination should dominate the entire decay properties of the sample. Under these conditions, the observed minority carrier decay dynamics are given by (Ahrenkiel et al., 1993) Rate ¼
dp p ¼ RSRH dt ts;1
ð45Þ
Furthermore, under such conditions, the fundamental filament decay lifetime ts,l is given by (Schroder, 1990; Ahrenkiel et al., 1993) ts;1 ¼ d=Slow
Slow Sp ¼ Nt;s kp;s
ð46Þ
where d is the thickness of the sample and Slow is the surface recombination velocity (in units of centimeters per second) at low-level injection (n < Nd and p < Nd). The other limiting situation is obtained under uniform illumination at an intensity high enough to produce highlevel injection conditions (i.e., n Nd and p Nd). By using Equation 42 with injected carrier concentrations n Nd and p Nd and, because equal numbers of
620
ELECTROCHEMICAL TECHNIQUES
electrons and holes are created by the optical injection pulse, taking n ¼ p, the recombination rate is Rate ¼
d p p ¼ RSRH dt ts;h
ð47Þ
Now, however, the filament decay lifetime is (Blakemore, 1987; Schroder, 1990) ts;h ¼
d Shigh
Shigh ¼
Sp Sn Sp þ Sn
ð48Þ
For conditions of kp,s ¼ kn,s, Shigh ¼ 12Slow , and the surface recombination decay lifetime under high-level injection, ts,h, is equal to 2ts,l. Practical Aspects of the Method The carrier concentration decays can be monitored using a number of methods. Luminescence is a convenient and popular probe for direct band gap semiconductors, while for materials like silicon, conductivity methods are often employed. The two methods are complementary in that the photoluminescence signal decays when either the minority or majority carrier is captured, whereas conductivity signals are weighted by the individual mobilities of the carrier types. Thus, if all of the holes in silicon were to be trapped, the band gap luminescence signal would vanish, whereas the conductivity signal would still retain 75% of its initial amplitude (because of the carrier mobilities for silicon, mn ’ emp ). Either method can be used to probe the carrier concentration dynamics, and both are sometimes used on the same sample to ensure internal consistency of the methodology. A somewhat more specialized method of monitoring the carrier concentrations has also been recently developed using ohmic-selective contacts on silicon, where the photovoltage developed between these contacts yields a probe of the carrier concentration decay dynamics in the sample of concern (Tan et al., 1994a). Measurements of Surface Recombination Velocity Using Time-Resolved Photoluminescence Methods. One of the most reliable methods for monitoring minority carrier decay dynamics is the time-resolved photoluminescence (TRPL) method. The use of time-correlated single-photon counting has added further sensitivity to the TRPL technique and has allowed for sub-picosecond lifetime measurements. To perform time-correlated single-photon TRPL, one starts with pulsed, short-time monochromatic laser light tuned to an energy greater than the band gap energy of the material under study. This laser source can be produced by the output of a pulse-compressed Nd-yttriumaluminum garnet (YAG) pumped dye laser, if picosecond timing resolution suffices, or from the fundamental mode of a solid-state Ti-sapphire laser, if even shorter time resolution is desired. The timing characteristics of the laser pulse can be determined through use of an autocorrelator. Any chirping in the light source must be removed by passing the light through a few meters of optical fiber.
Prior to reaching the sample surface, the laser light is split by means of a beam splitter (Fig. 12). The less intense optical beam emanating from the beam splitter is directed into a photodiode, while the more intense beam is directed onto the semiconductor sample. The photodiode and sample must be positioned at equal distances from the beam splitter to ensure correct timing in the experiment. The voltage output of the photodiode is directly connected to the START input of a time-to-amplitude converter (TAC). Thus, with each laser pulse, the sample is illuminated and the TAC is triggered on (start). Both the beam diameter and the polarization of the incident light are experimental parameters used for controlling the light intensity, and thus the injection level, of the semiconductor. The light emitted by the sample due to radiative recombination processes is collected and focused onto a spectrometer that is tuned to the wavelength of the transition to be monitored. The light output from this spectrometer is then directed onto a single-photon detector. When a single photon is detected, the resulting voltage output produced by the detector is used to trigger the STOP input of the TAC. Once triggered into the off position, the TAC will in turn produce a voltage pulse whose magnitude is linearly dependent on the time duration between the START and STOP signals. The distribution of voltage-pulse magnitudes is thus a distribution in time of the photons emitted from the sample after initial illumination. By allowing a computer-controlled multichannel analyzer (MCA) to display a histogram, with respect to magnitude, of the distribution of the voltage pulses that are produced by the TAC, a single-photon TRPL spectrum is obtained (Ahrenkiel et al., 1993; Kenyon et al., 1993).
Figure 12. Time-resolved single-photon photoluminescence spectrometer.
SEMICONDUCTOR PHOTOELECTROCHEMISTRY
To enhance the signal-to-noise ratio in a TRPL spectrum, it is necessary to place a pulse- height discriminator between the single-photon detector and the STOP input of the TAC. This discriminator will ensure that low-voltage pulses that arise from either electronic or thermal noise and/or high-voltage pulses that can be produced by multiphoton events do not accidentally trigger the STOP signal on the TAC. To obtain the true TRPL spectrum for minority carriers, this experimentally obtained TRPL spectrum must be deconvoluted with respect to the system response function of the entire apparatus. The system response function must be determined independently for the setup of concern and is typically measured by exciting a diffuse reflector or scattering solution instead of the semiconductor. A variety of numerical methods allow generation of the corrected photoluminescence decay curve from the observed TRPL decay and the system response function (Love et al., 1984). Deconvolution is especially important if the TRPL decay time is comparable to the system response time. The success of the single-photon counting method is due to the development of detectors with sufficient gain and time response that a single photon produced by an electron-hole recombination event can be detected within a system response time of 30 ps. Two varieties of these detectors are photomultiplier tubes (PMTs) and microchannel plates (MCPs). A limitation of these detectors, however, is that only photon energies 1.1 eV can be detected. Recently, advancements in the timing resolution of singlephoton avalanche photodiodes (SPADs) have allowed for detection of photons in the near-infrared region and have raised the possibility of performing TRPL with materials having a band gap of iT;1 . Negative feedback occurs at the insulating regions, as is apparent by current lower than iT;1 . Figure 5B, showing GC imaging, was acquired with the substrate set at a potential just negative enough (þ420 mV) to generate a small concentration (about 20 mM) of FeðCNÞ4 while the tip was set at a positive potential 6 (þ600 mV) to detect, by oxidation, FeðCNÞ4 6 . In this case, the GC ability to detect small concentrations at a surface is illustrated, but at the cost of a lower resolution in comparison to the feedback mode. Good results in GC imaging rely on placing the tip as close as possible to the surface to avoid diffusional broadening of the detected species. Stirring may improve resolution by minimizing diffusional broadening. In GC imaging, the tip current provides an image of the distribution of the electroactive species near a surface. At a voltammetric tip, the tip current map can be converted to a concentration map by use of the equation for the limiting current at a disk electrode (Equation 1). To do this quantitatively requires knowledge of the species diffusion coefficient and number of electrons involved in the overall reaction. In addition, the tip potential must be located on the limiting current plateau. The singular advantage of the disk electrode in feedback imaging is not present in GC imaging, and so electrodes of conical or hemispherical geometry can be used. In this case, a version of the limiting current equation appropriate for the tip geometry must be employed (Wightman and Wipf, 1989; Mirkin et al., 1992). The potentiometric GC mode relies on an ISE as the tip. The main advantage of the potentiometric GC mode is that the tip is a passive sensor of the target ion activity and does not cause the depletion or feedback effects noted above. In addition, the tip can be used to detect ions that would be difficult or inconvenient to detect voltammetrically, such as Ca2þ or NH4þ . The selectivity of the tip for the ion of interest is an advantage, but this requires separate tips for each species. The response of an ISE is, in general, a voltage proportional to the logarithm of the ion’s activity, but special consideration must be made for potentiometric GC to provide a rapid response with microscopically small tips (Morf and Derooij, 1995). PRACTICAL ASPECTS OF THE METHOD SECM Instrumentation Figure 6 shows a block diagram of an SECM instrument. Minimally, the instrument consists of two components: one to position the tip in three dimensions and one to measure the tip signal. A commercial instrument designed specifi-
Figure 6. A scanning electrochemical microscope.
cally for SECM has recently become available; however, most SECM investigators build their own. The one in use in the author’s laboratory will serve as a model for a discussion of the instrument (cf. Fig. 6). The tip is attached to a three-axis translator stage with motion provided by linear piezoelectric motors (Inchworm motors, Burleigh Instruments). The inchworm motors provide scan rates greater than 100 mm/s over a 2.5-cm range of motion with a minimum resolution of less than 100 nm as well as backlash-free operation. An interface box (Model 6000, Burleigh) controls the speed, direction, and axis selection of the motors by accepting TTL (transistor-transistor logic) level signals from a personal computer. The motors operate in an open-loop configuration and the speed and total movement of the axis is controlled by a TTL clock signal. Use of the open-loop configuration requires that each axis be calibrated to relate the actual movement per clock pulse (i.e., nanometers per pulse). A closed-loop version of the inchworm motor is also available. A bipotentiostat (EI-400, Cypress Systems) is used to control the tip potential and amplify the tip current. Use of a bipotentiostat, which allows simultaneous control of two working electrodes versus common reference and auxiliary electrodes, is convenient when control of the substrate potential is also required. The bipotentiostat should be sufficiently sensitive to allow measurements of the low current flow at microelectrodes, which may be in the picoampere range. Commercial bipotentiostats designed for use with rotating ring-disk electrodes are not suitable without some user modification to decrease noise and increase sensitivity. Use of a Faraday cage around the SECM cell and tip to reduce environmental noise is advisable. For potentiometric SECM operation, an electrometer is required to buffer the high impedance of the microscopic tips. Custom software is used to program the tip movement and to display the tip and substrate signals in real time. For positioning the tip during GC imaging, a video microscope system is especially useful, since the tip can be observed approaching the substrate on the video monitor while the SECM operator is moving the tip. Descriptions of other SECM instruments are available in the literature (Bard et al., 1994; Wittstock et al., 1994b).
SCANNING ELECTROCHEMICAL MICROSCOPY
The SECM tip is perhaps the most important part of the SECM instrument and, at this time, disk-shaped electrodes must be constructed by the investigator. A construction method for disk electrodes with a radius of 0.6 mm is described below (see Protocol: SECM Tip Preparation).
643
tip-substrate separation (Borgwarth et al., 1994, 1995a). The transient tip current is affected by tip-substrate separation but not by substrate conductivity and, thus, can be used to position the tip automatically. Applications in Corrosion Science
Constant-Current Imaging Most SECM images have been acquired in the constantheight mode where the tip is rastered in a reference plane above the sample surface. The feedback mode requires tipsubstrate spacing of about one tip diameter during imaging. With submicrometer tips, it is difficult to avoid tip crashes due to vibration, sample topography, and sample tilt. Constant-current imaging can be used to avoid tip crashing and improve resolution by using a servomechanical amplifier to keep the tip-substrate distance constant while imaging. With surfaces known to be completely insulating or conducting, implementation of a constant-current mode is straightforward and similar in methodology to STM studies (Binnig and Rohrer, 1982). For example, any deviation of the tip current from the reference current is corrected by using a piezoelectric element to position the tip. A current larger than the reference level causes the tip to move toward an insulating surface and away from the conducting surface. However, this methodology cannot work when a surface has both insulating and conducting regions. Unless some additional information is provided about the substrate type or the tip-substrate separation, the tip will crash as it moves from an insulating to a conducting region or vice-versa. One method for providing constant-current imaging is to impose a small-amplitude (100 nm or less), highfrequency vertical modulation of the tip position, i.e., tip position modulation (TPM) during feedback imaging (Wipf et al., 1993). The phase of the resulting ac current shifts by 1808 when the tip is moved from an insulating to a conducting surface to provide unambiguous detection of the sample conductivity. This phase shift is used to set the proper tip motion and reference current level on the servoamplifier. The normal tip feedback signal is restored by filtering the small ac signal. A second method uses the hydrodynamic force present at a tip vibrated horizontally near the substrate surface (Ludwig et al., 1995). As the tip approaches the substrate, the amplitude of a vibration resonance mode is diminished and sensed by monitoring the diffraction pattern generated by a laser beam shining on the vibrating tip. Adjustment of the tip position to maintain a constant diffraction signal at a split photodiode allows ‘‘constant-current’’ imaging. Since a specific electrochemical response is not required, the hydrodynamic force method can be used for either GC or feedback imaging at a substrate. However, a special tip and cell design is required, and the method does not provide a measure of the tip-substrate separation. Another hydrodynamic method uses resonance frequency changes of a small tuning fork attached to the SECM tip (James et al., 1998). This relies on the presence of a shear-force damping as the tip approaches the surface. A third constant-current method, the ‘‘picking mode,’’ monitors the current caused by convection during a rapid tip approach from a large
A number of researchers have recognized the possibility of using SECM for investigations in localized corrosion at metallic samples. The SECM can be used to make topographic images of samples in the feedback mode, but in this case, the more interesting images will be GC images of local ion concentrations near the corroding surface. Voltammetric or amperometric GC methods can be used to detect electroactive species produced or depleted at corroding surfaces. Examples in the literature include Fe ions (Wipf, 1994); O2, Co, and Cr ions (Gilbert et al., 1993); and Cu and Ni ions (Ku¨ pper and Schultze, 1997b). Absolute determination of the concentration of the species is made using Equation 1 to calculate the limiting current of the SECM tip. However, the presence of multiple electroactive species may interfere with the detection. Use of a potential-time program (e.g., pulse or sweep voltammetry methods) can improve specificity and provide for multiple species detection at a stationary tip (Ku¨ pper and Schultze, 1997a). Alternatively, the use of potentiometric GC SECM images of corroding surfaces can yield information about local pH values. Changes in pH of 1 ms can diffuse across a 1-mm gap while a gap of 0.1 mm would allow a species with a 10-ms lifetime to react with the substrate. Further control of the reaction conditions is possible by addition of a scavenger species to solution. The ‘‘scavenger’’ reacts with the tip-generated species as they diffuse from the tip region. For example, a pH-buffered solution will minimize the extent of pH changes caused by OH generation at the tip. Likewise, an oxidizing agent can be neutralized by addition of a second reducing agent to scavenge any of the oxidant leaving the tip region. Another substrate modification method is the ‘‘direct’’ mode. In this mode, the tip is used as a microscopic auxiliary electrode for the substrate reaction (Hu¨ sser et al., 1989). With the tip positioned close to the substrate, any Faradaic reaction at the substrate is limited to the region near the tip. Any electrochemical reaction that can be produced at the bulk surface can be performed, at least in principle, at a local region of the surface using the direct mode. For example, metal etching, metal deposition, and polymer deposition are all possible at microscopic regions.
The SECM tip can be used to modify a surface on a microscopic scale by manipulating the local chemical environment or by production of a localized electric field. An example is the deposition of conducting polymer patterns from solutions of a monomer. Using a cathodic tip in the direct mode, polyaniline lines about 2 mm wide were drawn on Pt electrodes coated with a thin ionically conducting polymer (i.e., Nafion) containing the anilinium cation. The ionic polymer served to concentrate the electric field in order to oxidize anilinium cation at the metal surface (Wuu et al., 1989). Changing the local solution composition can be used to drive a reaction that leads to formation of polymer structures. Electrolytic reduction of water at the tip in a solution of aniline in H2SO4 increases local pH and leads to local oxidation of aniline and subsequent formation of polyaniline at a substrate electrode. Patterns with widths of 1 mm thickness could be formed at writing speeds of 1 mm/s (Zhou and Wipf, 1997). Polypyrrole polymer structures on electrodes can be produced by using voltage pulses in an aqueous pyrrole at the tip to produce pyrrole radical cations that subsequently polymerize on a substrate surface (Kranz et al., 1995a). Using this method, a conducting polypyrrole line was grown between two electrodes across a 100-mm insulating gap (Kranz et al., 1995b), and a 200-mm-high polypyrrole tower was grown (Fig. 10; Kranz et al., 1996). In another example, generation of the oxidizing agent Br2 at the SECM tip formed patterns by polymerizing a thin film of a 2,5-bis(1-methylpyrrol-2yl)-thiophene monomer deposited on a substrate, and dissolution of the unreacted monomer left the polymer
Figure 10. Scanning electron micrograph of a polypyrrole tower grown by SECM from an aqueous solution of pyrrole using a 10-mm-diameter Pt microelectrode (tower height ¼ 200 mm, width ¼ 70 mm). (Reprinted from Kranz et al., 1996, by permission of Wiley-VCH Verlag GmbH.)
646
ELECTROCHEMICAL TECHNIQUES
pattern (Borgwarth et al., 1995b). Due to the smaller tip size, STM deposition typically produces smaller feature sizes. The main advantage of SECM, yet to be fully realized, is the greater control of deposition conditions allowed by precise electrochemical control of the local solution composition. Metal and metal oxide structures are readily deposited from solution using SECM. Use of the direct mode to reduce metal ions dissolved in a thin, ionically conducting polymer film deposits Au, Ag, and Cu lines as thin as 0.3 mm (Craston et al., 1988; Hu¨ sser et al., 1988, 1989). The SECM tip induced pH changes have been used to deposit nickel hydroxide (Shohat and Mandler, 1994) and silver patterns (Borgwarth et al., 1995b). Thin Au patterns can be drawn on electronic conductors by oxidizing a gold SECM tip in Br solution to produce AuBr 4 ions; reduction of the ions at the sample surface produces the metal film (Meltzer and Mandler, 1995a). Three-dimensional microfabrication of a 10-mm-thick Ni column and spring was demonstrated by use of an SECM-like device in the direct mode (Madden and Hunter, 1996). Localized etching of metals and semiconductors to form patterns of lines or spots is possible by electrolytic generation of a suitable oxidizing agent at the SECM tip. For example, Cu can be etched using a tip-generated oxidant, 0 such as OsðbpyÞ3þ 3 ðbpy ¼ 2;2 -bipyridylÞ generated from 2þ OsðbpyÞ3 (Mandler and Bard, 1989), and GaAs and Si semiconductors can be etched by Br2 generated at the tip by oxidation of Br (Mandler and Bard, 1990b; Meltzer and Mandler, 1995b). Additionally, the SECM tip current observed during the etching process is a feedback current and can be used to monitor the rate of etching. In this case, the mediator is regenerated due to the redox reaction at the substrate rather than by direct ET and so the feedback current monitors the rate of oxidative dissolution directly. The behavior of the feedback current observed during etching was used to postulate a hole injection mechanism for etching of n-doped GaAs by tip-generated Br2 (Mandler and Bard, 1990a). The oxidative etching kinetics of Cu by tip-generated RuðbpyÞ2þ 3 was explored in the same manner (Macpherson et al., 1996). Dissolution of nonconducting material is an area in which the ability of the SECM to produce a controlled chemical environment near the material surface provides a significant advancement. Dissolution rates of even very soluble materials can be obtained by placing the tip near (1 to 10 mm) a dissolving surface. If the solution near the surface is saturated with the dissolving material and that material is electroactive, tip electrolysis produces a local undersaturation of the dissolving material. Quantitative details about the rate and mechanism of the dissolution reaction are available from the tip current as a function of time and distance to the sample (Unwin and Macpherson, 1995). The dissolution of the (010) surface of potassium ferrocyanide trihydrate in aqueous solution is an example in which the dissolution process was found to be second order in interfacial undersaturation (Macpherson and Unwin, 1995). In a separate study, the dissolution rate constant for AgCl in aqueous potassium nitrate solution was shown to be in excess of 3 cm/s (Macpherson et al., 1995b).
The spatial resolution of the SECM was employed to examine and characterize two very different Cu2þ dissolution processes on a copper sulfate pentahydrate crystal in H2SO4. At a crystalline interface where the dislocation density is large, and thus the average distance between dislocation sites is much smaller than the tip size, dissolution is rapid and follows a first-order rate law in undersaturation (Macpherson and Unwin, 1994a). In contrast, at the (100) face of the crystal, where the average distance between dislocation sites is larger than the tip size, an initial rapid dissolution is followed by an oscillatory dissolution process (Fig. 11A; Macpherson and Unwin, 1994b). The oscillatory dissolution is modeled by assuming that nucleation sites for dissolution are only produced above a critical undersaturation value of Cu2þ. Thus, the production of Cu2þ by dissolution autoinhibits further dissolution, leading to oscillations. A micrograph (Fig. 11B) of the dissolution pit shows five ledges, in agreement with the five oscillation cycles observed in this experiment. In a related experiment, a gold SECM probe tip began oscillatory dissolution in a 1.5 M HCl solution when it was moved close to a Pt substrate electrode (Mao et al., 1995). In this case, reduction of the tip-generated AuCl 4 produced a periodic excess of Cl ion in the tip-substrate gap, which led to the oscillations.
PROBLEMS A common problem in feedback SECM is a drift in the value of iT;1 during an experiment due to a change in the mediator concentration by solution evaporation, a chemical reaction, or a change in electrode activity with time. For quantitative measurements, the value of iT;1 should be checked several times during the experiment to verify stability. Note also that iT;1 must be checked at a sufficiently large distances. An L (d/a) value of 10 will still produce an IT value of 1.045 over a conductive surface. To get IT values within 1% of the value of L ¼ 1; L should be >100. Selecting a mediator for use in feedback imaging is important to obtain good images. Ideally, mediators should be stable in both the oxidized and reduced forms and over a wide range of solution conditions and have rapid ET kinetics. A common difficulty is that a mediator may often undergo a slow chemical reaction, causing a deactivation of the tip response over the time scale of an SECM imaging session. Tip deactivation is a problem, since removal of the tip from the SECM instrument to polish it and return it to the identical position at the sample surface is difficult. Although commonly used in electrochemical experiments, ferricyanide, FeðCNÞ3 6 ion can often cause slow deactivation of the tip signal (Pharr and Griffiths, 1997). A popular and very stable mediator that can be used in aqueous solution at pH 1.0, whereas for donor-acceptor and free-to-bound transitions it is anti-Stokes Raman scattering). The probability of these transitions occurring is, according to quantum mechanical perturbation theory, given by a2mn , where ð amn cm pcn dV
ð7Þ
Figure 3. Rayleigh and Raman scattering. 1 þ 2, Rayleigh scattering; 1 þ 3, Stokes Raman scattering; 4 þ 5, anti-Stokes Raman scattering.
702
OPTICAL IMAGING AND SPECTROSCOPY
Figure 4. Raman spectrum of impure, polyphase (monoclinicþtetragonal) ZrO2. (C. S. Kumai, unpublished results, 1998.)
is the first vibrational excited state and m is the vibrational ground state. Since the population of energy levels in a collection of atoms/molecules is governed by the Maxwell-Boltzmann distribution function, the population of the vibrational ground state will always be greater than the population of the first vibrational excited state. As a result, the intensity of the Stokes-shifted Raman scattering will always be greater than the intensity of the anti-Stokes-shifted Raman scattered radiation. This effect, which is predicted quantum mechanically but not classically, is illustrated in Figure 4, which presents the Raman spectrum of polycrystalline ZrO2 (mixed monoclinic and tetragonal). The Raman spectrum is depicted in terms of the shift (in wavenumbers) in the wavelength of the scattered radiation with respect to that of the incident light. Thus the Rayleigh peak is seen in the center, and the anti-Stokes and Stokes lines are to the left and right, respectively. Note that for each peak in the Stokes Raman section of the spectrum there is a corresponding peak of lower intensity in the anti-Stokes Raman section.
where h ¼ h=2p and h is Planck’s constant. These two vibrational states are plotted in Figure 5. By inspection, ð1 1
c1 ðxÞdx ¼ 0
ð12Þ
That is, performing the integration by summing up the infinite number of terms given by X
c1 ðxn Þxn
ð13Þ
(where xn ¼ xn xn1 Þ, it is seen that for every positive contribution (to the sum) given by c1 ðxi Þxi there is a term equal in magnitude but opposite in sign [c1 ðxi Þ xi ¼ c1 ðxi Þxi ]. Thus, the entire sum in Equation 13 is zero. Similarly, by inspection, ð1 1
c0 ðxÞdx 6¼ 0
ð14Þ
Overview of Group Theoretical Analysis of Vibrational Raman Spectroscopy The use of group theory in determining whether or not a given integral can have a value of zero may be illustrated using the ground vibrational state c0 ðxÞ and first vibrational excited state c1 ðxÞ of a harmonic oscillator. The equations describing these two vibrational states are, respectively, 2pmox2 c0 ðxÞ ¼ exp h 1=2 1=2 4p mo 2pmox2 exp c1 ðxÞ ¼ h h
ð10Þ ð11Þ
Figure 5. Ground-state vibrational wave function and first excited-state vibrational wave function of harmonic oscillator.
RAMAN SPECTROSCOPY OF SOLIDS
That is, every term c0 ðxi Þxi in the sum the identical sign and, hence, ð1 1
c0 ðxÞdx
X
P
c0 ðxn Þxn has
c0 ðxn Þxn 6¼ 0
ð15Þ
The above analysis of the integrals in Equations 12 and 13 can be repeated using group theory. To begin, there needs to be a mathematical description of what is meant by the symmetry of a function. The difference between the symmetry of c1 ðxÞ and c0 ðxÞ can be expressed as follows. Imagine that there is a mirror plane parallel to the YZ plane and that passes through the origin, (0, 0, 0); then the operation of this mirror plane on the function c0 ðxÞ is described by the phrase ‘‘a mirror plane acting on c0 ðxÞ yields c0 ðxÞ.’’ If the effect of the mirror plane acting on c0 ðxÞ is represented by a matrix [R], then Effect of mirror plane acting on c0 ðxÞ ¼ ½R ½c0 ðxÞ ð16Þ In this case, ½R ¼ ½I is the identity matrix. Similarly, for c1 ðxÞ, the effect of a mirror plane acting on c1 ðxÞ equals c1 ðxÞ. In matrix notation, this statement is given as Effect of mirror plane acting on c1 ðxÞ ¼ ½R c1 ðxÞ ¼ ½I c1 ðxÞ
ð17Þ
Note that the effect of the mirror operation on x is to produce x and that cðxÞ ¼ cðxÞ. The character of the one-dimensional matrix in Equation 16 is þ1 and that in Equation 17 is 1. Consider now a molecule that exhibits a number of symmetry elements (e.g., mirror planes, axes of rotation, center of inversion). If the operation of each of those on a function leaves the function unchanged, then the function is said to be ‘‘totally symmetric’’ with respect to the symmetry elements of the molecule (or, in other words, the function is totally symmetric with respect to the molecular point group, which is the collection of all the molecule symmetry elements). For the totally symmetric case, the character of each one-dimensional (1D) matrix representing the effect of the operation of each symmetry element on the function is 1 for all symmetry operations of the molecule. For such a function, the value of the integral over all space of cðxÞdx will be nonzero. If the operation of any symmetry element of the molecule (in the molecular point group) generates a matrix that has a character other than 1, the function is not totally symmetric and the integral of cðxÞdx over all space will be zero. Now consider the application of group theory to vibrational spectroscopy. A vibrational Raman transition generally consists of the incident light causing a transition from the vibrational ground state to a vibrational excited state. The transition may occur if ðx x
c1 ðxÞaij c0 ðxÞdx 6¼ 0
ð18Þ
703
Table 1. Character Table of Point Group C2v C2v
E
C2(z)
sv(xz)
s0 v (yz)
Basis Functions
A1 A2 B1 B2
þ1 þ1 þ1 þ1
þ1 þ1 1 1
þ1 1 þ1 1
þ1 1 1 þ1
z; x2; y2; z2 xy x; xz y; yz
That is, the integral may have a nonzero value if the integrand product is totally symmetric over the range of integration. The symmetry of a function that is the product of two functions is totally symmetric if the symmetries of the two functions are the same. In addition, for the sake of completeness, the last statement will be expanded using group theoretical terminology, which will be defined below (see Example: Raman Active Vibrational Modes of a-Al2O3). The symmetry species of a function that is the product of two functions will contain the totally symmetric irreducible representation if the symmetry species of one function contains a component of the symmetry species of the other. The ground vibrational state is totally symmetric. Hence, the integrand is totally symmetric and the vibrational mode is Raman active if the symmetry of the excited vibrational state c1 ðxÞ is the same as, or contains, the symmetry of the polarizability operator aij . As shown below (see Appendix), the symmetry of the operator axy is the same as that of the product xy. Thus, the integrand c1 axy c0 is totally symmetric if c1 has the same symmetry as xy (recall that c0 , the vibrational ground state, is totally symmetric). A molecular point group is a mathematical group (see Appendix) whose members consist of all the symmetry elements of the molecule. Character tables summarize much of the symmetry information about a molecular point group. Complete sets of character tables may be found in dedicated texts on group theory (e.g., Bishop, 1973). The character table for the point group C2v is presented in Table 1. In the far right-hand column are listed functions of interest in quantum mechanics and vibrational spectroscopy in particular. The top row lists the different symmetry elements contained in the point group and illustrated in Figure 6. Here, C2 is a twofold rotational axis that is parallel to the z axis. The parameters sxz and syz are mirror planes parallel to the xz and yz planes, respectively. In each row beneath the top row are listed a set of numbers. Each number is the character of the matrix that represents the effect of the symmetry operation acting on any of the functions listed in the right-hand end of the same row. In the point group C2v, e.g., the functions z, x2, y2, and z2 have the same symmetry, which in the terminology of group theory is identified as A1. The functions y and yz have the same symmetry, B2. As shown in Figure 6, the water molecule, H2O, exhibits the point group symmetry C2v. There are three atoms in H2O and hence there are 3N 6 ¼ 3 normal vibrational modes, which are depicted in Figure 7. These exhibit symmetries A1, A1, and B2. That is, the vibrational wave functions of the first excited state of each of these modes
704
OPTICAL IMAGING AND SPECTROSCOPY
Figure 6. Symmetry operations of H2O (point group C2v).
possess the symmetries A1, A1, and B2, respectively. Since x2, y2, and z2 exhibit A1 symmetry, so too would axx , ayy and azz . Hence the product of axx , ayy and azz with c1 for the vibrational modes with A1 symmetry would be totally symmetric, meaning that the integral over the whole space cA1 axxðor yy;zzÞ c0 dV may be nonzero, meaning that this vibrational mode is Raman active (the integrand cA1 axxðor yy;zzÞ c0 dV is the same as the integrand in Equation 7). Now consider the Raman vibrational spectra of solids. One mole of a solid will have 3 6:023 1023 6 normal modes of vibration. Fortunately, it is not necessary to consider all of these. As demonstrated below (see Appendix), the only Raman active modes will be those near the center of the Brillouin zone (BZ; k ¼ 0). Solids with only one atom per unit cell have zero modes at the center of the BZ and hence are Raman inactive (see Appendix). Solids with
more than one atom per unit cell, e.g., silicon and Al2O3, are Raman active. The symmetry of a crystal is described by its space group. However, as demonstrated below (see Appendix), for the purposes of vibrational spectroscopy, the symmetry of the crystal is embodied in its crystallographic point group. This is an extremely important and useful conclusion. As a consequence, the above recipe for deciding the Raman activity of the vibrational modes of H2O can be applied to the vibrational modes of any solid whose unit cell is multiatomic.
PRACTICAL ASPECTS OF THE METHOD The weakness of Raman scattering is the characteristic that most strongly influences the equipment and experimental techniques employed in Raman spectroscopy. A conventional Raman facility consists of five major components: (1) a source of radiation, (2) optics for illuminating the sample and collecting the Raman scattered radiation, (3) a spectrometer for dispersing the Raman scattered radiation, (4) a device for measuring the intensity of the Raman scattered light, and (5) a set of components that control the polarization of the incident radiation and monitor the polarization of the Raman scattered radiation (Chase, 1991; Ferraro and Nakamoto, 1994). Sources of Radiation
Figure 7. Normal vibrational modes of H2O.
Prior to the introduction of lasers, the primary source of monochromatic radiation for Raman spectroscopy was the strongest excitation line in the visible region (blue, 435.83 nm) of the mercury arc. The power density of the monochromatic radiation incident on the sample from a mercury arc is very low. To compensate, relatively large samples and complicated collection optics are necessary to, respectively, generate and collect as many Raman scattered photons as possible. Collecting Raman photons that are scattered in widely different directions precludes investigating the direction and polarization characteristics of the
RAMAN SPECTROSCOPY OF SOLIDS
Raman scattered radiation. This is valuable information that is needed to identify the component(s) of the crystal’s polarizability tensor that is (are) responsible for the Raman scattering (Long, 1977). In general, laser radiation is unpolarized. However, in ionized gas lasers, such as argon and krypton ion lasers, a flat window with parallel faces and oriented at the Brewster angle (yBrewster ) relative to the incident beam is positioned at the end of the tube that contains the gas. This window is called a Brewster window. The Brewster angle is also called the polarizing angle and is given by the arctangent of the ratio of the indices of refraction of the two media that form the interface (e.g., for air/glass, yBrewster ¼ tan1 1:5 57 ). If an unpolarized beam of light traveling in air is incident on a planar glass surface at the Brewster angle, the reflected beam will be linearly polarized with its electric field vector parallel to the plane of incidence. (The plane of incidence contains the direction of propagation of the light and the normal to the interface between the two media.) A beam of light that is linearly polarized with its electric field vector in the plane of incidence and is incident on a Brewster window at yBrewster will be entirely transmitted. Thus, the light that exits from a laser tube that is capped with a Brewster window is linearly polarized (Fowles, 1989; Hecht, 1993). Lasers have greatly increased the use of Raman spectroscopy as a tool for both research and chemical identification. In particular, continuous-wave gas lasers, such as argon, krypton, and helium-neon, provide adequately powered, monochromatic, linearly polarized, collimated, small-diameter beams that are suitable for obtaining Raman spectra from all states of matter and for a wide range of sample sizes using relatively simple systems of focusing and collection optics. Optics As the light exits the laser, it generally first passes through a filter set to transmit a narrow band of radiation centered at the laser line (e.g., an interference filter). This reduces the intensity of extraneous radiation, such as the plasma lines that also exit from the laser tube. A set of mirrors and/or lenses will make the laser light incident on the sample at the desired angle and spot size. The focusing lens reduces the beam’s diameter to d¼
4lf pD
ð19Þ
where l is the wavelength of the laser radiation, f is the focal length of the lens, and D is the diameter of the unfocused laser beam. Note that for a given focusing lens and incident light, the size of the focused spot is inversely proportional to the size of the unfocused beam. Thus, a smaller spot size can be generated by first expanding the laser beam to the size of the focusing lens. The distance l over which the beam is focused is proportional to the square of the focused beam’s diameter (Hecht, 1993):
l¼
16lf 2 pd2 ¼ pD2 l
ð20Þ
705
Thus, to reduce the diameter of the focal spot from an argon laser (l ¼ 514:5 nm) to 1 mm, the distance between the focusing lens and the sample must be accurate to within 12 pð1 mm)2/(0.5145 mm) 3 mm. The specularly reflected light is of no interest in Raman spectroscopy. The inelastically scattered light is collected using a fast lens (i.e., low f number, defined as the focal length per diameter of the lens). Often, the lens from a 35-mm camera is a very cost-effective collection lens. The collection lens collimates the Raman scattered light and transmits it toward a second lens that focuses the light into the entrance slit of the spectrometer. The f number of the second lens should match that of the spectrometer. Otherwise, the spectrometer’s grating will either not be completely filled with light, which will decrease the resolving power (which is directly proportional to the size of the grating), or be smaller than the beam of radiation, which will result in stray light that may reflect off surfaces inside the spectrometer and reduce the signal-to-noise ratio. There are many sources of stray light in a Raman spectrum. One of the major sources is elastically scattered light (it has the same wavelength as the incident radiation), caused by Rayleigh scattering by the molecules/atoms responsible for Raman scattering and Mie scattering from larger particles such as dust. Rayleigh scattering always accompanies Raman scattering and is several orders of magnitude more intense than Raman scattering. The Rayleigh scattered light comes off the sample at all angles and is captured by the collection optics along with the Raman scattered light and is directed into the spectrometer. This would not be a problem if the Rayleigh scattered light exited the spectrometer only at the 0-cm1 position. However, because of imperfections in the gratings and mirrors within the spectrometer, a portion of the Rayleigh scattered light exits the spectrometer in the same region as Raman scattered radiation that is shifted in the range of 0 to 250 cm1 from the incident radiation. Even though only a fraction of the Rayleigh scattered light is involved in this misdirection, the consequence is significant because the intensity of the Rayleigh component is so much greater than that of the Raman component. Raman peaks located within 250 cm1 of the laser line can only be detected if the intensity of the Rayleigh scattered light is strongly reduced. The intensity of stray light that reaches the detector can be diminished by passing the collimated beam of elastically and inelastically scattered light collected from the sample into a notch filter. After exiting the notch filter, the light enters a lens that focuses the light into the entrance of the spectrometer. A notch filter derives its name from the narrow band of wavelengths that it filters out. In a Raman experiment, the notch filter is centered at the exciting laser line and is placed in front of the spectrometer. (Actually, as mentioned above and explained below, the notch filter should be located in front of the lens that focuses the scattered light into the spectrometer.) The notch filter then removes the Rayleigh scattered radiation (as well as Brillouin scattered light and Mie scattering from dust particles). Depending on the width of the notch, it may also remove a significant amount of the Raman spectrum.
706
OPTICAL IMAGING AND SPECTROSCOPY
Dielectric notch filters consist of multiple, thin layers of two materials with different indices of refraction. The materials are arranged in alternating layers so that the variation of index of refraction with distance is a square wave. Typically, dielectric filters have a very wide notch with diffuse edges and have nonuniform transmission in the region outside the notch. As a consequence, a dielectric notch filter removes too much of the Raman spectrum (the low-wavenumber portion) and distorts a significant fraction of the portion it transmits. In contrast, holographic notch filters have relatively narrow widths, with sharp edges and fairly uniform transmission in the region outside the notch (Carrabba et al., 1990; Pelletier and Reeder, 1991; Yang et al., 1991; Schoen et al., 1993). A holographic notch filter (HNF) is basically an interference filter. In one manufacturing process (Owen, 1992), a photosensitive material consisting of a dichromate gelatin film is placed on top of a mirror substrate. Laser light that is incident on the mirror surface interferes with its reflected beam and forms a standing-wave pattern within the photosensitive layer. The angle between the incident and reflected beams determines the fringe spacing of the hologram. If the incident beam is normal to the mirror, the fringe spacing will be equal to the wavelength of the laser light. Chemical processing generates within the hologramexposed material an approximately sinusoidal variation of index of refraction with distance through the thickness of the layer. The wavelength of the modulation of the index of refraction determines the central wavelength of the filter. The amplitude of the modulation and the total thickness of the filter determine the bandwidth and optical density of the filter. The precise shape and location of the ‘‘notch’’ of the filter is angle tunable so the filter should be located in between the lens that collects and collimates the scattered light and the lens that focuses the Raman scattered radiation into the spectrometer such that only collimated light passes through the filter. A holographically generated notch filter can have a relatively narrow, sharp-edged band of wavelengths that will not be transmitted. The band is centered at the wavelength of the exciting laser line and may have a total width of 300 cm1 . The narrow width of the notch and its high optical density permit the measurement of both the Stokes and anti-Stokes components of a spectrum, as is illustrated in the Raman spectrum of zirconia presented in Figure 4. In addition to a notch filter, it is also possible to holographically generate a bandpass filter, which can be used to remove the light from the plasma discharge in the laser tube (Owen, 1992). In contrast to dielectric filters, holographic filters can have a bandpass that is five times more narrow and can transmit up to 90% of the laser line. Dielectric bandpass filters typically transmit only 50% of the laser line.
have an intensity of 1 mW. The spectrometer is set to transmit the radiation of the second laser, which is located at the exit slit of the spectrometer. For convenience, instead of having to remove the light detector, the light from the second laser can enter through a second, nearby port and made to hit a mirror that is inserted inside the spectrometer, just in front of the exit slits. The light reflected from the mirror then passes through the entire spectrometer and exits at the entrance slits, a process termed ‘‘back illumination.’’ The back-illuminated beam travels the same path through the spectrometer (from just in front of the exit slits to the entrance slits) that the Raman scattered radiation from the sample will travel in getting to the light detector. Consequently, the system is aligned by sending the back-illuminated beam through the collection optics and into coincidence with the exciting laser beam on the sample’s surface. Spectrometer The dispersing element is usually a grating, and the spectrometer may typically have one to three gratings. Multiple gratings are employed to reduce the intensity of the Rayleigh line. That is, light that is dispersed by the first grating is collimated and made incident on a second grating. However, the overall intensity of the dispersed light that exits from the spectrometer decreases as the number of gratings increases. The doubly dispersed light has a higher signal-to-noise ratio, but its overall intensity is significantly lower than singly dispersed light. Since the intensity of Raman scattered radiation is generally very weak (often by one or more orders of magnitude), there is an advantage to decreasing the intensity of the Rayleigh light that enters the spectrometer by using a single grating in combination with a notch filter rather than by using multiple gratings (see Optics). Modern spectrometers generally make use of interference or holographic gratings. These are formed by exposing photosensitive material to the interference pattern produced by reflecting a laser beam at normal incidence off a mirror surface or by intersecting two coherent beams of laser light. Immersing the photosensitized material into a suitable solvent, in which the regions exposed to the maximum light intensity experience either enhanced or retarded rates of dissolution, produces the periodic profile of the grating. The surface of the grating is then typically coated with a thin, highly reflective metal coating (Hutley, 1982). The minimum spacing of grooves that can be generated by interference patterns is directly proportional to the wavelength of the laser light. The minimum groove spacing that can be formed using the 458-nm line of an argon ion laser is 0.28 mm (3500 grooves/mm). The line spacing of a grating dictates one of the most important characteristics of the grating, its resolving power. The resolving power of a grating is a measure of the smallest change of wavelength that the grating can resolve:
Optical Alignment Alignment of the entire optical system is most easily accomplished by use of a second laser, which need only
l mW ¼ l d
ð21Þ
RAMAN SPECTROSCOPY OF SOLIDS
where l is the wavelength at which the grating is operating, m is the diffraction order, W is the width of the grating, and d is the spacing of the grating grooves (or steps). This relation indicates the importance of just filling the grating with light and the influence of the spacing of the grating grooves, which is generally expressed in terms of the number of grooves per millimeters, on its resolving power. A second parameter of interest for characterizing the performance of a grating is its absolute efficiency, which is a measure of the fraction of the light incident on the grating that is diffracted into the required order. It is a function of the shape of the groove, the angle of incidence, the wavelength of the incident light, the polarization of the light, the reflectance of the material that forms the grating, and the particular instrument that houses the gratings (Hutley, 1982). The efficiency of a grating can be measured or calculated with the aid of a computer. The measurement of the absolute efficiency is performed by taking the ratio of the flux in the diffracted beam to the flux in the incident beam. Generally, it is not necessary for a Raman spectroscopist to measure and/or calculate the absolute efficiencies of gratings. Such information may be available from the manufacturer of the gratings. Measurement of the Dispersed Radiation A device for measuring the intensity of the dispersed radiation will be located just outside the exit port of the spectrometer. Since the intensity of the Raman scattered radiation is so weak, a very sensitive device is required to measure it. Three such devices are a photomultiplier tube (PMT), a photodiode array, and a charge-transfer device. The key elements in a PMT are a photosensitive cathode, a series of dynodes, and an anode (Long, 1977; Ferraro and Nakamoto, 1994). The PMT is located at the exit of the spectrometer, which is set to transmit radiation of a particular wavelength. For Stokes-shifted Raman spectroscopy, the wavelength is longer than that of the exciting laser. As photons of this energy exit the spectrometer, they are focused into the photocathode of the PMT. For each photon that is absorbed by the photocathode, an electron is ejected and is accelerated toward the first dynode, whose potential is 100 V positive with respect to the photocathode. For each electron that hits the first dynode, several are ejected and are accelerated toward the second dynode. A single photon entering the PMT may cause a pulse of 106 electrons at the anode. Long (1977) describes the different procedures for relating the current pulse at the anode to the intensity of the radiation incident on the PMT. The efficiency of the PMT is increased through the use of a photoemission element consisting of a semiconductor whose surface is coated with a thin layer (2 nm) of material with a low work function (e.g., a mixture of cesium and oxygen; Fraser, 1990). For a heavily doped p-type semiconductor coated with such a surface layer, the electron affinity in the bulk semiconductor is equal to the difference between its band-gap energy and the surface layer work function. If the work function is smaller than the band
707
gap, the semiconductor has a negative electron affinity. This means that the lowest energy of an electron in the conduction band in the bulk of the semiconductor is higher than the energy of an electron in vacuum and results in an increase in the number of electrons emitted per absorbed photon (i.e., increased quantum efficiency). The gain of the PMT can be increased by treating the dynodesurface in a similar fashion. A dynode with negative electron affinity emits a greater number of electrons per incident electron. The overall effect is a higher number of electrons at the anode per photon absorbed at the cathode. It is important to note that the PMT needs to be cooled to reduce the number of thermally generated electrons at the photocathode and dynodes, which add to the PMT ‘‘dark count.’’ Heat is generally extracted from the PMT by a thermoelectric cooler. The heat extracted by the cooler is generally conducted away by flowing water. A PMT is well suited to measuring the weak signal that exits a spectrometer that has been set to pass Raman scattered radiation of a single energy. The entire Raman spectrum is measured by systematically changing the energy of the light that passes through the spectrometer (i.e., rotating the grating) and measuring its intensity as it exits from the spectrometer. For a double monochromator fitted with a PMT, it may take minutes to tens of minutes to generate a complete Raman spectrum of one sample. In a PMT, current is generated as photons hit the cathode. In contrast, each active segment in a photodiode array (PDA) and a charge-coupled device (CCD) stores charge (rather than generating current) that is created by photons absorbed at that location. The Raman spectrum is generated from the spatial distribution of charge that is produced in the device. In both a PDA and CCD, photons are absorbed and electron-hole pairs are created. The junction in which the electron-hole pairs are created are different in the two devices. In a PDA, the junction is a reversebiased p-n junction. In a CCD, the junction is the depletion zone in the semiconductor at the semiconductor-oxide interface of a metal-oxide-semiconductor. When the reverse-biased p-n junction is irradiated by photons with an energy greater than the band gap, electron-hole pairs are generated from the absorbed photons. Minority carriers from the pairs formed close (i.e., within diffusion distance) to the charge-depleted layer of the junction are split apart from their complementary, oppositely charged particles and driven in the opposite direction by the electric field in the junction. The increase in reverse saturation current density is proportional to the light intensity hitting the photodiode. To minimize the ‘‘dark counts’’, the photodiode must be cooled to reduce the number of thermally generated electron-hole pairs. If the junction is at equilibrium and is irradiated with photons of energy greater than the band gap, electronhole pairs generated in the junction are separated by the built-in electric field. The separated electrons and holes lower the magnitude of the built-in field. If an array of diodes is distributed across the exit of the spectrometer, the distribution of charge that is created in the array provides a measure of the intensity of the radiation that is dispersed by the spectrometer across the focal plane for the exiting radiation.
708
OPTICAL IMAGING AND SPECTROSCOPY
The time to measure a Raman spectrum can be greatly lowered by measuring the entire spectrum at once, rather than one energy value at a time, as is the case with a PMT. Multichannel detection is accomplished by, e.g., placing a PDA at the exit of a spectrometer. There are no slits at the exit, which is filled by the PDA. Each tiny diode in the array measures the intensity of the light that is dispersed to that location. Collectively, the entire array measures the whole spectrum (or a significant fraction of the whole spectrum) all at once. Generally, an individual photodiode in a PDA is not as good a detector as is a PMT. However, for some experiments the multichannel advantage compensates for the lower quality detector. Unfortunately, a PDA do not always provide the sensitivity at low light intensities that are needed in Raman spectroscopy. A CCD is a charge-transfer device that combines the multichannel detection advantage of a PDA with a sensitivity at low light intensities that rivals that of a PMT (Bilhorn et al., 1987a,b; Epperson, 1988). Charge-coupled devices are made of metal-oxide-semiconductor elements and, in terms of light detection, function analogously to photographic film. That is, photons that hit and are absorbed at a particular location of the CCD are stored at that site in the form of photogenerated electrons (for p-type semiconductor substrate) or photogenerated holes (for n-type semiconductor substrate). One example of a CCD is a heavily doped, p-type silicon substrate whose surface is coated with a thin layer of SiO2 on top of which are a series of discrete, closely spaced metal strips, referred to as gates. The potential of each adjacent metal strip is set at a different value so that the width of the depletion layer in the semiconductor varies periodically across its surface. When a positive potential is applied to a metal strip, the mobile holes in the p-type silicon are repelled from the surface region, creating a depleted layer. If a high enough potential is applied to a metal strip, significant bending of the bands in the region of the semiconductor close to the oxide layer causes the bottom of the conduction band to approach the Fermi level. Electrons occupy states in the conduction band near the surface forming an inversion layer, i.e., an n-type surface region in the bulk p-type semiconductor. Photons that are absorbed in the surface region generate electron-hole pairs in which the electrons and holes are driven apart by the field in the depletion layer. The holes are expelled from the depletion layer and the electrons are stored in the potential well at the surface. As the intensity of light increases, the number of stored electrons increases. The light exiting the spectrometer is dispersed and so hits the CCD at locations indicative of its wavelength. The dispersed light creates a distribution of charge in the depletion layer from which the sample Raman spectrum is generated. The distribution of charge corresponding to the Raman spectrum is sent to a detector by shifting the periodic variation in potential of the metal strips in the direction of the detector. In this manner the charge stored in the depletion layer of the semiconductor adjacent to a particular metal strip is shifted from one strip to the next. Charge arrives at the detector in successive packets, corresponding to the sequential distribution of the metal gates across the face of the CCD.
An illustration of the increased efficiency of measuring Raman spectra that has emerged in the past 10 years is provided by a comparison of the time required to measure a portion (200 to 1500 cm1) of the surface-enhanced Raman spectrum (see the following paragraph) of thin passive films grown on samples of iron immersed in aqueous solutions. Ten to 15 min (Gui and Devine, 1991) was required to generate the spectrum by using a double spectrometer (Jobin-Yvon U1000 with 1800 grooves/mm holographic gratings) and a PMT (RCA c31034 GaAs), while the same spectrum can now be acquired in 5 s (Oblonsky and Devine, 1995) using a single monochromator (270M Spex), in which the stray light is reduced by a notch filter (Kaiser Optics Super Holographic Notch Filter centered at 647.1 nm), and the intensity of the dispersed Raman scattered radiation is measured with a CCD (Spectrum One, 298 1152 pixels). The enhanced speed in generating the spectrum makes it possible to study the time evolution of samples. In the previous paragraph, mention was made of surface-enhanced Raman scattering (SERS), which is the greatly magnified intensity of the Raman scattered radiation from species adsorbed on the surfaces of specific metals with either roughened surfaces or colloidal dimensions (Chang and Furtak, 1982). SERS is not a topic of this unit, but a great deal of interest during the past 15 years has been devoted to the use of SERS in studies of surface adsorption and surface films. Polarization By controlling the polarization of the incident radiation and measuring the intensity of the Raman scattered radiation as a function of its polarization, it is possible to identify the component(s) of the polarizabilty tensor of a crystal that are responsible for the different peaks in the Raman spectrum (see Example: Raman Active Vibrational Modes of a-Al2O3, below). This information will then identify the vibrational mode(s) responsible for each peak in the spectrum. The procedure is illustrated below for a-Al2O3 (see Example: Raman Active Vibrational Modes of a-Al2O3). The configuration of the polarizing and polarization measuring components are discussed in Scherer (1991). Fourier Transform Raman Spectroscopy The weak intensity of Raman scattering has already been mentioned as one of the major shortcomings of Raman spectroscopy. Sample heating and fluorescence are two other phenomena that can increase the difficulty of obtaining Raman spectra. Sample heating is caused by absorption of energy during illumination with laser beams of high power density. Fluorescence is also caused by absorption of laser energy by the sample (including impurities in the sample). When the excited electrons drop back down to lower energy levels, they emit light whose energy equals the difference between the energies of the initial (excited) and final states. The fluorescence spectrum is (practically speaking) continuous and covers a wide range of values. The intensity of the fluorescence can be much greater than that of the Raman scattered radiation.
RAMAN SPECTROSCOPY OF SOLIDS
Both sample heating and fluorescence can be minimized by switching to exciting radiation whose energy is too low to be absorbed. For many materials, optical radiation in the red or near-IR range will not be absorbed. If such radiation is to be used in Raman spectroscopy, two consequences must be recognized and addressed. First, since Raman scattering is a light-scattering process, its intensity varies as l4 . Hence, by switching to longer wavelength exciting radiation, the intensity of the Raman spectrum will be significantly lowered. Second, the sensitivity of most PMTs to red and especially near-infrared radiation is very low. Consequently, the use of long-wavelength radiation in Raman spectroscopy has been coupled to the use of Fourier transform Raman spectroscopy (FTRS) (Parker, 1994). In FTRS, an interferometer is used in place of a monochromator. The Raman scattered radiation is fed directly into a Michelson interferometer and from there it enters the detector. The Raman spectrum is obtained by taking the cosine-Fourier transform of the intensity of the radiation that reaches the detector, which varies in magnitude as the path difference between the moving mirror and the fixed mirror within the interferometer changes. Since the Raman scattered radiation is not dispersed, a much higher Raman intensity reaches the detector in FTRS than is the case for conventional Raman spectroscopy. The much higher intensity of the Raman scattered radiation in FTRS permits the use of longer wavelength incident radiation (e.g., l ¼ 1064 nm of NdYAG), which may eliminate fluorescence.
DATA ANALYSIS AND INITIAL INTERPRETATION Raman scattering may result from incident radiation inducing transitions in electronic, vibrational, and rotational states of the scattering medium. Peaks in the Raman spectrum of a solid obtained using visible exciting radiation are generally associated with vibrational modes of the solid. The vibrations may be subdivided into internal modes that arise from molecules or ions that make up the solid and external modes that result from collective modes (k ¼ 0) of the crystal. In either case, the presence of a peak in the Raman spectrum due to the vibration requires that the Raman transition be symmetry allowed, i.e., that the integrand in Equation 7 have a nonzero value. In addition, for a measurable Raman intensity, the integral in Equation 7 must have a large enough magnitude. It is generally difficult to calculate the intensity of a Raman peak, but it is rather straightforward to use group theoretical techniques to determine whether or not the vibrational mode is symmetry allowed. Example: Raman Active Vibrational Modes of a-Al2O3 The extremely useful result that the crystallographic point group provides all the information that is needed to know the symmetries of the Raman active vibrational modes of the crystal is derived below (see Appendix). The lefthand column in the character table for each point group
709
lists a set of symbols such as A1, B1, and B2 for the point group C2v . The term A1 is the list of characters of the matrices that represent the effect of each symmetry operation of the point group on any function listed in the far right-hand column of the character table. A remarkable result of group theory is that the list of characters of the matrices that represent the effect of each symmetry operation of the point group on ‘‘any’’ function can be completely described by a linear combination of the few lists presented in the character table. For example, for the point group C2v , the symmetry of the function x2 is described by A1. This means that when the effect of each of the symmetry operations operating on x2 is represented by a matrix, the characters of the four matrices are 1, 1, 1, 1. In fact, within the context of the point group C2v , the symmetry of any function is given by aA1 þ bB1 þ cB2 , where a, b, and c are integers. In the language of group theory, A1, B1, and B2 are called the irreducible representations of the point group C2v. The linear combination of irreducible representations that describe the symmetry of a function is called the symmetry species of the function. The irreducible representations of the normal vibrational modes of a crystal are most easily identified by the correlation method (Fately et al., 1971). The analysis for the lattice vibrations of a-Al2O3 will be summarized here. The details of the analysis are contained in Fately et al. (1971). The space group of a-Al2O3 is D63d . There are two molecules of a-Al2O3 per Bravais cell. Each atom in the Bravais cell has its own symmetry, termed the site symmetry, which is a subgroup of the full symmetry of the Bravais unit cell. The site symmetry of the aluminum atoms is C3 and that of the oxygen atoms is C2. From the character table for the C3 point group, displacement in the z direction is a basis for the A irreducible representation. Displacements parallel to the x and y axes are bases for the E irreducible representation. The correlation tables for the species of a group and its subgroups (Wilson et al., 1955) correlate the A species of C3 to A1g, A2g, A1u, and A2u of the crystal point group D3d. Displacement of the oxygen atom in the z direction is a basis of the A irreducible representation of the C2 point group, which is the site symmetry of the oxygen atoms in a-Al2O3. Displacements of the oxygen atom that are parallel to the x and y axes are bases for the B irreducible representation of C2. The correlation table associates the A species of C2 to A1g, Eg, A1u, and Eu of D3d. The B species of C2 correlates to A2g, Eg, A2u, and Eu of D3d. After accounting for the number of degrees of freedom contributed by each irreducible representation of C2 and C3 to an irreducible representation of D3d and removing the irreducible representations associated with the rigid translation of the entire crystal, the irreducible species for the optical modes of the corundum crystal are determined to be 2A1g þ 2A1u þ 3A2g þ 2A2u þ 5Eg þ 4Eu
ð22Þ
Consulting the character table of the point group D3d indicates that the Raman active vibrational modes would span
710
OPTICAL IMAGING AND SPECTROSCOPY
Figure 8. Directions and polarizations of incident and scattered radiation for the x(zz)y configuration.
the symmetry species A1g and Eg. The basis functions for these symmetry species are listed in the character table: x2 þ y2
and z2
for A1g
ð23Þ
discussed above so the two peaks present in the spectrum correspond to vibrational modes with A1g symmetry. The peaks present in the spectrum in Figure 9B can originate from vibrational modes with either A1g or Eg symmetries. Consequently, the peaks present in Figure 9B but missing in Figure 9A have Eg symmetry. The spectra in Figures 9C-E exhibit only peaks with Eg symmetry. Collectively, the spectra reveal all seven Raman active modes and distinguish between the A1g modes and the Eg modes. The magnitude of the Raman shift in wavenumbers of each peak with respect to the exciting laser line can be related to the energy of the vibration. Combined with the knowledge of the masses of the atoms involved in the vibrational mode and, e.g., the harmonic oscillator assumption, it is possible to calculate the force constant associated with the bond between the atoms participating in the vibration. If one atom in the molecule were replaced by another without changing the symmetry of the molecule, the peak would shift corresponding to the different strength of the bond and the mass of new atom compared to the original atom. This illustrates why the Raman spectrum can provide information about bond strength and alloying effects. Other Examples of the Use of Raman Spectroscopy in Studies of Solids
and ðx2 y2 ; xyÞ and ðxz; yzÞ for Eg
ð24Þ
Thus there are a total of seven Raman active modes so there may be as many as seven peaks in the Raman spectrum. The question now is which peaks correspond to which vibrational modes. Figure 8 depicts a Cartesian coordinate system, and a crystal of corundum is imagined to be located at the origin with the crystal axes parallel to x, y, and z. If light is incident on the crystal in the x direction and polarized in the z direction, and scattered light is collected in the y direction after having passed through a y polarizer, then the experimental conditions of incidence and collection are described by the Porto notation (Porto and Krishnan, 1967) as x(zz)y. The first term inside the parentheses indicates the polarization of the incident light and the second term denotes the polarization of the scattered light that is observed. In general, if z polarized light is incident on a crystal in the x direction, then the induced polarization of the molecule will be given by [px ; py ; pz ¼ ½axz Ez ; ayz Ez ; azz Ez ]. Light scattered in the y direction would be the result of the induced polarization components px and pz. If the light scattered in the y direction is passed through a z polarizer, then the source of the observed radiation would be the induced polarization along the z direction. Consequently, only the azz component of the polarizability contributes to scattered light in the case of x(zz)y. Thus, peaks present in the Raman spectrum must originate from Raman scattering by vibrational modes that span the A1g irreducible species. Figure 9 presents the Raman spectra reported by Porto and Krishnan (1967) for single-crystal corundum under a variety of incidence and collection conditions. Figure 9A is the Raman spectrum in the x(zz)y condition that was
Raman spectroscopy can provide information on the amorphous/crystalline nature of a solid. Many factors can contribute to the widths of peaks in the Raman spectra of a solids. The greatest peak broadening factor can be microcrystallinity. For a macroscopic-sized single crystal of silicon at room temperature, the full width at half-maximum (FWHM) of the peak at 522 cm1 is 3 cm1 (Pollak, 1991). The breadth of the peak increases dramatically as the size of the crystal decreases below 10 nm. The cause of peak broadening in microcrystals is the finite size of the central peak in the Fourier transform of a phonon in a finite-size crystal. In an infinitely large crystal, the Fourier transform of a phonon consists of a single sharp line at the phonon frequency. The peak width k (i.e., FWHM) in a crystal of dimension L is given approximately by k ¼
2pðv=cÞ L
ð25Þ
where v is phonon velocity and c the velocity of light. Assuming v ¼ 1 105 cm/s, k ¼ 20 cm1 for L ¼ 10 nm and k ¼ 200 cm1 for L ¼ 1 nm. In the extreme case, the silicon is amorphous, in which case the k ¼ 0 selection rule breaks down and all phonons are Raman active. In that case, the Raman spectrum resembles the phonon density of states and is characterized by two broad peaks centered at 140 and 480 cm1 (Pollak, 1991). Consequently, the Raman spectrum can distinguish between the crystalline and amorphous structures of a solid and can indicate the grain size of microcrystalline solids. Raman spectroscopy can be used to nondestructively measure the elastic stress in crystalline samples. Elastic straining of the lattice will alter the spring constant of a chemical bond and, hence, will shift the frequency of the
RAMAN SPECTROSCOPY OF SOLIDS
711
Figure 9. Raman spectra of corundum as a function of crystal and laser light polarization orientations (Porto and Krishnan, 1967).
vibrational mode associated with the strained bond. The direction and magnitude of the shift in peak location will depend on the sign (tensile or compressive) and magnitude, respectively, of the strain. Raman spectroscopy has been used to measure the residual stresses in thin films deposited on substrates that result in lattice mismatches and in thermally cycled composites where the strains result from differences in thermal expansion coefficients of the two materials (Pollak, 1991). Another example of the use of Raman spectroscopy to nondestructively investigate the mechanical behavior of a solid is its use in studies of Al2O3-ZrO2 composites (Clarke and Adar, 1982). These two-phase structures exhibit improved mechanical toughness due to transformation of the tetragonal ZrO2 to the monoclinic form. The transformation occurs in the highly stressed region ahead of a growing crack and adds to the energy that must be expended in the propagation of the crack. Using Raman
microprobe spectroscopy, it was possible to measure, with a spatial resolution of 1 mm, the widths of the regions on either side of a propagating crack in which the ZrO2 transformed. The size of the transformed region is an important parameter in theories that predict the increased toughness expected from the transformation of the ZrO2. Phase transformations resulting from temperature or pressure changes in an initially homogeneous material have also been studied by Raman spectroscopy (Ferraro and Nakamoto, 1994). Raman spectra can be obtained from samples at temperatures and pressures that are markedly different from ambient conditions. All that is needed is the ability to irradiate the sample with a laser beam and to collect the scattered light. This can be accomplished by having an optically transparent window (such as silica, sapphire, or diamond) in the chamber that houses the sample and maintains the nonambient conditions. Alternatively, a fiber optic can be used to transmit the
712
OPTICAL IMAGING AND SPECTROSCOPY
incident light to the sample and the scattered light to the spectrometer. The influence of composition on the Raman spectrum is in evidence in semiconductor alloys such as Ga1x ALx As (0 < x < 1), which exhibit both GaAs-like and AlAs-like longitudinal optical (LO) and transverse optical (TO) phonon modes. The LO modes have a greater dependence on composition than do the TO modes. The AlAs-like LO mode shifts by 50 cm1 and the GaAs-like mode shifts by 35 cm1 as x varies from 0 to 1. Thus the Raman spectrum of Ga1x Alx As ð0 < x < 1Þ can be used to identify the composition of the alloy (Pollak, 1991). Similarly, the Raman spectra of mixtures of HfO2 and ZrO2 are strong functions of the relative amounts of the two components. Depending on the particular mode, the Raman peaks shift by 7 to 60 cm1 as the amount of HfO2 increases from 0 to 100% spectroscopy (Ferraro and Nakamoto, 1994). Raman spectroscopy has long been used to characterize the structure of polymers: identifying functional groups, end groups, crystallinity, and chain orientation. One word of caution: As might be expected of any Raman study of organic matter, fluorescence problems can arise in Raman investigations of polymers (Bower and Maddams, 1989; Bulkin, 1991; Rabolt, 1991). Raman Spectroscopy of Carbon Because carbon can exist in a variety of solid forms, ranging from diamond to graphite to amorphous carbon, Raman spectroscopy is the single most effective characterization tool for analyzing carbon. Given the wide spectrum of structures exhibited by carbon, it provides a compelling example of the capability of Raman spectroscopy to analyze the structures of a large variety of solids. The following discussion illustrates the use of Raman spectroscopy to distinguish between the various structural forms of carbon and describes the general procedure for quantitatively analyzing the structure of a solid by Raman spectroscopy. Figure 10, which was originally presented in an article by Robertson (1991), who compiled the data of a number of researchers, presents the Raman spectra of diamond, large crystal graphite, microcrystalline graphite, glassy carbon, and several forms of amorphous carbon. Diamond has two atoms per unit cell and therefore has a single internal vibrational mode (3N 5 ¼ 1). This mode is Raman active and is located at 1332 cm1 . Graphite, with four atoms per unit cell, has six internal vibrational modes (3N 6 ¼ 6), two of which are Raman active. The rigid-layer mode of graphite spans the symmetry species E2g and occurs at 50 cm1 , which is too close to the incident laser line to be accessible with most Raman collection optics and spectrometers. The second Raman active mode of graphite is centered at 1580 cm1 , and it too spans the symmetry species E2g. The relative displacements of the carbon atoms during this in-plane mode are presented next to the Raman spectrum in Figure 10. Thus, Raman spectroscopy can easily distinguish between diamond and large-crystal graphite. The third spectrum from the top of Figure 10 was obtained from microcrystalline graphite. The next lower spectrum is similar and belongs to glassy carbon. There
Figure 10. First-order Raman spectra of diamond, highly oriented pyrolytic graphite (hopg), microcrystalline graphite, glassy C, plasma-deposited a-C:H, sputtered a-C, and evaporated a-C. (From Robertson, 1991.)
are two peaks in these spectra, one at 1580 cm1 and the other at 1350 cm1 . Given these peak positions, originally there was some thought that the structures were a mixture of graphitelike (sp2 bonding) and diamondlike (sp3 bonding) components. However, experiments that revealed the effects of heat treatments on the relative intensities of the two peaks clearly showed that the material was polycrystalline graphite and the peak at 1350 cm1 is a consequence of the small size of the graphite crystals. The new peak at 1350 cm1 is labeled the disorder, or ‘‘D,’’ peak, as it is a consequence of the breakdown of the crystal momentum conservation rule. In the Raman spectrum of an infinite crystal, the peak at 1350 cm1 is absent. This peak results from the vibrational mode sketched next to the spectrum for microcrystalline graphite in Figure 10, and it spans the symmetry species A1g. Its Raman activity is symmetry forbidden in a crystal of infinite size. Finite crystal size disrupts translational symmetry, and so the D mode is Raman active in microcrystalline graphite. The ratio of the integrated intensities of the peaks associated with the D and G modes is proportional to the size of the crystal: IðD modeÞ k ¼ IðG modeÞ d
ð26Þ
where d is the crystal size. This relationship may be used, for example, to calculate the relative sizes of two microcrystals and the relative mean grain diameters of two
RAMAN SPECTROSCOPY OF SOLIDS
Figure 11. Phonon density of states for graphite.
polycrystalline aggregates, so long as the grain size is ˚ . The behavior of crystals smaller than greater than 12 A ˚ 12 A begins to resemble that of amorphous solids. An amorphous solid of N atoms may be thought of as a giant molecule of N atoms. All 3N 6 of its internal vibrational modes are Raman active, and the Raman spectrum is proportional to the phonon density of states. This is demonstrated by a comparison of the bottom three spectra in Figure 10 with the calculated phonon density of states of graphite in Figure 11. At this point several important distinctions between x-ray diffraction and Raman spectroscopy can be appreciated. Like Raman spectroscopy, x-ray diffraction is also able to distinguish between crystals, microcrystals, and amorphous forms of the same material. Raman spectroscopy not only is able to distinguish between the various forms but also readily provides fundamental information about the amorphous state as well as the crystalline state. This is due to the fact that x-ray diffraction provides information about the long-range periodicity of the arrangement of atoms in the solid, while Raman spectroscopy provides information concerning the bonds between atoms and the symmetry of the atomic arrangement. On the other hand, while x-ray diffraction is in principle applicable to every crystal, Raman spectroscopy is only applicable to those solids with unit cells containing two or more atoms. Quantitative Analysis The quantitative analysis of a spectrum requires fitting each peak to an expression of Raman scattered intensity vs. frequency of the scattered light. Typically, a computer is used to fit the data to some mathematical expression of the relationship between the intensity of the Raman scattered radiation as a function of its frequency with respect to that of the incident laser line. One example of a peakfitting relationship is the expression (DiDomenico et al., 1968) dI ðconstÞo0 ¼ do fo20 ðoÞ2 g2 þ 42 o20 ðoÞ2
ð27Þ
where dI=do is the average Raman scattered intensity per unit frequency and is proportional to the strength of the signal at the Raman shift of o; is the damping constant, o0 is the frequency of the undamped vibrational mode, and o is the frequency shift from the laser line. The calcu-
713
lated fit yields the values of and o0 for each peak in the spectrum. An example of the quantitative analysis of a Raman spectrum is provided in the work of Dillon et al. (1984), who used Raman spectroscopy to investigate the growth of graphite microcrystals in carbon films as a function of annealing temperature. The data for the D and G peaks of graphite were fitted to Equation 27, which was then integrated to give the integrated intensity of each peak. The ratio of the integrated intensity of the D mode to the integrated intensity of the G mode was then calculated and plotted as a function of annealing temperature. The ratio increased with annealing temperature over the range from 400 to 6008C, and the widths of the two peaks decreased over the same range. The results suggest that either the number or the size of the crystal grains was increasing over this temperature range. The integrated intensity ratio of the D-to-G mode reached a maximum and then decreased with higher temperatures, indicating growth in the grain size during annealing at higher temperatures. Dillon et al. also used the peak positions defined by Equation 27 to monitor the shift in peak locations with annealing temperature. The results indicated that the annealed films were characterized by threefold rather than fourfold coordination. In summary, a single-phase, unknown material can be identified by the locations of peaks and their relative intensities in the Raman spectrum. The location of each peak and its breadth can be defined by fitting the peaks to a quantitative expression of peak intensity vs. Raman shift. The same expression will provide the integrated intensity of each peak. Ratios of integrated intensities of peaks belonging to different phases in a multiphase sample will be proportional to the relative amounts of the two phases.
PROBLEMS Many problems can arise in the generation of a Raman spectrum. Peaks not part of the spectrum of the sample can sometimes appear. Among the factors that can produce spurious peaks are plasma lines from the laser and cosmic rays hitting the light detector. The occurrence of peaks caused by cosmic rays increases with the time required for acquisition of the spectrum. The peaks from cosmic rays can be identified by their sharpness and their nonreproducibility. Peaks attributed to plasma lines are also sharp, but not nearly as sharp as those caused by cosmic rays. A peak associated with a plasma line will vanish when the spectrum is generated using a different laser. An all too common problem encountered in Raman spectroscopy is the occurrence of fluorescence, which completely swamps the much weaker Raman signal. The fluorescence can be caused by the component of interest or by impurities in the sample. Fluorescence can be minimized by decreasing the concentration of the guilty impurity or by long-time exposure of the impurity to the exciting laser line, which ‘‘burns out’’ the fluorescence. Absorption of the exciting laser line by the impurity results in its thermal decomposition.
714
OPTICAL IMAGING AND SPECTROSCOPY
Fluorescence is generally red shifted from the exciting laser line. Consequently, the anti-Stokes Raman spectrum may be less affected by fluorescence than the StokesRaman spectrum. The intensity of the anti-Stokes Raman spectrum is much weaker than that of the Stokes Raman so this is not always a viable approach to avoiding the fluorescence signal. Increasing the wavelength of the exciting laser line may also reduce fluorescence, as was mentioned above (see Fourier Transform Raman Spectroscopy). One potential problem with the latter approach is tied to the wavelength dependence of the intensity of the scattered light, which varies inversely as the fourth power of the wavelength. Shifting to exciting radiation with a longer wavelength may reduce fluorescence but it will also decrease the overall intensity of the scattered radiation. Fluctuations in the intensity of the incident laser power during the generation of a Raman spectrum are particularly problematic for spectra measured using a singlechannel detector. Errors in the relative intensities of different peaks can result from fluctuations in the laser power. Similarly, fluctuations in laser power will cause variations in the intensities of successive measurements of the same spectrum using a multiple-channel detector. These problems will be largely averted if the laser can be operated in a constant-light-intensity mode rather than in a controlled (electrical) power input mode. If the sample absorbs the incident radiation, problems may occur even if the absorption does not result in fluorescence. For example, if an argon laser (l ¼ 514:5 nm) were used to generate the Raman spectrum of an adsorbate on the surface of a copper or gold substrate, significant reduction in the Raman intensity would result because of the strong absorption by copper and gold of green light. In this case, better results would be obtained by switching to a krypton laser (l ¼ 647:1 nm) or helium-neon laser (l ¼ 632:8 nm). Absorption of the incident radiation can also lead to increases in the temperature of the sample. This may have a major deleterious effect on the experiment if the higher temperature causes a change in the concentration of an adsorbate or in the rate of a chemical or electrochemical reaction. If optically induced thermal effects are suspected, the spectra should be generated using several different incident wavelengths. Temperature changes can also cause significant changes in the Raman spectrum. There are a number of sources of the temperature dependency of Raman scattering. First, the intensity of the Stokes component relative to the antiStokes component decreases as temperature increases. For studies that make use of only the Stokes component, an increase in temperature will cause a decrease in intensity. A second cause of the temperature dependency of a Raman spectrum is the broadening of peaks as the temperature increases. This cause of peak broadening is dependent on the decay mechanism of the phonons. Broadening of the Raman peak can also result from hot bands. Here the higher temperature results in a higher population of the excited state. As a result, the incident photon can induce the transition from the first excited state to the second excited state. Because of anharmonic effects,
the energy required for this transition is different from the energy required of the transition from the ground state to the first excited state. As a result, the peak is broadened. Hence, Raman peaks are much sharper at lower temperatures. Consequently, it may be necessary to obtain spectra at temperatures well below room temperature in order to minimize peak overlap and to identify distinct peaks. Temperature changes can also affect the Raman spectrum due to the temperature dependencies of the phonons themselves. For example, the phonon itself will have a strong temperature dependency as the temperature is raised close to that of a structural phase transition if at least one component of the oscillation coincides with the displacement during the phase transition. Temperature changes can also affect the phonon frequency. As the temperature increases, the anharmonic character of the vibration causes the well in the plot of potential energy vs. atomic displacement to be more narrow than is the case for a purely harmonic oscillation. The atomic displacements are therefore smaller, which results in a higher frequency since the amplitude of displacement is inversely proportional to frequency. In addition, thermal expansion increases the average distance between the atoms as temperature increases, leading to a decrease in the strength of the interatomic interactions. This results in a decrease in the phonon frequency with increasing temperature. In summary, superior Raman spectra are generally obtained at lower temperatures, and it may be necessary in some cases to obtain spectra at temperatures that are considerably lower than room temperature. Changes in the temperature of the room in which the Raman spectra are measured can cause changes in the positions of optical components (lenses, filters, mirrors, gratings) both inside and outside the spectrometer. Such effects are important when attempting to accurately measure the locations of peaks in a spectrum, or, e.g., stressinduced shifts in peak locations. One of the more expensive mistakes that can be made in systems using single-channel detectors is the inadvertent exposure of the PMT to the Rayleigh line. The high intensity of the Rayleigh scattered radiation can ‘‘burn out’’ the PMT, resulting in a significant increase in PMT dark counts. Either closing the shutter in front of the PMT or shutting off the high voltage to the PMT will prevent damage to the PMT from exposure to high light intensity. Typically, accidents of this type occur when the frequency of the incident radiation is changed and the operator neglects to note this change at the appropriate point in the software that runs the experiment and controls the operation of the PMT shutter.
ACKNOWLEDGMENTS It is a pleasure to thank J. Larry Nelson and Wylie Childs of the Electric Power Research Institute for their long-term support and interest in the use of Raman spectroscopy in corrosion investigations. In addition, Gary Chesnut and David Blumer of ARCO Production and Technology have encouraged the use of surface-enhanced Raman spectroscopy in studies of corrosion inhibition.
RAMAN SPECTROSCOPY OF SOLIDS
715
Both organizations have partially supported the writing of this unit. The efforts and skills of former and current graduate students, in particular Jing Gui, Lucy J. Oblonsky, Christopher Kumai, Valeska Schroeder, and Peter Chou, have greatly contributed to my continually improved understanding and appreciation of Raman spectroscopy.
Hamermesh, M. 1962. Group Theory and Its Application to Physical Problems. Dover Publications, New York.
LITERATURE CITED
Koster, G. F. 1957. Space groups and their representations. In Solid State Physics, Advances in Research and Applications, Vol. 5 (F. Seitz and D. Turnbull, eds.) 173–256. Academic Press, New York.
Altmann, S. L. 1991. Band Theory of Solids: An Introduction from the Point of View of Symmetry (see p. 190). Oxford University Press, New York. Atkins, P. W. 1984. Molecular Quantum Mechanics. Oxford University Press, New York. Bilhorn, R. B., Epperson, P. M., Sweedler, J. V., and Denton, M. B. 1987b. Spectrochemical measurements with multichannel integrating detectors. Appl. Spectrosc. 41:1125–1135. Bilhorn, R. B., Sweedler, J. V., Epperson, P. M., and Denton, M. B. 1987a. Charge transfer device detectors for analytical optical spectroscopy—operation and characteristics. Appl. Spectrosc. 41:1114–1125. Bishop, D. 1973. Group Theory and Chemistry. Clarendon Press, Oxford. Bower, D. I. and Maddams, W. F. 1989. The Vibrational Spectroscopy of Polymers. Cambridge University Press, Cambridge. Bulkin, B. J. 1991. Polymer applications. In Analytical Raman Spectroscopy (J. G. Grasselli and B. J. Bulkin, eds.) pp. 45– 57. John Wiley & Sons, New York. Carrabba, M. M., Spencer, K. M., Rich, C., and Rauh, D. 1990. The utilization of a holographic Bragg diffraction filter for Rayleigh line rejection in Raman spectroscopy. Appl. Spectrosc. 44: 1558–1561. Chang, R. K. and Furtak, T. E. (eds.). 1982. Surface Enhanced Raman Scattering. Plenum Press, New York. Chase, D. B. 1991. Modern Raman instrumentation and techniques. In Analytical Raman Spectroscopy (J. G. Grasselli and B. J. Bulkin, eds.) pp. 45–57. John Wiley & Sons, New York. Clarke, D. R. and Adar, F. 1982. Measurement of the crystallographically transformed zone produced by fracture in ceramics containing tetragonal zirconia. J. Am. Ceramic Soc. 65:284– 288. DiDomenico, M., Wemple, S. H., Perto, S. P. S., and Bauman, R. P. 1968. Raman spectrum of single-domain BaTiO3. Phys. Rev. Dillon, R. O., Woollam, J. A., and Katkanant, V. 1984. Use of Raman scattering to investigate disorder and crystallite formation in As-deposited and annealed carbon films. Phys. Rev. B. Epperson, P. M., Sweedler, J. V., Bilhorn, R. B., Sims, G. R., and Denton, M. B. 1988. Applications of charge transfer devices in spectroscopy. Anal. Chem. 60:327–335. Fateley, W. G., McDevitt, N. T., and Bailey, F. F. 1971. Infrared and Raman selection rules for lattice vibrations: The correlation method. Appl. Spectrosc. 25:155–173. Ferraro, J. R. and Nakamoto, K. 1994. Introduction to Raman Spectroscopy. Academic Press, San Diego, Ca. Fowles, G. R. 1989. Introduction to Modern Optics. Dover Publications, Mineola, N.Y.
Hecht, J. 1993. Understanding Lasers: An Entry-Level Guide. IEEE Press, Piscataway, N.J. Hutley, M. C. 1982. Diffraction Gratings. Academic Press, New York. Kettle, S. F. A. 1987. Symmetry and Structure. John Wiley & Sons, New York.
Lax, M. 1974. Symmetry Principles in Solid State and Molecular Physics. John Wiley & Sons, New York. Leech, J. W. and Newman, D. J. 1969. How To Use Groups. Methuen, London. Long, D. A. 1977. Raman Spectroscopy. McGraw-Hill, New York. Mariot, L. 1962. Group Theory and Solid State Physics. PrenticeHall, Englewood Cliffs, N.J. Meijer, P. H. E. and Bauer, E. 1962. Group Theory—The Application to Quantum Mechanics. North-Holland Publishing, Amsterdam, The Netherlands. Oblonsky, L. J. and Devine, T. M. 1995. Surface enhanced Raman spectroscopic study of the passive films formed in borate buffer on iron, nickel, chromium and stainless steel. Corr. Sci. 37:17– 41. Owen, H. 1992. Holographic optical components for laser spectroscopy applications. SPIE 1732:324–332. Parker, S. F. 1994. A review of the theory of Fouriertransform Raman spectroscopy. Spectrochim. Acta 50A:1841– 1856. Pelletier, M. J. and Reeder, R. C. 1991. Characterization of holographic band-reject filters designed for Raman spectroscopy. Appl. Spectrosc. 45:765–770. Pollak, F. H. 1991. Characterization of semiconductors by Raman spectroscopy. In Analytical Raman Spectroscopy (J. G. Grasselli and B. J. Bulkin, eds.) pp. 137–221. John Wiley & Sons, New York. Porto, S. P. S. and Krishnan, R. S. 1967. Raman effect of corundum. J. Chem. Phys. 47:1009–10012. Rabolt, J. F. 1991. Anisotropic scattering properties of uniaxially oriented polymers: Raman studies. In Analytical Raman Spectroscopy (J. G. Grasselli and B. J. Bulkin, eds.) pp. 45–57. John Wiley & Sons, New York. Robertson, J. 1991. Hard amorphous (diamond-like) carbons. Prog. Solid State Chem. 21:199–333. Rossi, B. 1957. Optics. Addison-Wesley, Reading, Mass. Scherer, J. R. 1991. Experimental considerations for accurate polarization measurements. In Analytical Raman Spectroscopy (J. G. Grasselli and B. J. Bulkin, eds.) pp. 45–57. John Wiley & Sons, New York. Schoen, C. L., Sharma, S. K., Henlsley, C. E., and Owen, H. 1993. Performance of a holographic supernotch filter. Appl. Spectrosc. 47:305–308. Weyl, H. 1931. The Theory of Groups and Quantum Mechanics. Dover Publications, New York.
Fraser, D. A. 1990. The Physics of Semiconductor Devices. Oxford University Press, New York.
Wilson, E. B., Decius, J. C., and Cross, P. C. 1955. Molecular Vibrations: The Theory of Infrared and Raman Vibrational Spectra. McGraw-Hill, New York.
Gui, J. and Devine,T. M. 1991. In-situ vibrational spectra from the passive film in iron in buffered borate solution. Corr. Sci. 32:1105–1124.
Yang, B., Morris, M. D., and Owen, H. 1991. Holographic notch filter for low-wavenumber stokes and anti-stokes Raman spectroscopy. Appl. Spectrosc. 45:1533–1536.
716
OPTICAL IMAGING AND SPECTROSCOPY
KEY REFERENCES Altmann, 1991. See above. In Chapter 11, Bloch sums are used for the eigenvectors of a crystal’s vibrational Hamiltonian. Atkins, 1984. See above. Provides a highly readable introduction to quantum mechanics and group theory. Fateley et al., 1971. See above. Recipe for determining the symmetry species of vibrational modes of crystals.
Figure 12. Dispersion of longitudinal wave in a linear monoatomic lattice (nearest-neighbor interactions only).
Ferraro and Nakamoto, 1994. See above. Provides a more current description of equipment and more extensive and up-to-date discussion of applications of Raman spectroscopy than are provided in the text by Long (see below).
that is especially helpful in considering the Raman activity of solids. The requirements of energy and momentum conservation are expressed as follows:
Long, 1977. See above.
oi ¼ h hos hov hki ¼ h ks hkv
Provides a thorough introduction to Raman spectroscopy. Pollak, 1991. See above. A comprehensive review of the use of Raman spectroscopy to investigate the structure and composition of semiconductors.
APPENDIX: GROUP THEORY AND VIBRATIONAL SPECTROSCOPY This appendix was written, in part, to provide a road map that someone interested in vibrational spectroscopy might want to follow in migrating through the huge number of topics that are covered by the vast number of textbooks on group theory (see Literature Cited). Although necessarily brief in a number of areas, it does cover the following topics in some depth, selected on the basis of their importance and/or the rareness with which they are fully discussed elsewhere: (1) How character tables are used to identify the Raman activity of vibrational modes is demonstrated. (2) That the ij component, aij , of the polarizability tensor has the same symmetry as the product function xi xj is demonstrated. Character tables list quadratic functions such as x2, yz and identify the symmetry species (defined below) spanned by each. The same symmetry species will be spanned by the Raman active vibrational modes. (3) The proper rotations found in crystallographic point groups are identified. (4) A general approach, developed by Lax (1974) and using multiplier groups, is presented that elegantly establishes the link between all space groups, both symmorphic and nonsymmorphic, and the crystallographic point groups.
ð28Þ ð29Þ
hks , and hkv are the momentum vectors of the where hki ; incident and scattered radiation and the crystal phonon, respectively. For optical radiation jki j is on the order of 104 to 105 cm1 . For example, for l ¼ 647:1 nm, which corresponds to the high-intensity red line of a krypton ion laser, k¼
2p ¼ 9:71 103 cm1 l
ð30Þ
For a crystal with a0 ¼ 0:25 nm, kv;max 2:51 108 cm1 . Hence, for Raman scattering, kv kv;max . In other words, kv must be near the center of the Brillouin Zone (BZ). The phonon dispersion curves for a simple, one-dimensional, monoatomic lattice and a 1D diatomic lattice are presented in Figures 12 and 13. Since there are no available states at the center of the BZ in a monatomic lattice [for a three-dimensional (3D) lattice as well as a onedimensional lattice], such structures are not Raman active. Consequently, all metals are Raman inactive. Diatomic lattices, whether homonuclear or heteronuclear, do possess phonon modes at the center of the BZ (see Fig. 13) and are Raman active. Thus, crystals such as diamond, silicon, gallium nitride, and aluminum oxide are all Raman active.
Vibrational Selection Rules Quantum mechanically a vibrational Raman transition may occur if the integral in Equation 7 has a nonzero value. This can quickly be determined by the use of group theoretical techniques, which are summarized in the form of a Raman selection rule. The Raman selection rule is a statement of the symmetry that a vibrational mode must possess in order that it may be Raman active. Before discussing the symmetry-based selection rules for Raman scattering, there is also a restriction based on the principles of energy and momentum conservation
Figure 13. Dispersion of longitudinal wave in a linear diatomic lattice.
RAMAN SPECTROSCOPY OF SOLIDS
A Raman spectrum may consist of several peaks in the plot of the intensity of the scattered radiation versus its shift in wavenumber with respect to the incident radiation, as illustrated in Figure 4. If the immediate task is to identify the material responsible for the Raman spectrum, then it is only necessary to compare the measured spectrum with the reference spectra of candidate materials. Since the Raman spectrum acts as a fingerprint of the scattering material, once a match is found, the unknown substance can be identified. Often the identity of the sample is known, and Raman spectroscopy is being used to learn more about the material, e.g., the strength of its atomic bonds or the presence of elastic strains in a crystalline matrix. Such information is present in the Raman spectrum but it must now be analyzed carefully in order to characterize the structure of the material. Once a vibrational Raman spectrum is obtained, the first task is to identify the vibrational modes that contribute to each peak in the spectrum. Group theoretical techniques enormously simplify the analyses of vibrational spectra. At this point, the rules of group theory that are used in vibrational spectroscopy will be cited and examples of their use will then be given. In doing so, it is worth recalling the warning issued by David Bishop in the preface to his textbook on Group Theory (Bishop, 1973, p. vii): ‘‘The mathematics involved in actually applying, as opposed to deriving group theoretical formulae is quite trivial. It involves little more than adding and multiplying. It is in fact possible to make the applications, by filling in the necessary formulae in a routine way, without even understanding where the formulae have come from. I do not, however, advocate this practice.’’ Although the approach taken in the body of this unit ignores Bishop’s advice, the intent is to demonstrate the ease with which vibrational spectra can be analyzed by using the tools of group theory. One researcher used the automobile analogy to illustrate the usefulness of this approach to group theory. Driving an automobile can be a very useful practice. Doing so can be accomplished without any knowledge of the internal combustion engine. Hopefully, the elegance of the methodology, which should be apparent from reading this appendix, will encourage the reader to acquire a firm understanding of the principles of group theory by consulting any number of texts that provide an excellent introduction to the subject (see Literature Cited, e.g., Hammermesh, 1962; Koster, 1957; Leech and Newman, 1969; Mariot, 1962; Meijer and Bauer, 1962; Weyl, 1931). Group Theoretical Tools for Analyzing Vibrational Spectra Point Groups and Matrix Representations of Symmetry Operations. The collection of symmetry operations of a molecule make up its point group. A point group is a collection of symmetry operations that are linked to one another by the following rules: (1) The identity operation is a member of the set. (2) The operations multiply associatively. (3) If R and S are elements, then so is RS. (4) The inverse of each element is also a member of the set. In satisfying these requirements, the point groups meet the criteria
717
for a mathematical group so that molecular symmetry can be investigated with the aid of group theory. At this juncture, point groups, which describe the symmetry of molecules, and the use of group theoretical techniques to identify the selection rules for Raman active, molecular vibrations will be described. Although the symmetry of crystals is described by space groups, it will be demonstrated below that the Raman active vibrational modes of a crystal can be identified with the aid of the point group associated with the crystal space group. That is, the Raman spectrum of a crystal is equated to the Raman spectrum of a single unit cell, which is treated like a molecule. Specifically, the symmetry of the unit cell is described by one of 32 point groups. All that is presented about point groups and molecular symmetry will be directly applicable to the use of symmetry and group theory for analyses of vibrational spectra of solids. After completing the discussion of point groups and molecular symmetry, the link between point groups and vibrational spectroscopy of solids will be established. There are a total of fourteen different types of point groups, seven with one principal axis of rotation; groups within these seven types form a series in which one group can be formed from one other group by the addition of a symmetry operation. The seven other types of point groups do not constitute a series and involve multiple axes of higher order. The symmetries of the vast number of different molecules can be completely represented by a small number of point groups. The water molecule belongs to the point group C2v, which consists of the symmetry elements svðxzÞ ; sv ðyzÞ, and C2, as illustrated in Figure 6, plus the identity operation E. Each symmetry operation can be represented by a matrix that describes how the symmetry element acts on the molecule. First, a basis function that represents the molecule is identified. Examples of basis functions for the water molecule are: (1) the 1s orbitals of the two hydrogen atoms and the 2s orbital of the oxygen atom; (2) the displacement coordinates of the molecule during normal-mode vibrations; and (3) the 1s orbitals of the hydrogen atoms and the 2px;y;z orbitals of oxygen. In fact, there are any number of possible basis functions, although some are more efficient and effective than others in representing the molecule. The number of functions in the basis defines the dimensions of the matrix representations of the symmetry operations. In the case of a basis consisting of three components, e.g., the two 1s orbitals of hydrogen atoms and the 2s orbital of oxygen, each matrix would be 3 3. The action of the symmetry operation on the molecule would then be provided by an equation of the form 0
d11 ½f1 f2 f3 ¼ @ d21 d31
d12 d22 d32
12 3 2s d13 A 4 d23 2sa 5 d33 1sb
ð31Þ
where ½f1 ; f2 ; f3 represents the basis function following the action of the symmetry operation and 1sa represents the 1s orbital on the ‘‘a’’ hydrogen atom. If the symmetry operation under consideration is the twofold axis of rotation,
718
OPTICAL IMAGING AND SPECTROSCOPY
Figure 14. Matrix representation of the symmetry operation R is given as D(R). In the form depicted, D(R) can be visualized as the ‘‘direct sum’’ of a 2 2 and a 1 1 matrix.
then ½f1 ; f2 ; f3 is ½2s; 1sb ; 1sa and the matrix representation of the twofold axis of rotation for this basis function is 0
1 DðC2 Þ ¼ @ 0 1
1 0 0 0 1A 0 0
ð32Þ
In instances in which one component of the basis function is unchanged by all of the symmetry operations, each matrix representation has the form depicted in Figure 14. The matrix can be thought of as consisting of the sum of two matrices, a 1 1 matrix and a 2 2 matrix. This type of sum is not an arithmetic operation and is referred to as a ‘‘direct sum.’’ The process of converting all of the matrices that represent each of the symmetry operations for a given basis to the direct sum of matrices of smaller dimensions is referred to as ‘‘reducing the representation.’’ In some cases each 2 2 matrix representation of the symmetry operations is also diagonalized, and the representation can be reduced to the direct sum of two 1 1 matrices. When the matrix representations have been converted to the direct sum of matrices with the smallest possible dimensions, the process of reducing the representation is complete. Obviously, there are an infinite number of matrix representations that could be developed for each point group, each arising from a different basis. Fortunately, and perhaps somewhat surprisingly, most of the information about the symmetry of a molecule is contained in the characters of the matrix representations. Knowledge of the full matrix is not required in order to use the symmetry of the molecule to help solve complex problems in quantum mechanics. Often, only the characters of the matrices representing the group symmetry operations are needed. The many different matrix representations of the symmetry operations can be reduced via similarity transformations to block diagonalized form. In block diagonalized form, each matrix is seen to consist of the direct sum of a number of irreducible matrices. An irreducible matrix is one for which there is no similarity transformation that is capable of putting it into block diagonalized form. A reduced representation consists of a set of matrices, each of which represents one symmetry operation of the point group and each of which is in the identical block diagonalized form. Now it turns out that the character of a matrix representation is the most important piece of information concerning the symmetry of the molecule, as mentioned above. The list composed of the characters of the matrix representations of the symmetry operations of the group is referred to as the symmetry species of the representation. For each point group there is a small number of
unique symmetry species that are composed of the characters of irreducible matrices representing each of the symmetry operations. These special symmetry species are called irreducible representations of the point group. For each point group the number of irreducible representations is equal to the number of symmetry classes. Remarkably, of the unlimited number of representations of a point group, corresponding to the unlimited number of bases that can be employed, the symmetry species of each representation can be expressed by a linear combination of symmetry species of irreducible representations. The list of the symmetry species of each irreducible representation therefore concisely indicates what is unique about the symmetry of the particular point group. This information is provided in character tables of the point groups. Character Tables. The character table for the point group C2v is presented in Table 1. The top row lists the symmetry elements in the point group. The second through the fourth rows list the point group irreducible representations. The first column is the symbol used to denote the irreducible representation, e.g., A1. The numbers in the character tables are the characters of the irreducible matrix representations of the symmetry operations for the various irreducible symmetry species. The right-hand column lists basis functions for the irreducible symmetry species. For example, if x is used as a basis function for the point group C2v, the matrix representations of the various symmetry operations will all be 1 1 matrices with character of 1. The resultant symmetry species, therefore, is 1, 1, 1, 1 and is labeled A1. On the other hand, if xy is the basis function, the matrix representations of all of the symmetry operations will consist of 1 1 matrices. The resultant symmetry species is 1, 1, 1, 1 and is labeled A2. If a function is said to have the same symmetry as x, then that function will also serve as a basis for the irreducible representation B1 in the point group C2v. There are many possible basis functions for each irreducible representation. Those listed in the character tables have special significance for the analyses of rotational, vibrational, and electronic spectra. This will be made clear below. When a function serves as a basis of a representation, it is said to span that representation.
Vibrational Selection Rules. Where group theory is especially helpful is in deciding whether or not a particular integral, taken over a symmetric range, has a nonzero value. It turns out that if the integrand can serve as a basis for the A1, totally symmetric irreducible representation, then the integral may have a nonzero value. If the integrand does not serve as a basis for the A1 irreducible representation, then the integral will necessarily be zero. The symmetry species that is spanned by the products of two functions is obtained by forming the ‘‘direct product’’ of the symmetry species spanned by each function. Tables are available that list the direct products of all possible combinations of irreducible representations for each point group (see, e.g., Wilson et al., 1955; Bishop, 1973; Atkins, 1984; Kettle, 1987).
RAMAN SPECTROSCOPY OF SOLIDS
Thus, if we want to know whether or not a Raman transition from the ground-state vibrational mode n to the first vibrationally excited state m can occur, we need to examine the six integrals of the form ð
cm aij cn dV
ð33Þ
one for each of the six independent values of aij (or fewer depending on the symmetry of the molecule or crystal). If any one is nonzero, the Raman transition may occur. If all are zero, the Raman transition will be symmetry forbidden. The Raman selection rule states that a Raman transition between two vibrational states n and m is allowed if the product cm cn of the two wave functions describing the m and n vibrational states has the same symmetry species as at least one of the six components of aij . This rule Ð reflects the fact that for the integral f1 f2 f3 dV to have a nonzero value when taken over the symmetric range of variables V, it is necessary that the integrand, which is the product of the three functions f1, f2, and f3, must span a symmetry species that is or contains the totally symmetric irreducible representation of the point group. The symmetry species spanned by the product of three functions is obtained from the direct product of the symmetry species of each function (see, e.g., Wilson et al., 1955; Bishop, 1973; Atkins, 1984; Kettle, 1987). For the symmetry species of a direct product to contain the totally symmetric irreducible representation, it is necessary that the two functions span the same symmetry species or have a common irreducible representation in their symmetry species. The function x2 acts as a basis for the A1 irreducible representation of the point group C2v. A basis for the irreducible representation B1 is given as x. The function x2x would not span the irreducible representation A1, and so its integral over all space would be zero. The function xz is also a basis for the irreducible representation B1 in the point group C2v. Hence, the function xxzð¼ x2 zÞ spans the irreducible representation A1 and its integral over all space could be other than zero. Thus, for the integrand to span the totally symmetric irreducible representation of the molecular point group, it is necessary that the representation spanned by the product cm cn be the same as the representation spanned by aij . This last statement is extremely useful for analyzing the probability of a Stokes Raman transition from the vibrational ground state n to the first vibrational excited state m or for an anti-Stokes Raman transition from the first vibrational excited state m to the vibrational ground state n. The vibrational ground state spans the totally symmetric species. Consequently, the direct product cm cn spans the symmetry species of the first vibrational excited state m. Hence, the Raman selection rule means that the symmetry species of aij must be identical to the symmetry species of the first vibrational excited state. The symmetry of aij can be obtained by showing that it transforms under the action of a symmetry operation of the point group in the same way as a function of known symmetry. Here, aij relates pi to aij . If a symmetry operation R is applied to the molecule/crystal, then the applied electric field is transformed to R ¼ DðRÞ, where DðRÞ is the
719
matrix representation of R. The polarizability, which expresses the proportionality between the applied electric field and the induced polarization of the molecule/crystal, must transform in a manner related to but in general different from that of the applied electric field because the direction of p will generally be different from the direction of ej . The polarization of the crystal induced by the applied electric field is transformed to Rp ¼ DðRÞp as a result of the action of the symmetry operation R. Consequently, the polarizability transforms as RRa ¼ DðRÞDðRÞa. The transformation of a can be derived as follows (taken from Bishop, 1973). If the symmetry operation R transforms the coordinate system from x to x0 , then (summations are over all repeated indices) pi ðx0 Þ ¼ ¼
X X
aij ðx0 Þej ðx0 Þ X Djk ðRÞek ðxÞ aij ðx0 Þ
X
Dml ðRÞpm ðx0 Þ X X X pl ðxÞ ¼ Dml ðRÞ aim ðx0 Þ Djk ðRÞ k ðxÞ
pl ðxÞ ¼
ð34Þ ð35Þ ð36Þ
Substituting into the last equation the expression pl ðxÞ ¼
X
alk ðxÞ k ðxÞ
ð37Þ
gives alk ðxÞ ¼
XX
Dml ðRÞDjk ðRÞamj ðx0 Þ
ð38Þ
This describes the action of the symmetry operation R on the polarizability a. Now, xl ¼
X
Dml ðRÞxm
ð39Þ
and xk ¼
X
Djk ðRÞxj
ð40Þ
so xl xk ¼
XX
Dml ðRÞDjk ðRÞxm xj
ð41Þ
Comparison of Equations 19 and 22 indicates that alk transforms as the function xl xk . Any symmetry species spanned by the function xm xj will also be spanned by amj . Thus, any vibrational mode that spans the same symmetry species as xl xk will be Raman active. This information is included in the character tables of the point groups. In the column that lists the basis functions for the various symmetry species of the point group are listed quadratic functions such as x2 ; yz; x2 þ y2 ; x2 þ y2 2z2 ; . . . . By way of illustration, the character table indicates that if a molecule with point group symmetry C2v has a vibrational mode that spans the symmetry species A1, it will be Raman active since this symmetry species is also spanned
720
OPTICAL IMAGING AND SPECTROSCOPY
by the quadratic function x2. Any vibrational mode that spans a symmetry species that has at least one quadratic function (e.g., x2, yz) as a basis will be Raman active.
Raman Active Vibrational Modes of Solids It is now possible to illustrate how Raman spectroscopy can identify the symmetry species of the vibrational modes of a single crystal. To begin, it is important to recognize that the symmetry of a molecule may be lowered when it is part of a crystal. This can be appreciated by considering a simpler case of symmetry reduction: the change in symmetry of a sulfate anion as a consequence of its adsorption on a solid surface. As illustrated in Figure 15, the sulfate anion consists of oxygen nuclei located on the four corners of a tetrahedron and the sulfur nucleus positioned at the geometric center of the tetrahedron. The anion exhibits Td point group symmetry. If the sulfate is adsorbed on the surface of a solid through a bond formed between the solid and one of the oxygen anions of the sulfate, then the symmetry of the sulfate is lowered to C3v. If two oxygen-surface bonds are formed, in either a bridging or nonbridging configuration, the symmetry of the sulfate is further lowered to C2v. When present in a crystal, a molecule or anion will exhibit either the same or lower symmetry than the free molecule or anion. The distinctive feature of the symmetry of a crystal is the presence of translational symmetry elements. In fact, it is possible to generate the entire lattice of a crystal by the repeated operation of the three basic primitive translation vectors that define the unit cell. When all possible arrays of lattice points in three dimensions are considered, it turns out that there are only fourteen distinctive types of lattices, called Bravais lattices. The Bravais lattice may not completely describe the symmetry of a crystal. The unit cell may have an internal structure that is not completely
specified by the symmetry elements of the bare lattice. The symmetry of the crystal is completely specified by its space group, which is a mathematical group that contains translational operations and that may contain as many as three additional types of symmetry elements: rotations (proper and improper), which are the point group elements, screw axes, and glide planes. The latter two elements may be conceptually subdivided into a rotation (proper ¼ screw axis; improper ¼ glide plane) plus a rational fraction of a primitive lattice translation. Lattices can be brought into themselves by the operations of certain point groups. Each Bravais lattice type is compatible with only a particular set of point groups, which are 32 in number and are referred to as crystallographic point groups. The appropriate combinations of the 14 Bravais lattices and the 32 crystallographic point groups result in the 230 three-dimensional space groups. The restrictions that a lattice imposes on the rotational symmetry elements of the space group can be readily illustrated. Each lattice point is located at the tip of a primitive lattice vector given by T ¼ n1 t1 þ n2 t2 þ n3 t3
ð42Þ
where the ti are the basic primitive translation vectors and the ni are positive and negative integers including zero. The rotation of the lattice translates the points to new locations given by T0 ¼ RT ¼ Rij nj tj
ð43Þ
where Rij nj must be integers for all values of nj, which are integers. Hence, the Rij must be integers as well. For a proper or improper rotation of y about the z axis, 2
3 cos y sin y 0 R ¼ 4 sin y cos y 0 5 0 0 1
ð44Þ
The requirement of integral values for each element Rij means, in particular, that Tr R ¼ 2 cos y 1 ¼ integer
ð45Þ
The plus sign corresponds to a proper rotation, while the minus sign corresponds to an improper rotation (e.g., reflection or inversion). Thus, the requirement of integral values of Rij limits the possible rotation angles y to cos y ¼ ðn 1Þ=2 ¼> y ¼ 0; p=3; p=2; 2p=3; 2p
Figure 15. Influence of bonding on the symmetry of sulfate: (A) free sulfate; (B) unidentate sulfate; (C) bidentate sulfate; (D) bidentate sulfate in the bridging configuration.
ð46Þ
Crystallographic point groups may therefore contain the following proper rotations: C1, C2, C3, C4, and C6, where Cn indicates a rotation angle of 180 =n about an axis of the crystal. Within each space group is an invariant subgroup consisting of the primitive translational operations. An invariant group is one in which a conjugate transformation
RAMAN SPECTROSCOPY OF SOLIDS
produces another member of the group; i.e., if for all elements X and t of the group G, X 1 tX ¼ t0 , where t and t0 are elements of the same subgroup of G, then this subgroup is an invariant subgroup of G. A translational operation is represented by the symbol ejt, which is a particular case of the more general symbol for a rotation of a followed by a translation of a, aja. The notation e represents a rotation of 08. The invariance of the translational subgroup is demonstrated by ðajaÞx ¼ ax þ a ðbjbÞðajaÞx ¼ bax þ ba þ b ¼ ðbajba þ bÞx 1
Since ðajaÞ ðajaÞ ¼ ðej0Þ ðajaÞ1
must be given by ða1 j a1 aÞ ðajaÞ1 ðejtÞðajaÞx ¼ ðajaÞ1 ðejtÞðax þ aÞ ðajaÞ1 ðax þ t þ aÞ ¼ a1 ax þ a1 t þ a1 a a1 a ¼ x þ a1 t ¼ ðeja1 tÞx ¼ ðejt0 Þx
ð47Þ
Thus, a conjugate transformation of a translation operator ejt using a general rotation plus translation operator aja that is also a member of the group G produces another simple translation operator ejt0 , demonstrating that the translational subgroup is invariant to the conjugacy operation. The invariant property of the translational subgroup is exploited quite heavily in developing expressions for the irreducible representations of space groups. The difficulty in dealing with the symmetry of a crystal and in analyzing the vibrational modes of a crystal stems from the fact that the translational subgroup is infinite in size. Stated slightly differently, a crystal consists of a large number of atoms so that the number of normal vibrational modes is huge, i.e., 1023. Fortunately, group theory provides a method for reducing this problem to a manageable size. Most textbooks dealing with applications of group theory to crystals develop cofactor expressions for space groups. The cofactor group is isomorphic with the crystallographic point group providing the space group is symmorphic. A symmorphic space group consists of rotational and translational symmetry elements. A nonsymmorphic space group contains at least one symmetry element (aja) in which the rotation a, all by itself, and/or the translation a, all by itself, are not symmetry elements of the space group. Of the 230 three-dimensional space groups, only 73 are symmorphic. Consequently, the use of factor groups is of no benefit for analyzing the vibrational modes of crystals in the 157 nonsymmorphic space groups. Since the use of cofactor groups is so common, albeit of limited value, the approach will be summarized here and then a more general approach that addresses both symmorphic and nonsymmorphic space groups will be developed.
721
The translational factor group G|T of the symmorphic space group G consists of X GjT ¼ T þ si T ði ¼ 2; 3; . . . ; nÞ ð48Þ where the si are the nontranslational symmetry elements except the identity element (s1 ¼ E) of the symmorphic space group G whose crystallographic point group is of order n; T is the translational symmetry elements of G. The translational factor group is distinct from the group G because T is treated as a single entity, rather than as the infinite number of lattice translational operations that it is in the group G. The factor group has a different multiplication rule than the space group and T acts as the identity element. As long as G is symmorphic, the factor group G|T is obviously isomorphic with the crystallographic point group of G. Because of the isomorphic relationship of the two groups, the irreducible representations of the crystallographic point group will also serve as the irreducible representations of the factor group G|T. The irreducible representations of the factor group provide all the symmetry information about the crystal that is needed to analyze its vibrational motion. On the other hand, if G is nonsymmorphic, then it contains at least one element sj of the form ½ajvðaÞ , where a and/or vðaÞ are not group elements. As a consequence, G|T is no longer symmorphic with any crystallographic point group. The value of this approach, which makes use of cofactor groups, is that the total number of group elements have been reduced from g 1023 , where g is the number of point group operations, to g. Furthermore, the factor group G|T is isomorphic with the crystallographic point group. Consequently, the irreducible representations of the point group serve as irreducible representations of the factor group. How does this help in the analyses of vibrational modes? Recall that the only Raman active modes are characterized by k ¼ 0 (i.e., approximately infinite wavelength) and nonzero frequency. The modes of k ¼ 0 consist of corresponding atoms in each unit cell moving in phase. The number of normal modes of this type of motion is given by 3N, where N is the number of atoms in the unit cell. The symmetry of the unit cell is just that of the factor group. Thus, the irreducible representations of the crystallographic point group provide the irreducible representations of the Raman active vibrational modes of the crystal. As mentioned above, a more general approach to the group theoretical analysis of crystal vibrational modes will be developed in place of the use of factor groups. This approach is enunciated by Lax (1974). The group multiplication of symmetry elements of space groups is denoted by ½ajvðaÞ ½bjvðbÞ r ¼ ½ajvðaÞ ½br þ vðbÞ ¼ abr þ avðbÞ þ vðaÞ ¼ abr þ avðbÞ þ vðaÞ þ vðabÞ vðabÞ abr þ vðabÞ þ avðbÞ þ vðaÞ vðabÞ where
¼ abr þ vðabÞ þ t t ¼ avðbÞ þ vðaÞ vðabÞ ¼ ðejtÞðabjvðabÞÞ
ð49Þ
722
OPTICAL IMAGING AND SPECTROSCOPY
Next, consider the operation of ðejtÞ½abjvðabÞ on a Bloch function expðik rÞuk ðrÞ. Functions of this form serve as eigenvectors of a crystal’s vibrational Hamiltonian that is expressed in normal-mode coordinates with uk ðrÞ representing the displacement in the kth normal vibrational mode of the harmonic oscillator at r (Altmann, 1991). Therefore, the operation of two successive symmetry elements ½ajvðaÞ and ½bjvðbÞ on the Bloch function is given by ðejtÞ½abjvðabÞ expðik rÞuk ðrÞ
ðejtÞfexp½ik ðabjvðabÞÞ
r ðabjvðabÞÞuk ðrÞg
ð51Þ
Now focus attention on the argument of the exponential in the previous expression. It may be rewritten as ½ik ððabÞ1 j ðabÞ1 vðabÞÞ r ¼ ½ik ðabÞ
1
½r vðabÞ
¼ ½iðabÞk ðr vðabÞÞ
THOMAS M. DEVINE University of California Berkeley, California
ð50Þ
Applying ½abjvðabÞ to both functions that constitute the Bloch function converts the above expression to 1
which is the same group multiplication rule obeyed by symmetry elements of point groups (i.e., groups consisting of elements of the form ½aj0 Þ. Thus, the representations of space group elements ½ajvðaÞ are identical to the representations of the point group elements ½aj0 .
ð52Þ
(1) Substituting this expression for the argument of the exponential back into Equation 51, (2) operating abjvðabÞ on r in uk ðrÞ, and (3) including ejt in the arguments of both functions of the Bloch function convert Equation 51 to expfiðabÞk ðejtÞ1 ½r vðabÞ guk fðejtÞ1 ½abjvðabÞ 1 rg ð53Þ
ULTRAVIOLET PHOTOELECTRON SPECTROSCOPY INTRODUCTION Photoemission and Inverse Photoemission Ultraviolet photoelectron spectroscopy (UPS) probes electronic states in solids and at surfaces. It relies on the process of photoemission, in which an incident photon provides enough energy to bound valence electrons to release them into vacuum. Their energy E, momentum hk, and spin s provide the full information about the quan tum numbers of the original valence electron using conservation laws. Figure 1 depicts the process in an energy diagram. Essentially, the photon provides energy but negligible momentum (due to its long wavelength l ¼ 2p=jkjÞ, thus shifting all valence states up by a fixed energy (‘‘vertical’’ or ‘‘direct’’ transitions). In addition, secondary
Now, uk fðejtÞ1 ½abjvðabÞ 1 rg ¼ uk f½abjvðabÞ 1 rg
ð54Þ
because uk ðr tÞ ¼ uk ðrÞ
ð55Þ
where t is the lattice vector as uk(r) has the periodicity of the lattice. Equation 53 now becomes expfiðabÞk ½r vðabÞ t guk f½abjvðabÞ 1 rg exp½iðabÞk t expfiðabÞk ½r vðabÞ guk f½abjvðabÞ 1 rg exp½iðabÞk t ðabvðabÞÞcðr; kÞ
ð56Þ
where cðr; kÞ is the Bloch function. At this point it is no longer possible to continue in a completely general fashion. To include in Equation 56 all possible values of k, it would be necessary to develop two separate expressions of Equation 56: one for symmorphic space groups and the other for nonsymmorphic space groups. Alternatively, a single expression for Equation 56 can be developed for both symmorphic and nonsymmorphic space groups if we consider only one value of k: the long-wavelength limit, k ¼ 0. For k ¼ 0, the above group multiplication becomes ðajvðaÞÞðbjvðbÞÞ ¼ ðabjvðabÞÞ
ð57Þ
Figure 1. Photoemission process (Smith and Himpsel, 1983).
ULTRAVIOLET PHOTOELECTRON SPECTROSCOPY
Figure 2. Photoemission and inverse photoemission as probes of occupied and unoccupied valence states: f ¼ work function (Himpsel and Lindau, 1995).
processes, such as energy loss of a photoelectron by creating plasmons or electron-hole pairs, produce a background of secondary electrons that increases toward lower kinetic energy. It is cut off at the vacuum level EV , where the kinetic energy goes to zero. The other important energy level is the Fermi level EF , which becomes the upper cutoff of the photoelectron spectrum when translated up by the photon energy. The difference f ¼ EV EF is the work function. It can be obtained by subtracting the energy width of the photoelectron spectrum from the photon energy. For reviews on photoemission, see Cardona and Ley (1978), Smith and Himpsel (1983), and Himpsel and Lindau (1995). Information specific to angle-resolved photoemission is given by Plummer and Eberhardt (1982), Himpsel (1983), and Kevan (1992). Photoemission is complemented by a sister technique that maps out unoccupied valence states, called inverse photoemission or bremsstrahlung isochromat spectroscopy (BIS). This technique is reviewed by Dose (1985), Himpsel (1986), and Smith (1988). As shown in Figure 2, inverse photoemission represents the reverse of the photoemission process, with an incoming electron and an outgoing photon. The electron drops into an unoccupied state and the energy is released by the photon emission. Both photoemission and inverse photoemission operate at photon energies in the ultraviolet (UV), starting with the work function threshold at 4 eV and reaching up to 50- to 100-eV photon energy, where the cross-section of valence states has fallen off by an order of magnitude and the momentum information begins to get blurred. At kinetic energies of 1 to 100 eV, the electron mean free path is only a few atomic layers, making it possible to detect surface states as well as bulk states.
723
completely characterized by a set of quantum numbers. These are energy E, momentum hk, point group symmetry (i.e., angular symmetry), and spin. This information can be summarized by plotting EðkÞ band dispersions with the appropriate labels for point group symmetry and spin. Disordered solids such as random alloys can be characterized by average values of these quantities, with disorder introducing a broadening of the band dispersions. Localized electronic states (e.g., the 4f levels of rare earths) exhibit flat EðkÞ band dispersions. Angle-resolved photoemission, combined with a tunable and polarized light source, such as synchrotron radiation, is able to provide the full complement of quantum numbers. In this respect, photoemission and inverse photoemission are unique among other methods of determining the electronic structure of solids. Before getting into the details of the technique, its capabilities are illustrated in Figure 3, which shows how much information can be extracted by various techniques about the band structure of Ge. Optical spectroscopy integrates over the momentum and energy of the photoelectron and leaves only the photon energy hn as variable. The resulting spectral features represent regions of momentum and energy space near critical points at which the density of transitions is high (Fig. 3A). By adjusting an empirical band structure to the data (Chelikowski and Cohen, 1976; Smith et al., 1982), it is possible to extract rather accurate information about the strongest critical points. Angle-integrated photoemission goes one step further by detecting the electron energy in addition to the photon energy (Fig. 3B). Now it becomes possible to sort out whether spectral features are due to the lower state or the upper state of the optical transition. To extract all the information on band dispersion, it is necessary to resolve the components of the electron momentum parallel and perpendicular to the surface, kk and k? , by angle-resolved photoemission with variable photon energy (Fig. 3C). Photoemission and inverse photoemission data can, in principle, be used to derive a variety of electronic properties of solids, such as the optical constants [from the EðkÞ relations; see Chelikowski and Cohen (1976) and Smith et al. (1982)], conductivity (from the optical constants), the electron lifetime [from the time decay constant t or from the line width dE; see Olson et al. (1989) and Haight (1995)], the electron mean free path [from the attenuation length l or from the line width dk; see Petrovykh et al. (1998)], the group velocity of the electrons [via the slope of the E(k) relation; see Petrovykh et al. (1998)], the magnetic moment (via the band filling), and the superconducting gap (see Olson et al., 1989; Shen et al., 1993; Ding et al., 1996).
Characterization of Valence Electrons In atoms and molecules, the important parameters characterizing a valence electron are its binding energy (or ionization potential) plus its angular momentum (or symmetry label in molecules). This information is augmented by the vibrational and rotational fine structures, which shed light onto the interatomic potential curves (Turner et al., 1970). In a solid, it becomes necessary to consider not only energy but also momentum. Electrons in a crystalline solid are
Energy Band Dispersions How are energy band dispersions determined in practice? A first look at the task reveals that photoemission (and inverse photoemission) provides just the right number of independent measurable variables to establish a unique correspondence to the quantum numbers of an electron in a solid. The energy E is obtained from the kinetic energy of the electron. The two momentum components parallel to
724
OPTICAL IMAGING AND SPECTROSCOPY
Figure 4. Mapping bulk and surface bands of copper by angleresolved photoemission. Variable photon energy from monochromatized synchrotron radiation makes it possible to tune the k component perpendicular to the surface k? and to distinguish surface states from bulk states by their lack of k? dispersion (Knapp et al., 1979; Himpsel, 1983).
Figure 3. Comparison of various spectroscopies applied to the band structure of germanium. (A) The optical reflectivity R(E) determines critical points near the band gap, (B) angle-integrated photoemission adds critical points farther away, and (C) angleresolved photoemission and inverse photoemission provide the full E(k) band dispersion (Phillip and Ehrenreich, 1963; Grobman et al., 1975; Chelikowski and Cohen, 1976; Wachs et al., 1985; Hybertsen and Louie, 1986; Ortega and Himpsel, 1993).
the surface, kk , are derived from the polar and azimuthal angles # and f of the electron. The third momentum component, k? , is varied by tuning the photon energy hn. This can be seen in Figure 4, in which angle-resolved photoemission spectra at different photon energies hn are plotted together with the relevant portion of the band structure. The parallel momentum component kk is kept zero by detecting photoelectrons in normal emission; the perpendicular component k? changes as the photon energy of the vertical interband transitions is increased. A complete band structure determination requires a tunable photon source, such as synchrotron radiation (or a tunable photon detector in inverse photoemission). Surface states, such as the state near EF , labeled S1 in Figure 4, do not exhibit a well-defined k? quantum number. Their binding energy does not change when the k? of the upper state is varied by changing hn. This lack of k? dispersion is one of the characteristics of a surface state. Other clues are that a surface state is located in a gap of bulk states with the same kk and the surface state is sensitive to contamination. To complete the set of quantum numbers, one needs the point group symmetry and the spin in ferromagnets. The former is obtained via dipole selection rules from the polarization of the photon, the latter from the spin polarization of the photoelectron. For two-dimensional (2D) states in thin films and at surfaces, the determination of energy bands is almost trivial since only E and kk have to be determined. These quantities obey the conservation laws E1 ¼ Eu hn
ð1Þ
ULTRAVIOLET PHOTOELECTRON SPECTROSCOPY
725
Competitive and Related Techniques
and k k1
kku
gk
ð2Þ
where gk is a vector of the reciprocal surface lattice, u denotes the upper state, and l the lower state. These conservation laws can be derived from the invariance of the crystal with respect to translation in time and in space (by a surface lattice vector). For the photon, only its energy hn appears in the balance because the momentum of a UV photon is negligible compared to the momentum of the electrons. The subtraction of a reciprocal lattice vector simply corresponds to plotting energy bands in a reduced surface Brillouin zone, i.e., within the unit cell in kk space. For a three-dimensional (3D) bulk energy band, the situation becomes more complicated since the momentum component perpendicular to the surface is not conserved during the passage of the photoelectron across the surface energy barrier. However, k? can be varied by changing the photon energy hn, and extremal points, such as the and L points in Figure 4, can thus be determined. A discussion of the practical aspects and capabilities of various band mapping schemes is given below (see Practical Aspects of the Method) and in several reviews (Plummer and Eberhardt, 1982; Himpsel, 1983; Kevan, 1992). Experimental energy bands are compiled in Landolt-Bo¨ rnstein (1989, 1994). In ferromagnets the bands are split into two subsets, one with majority spin, the other with minority spin (see Fig. 5). The magnetic exchange splitting dEex between majority and minority spin bands is the key to magnetism. It causes the majority spin band to become filled more than the minority band, thus creating the spin imbalance that produces the magnetic moment. An overview of the band structure of ferromagnets and magnetic thin-film structures is given in Himpsel et al. (1998).
Figure 5. Using multidetection of energy and angle to map out the E(k) band dispersion of Ni near the Fermi level EF from a Ni(001) crystal. The majority and minority spin bands are split by the ferromagnetic exchange splitting dEex . High photoemission intensity is shown in dark (Petrovykh et al., 1998).
The relation of UPS to optical spectroscopy has been mentioned in the context of Figure 3 already. Several other spectroscopies involve core levels but provide information about valence electrons as well. Figure 6 shows the simplest processes, which involve transitions between two levels only, one of them a core level. More complex phenomena involve four levels, such as Auger electron spectroscopy (see AUGER ELECTRON SPECTROSCOPY) and appearance potential spectroscopy. Core-level photoelectron spectroscopy determines the energy of a core level relative to the Fermi level EF (Fig. 6A) and is known as x-ray photoelectron spectroscopy (XPS) or electron spectroscopy for chemical analysis (ESCA). A core electron is ionized, and its energy is obtained by subtracting the photon energy
Figure 6. Spectroscopies based on core levels. (A) X-ray photoelectron spectroscopy measures the binding energy of core levels, (B) core-level absorption spectroscopy detects transitions from a core level into unoccupied valence states, and (C) core-level emission spectroscopy detects transitions from occupied valence states into a core hole. In contrast to UPS, these spectroscopies are element-specific but cannot provide the full momentum information.
726
OPTICAL IMAGING AND SPECTROSCOPY
from the kinetic energy of the photoelectron. The energy shifts of the substrate and adsorbate levels are a measure of the charge transfer and chemical bonding at the surface. Core-level absorption spectroscopy determines the pattern of unoccupied orbitals by exciting them optically from a core level (Fig. 6B). It is also known as near-edge x-ray absorption fine structure (NEXAFS; see XAFS SPECTROSCOPY) or x-ray absorption near-edge structure (XANES). Instead of measuring transmission or reflectivity, the absorption coefficient is determined in surface experiments by collecting secondary products, such as photoelectrons, Auger electrons, and core-level fluorescence. The short escape depth of electrons compared to photons provides surface sensitivity. These spectra resemble the density of unoccupied states, projected onto specific atoms and angular momentum states. In a magnetic version of this technique, magnetic circular dichroism (MCD), the difference in the absorption between parallel and antiparallel alignment of the electron and photon spin is measured. Compared to UPS, the momentum information is lost in core-level absorption spectroscopy, but element sensitivity is gained. Also, the finite width of the core level is convolved with the absorption spectrum and limits the resolution. Core-level emission spectroscopy can be viewed as the reverse of absorption spectroscopy (Fig. 6C). Here, the valence orbital structure is obtained from the spectral distribution of the characteristic x rays emitted during the recombination of a core hole with a valence electron. The core hole is created by either optical or electron excitation. As with photoemission and inverse photoemission, corelevel emission spectroscopy complements core-level absorption spectroscopy by mapping out occupied valence states projected onto specific atoms.
PRINCIPLES OF THE METHOD The Photoemission Process The most general theory of photoemission is given by an expression of the golden rule type, i.e., a differential cross-section containing a matrix element between the initial and final states ci and cf and a phase space sum (see Himpsel, 1983; Dose, 1985; Smith, 1988): ds pffiffiffiffiffiffiffiffiffi X hf jA p þ p Aji ij2 dðEf Ei hnÞ Ekin d i ð3Þ pffiffiffiffiffiffiffiffiffi The factor Ekin containing the kinetic energy Ekin of the photoelectron represents the density of final states, the scalar product of the vector potential A of the photon and the momentum operator p ¼ ihq=qr is the dipole operator for optical excitation, and the d function represents energy conservation. This is often called the onestep model of photoemission, as opposed to the three-step model, which approximates the one-step model by a
sequence of three simpler steps. For practical purposes (see Practical Aspects of the Method; Himpsel, 1983), we have to consider mainly the various selection rules inherent in this expression, such as the conservation of energy (Equation 1), parallel momentum (Equation 2), and spin, together with the point group selection rules. Since there is no clear-cut selection rule for the perpendicular momentum, it is often determined approximately by using a nearly free electron final state. While selection rules provide clear yes-no decisions, there are also more subtle effects of the matrix element that can be used to bring out specific electronic states. The atomic symmetry character determines the energy dependence of the cross-section (Yeh and Lindau, 1985), allowing a selection of specific orbitals by varying the photon energy. For example, the s,p states in transition and noble metals dominate the spectra near the photoelectric threshold, while the d states turn on at 10 eV above threshold. It takes photon energies of 30 eV above threshold to make the f states in rare earths visible. Resonance effects at a threshold for a core-level excitation can also enhance particular orbitals. Conversely, the crosssection for states with a radial node exhibits so-called Cooper minima, at which the transitions become almost invisible. The wave functions of the electronic states in solids and at surfaces are usually approximated by the ground-state wave functions obtained from a variety of schemes, e.g., empirical tight binding and plane-wave schemes (Chelikowski and Cohen, 1976; Smith et al., 1982) and first-principles local density approximation (LDA; see Moruzzi et al., 1978; Papaconstantopoulos, 1986). Strictly speaking, one should use the excited-state wave functions and energies that represent the hole created in the photoemission process or the extra electron added to the solid in the case of inverse photoemission. Such quasiparticle calculations have now become feasible and provide the most accurate band dispersions to date, e.g., the so-called GW calculations, which calculate the full Green’s function G of the electron/hole and the fully screened Coulomb interaction W but still neglect vertex and density gradient corrections (Hybertsen and Louie, 1986; see also BONDING IN METALS). Particularly in the case of semiconductors the traditional ground-state methods are unable to determine the fundamental band gap from first principles, with Hartree-Fock overestimating it and LDA underestimating it, typically by a factor of 2. The band width comes out within 10% to 20 % in LDA calculations. Various types of wave functions can be involved in the photoemission process, as shown in Figure 7. The states that are propagating in the solid lead to vertical transitions that conserve all three momentum components and are being used for mapping out bulk bands. Evanescent states conserve the parallel momentum only and are more sensitive to the surface. At elevated temperatures one has to consider phonon-assisted transitions, which scramble the momentum information about the initial state completely. The materials property obtained from angle-integrated UPS is closely related to the density of states and is often interpreted as such. Strictly speaking, one measures
ULTRAVIOLET PHOTOELECTRON SPECTROSCOPY
Figure 7. Various types of wave functions encountered in UPS. Propagating bulk states give rise to transitions that conserve all three k components; evanescent states ignore the k? component (Smith and Himpsel 1983).
727
be a good quantum number. The states truly specific to the surface are distinguished by their lack of interaction with bulk states, which usually requires that they are located at points in the Eðkk Þ diagram where bulk states of the same symmetry are absent. This is revealed in band diagrams where the Eðkk Þ band dispersions of surface states are superimposed on the regions of bulk bands projected along k? . On metals, one finds fairly localized, d-like surface states and delocalized s,p-like surface states. The d states carry most of the spin polarization in ferromagnets. Their cross-section starts dominating relative to s,p states as the photon energy is increased. An example of an s,p-like surface state is given in Figure 4. On the Cu(111) surface a pz-like surface state appears close to the Fermi level (S1 ). It is located just above the bottom of the L02 -L1 gap of the bulk s, p band. A very basic type of s,p surface state is the image state in a metal. The negative charge of an electron outside a metal surface induces a positive image charge that binds the electron to the surface. In semiconductors, surface states may be viewed as broken bond orbitals.
Electronic Phase Transitions the energy distribution of the joint density of initial and final states, which includes the optical matrix element (Grobman et al., 1975). Angle-resolved UPS spectra provide the E(k) band dispersions that characterize electrons in a solid completely. Since the probing depth is several atomic layers in UPS, it is possible to determine these properties for the bulk as well as for the surface.
Energy Bands in Solids The mapping of energy bands in solids and at surfaces is based on the conservation laws for energy and parallel momentum (Equations 1 and 2). Figures 3, 4, and 5 give examples. Presently, energy bands have been mapped for practically all elemental solids and for many compounds. Such results are compiled in Landolt-Bo¨ rnstein (1989, 1994). For the ferromagnets, see Himpsel et al. (1998). Since energy band dispersions comprise the full information about electrons in solids, all other electronic properties can in principle be obtained from them. Such a program has been demonstrated for optical properties (Chelikowski and Cohen, 1976; Smith et al., 1982). For ferromagnetism, the most significant band parameters are the exchange splitting dEex between majority and minority spin bands and the distance between the top of the majority spin d bands and the Fermi level (Stoner gap). The exchange splitting is loosely related to the magnetic moment (1 eV splitting per Bohr magneton), the Stoner gap to the minimum energy for spin flip excitations (typically from a few tenths of an electron volt down to zero).
Surface States The electronic structure of a surface is characterized by 2D energy bands that give the relation between E and kk . The momentum perpendicular to the surface, k? , ceases to
The electronic states at the Fermi level are a crucial factor in many observed properties of a solid. States within a few kT of the Fermi level determine the transport properties, such as electrical conductance, where k is the Boltzmann constant and T the temperature in kelvin. They also drive electronic phase transitions, such as superconductivity, magnetism, and charge density waves. Typically, these phase transitions open a gap at the Fermi level of a few multiples of kTC , where TC is the transition temperature. Occupied states near the Fermi level move down in energy by half the gap, which lowers the total energy. In recent years, the resolution of UPS experiments has reached a level where sub-kT measurements have become nearly routine. The best-known examples are measurements of the gap in high-temperature superconductors (Olson et al., 1989; Shen et al., 1993; Ding et al., 1996). Compared to other techniques that probe the superconducting gap, such as infrared absorption and tunneling, angle-resolved photoemission provides its k dependence.
Atoms and Molecules Atoms and molecules in the gas phase exhibit discrete energy levels that correspond to their orbitals. There is an additional fine structure due to transitions between different vibrational and rotational states in the lower and upper states of the photoemission process. This fine structure disappears when molecules are adsorbed at surfaces. Nevertheless, the envelope function of the molecular orbitals frequently survives and facilitates the fingerprinting of adsorbed species. A typical molecular photoelectron spectrum is shown in Figure 8 (Turner et al., 1970), a typical spectrum of an adsorbed molecular fragment in Figure 9 (Sutherland et al., 1997). Ultraviolet photoelectron spectroscopy has been used to study surface reactions in heterogeneous catalysis and
728
OPTICAL IMAGING AND SPECTROSCOPY
Figure 8. UPS spectrum of the free CO molecule, showing several molecular orbitals and their vibrational fine structure. Note that the reference level is the vacuum level EV for atoms and molecules, whereas it is the Fermi level EF for solids (Turner et al., 1970).
in semiconductor surface processing. For example, the methyl and ethyl groups adsorbed on a silicon surface in Figure 9 exhibit the same C 2s orbital splittings as in the gas phase (compare Pireaux et al., 1986), allowing a clear identification of the fragments that remain on the surface after reaction with dimethylsilane and diethylsilane. Gen-
erally, one distinguishes the weak adsorption on inert surfaces at low temperatures (physisorption regime) and strong adsorption on reactive surfaces at room temperature (chemisorption). During physisorption, molecular orbitals are shifted rigidly toward higher energies by 1 to 2 eV due to dielectric screening of the valence holes created in the photoemission process. Essentially, surrounding electrons in the substrate lower their energy by moving toward the valence hole on the adsorbate molecule, generating extra energy that is being transferred to the emitted photoelectron. In the chemisorption regime, certain molecular orbitals react with the substrate and experience an additional chemical shift. A well-studied example is the adsorption of CO on transition metals, where the chemically active 5s orbital is shifted relative to the more inert 4s and 1p orbitals. The orientation of the adsorbed molecules can be determined from the angle and polarization dependence of the UPS spectra using dipole selection rules. If the CO molecule is adsorbed with its axis perpendicular to the surface and photoelectrons are detected along that direction, the s orbitals are seen with the electric field vector E of the light along the axis and the p orbitals with E perpendicular to it. Similar selection rules apply to mirror planes and make it possible to distinguish even and odd wave functions.
PRACTICAL ASPECTS OF THE METHOD Light Sources
Figure 9. Fingerprinting of methyl and ethyl groups deposited on a Si(100) surface via reaction with dimethylsilane and diethylsilane. The lowest orbitals are due to the C 2s electrons. The number of carbon atoms in the chain determines the number of C 2s orbitals (Sutherland et al., 1997).
To be able to excite photoelectrons across a valence band that is typically 10 to 20 eV wide, the photon energy has to exceed the valence band width plus the work function (typically 3 to 5 eV). This is one reason conventional lasers have been of limited applicability in photoelectron spectroscopy, except for two-photon pump-probe experiments with short pulses (Haight, 1995). The most common source of UV radiation is based on a capillary glow discharge using the He(I) line. It provides monochromatic radiation at 21.2 eV with a line width as narrow as 1 meV (Baltzer et al., 1993; a powerful electron cyclotron resonance source is marketed by Gammadata). This radiation originates from the 2p-to-1s transition in neutral He atoms. Emission can also be produced from He ions, whose primary He(II) emission line is at an energy of 40.8 eV. Similarly, emission lines of Ne(I) at 16.8 eV and Ne(II) at 26.9 eV are being used in photoelectron spectroscopy. During the last two decades, synchrotron radiation has emerged both as a powerful and convenient excitation source in photoelectron spectroscopy (Winick and Doniach, 1980; Koch, 1983; Winick et al., 1989, some synchrotron light sources are listed in Appendix A). Synchrotron radiation is emitted by relativistic electrons kept in a circular orbit by bending magnets. In recent years, undulators have been developed that contain 10 to 100 bends in a row. For a well-focused electron beam and a highly perfect magnetic field, photons emitted from all bends are superimposed coherently. In this case, the amplitudes add up, as opposed to the intensities, and a corresponding increase in spectral brilliance by 2 to 4 orders of magnitude is
ULTRAVIOLET PHOTOELECTRON SPECTROSCOPY
achievable. The continuous synchrotron radiation spectrum shifts its weight toward higher energies with increasing energy of the stored electrons and increasing magnetic fields. A variety of electron storage rings exists worldwide that are dedicated to synchrotron radiation. Three spectral regions can be distinguished by their scientific scope: For the photon energies of 5 to 50 eV required by UPS, storage ring energies of 0.5 to 1 GeV combined with undulators are optimum. For core-level spectroscopies at 100- to 1000-eV photon energy, a storage ring energy of 1.5 to 2 GeV is the optimum. For studies of the atomic structure of solids, even shorter wavelengths are required, which correspond to photon energies of 5 to 10 keV and storage ring energies of 6 to 8 GeV combined with an undulator. Bending magnets emit higher photon energies than undulators due to their higher magnetic field. Synchrotron radiation needs to be monochromatized for photoelectron spectroscopy. The typical range of UPS is covered best by normal-incidence monochromators, which provide photon energies up to 30 eV with optimum intensity and up to 50 eV with reduced intensity, while suppressing higher energy photons from higher order diffracted light and harmonics of the undulators. Resolutions of better than 1 meV are routinely achievable now. Synchrotron radiation has a number of desirable properties. The most widely utilized properties are a tunable photon energy and a high degree of polarization. Tunable photon energy is necessary for reaching the complete momentum space by varying k? , for distinguishing surface from bulk states, and for adjusting relative cross-sections of different orbitals. Linear polarization (in the plane of the electron orbit) is critical for determining the point group symmetry of electrons in solids. Circular polarization (above and below the orbit plane) allows spin-specific transitions in magnetic systems. Some synchrotron facilities are listed in Appendix A, and figures of merit to consider regarding light sources and detectors for UPS are listed in Appendix B. Electron Spectrometers The most common type of photoelectron spectrometers have been the cylindrical mirror analyzer (CMA) for high-throughput, angle-integrated UPS and the hemispherical analyzer for high-resolution, angle-resolved UPS. Popular double-pass CMAs were manufactured by PHI. With hemispherical spectrometers, resolutions of 30 eV), where the kinetic energy of the photoelectrons becomes large compared to the Fourier components of the lattice
Figure 10. Location of direct transitions in k space, shown for a Ni(001) surface along the [110] emission azimuth. The circular lines represent the location of possible upper states at a given kinetic energy using empty lattice bands. The lines fanning out toward the top represent different emission angles from the [001] sample normal (Himpsel, 1983).
potential.Various ‘‘absolute’’ methods are available to refine the k values. These methods determine the k? component by triangulation from different crystal faces or by mirror symmetry properties of high-symmetry lines (see Plummer and Eberhardt, 1982; Himpsel, 1983; Kevan, 1992). The thermal broadening of the Fermi function that cuts off photoelectron spectra at high energy is proportional to kT. Taking the full width at half-maximum of the derivative of the Fermi function, one obtains a width 3.5kT, which is 0.09 eV at room temperature and 1 meV at liquid He temperature. The half-height points of the derivative correspond to the 15% and 85% levels of the Fermi function. Sensitivity Limits A typical measurement volume in UPS is an area 1 mm2 times a depth of 1 nm. In photoelectron microscopes, it is possible to reduce the sampling area to dimensions of 0. It can be seen that the curves do not quite have a periodic form. The curve front moves toward a lower c value with increasing cycle number. In the case where k2 is high (0.3), the curves show a spiral shape. To determine the value of the refractive index n2 for a nonabsorbing film k2 ¼ 0, a large number of theoretical curves have to be drawn and compared with the experimental values. The data-fitting procedure is much more complicated for absorbing films that have a complex refractive index. In such a case, both n and k values have to be changed, and families of curves have to be drawn for the relationship between and c. From these curves, the one that best fits the experimentally observed curves can be used to determine n2 and k2 . One can use any standard fitting tool for determining the minimum error between the experimental and predicted values or use some of the ellipsometry-specific methods in the literature (Erfemova and Arsov, 1992).
SAMPLE PREPARATION Ellipsometric measurements can be carried out only with specimens that have a high degree of reflectivity. The degrees of smoothness and flatness are both factors that must be considered in the selection of a substrate. An indication of the extent of regularity of the whole surface is obtained by determination of P and A values at several
Metal specimens with high reflectivity and purity at the surface can be prepared as films by vapor deposition in vacuum. The substrates generally used are glass microscope slides, which are first cleaned by washing and vapor phase degreasing before being introduced into the vacuum chamber, where they are cleaned by ion bombardment with argon. Adjusting parameters such as the rate and time of evaporation regulates the thickness of the evaporated film. Many other variables have an impact on the properties of evaporated films. They include the nature and pressure of residual gas in the chamber, intensity of the atomic flux condensing on the surface, the nature of the target surface, the temperature of the evaporation source and energy of the impinging atoms, and contamination on the surface by evaporated supporting material from the source. Mechanical Polishing Mechanical polishing is one of the most widely used methods in surface preparation. Generally, mechanical polishing consists of several steps. The specimens are initially abraded with silicon carbide paper of various grades, from 400 to 1000 grit. After that, successive additional fine mechanical polishing on silk disks with diamond powders and sprays (up to 1/10 mm grain size) is practiced to yield a mirror finish. This results in a bright and smooth surface suitable for optical measurements. But, it is well known that mechanical treatments change the structure of surface layers to varying depths depending on the material and the treatment. Mechanically polished surfaces do not have the same structural and optical properties as the bulk material. As a result of mechanical polishing, surface crystallinity is impaired. In addition, there exists a thin deformed layer extending from the surface to the interior due to the friction and heating caused by mechanical polishing. Such changes are revealed by the deviations in the optical properties of the surface from those of the bulk. Hence mechanical polishing is utilized as a sole preparation method, primarily if the specimen will be used as a substrate for ellipsometric study of some other surface reactions, e.g., adsorption, thermal, or electrochemical formation of oxide films and polymer electrodeposition. Chemical Polishing This method of surface preparation is generally used after mechanical polishing in order to remove the damaged layer produced by the mechanical action. A specimen is immersed in a bath for a controlled time and subsequently ultrasonically washed in an alcohol solution. The final cleaning is extremely important to remove bath components that may otherwise be trapped on the surface.
742
OPTICAL IMAGING AND SPECTROSCOPY
Various baths of different chemical compositions and immersion conditions (temperature of bath and time of treatment depending on the nature of the metal specimen) are described in the literature (Tegart, 1960). During chemical polishing, oxygen evolution is possible along with the chemical dissolution of the metal substrate. In such cases, the polished surfaces may lose their reflectivity, becoming less suitable for optical measurements. A major problem associated with chemical polishing is the possible formation of surface viscous layers that result in impurities on the surface. Care must be taken to identify and remove impurities like S, Cl, N, O, and C that might be formed by the use of chemicals. Electrochemical Polishing Electrochemical polishing is a better way to prepare metal surfaces than mechanical and chemical polishing. The metal is placed in an electrochemical cell as an anode, while a Pt or carbon electrode with a large surface area is used as the cathode. Passage of current causes the dissolution of parts of the metal surface, the peaks dissolving at a greater rate than the valleys, which leads to planarization and a subsequently reflecting surface. Electrochemical Dissolution of Natural Oxide Films Dissolution of the naturally formed oxide films can be carried out by cathodic polarization. It is very important to determine an optimum value of the cathodic potential for each metal in order to minimize the thickness of the oxide layer. For potentials more anodic than the optimum value, the dissolution is very slow and the natural oxide film is not dissolved completely. Potentials higher than this optimum increase the risk for hydrogen contamination at the surface, its penetration in the substrate, and its inclusion in the metal lattice. The formation of thin hydrated layers is also possible. PROBLEMS There are various sources, some from the equipment and others from the measuring system or the measurement process itself, that can introduce errors and uncertainties during an ellipsometric measurement. Most of these errors are correctable with certain precautions and corrections during or after the experiment. A frequent source of error is the presence of a residual oxide layer on the surface of the film under study. The ˚ ) can cause the presence of even a very thin film (50 A refractive index to be markedly different than the actual substrate value (Archer, 1962). Hence, in the case of an absorbing substrate and a nonabsorbing thin film, four optical parameters need to be determined: the refractive index and the thickness of the film, n2 and d2, and the complex refractive index coefficients of the substrate, n3 and k3. Since only two parameters are determined, and c, it is not possible to determine all four parameters using a single measurement. To estimate the optical properties of the substrate, multiple measurements are made (So and Vedam, 1972) using (1) various film thickness values, (2) various angles of incidence, or (3) various incident media.
One set of substrate optical properties consistent with all the multiple readings can then be found. Errors in measurement also occur due to surface roughness in the substrate (or the film). Predictions of the optical parameters for surfaces with very minor surface rough˚ ) have shown large errors in the determined ness (50 A refractive indices (Festenmaker and McCrackin, 1969). Hence, it is imperative that one has a good knowledge of the degree of impurities or abnormalities that a sample might have. Sample preparation is crucial to the accuracy of the results, and care must be taken to determine a method most suited for the particular application. The polarizer, compensator, and analyzer are all subject to possibly incorrect settings and hence azimuth errors. The analysis can be freed from azimuth errors by performing the analysis and averaging the results with components set at angles that are 908 apart, described as zone averaging by McCrackin et al. (1963). The magnitudes of and c vary as a function of the angle of incidence. There are several types of systematic errors encountered during the angle-of-incidence measurements that produce errors in and c. Many studies suggest that the derived constants are not true constants, but are dependent on the incidence angle (Vasicek, 1960; Martens et al., 1963). The beam area divided by the cosine of the angle of incidence gives the area of the surface under examination. Hence, there is a significant increase in the area under examination for a small change in the incident angle. A change in the angle of incidence simultaneously changes the area of the surface under examination. As a matter of convenience, it is suggested that the ellipsometric measurements be made with an angle of incidence near the principal angle, i.e., the angle at which the relative phase shift between the parallel and perpendicular components of reflected light, , is 908. The value of this principal angle depends mainly on the refractive index of the substrate, and is between 608 and 808 for typical metal substrates. Other sources of error include those arising from the deflection of the incident light beam due to an imperfection in the polarizer, multiply reflected beams, and incompletely polarized incident light. The importance of reliable and accurate optical components should therefore be stressed, since most of these errors are connected to the quality of the equipment one procures. It should also be noted that the steps after the experimental determination of and c are also susceptible to error. One typically has to input the values of and c into a numerical algorithm to compute the optical properties. One should have an idea of the precision of the computation accuracy of the input data. There are also cases in which the approximations in the relations used to compute the physical parameters are more extensive than necessary and introduce more uncertainty than a conservative approach would cause.
LITERATURE CITED Archer, R. J. 1962. Determination of the properties of films on silicon by the method of ellipsometry. J. Opt. Soc. Am. 52:970.
ELLIPSOMETRY Archer, R. J. 1964. Measurement of the physical adsorption of vapors and the chemisorption of oxygen and silicon by the method of ellipsometry. In Ellipsometry in the Measurements of Surfaces and Thin Films (E. Passaglia, R.R. Stromberg, and J. Kruger, eds.) pp. 255–272. National Bureau of Standards, Washington, DC, Miscellaneous Publication 256. Archer, R. J., 1968. Manual on Ellipsometry. Gaertner Scientific, Chicago. Arsov, Lj. 1985. Dissolution electrochemique des films anodiques du titane dans l’acide silfurique. Electrochim. Acta 30:1645– 1657. Azzam, R. M. A. and Bashara, N. M. 1977. Ellipsometry and Polarized Light. North-Holland Publishing, New York. Brown, E. B. 1965. Modern Optics. Reinhold Publishing, New York. Cahan, B. D. and Spainer, R. F. 1969. A high speed precision automatic ellipsometer. Surf Sci. 16:166. Drevillon, B., Perrin, J., Marbot, R., Violet, A., and Dalby, J. L. 1982. Fast polarization modulated ellipsometer using a microprocessor system for digital Fourier analysis. Rev. Sci. Instrum. 53(7):969–977. Drude, P. 1889. Ueber oberflachenschichten. Ann. Phys. 272:532– 560. Erfemova, A. T. and Arsov, Lj. 1992. Ellipsometric in situ study of titanium surfaces during anodization. J. Phys. France II:1353– 1361. Faber, T. E. and Smith, N. V. 1968. Optical measurements on liquid metals using a new ellipsometer. J. Opt. Soc. Am. 58(1):102–108. Festenmaker, C. A. and McCrackin, F. L. 1969. Errors arising from surface roughness in ellipsometric measurement of the refractive index of a surface. Surf. Sci. 16:85–96. Gagnaire, J. 1983. Ellipsometric study of anodic oxide growth: Application to the titanium oxide systems. Thin Solid Films 103:257–265. Ghezzo, M. 1960. Method for calibrating the analyser and the polarizer in an ellipsometer. Br. J. Appl. Phys. (J. Phys. D) 2: 1483–1485. Ghezzo, M. 1969. Method for calibrating the analyzer and polarizer in an ellipsometer. J. Phys. Ser. D 2:1483–1490. Graves, R. H. W. 1971. Ellipsometry using imperfect polarizers. Appl. Opt. 10:2679. Hauge, P. S. and Dill, F. H. 1973. Design and operation of ETA, and automated ellipsometer. IBM J. Res. Dev. 17:472–489. Hayfield, P. C. S. 1963. American Institute of Physics, Handbook. McGraw-Hill, New York. Heavens, O. S., 1955. Optical Properties of Thin Solid Films. Dover, New York. Hristova, E., Arsov, Lj., Popov, B., and White, R. 1997. Ellipsometric and Raman spectroscopic study of thermally formed films on titanium. J. Electrochem. Soc. 144:2318– 2323. Jasperson, S. N. and Schnatterly, S. E. 1969. An improved method for high reflectivity ellipsometry based on a new polarization modulation technique. Rev. Sci. Instrum. 40(6):761– 767. Martens, F. P., Theroux, P., and Plumb, R. 1963. Some observations on the use of elliptically polarized light to study metal surfaces. J. Opt. Soc. Am. 53(7):788–796. Mathieu, H. J., McClure, D. E., and Muller, R. H. 1974. Fast self-compensating ellipsometer. Rev. Sci. Instrum. 45:798– 802.
743
McCrackin, F. L. 1969. A FORTRAN program for analysis of ellipsometer measurements. Technical note 479. National Bureau of Standards, pp. 1–76. McCrackin, F. L., Passaglia, E., Stromberg, R. R., and Steinberg, H. L. 1963. Measurement of the thickness and refractive index of very thin films and the optical properties of surfaces by ellipsometry. J. Res. Natl. Bur. Stand. A67:363–377. Menzel, H. D. (ed.). 1960. Fundamental Formulas of Physics, Vol. 1. Dover Publications, New York. Nick, D. C. and Azzam, R. M. A. 1989. Performance of an automated rotating-detector ellipsometer. Rev. Sci. Instrum. 60(12):3625–3632. Ord, J. L. 1969. An elliposometer for following film growth. Surf. Sci. 16:147. Schmidt, E. 1970. Precision of ellipsometric measurements. J Opt. Soc. 60(4):490–494. Shewchun, J. and Rowe, E. C. 1971. Ellipsometric technique for obtaining substrate optical constants. J. Appl. Phys. 41(10): 4128–4138. Smith, P. H. 1969. A theoretical and experimental analysis of the ellipsometer. Surf. Sci. 16:34–66. So, S. S. and Vedam, K. 1972. Generalized ellipsometric method for the absorbing substrate covered with a transparent-film ˚ . J. Opt. Soc. system. Optical constants for silicon at 3655 A Am. 62(1):16–23. Steel, M. R. 1971. Method for azimuthal angle alignment in ellipsometry. Appl. Opt. 10:2370–2371. Tegart, W. J. 1960. Polissage electrolytique et chemique des metaux au laboratoire et dans l’industry. Dunod, Paris. Vasicek, A. 1960. Optics of Thin Films. North-Holland Publishing, New York. Winterbottom, A. W. 1946. Optical methods of studying films on reflecting bases depending on polarization and interference phenomena. Trans. Faraday Soc. 42:487–495. Zaininger, K. H. and Revesz, A. G. 1964. Ellipsometry—a valuable tool in surface research. RCA Rev. 25:85–115.
KEY REFERENCES Azzam, R. M. A. (ed.). 1991. Selected Papers on Ellipsometry; SPIE Milestone Series, Vol. MS 27. SPIE Optical Engineering Press, Bellingham, Wash. A collection of many of the path-breaking publications up to 1990 in the field of ellipsometry and ellipsometric measurements. Gives a very good historical perspective of the developments in ellipsometry. Azzam and Bashara, 1977. See above. Gives an excellent theoretical basis and provides an in-depth analysis of the principles and practical applications of ellipsometry. McCrackin et al., 1963. See above. Provides a good explanation of the practical aspects of measuring the thickness and refractice indices of thin films. Includes development of a notation for identifying different null pairs for the polarizer/analyzer rotations. Also provides a method to calibrate the azimuth scales of the ellipsometer divided circles. Passaglia, E., Stromberg, R.R., and Kruger, J. (eds.). 1964. Ellipsometry in the Measurement of Surfaces and Thin Films. Symposium Proceedings, Washington, DC, 1963. National Bureau of Standards Miscellaneous Publication 256. Presents practical aspects of ellipsometric measurements in various field.
744
OPTICAL IMAGING AND SPECTROSCOPY
APPENDIX: GLOSSARY OF TERMS AND SYMBOLS A
Analyzer settings with respect to the plane of incidence, deg C Compensator settings with respect to the plane of incidence, deg Ei Amplitude of the incident electric wave Er Amplititude of the reflected electric wave d Film thickness, cm Amplitude of the electric wave in the x axis Ex p ffiffiffiffiffiffiffi i 1 k Absorption index (imaginary part of the refractive index) n Real part of the refractive index n^ Complex refractive index P Polarizer settings with respect to the plane of incidence, deg Q Compensator settings with respect to the plane of incidence, deg R Reflectivity rp Fresnel reflection coefficient for light polarized parallel to the plane of incidence rs Fresnel reflection coefficient for light polarized perpendicular to the plane of incidence Relative phase change, deg d Change of phase of the beam crossing the film ^e Complex dielectric function l Wavelength, cm n Frequency, s1 r Ratio of complex reflection coefficients ^ s Complex optical conductivity j Angle of incidence, deg tan c Relative amplitude attenuation, deg o Angular velocity, rad LJ. ARSOV University Kiril and Metodij Skopje, Macedonia
M. RAMASUBRAMANIAN B. N. POPOV University of South Carolina Columbia, South Carolina
IMPULSIVE STIMULATED THERMAL SCATTERING INTRODUCTION Impulsive stimulated thermal scattering (ISTS) is a purely optical, non-contacting method for characterizing the acoustic behavior of surfaces, thin membranes, coatings, and multilayer assemblies (Rogers et al., 2000a), as well as bulk materials (Nelson and Fayer, 1980). The method has emerged as a useful tool for materials research in part because: (1) it enables accurate, fast, and nondestructive measurement of important acoustic (direct) and elastic (derived) properties that can be difficult or impossible to evaluate in thin films using other techniques; (2) it can be applied to a wide range of materials that occur in
microelectronics, biotechnology, optics, and other areas of technology; and (3) it does not require specialized test structures or impedance-matching fluids which are commonly needed for conventional mechanical and acoustic tests. Further, recent advances in experimental design have simplified the ISTS measurement dramatically, resulting in straightforward, low-cost setups, and even in the development of a commercial ISTS photoacoustic instrument that requires no user adjustments of lasers or optics. With this tool, automated single-point measurements, as well as scanning-mode acquisition of images of acoustic and other physical properties, are routine. ISTS, which is based on transient grating (TG) methods (Nelson and Fayer, 1980; Eichler et al., 1986), is a spectroscopic technique that measures the acoustic properties of thin films over a range of acoustic wavelengths. It uses mild heating produced by crossed picosecond laser pulses to launch coherent, wavelength-tunable acoustic modes and thermal disturbances. The time-dependent surface ripple produced by these motions diffracts a continuouswave probing laser beam that is overlapped with the excited region of the sample. Measuring the temporal variation of the intensity of the diffracted light yields the frequencies and damping rates of acoustic waves that propagate in the plane of the film. It also determines the arrival times of acoustic echoes generated by subsurface reflections of longitudinal acoustic wavepackets that are launched at the surface of the film. The wavelength dependence of the acoustic phase velocities (i.e., product of the frequency and the wavelength) of in-plane modes, which is known as the acoustic dispersion, is determined either from a single measurement that involves the excitation of acoustic waves with a well defined set of wavelengths, or from the combined results of a series of measurements that each determine the acoustic response at a single wavelength. Interpreting this dispersion with suitable models of the acoustic waveguide physics yields viscoelastic (e.g., Young’s modulus, Poisson’s ratio, acoustic damping rates, stress, etc.) and/or other physical properties (e.g., density, thickness, presence or absence of adhesion, etc.) of the films. The acoustic echoes, which are recorded in the same measurements, provide additional information that can simplify this modeling. The ISTS data also generally contain information from nonacoustic (e.g., thermal, electronic, etc.) responses. This unit, however, focuses only on ISTS measurement of acoustic motions in thin films. It begins with an overview of other related measurement techniques. It then describes the ISTS acoustic data and demonstrates how it can be used to determine: (1) the stress and flexural rigidity in thin membranes (Rogers and Bogart, 2000; Rogers et al., 2000b; Rogers and Nelson, 1995); (2) the elastic constants of membranes and supported films (Rogers and Nelson, 1995; Duggal et al., 1992; Shen et al., 1996; Rogers and Nelson, 1994; Rogers et al., 1994a); and (3) the thicknesses of single or multiple films in multilayer stacks (Banet et al., 1998; Gostein et al., 2000). Competitive and Related Techniques Acoustic properties of thin films are most commonly evaluated with conventional ultrasonic tests, which involve a
IMPULSIVE STIMULATED THERMAL SCATTERING
source of ultrasound (e.g., a transducer), a propagation path, and a detector. The ability to excite and detect acoustic waves with wavelengths that are short enough for thin film evaluation (i.e., wavelengths comparable to or smaller than the film thickness) requires high-frequency transducers/detectors fabricated directly on the sample, or coupled efficiently to it with impedance-matching liquids or gels. Both of these approaches restrict the range of structures that can be examined; they limit the usefulness of conventional acoustic tests for thin film measurement. Photoacoustic methods overcome the challenge of acoustic coupling by using laser light to excite and probe acoustic disturbances without contacting the sample. In addition to ISTS, there are two other general classes of photoacoustic techniques for measuring acoustics in thin films. In the first, a single excitation pulse arrives at the surface of the sample and launches, through mild heating, a longitudinal (i.e., compressional) acoustic wavepacket that propagates into the depth of the structure (Thomsen et al., 1986; Eesley et al., 1987; Wright and Kawashima, 1991). Parts of this acoustic disturbance reflect at buried interfaces, such as the one between the film and its support or between films in a complex multilayer stack. A variably delayed probe pulse measures the time dependence of the optical reflectivity or the slope of the front surface of the sample in order to determine the time of arrival of the various acoustic echoes. Data from this type of measurement are similar in information content to the acoustic echo component of the ISTS signal. In both cases, the data reveal the out-of-plane longitudinal acoustic velocities when the thicknesses of the films are known. The measured acoustic reflectivity can also be used to determine properties (e.g., density) that are related to the change in acoustic impedance that occurs at the interface. This technique has the disadvantage that it requires the excitation pulses to be strongly absorbed by the sample. It also typically relies on expensive and complex femtosec laser sources and relatively slow detection schemes that use probe pulses. Although it can be used to measure thicknesses accurately, it does not yield information on transversely polarized (i.e., shear) acoustic waves or on modes that propagate in the plane of the film. As a result, the only elastic property that can be evaluated easily is the out-of-plane compressional modulus. Another method uses a cylindrically-focused excitation pulse as a broadband line source for surface propagating waves (Neubrand and Hess, 1992; Hess, 1996). Examining the changes in position, intensity, or phase of at least one other laser beam that strikes the sample at a location spatially separated from the excitation region provides a means for probing these waves. The data enable reliable measurement of surface acoustic wave velocities over a continuous range of acoustic wavelengths (i.e., the dispersion) when the separation between the excitation and probing beams (or between the probing beams themselves) is known precisely. The measured dispersion can be used, with suitable models for the acoustic physics, to extract other properties (e.g., density, thickness, elastic properties) of the samples. This technique has the disadvantage that out-of-plane acoustic properties are not probed directly. Also, data that include multiple acoustic veloci-
745
ties at a single wavelength (e.g., multiple modes in an acoustic waveguide) can be difficult to interpret. We note that this method and the one described in the previous paragraph share the ‘‘sourcepropagation pathreceiver’’ approach of conventional acoustic testing techniques. They are similar also in their generation of an essentially single-cycle acoustic pulse or ‘‘wavepacket’’ that includes a wide range of wavevectors and corresponding frequencies. The combined information from these two techniques is present in the ISTS data, in a separable and more easily analyzed form. Although it is not strictly a photoacoustic method, surface Brillouin scattering (SBS; Nizzoli and Sandercock, 1990) is perhaps more closely related to ISTS than the techniques described above. An SBS experiment measures the spectral properties of light scattered at a well-defined angle from the sample. The spectrum reveals the frequencies of incoherent, thermally populated acoustic modes with wavelengths that satisfy the associated phase-matching condition for scattering into the chosen angle. The information obtained from SBS and ISTS measurements is similar. An important difference is that the ISTS technique uses coherent, laser-excited phonons rather than incoherent, thermally populated ones. ISTS signals are therefore much stronger than those in SBS and they can be detected rapidly in the time domain. This form of detection enables acoustic damping rates, for example, to be evaluated accurately without the deconvolution procedures that are necessary to interpret spectra collected with the sensitive Fabry-Perot filters that are commonly used in SBS. Also, with ISTS it is possible simultaneously to excite and monitor acoustic modes with more than one wavelength, to determine their phases, and to measure them in real-time as they propagate across the sample. These and other capabilities, which are useful for accurately evaluating films or other structures with dimensions comparable to the acoustic wavelength, are absent from traditional forms of SBS. Finally, in some cases, certain properties that can be derived from ISTS measurements (e.g., elastic constants, density, thickness, stress, etc.) can be determined with other, nonacoustic methods. Although complete descriptions of all of the possible techniques is beyond the scope of this unit, we list a few of the more established methods. 1. Elastic constants can be determined with uniaxial pull-testers, nanoindenters (Pharr and Oliver, 1992), and specialized micromechanical test structures (Allen et al., 1987). 2. Stress, and sometimes elastic constants, are typically measured with tests that use deflections or vibrations of drumhead membranes (Maden et al., 1994; Vlassak and Nix, 1992) or cantilevered beams (Mizubayashi et al., 1992), or these are inferred from strain evaluated using X-ray diffraction (Clemens and Bain, 1992; also see X-RAY AND NEUTRON DIFFUSE SCATTERING MEASUREMENTS). 3. Thicknesses of transparent films are often determined with reflectometers or ellipsometers (See ELLIPSOMETRY). For opaque films, thickness is evaluated
746
OPTICAL IMAGING AND SPECTROSCOPY
using stylus profilometry or grazing incidence x-ray reflection; in the case of conducting films, it is determined indirectly from measurements of sheet resistance. Vinci and Vlassak (1996) present a review of techniques for measuring the mechanical properties of thin films and membranes.
PRINCIPLES OF THE METHOD As mentioned above (see Introduction), ISTS uses short (relative to the time-scale of the material response of interest) pulses of light from an excitation laser to stimulate acoustic motions in a sample. The responses are measured with a separate probing laser. Figure 1 schematically illustrates the mechanisms for excitation and detection in the simplest case. Here, a single pair of excitation pulses crosses at an angle y at the surface of a thin film on a substrate. The optical interference of these pulses produces a sinusoidal variation in intensity with a period , given by ¼
le 2sinðy=2Þ
ð1Þ
The wavelength of the excitation light (le ) is chosen so that it is partly absorbed by the sample; this absorption induces heating in the geometry of the interference pattern. The resulting spatially periodic thermal expansion launches coherent, monochromatic counter-propagating surface acoustic modes with wavelength . It also simultaneously generates acoustic waves that propagate into the bulk of the sample and produce the acoustic echoes mentioned previously (Shen et al., 1996; Crimmins et al., 1998). In thin films, the former and latter responses typically occur on nanosecond and picosecond time scales, respectively. On microsecond time scales, the thermally induced strain slowly relaxes via thermal diffusion. The temporal nature of the surface motions that result from these responses can be determined in real time with each shot of the excitation laser by measuring the intensity of diffraction of a continuous wave probe laser with a fast detector and transient recorder. For motions that exceed the bandwidth of conventional detection electronics (typically 1 to 2 GHz), it is possible to use either variably delayed picosecond probing pulses to map out the response in a point-by-point fashion (Shen et al., 1996; Crimmins et al., 1998) or a continuous wave probe with a streak camera for ultrafast real-time detection. It is also possible to use, instead of the streak camera, a scanning spectral filter to resolve the motions in the frequency domain (Maznev et al., 1996). This unit focuses primarily on the in-plane acoustic modes, partly because they can be excited and detected rapidly in real-time using commercially available, low-cost laser sources and electronics. Also, these modes provide a route to measuring a wide range of thin film elastic constants and other physical properties. The following functional form describes the response, R(t), when the thermal and the in-plane acoustic motions both contribute RðtÞ / Atherm et þ
X
Bi egi ;t cosoi t
ð2Þ
i
Figure 1. Schematic illustration of the ISTS measurement. Crossed laser pulses form an optical interference pattern and induce coherent, monochromatic acoustic and thermal motions with wavelengths that match this pattern. The angle between the pulses determines the wavelength of the response. A continuous wave probing laser diffracts from the ripple on the surface of the sample. Measuring the intensity of the diffracted light with a fast detector and transient recorder reveals the time dependence of the motions.
where Atherm and Bi are the amplitudes of the thermal and acoustic responses, respectively, and the oi are the frequencies of the acoustic modes. The summation extends over the excited acoustic modes, i, of the system. The thermal and acoustic decay rates are, respectively, t and gi . In the simplest diffraction-based detection scheme, the measured signal is proportional to the diffraction efficiency, which is determined by the square of the R(t). These responses, as well as the acoustic echoes, are all recorded in a single measurement of the area of the sample that is illuminated by the excitation and probing lasers. The ISTS measurement provides spatial resolution that is comparable to this area, which is typically a circular or elliptical region with a characteristic dimension (e.g., diameter or major axis) of 50 to 500 mm. In certain situations, the effective resolution can be significantly better than this length scale (Gostein et al., 2000). Figure 2 shows data from a polymer film on a silicon substrate. The onset of diffraction coincides with the arrival of the excitation pulses at t ¼ 0. The slow decay of signal is associated with thermal diffusion (Fig. 2B); the oscillations in Fig. 2A are due to acoustic modes that propagate
IMPULSIVE STIMULATED THERMAL SCATTERING
Figure 2. Typical ISTS data from a thin polymer film (thickness 4 mm) on a silicon substrate. Part (A) shows the in-plane acoustic response, which occurs on a nanosec time scale and has a single well defined wavelength (8.32 mm) determined by the color and crossing angle of the excitation pulses. The oscillations in the signal reveal the frequencies of the different acoustic waveguide modes that are excited in this measurement. The inset shows the power spectrum. The frequencies are determined by the acoustic wavelength and the mechanical properties of the film, the substrate, and the nature of the interface between them. The acoustic waves eventually damp out and leave a nonoscillatory component of signal that decays on a microsec time scale (B). This slow response is associated with the thermal grating. Its decay rate is determined by the wavelength of the response and the thermal diffusivity of the structure. The dashed lines in (B) indicate the temporal range displayed in (A).
in the plane of the film. The frequencies and damping rates of the acoustic modes, along with the acoustic wavelength determined from Equation 1, define the real and imaginary parts of the phase velocities. On a typically faster (picosecond) time scale it is also possible to resolve responses due to longitudinal waves that reflect from the film/substrate interface. Figure 3 shows both types of acoustic responses evaluated in a single measurement on a thin metal film on a silicon substrate. Although the measured acoustic frequencies and echoes themselves can be important (e.g., for filters that use sur-
747
Figure 3. Typical ISTS data from an ultrathin metal film (thickness 200 nm) on a silicon substrate. Part (A) shows the arrival of two acoustic echoes produced by longitudinal acoustic wavepackets that are launched at the surface of the film and reflect at the interface between the film and the substrate and at the interface between the film and the surrounding air. Propagation of acoustic modes in the plane of the film causes oscillations in the signal on a nanosec time scale (B). Thermal diffusion produces the overall decay in signal that occurs on a nanosec time scale. The dashed lines in (B) indicate the temporal range displayed in (A).
face acoustic waves or thin film acoustic resonances, respectively), the intrinsic elastic properties of the films are often of interest. The acoustic echoes yield, in a simple way, the out-of-plane compressional modulus, co, when the density, r, and thickness, h, are known. The measured roundtrip time in this case defines, with the thickness, the out-of-plane longitudinal acoustic velocity, vo; the modulus is c0 ¼ rv20 . Extracting moduli from the in-plane acoustic responses is more difficult because thin films form planar acoustic waveguides that couple in- and out-of-plane compressional and shearing motions (Farnell and Adler, 1972; Viktorov, 1976). An advantage of this characteristic is that, in principle, the dispersion of the waveguide modes can be used to determine a set of anisotropic elastic constants as well as film thicknesses and densities. Determining these properties requires an accurate measurement of the dispersion of the waveguide and a detailed understanding of how
748
OPTICAL IMAGING AND SPECTROSCOPY
Figure 4. Power spectra from data collected in ISTS measurements on a thin membrane at several different acoustic wavelengths between 30 mm and 6 mm and labeled 1 to 8. The variation of the frequency with wavelength defines the acoustic dispersion. Analyzing this dispersion with physical models of the waveguide acoustics of the membrane yields the elastic constants and other properties.
acoustic waves propagate in layered systems. There are two approaches for determining the dispersion. One involves a series of measurements with different angles between the excitation pulses to determine the acoustic response as a function of wavelength. Figure 4 shows the results of measurements that determine the dispersion of a thin unsuppported membrane using this method (Rogers and Bogart, 2000). The other approach uses a single measurement performed with specialized beam-shaping optics that generate more than two excitation pulses (Rogers, 1998). Figure 5 shows data that were collected in an
Figure 5. ISTS data from a thin film of platinum on a silicon wafer. This measurement used a system of optics to generate and cross six excitation pulses at the surface of the sample. The complex acoustic disturbance launched by these pulses is characterized by six different wavelengths. The power spectrum reveals the frequency components of the signal. A single measurement of this type defines the acoustic dispersion.
Figure 6. Optical system for ISTS measurements. Pulses from an excitation laser pass through a beam-shaping optic that splits the incident pulse into an array of diverging pulses. This optic contains a set of patterns, each of which produces a different number and distribution of pulses. A pair of imaging lenses crosses some of the pulses at the surface of a sample. The optical interference pattern produced in this way defines the wavelengths of acoustic motions stimulated in the sample. Diffraction of a probe laser is used to monitor the time dependence of these motions. A pair of lenses collects the diffracted signal light and directs it to a fast photodetector. The wavelengths of the acoustic waves can be changed simply by translating the beam-shaping optic.
ISTS measurement with six crossed excitation pulses. The response in this case contains contributions from acoustic modes with six different wavelengths, which are defined by the geometry of the six-pulse interference pattern. In both cases, the size of the excitation region is chosen, through selection of appropriate lenses, to be at least several times larger than the longest acoustic wavelength. This geometry ensures precise definition of the spatial periods of the interference patterns and, therefore, of the acoustic wavelengths. The simplicity and general utility of the modern form of ISTS derives from engineering advances in optical systems that enable rapid measurement of acoustic dispersion with the two approaches described above (Rogers et al., 1997; Maznev et al., 1998; Rogers and Nelson, 1996; Rogers, 1998). Figure 6 schematically illustrates one version of the optical setup. Specially designed, phase-only beam shaping optics produce, through diffraction, a pair or an array of excitation pulses from a single incident pulse. A selected set of the diffracted pulses pass through a pair of imaging lenses and cross at the sample surface. Their interference produces simple or complex intensity patterns with geometries defined by the beam shaping optic and the imaging lenses. For the simple two-beam interference described by Equation 1, this optic typically
IMPULSIVE STIMULATED THERMAL SCATTERING
consists of a binary phase grating optimized for diffraction at the excitation wavelength. Roughly 80% of the light that passes through this type of grating diffracts into the þ1 and 1 orders. The imaging lenses cross these two orders at the sample to produce a sinusoidal intensity pattern with periodicity when the magnification of the lenses is unity and the period of the grating is 2 . More complex beam-shaping optics produce more than two excitation pulses, and, therefore, interference patterns that are characterized by more than one period. In either case, a useful beam-shaping optic contains many (20 to 50) different spatially separated diffracting patterns. The excitation geometry can then be adjusted simply by translating this optic so that the incident excitation pulses pass through different patterns. With this approach, the response of the sample at many different wavelengths can be determined rapidly, without moving any of the other elements in the optical system. Detection is accomplished by imaging one or more diffracted probe laser beams onto the active area of a fast detector. When the intensity of these beams is measured directly, the signal is proportional to the product of the square of the amplitude of the out-of-plane acoustic displacements (i.e., the square of Equation 2) and the intensity of the probing light. Heterodyne detection approaches, which measure the intensity of the coherent optical interference of the signal beams with collinear reference beams generated from the same probe laser, provide enhanced sensitivity. They also simplify data interpretation since the heterodyne signal, S(t), is linear in the material response. In particular, in the limit that the intensity of the reference beam, Ir , is large compared to the diffracted signal SðtÞ / j
pffiffiffiffi pffiffiffiffiffi pffiffiffiffiffiffiffiffi Ip RðtÞ þ Ir eij j Ir þ 2 Ip Ir RðtÞcos j
ð3Þ
where j is the phase difference between the reference and diffracted beams and Ip is the intensity of the probing beam. A general scheme for heterodyne detection that uses the beam-shaping optic to produce the excitation pulses, as well as to generate the reference beam, is a relatively new and remarkably simple approach that makes this sensitive detection method suitable for routine use (Maznev et al., 1998; Rogers and Nelson, 1996). Figure 7 schematically illustrates the optics for measurements on transparent samples; a similar setup can be used in reflection mode. With heterodyne and nonheterodyne detection, peaks in the power spectrum of the signal define frequencies of the acoustic responses. In the case of nonheterodyne signals, sums and differences and twice these frequencies (i.e., cross terms that result from squaring Equation 2) also appear. Figure 8 compares responses measured with and without heterodyne detection. The measured dispersion can be combined with models of the waveguide acoustics to determine intrinsic material properties. The general equation of motion for a nonpiezoelectric material is given by q2 uj cijkl q2 uk q ðrÞ quj ¼ þ r qt2 r qxi qxl qxi ik qxk
ð4Þ
749
Figure 7. Optical system for ISTS measurements with heterodyne detection. Both the probe and the excitation laser beams pass through the same beam-shaping optic. The excitation pulses are split and recombined at the surface of the sample to launch the acoustic waves. The probe light is also split by this optic into more than one beam. For this illustration, the beam shaping optic produces a single pair of probing and excitation pulses. One of the probe beams is used to monitor the material response. The other emerges collinear with the signal light and acts as a reference beam for heterodyne amplification of signal generated by diffraction of the beam used for probing. The beam-shaping optic and imaging lenses ensure overlap of the signal and reference light. An optical setup similar to the one illustrated here can be used for measurements in reflection mode.
where c is the elastic stiffness tensor, u is the displacement vector, r is the density of the material and r is the residual stress tensor. Solutions to this equation, with suitable boundary conditions at all material interfaces, define the dispersion of the velocities of waveguide modes in arbitrary systems (Farnell and Adler, 1972;Viktorov, 1976). Inversion algorithms based on these solutions can be used to determine the elastic constants, densities, and/or film thicknesses from the measured dispersion. For thickness determination, the elastic constants and densities are typically assumed to be known; they are treated as fixed parameters in the inversion. Similarly, for elastic constant evaluation, the thicknesses and densities are treated as fixed parameters. It is important to note, however, that only the elastic constants that determine the velocities of modes with displacement components in the vertical (i.e., sagittal) plane can be determined, because ISTS probes only these modes. Elastic constants that govern the propagation of in-plane shear acoustic waves polarized in the plane of the film, for example, cannot be evaluated. Figure 9 illustrates calculated distributions of displacements in the lowest six sagittal modes for a polymer film that is strongly bonded to a silicon substrate. As this figure illustrates, the modes involve coupled in- and out-of-plane shearing and compressional motions in the film and the substrate. The elastic constants and densities of the film and substrate materials and the thickness of the film determine the spatial characters and velocities of the modes. Their relative contributions to the ISTS signal are determined by their excitation efficiency and by their diffraction efficiency; the latter is dictated by the amount of surface ripple that is associated with their motion.
PRACTICAL ASPECTS OF THE METHOD The setup illustrated in Figure 6 represents the core of the experimental apparatus. The entire system can either
750
OPTICAL IMAGING AND SPECTROSCOPY
Figure 8. ISTS signal and power spectrum of the signal from an electroplated copper film (thickness 1.8 mm) on a layer of silicon dioxide (thickness 0.1 mm) on a silicon substrate at an acoustic wavelength 7 mm. Part (A) shows the signal obtained without heterodyne amplification. The fact that the signal is quadratic with respect to the material displacement results in what appears as a fast decay of the acoustic oscillations. This effect is caused by the decay of the background thermal response rather than a decay in the acoustic component of the signal itself. Part (B) shows the heterodyne signal; the spectrum contains two peaks corresponding to Rayleigh and Sezawa waves, i.e., the two lowest-order modes of the layered structure. The signal without heterodyning shows these frequencies and combinations of them, due to the quadratic dependence of the signal on the response. The peak width in the spectrum of the signal in (A) is considerably larger than that in (B). This effect is due to the contribution of the thermal decay to the peak width in the nonheterodyne case. Note also that the heterodyne signal appears to drop below the baseline signal measured before the excitation pulses arrive. This artifact results from the insensitivity of the detector to DC levels of light.
be obtained commercially or most of it can be assembled from conventional, off-the-shelf components (lenses, lasers, etc.). The beam-shaping optics can be obtained as special order parts from commercial digital optics vendors. Alternatively, they can be custom built using relatively simple techniques for microfabrication (Rogers, 1998).
Figure 9. Computed displacement distributions for the lowest six waveguide modes in a polymer film supported by a silicon substrate. These modes are known as Rayleigh or Sezawa modes. They are excited and probed efficiently in ISTS measurements because they involve sagittal displacements that couple strongly both to the thermal expansion induced by the excitation pulses and to surface ripple, which is commonly responsible for diffracting the probe laser. The spatial nature of these modes and their velocities are defined by the mechanical properties of the film and the substrate, the nature of the interface between them, and the ratio of the acoustic wavelength to the film thickness.
The excitation pulses are typically derived from a flash lamp or diode-pumped Nd:YAG laser, although any source of laser pulses with good coherence and mode properties can be used. The pulse duration must be short compared to the temporal period of the excited acoustic wave. In many instances, pulses shorter than 300 psec
IMPULSIVE STIMULATED THERMAL SCATTERING
are suitable. Pulse energies in the range of one or a few mJ provide adequate excitation for most samples. Nonlinear crystals can be used to double, triple, quadruple, or even quintuple the output of the Nd:YAG (wavelength 1064 nm) to produce 532, 355, 266, or 213 nm light. The color is selected so that a sufficient fraction of the incident light is absorbed by the sample. Ultraviolet (UV) light is a generally useful wavelength in this regard, although the expense and experimental complexity of using multiple nonlinear crystals represent a disadvantage of UV light when a Nd:YAG laser is used. Nonlinear frequency conversion can be avoided by depositing thin absorbing films onto samples that are otherwise too transparent or reflective to examine directly at the fundamental wavelength of the laser. The probe laser can be pulsed or continuous wave. For most of the data presented in this unit, we used a continuous-wave infrared (850 nm) diode laser with a power of 200 mW. Its output is electronically gated so that it emits only during the material response, which typically lasts no longer than 100 msec after the excitation pulses strike the sample. Alignment of the optics (which is performed at the factory for commercial instruments) ensures that the probing beam overlaps the crossed excitation beams and that signal light reaches the detector. The size of the beams at the crossing point is typically in the range of one or several hundred microns. The alignment of the probe laser can be aided by the use of a pinhole aperture to locate the crossing point of the excitation beams. Routine measurement is accomplished by placing the surface of the sample at the intersection of the excitation and probing laser beams, moving the beam-shaping optic to the desired pattern(s), recording data, and interpreting the data. The sample placement is most easily achieved by moving the sample along the z direction (Fig. 6) to maximize the measured signal, which can be visualized in real time on a digitizing oscilloscope. The surface normal of the sample is adjusted to be parallel to the bisector of the excitation beams. Recorded data typically consist of an average of responses measured for a specified number of excitationprobing events. The strength of the signal, the required precision, and the necessary measurement time determine the number of averages: between one hundred and one thousand is typical. This averaging requires 1/2 rather than spin 1/2 nuclei. The internal structure of materials can also be imaged by x-ray tomography and ultrasonic methods, which depend on the existence of inhomogeneities in the distribution of x-ray absorbers, or acoustic properties, respectively. The stress and pressure dependences of the NQR frequency are generally much stronger than those of NMR imaging, and therefore NQR is a superior method for mapping the temperature or stress distribution across a sample. Such stress distributions can be determined for optically transparent material by using polarized light; for opaque material, there may be no alternative methods. This unit aims to be a general and somewhat cursory review of the theory and practice of modern onedimensional (1D) and two-dimensional (2D) NQR and NQR imaging. It will certainly not be comprehensive. For a more general grounding in the theory of the method, one may best go back to the classic works of Abragam (1961) and Das and Hahn (1958); more modern reviews are referenced elsewhere in the text. PRINCIPLES OF THE METHOD Nuclear Moments Elementary electrostatics teaches us that the nucleus, like any electromagnetic distribution, can be physically treated from the outside as a series of electric and magnetic moments. The moments of the nucleus follow a peculiar alternating pattern; nuclei in general have an electric monopole (the nuclear charge), a magnetic dipole moment, electric quadrupole moment, magnetic octopole moment, and electric hexadecapole moment. The last two, while they have been observed (Liao and Harbison, 1994) have little practical significance, and will not be examined further. The magnetic dipole moment of the nucleus enables NMR, which is treated elsewhere (NUCLEAR MAGNETIC RESONANCE IMAGING). It is the electric quadrupole moment of the nucleus that enables NQR. Whether or not a nucleus has an electric quadrupole moment depends on its total spin angular momentum, I, a property that is quantized in half-integer multiples of the quantum of action, h (Planck’s constant), and often
776
RESONANCE METHODS
referred to in shorthand form as the nuclear spin. Nuclei with spin zero, such as 12C, have neither a magnetic dipole nor an electric quadrupole moment. Nuclei with spin 1/2 have magnetic dipole moments but not electric quadrupole moments, while nuclei with spin 1 or higher have both magnetic dipole moments and electric quadrupole moments. For reasons connected with the details of nuclear structure and particularly the paucity of odd-odd nuclei (those with odd numbers of both protons and neutrons) there are few nuclei with integer spin: the naturally occurring spin 1 nuclei are 2H, 6Li, and 14N, while 10B, 40K, and 50 V are the single examples of spin 3, 4, and 6 respectively. The quadrupole moments of 2H and 6Li are extremely small, making direct nuclear quadrupole resonance impossible except through the use of SQUIDs (superconducting quantum interference devices, see Practical Aspects of the Method and (TECHNIQUES TO MEASURE MAGNETIC DOMAIN STRUCTURES). The higher integer spin nuclei are of no experimental importance, and therefore the only integer spin nucleus that has been studied much by NQR is 14N. In contrast, half-integer spin nuclei are liberally spread across the periodic table, and they have therefore been the subject of most NQR work. They range from the very common spin 3/2, for which there are scores of different stable isotopes, through the comparatively rare spin 9/2, which is present in a few, such as 87Sr and 209Bi. Nuclear electric quadrupole moments are typically about eight orders of magnitude smaller than molecular electric quadrupole moments. In Figure 1, they are listed in SI units of C m2, though it is common to divide out the charge of the electron and give them in units of area. They vary from the tiny moment of 1:6 1050 C m2 for 6Li to the comparatively huge value of 6:1 1047 C m2 in 179Hf. They can be directly measured by atomic beam methods, but are frequently known only with very rough precision; for example, recent measurements of the moment of 137Ba
have varied from Q ¼ 0:228 barns (1 barn ¼ 1028 m2)to Q ¼ 0:35 barns (to convert Q in barns to eQ in C m2 multiply by 1:602 1047 ). An extensive tabulation of measurements of nuclear moments is available in Raghavan (1989). As will be discussed below, such moments cannot easily be independently determined by NQR. Nuclear Couplings The Electric Field Gradient. Group theory dictates that nuclear moments couple only with the derivative of the local electric or magnetic field that possesses the same symmetry. This means that the nuclear magnetic dipole couples only with the magnetic field, and the electric quadrupole couples with the electric field gradient. In NMR, the magnetic field is usually externally applied and in most cases dwarfs the local field from the paramagnetism of the sample. However, in NQR, the local electric field gradient dwarfs any experimentally feasible external field gradient. As a ball park estimate of the local field gradient (rE) in an ionic crystal, we can compute the magnitude of the field gradient at a distance of r ¼ 0:2 nm from a unipositive ion (qr ), using rE ¼
^ 1Þqr ð3^ rr 4pe0 r3
ð1Þ
^ is a unit where e0 is the permitivity of the vacuum and r vector. This yields a gradient of 3:6 1020 V m2, or between 8 and 9 orders of magnitude greater than the external field gradients that can currently be produced in the laboratory. NQR is therefore carried out using only the local field gradient of the sample. The electric field gradient rE is a traceless symmetric second-rank tensor with five independent elements that can be expressed in terms of two magnitudes and three
Figure 1. Spin and quadrupolar moments (in C m2 \times 1050) of quadrupolar nuclei.
NUCLEAR QUADRUPOLE RESONANCE
Euler angles defining its directionality. The magnitude of the field gradient tensor along its zz direction in its principal axis system ðrEÞzz is often given the shorthand notation eq, with e, the electric charge, factored out. The asymmetry parameter Z is defined by Z¼
ðrEÞyy ðrEÞxx ðrEÞzz
The Electric Quadrupole Hamiltonian. The coupling between the electric quadrupole moment and the ambient electric field gradient is expressed mathematically by the electric quadrupole coupling Hamiltonian. H¼
ð2Þ
Electric field gradients can be computed for nuclear sites in ionic lattices and in covalently bound molecules. For lattice calculations, point-charge models are rarely adequate, and in fact often give results that are entirely wrong. In most cases, we find that fixed and induced dipole and quadrupole moments need to be introduced for each atom or residue in order to get accurate values. For example, in calculations of the field gradient at the 14N and cationic sites in Ba(NO3)2 and Sr(NO3)2 (Kye, 1998), we include the charges, the quadrupole moment of the nitrate ion (treated as a single entity and calculated by ab initio methods), the anisotropic electric polarizability of the nitrate ion (obtained from the refractive indices of a series of nitrates via the Clausius-Mosotti equation), and the quadrupole polarizability of the cations (calculated by other authors ab initio). We omit the electric polarizability of the cations only because they sit at a center of crystal symmetry and the net dipole moment must inevitably be zero; otherwise, this would be an important contribution. Briefly, after calculating the field and field gradient for each site using point charges, we compute the induced dipole at the nitrate as a vector using the nitrate polarizability tensor, and recalculate the field iteratively until convergence is obtained. We repeat this procedure again after introducing the quadrupole moment of the nitrate and quadrupole polarizabilities of the cations. Convergence of the lattice sums may be facilitated by an intelligent choice of the unit cells. The contributions of the various terms to the total field gradient at the nitrate in Ba(NO3)2 and Pb(NO3)2 are shown in Figure 2. In covalent molecules, field gradients at the nucleus may be adequately obtained using computational modeling methods.
777
h i e2 qQ Z 2 2 3Iz2 IðI þ 1Þ þ ðIþ þ I Þ 4hIð2I 1Þ 2
ð3Þ
The quantity e2 qQ=h is the so-called electric quadrupole coupling constant, and is a measure of the strength of the coupling, dimensioned, as is usual in radiofrequency spectroscopy, in frequency units. Iz , Iþ , and I are nuclear spin operators defined in the eigenbasis of the field gradient tensor. The Sternheimer Effect and Electron Deformation Densities. One major complication of lattice calculations of electric field gradients is the Sternheimer antishielding phenomenon. Briefly, an atom or ion in the presence of a field gradient deforms into an induced quadrupole. The induced quadrupole does not merely perturb other atoms in the lattice; it also changes the field gradient within the electron distribution, usually increasing it, and often by a large amount. The field experienced by the nucleus is related to the external field by rEnuclear ¼ ð1 g1 ÞrEexternal
ð4Þ
where g1 is the Sternheimer antishielding factor. Antishielding factors can be calculated by ab initio methods, and are widely available in the literature. They vary from small negative values for ions isoelectronic with helium, to positive values of several hundred for heavy ions. A representative selection of such factors for common ions is given in Table 1. Energy Levels at Zero Field Spin 1. At zero field, the electric quadrupole coupling is generally the only important component of the spin Hamiltonian. If the asymmetry parameter is zero, the Hamiltonian is already diagonal in the Cartesian (field gradient) representation, and yields nuclear energy levels of E1 ¼
e2 qQ ; 4h
E0 ¼
e2 qQ 2h
ð5Þ
There are two degenerate transitions, between m ¼ 1 ! m ¼ 0 and m ¼ 0 ! m ¼ þ1, appearing at a frequency oQ ¼ E1 E0 ¼
Figure 2. Contributions of the point charges, induced dipole moments, fixed quadrupolar and induced quadrupole moments to the total electric field gradient in ionic nitrates (from Kye and Harbison, 1998).
3e2 qQ 4h
ð6Þ
If Z 6¼ 0, the degeneracy is lifted and the Hamiltonian is no longer diagonal in any Cartesian representation, meaning that the nuclear spin states are no longer pure eigenstates of the angular momentum along any direction, but rather are linear combinations of such eigenstates. The three energies are given by the equation E ¼
e2 qQ ð1 ZÞ; 4h
E0 ¼
e2 qQ 2h
ð7Þ
778
RESONANCE METHODS Table 1. Sternheimer Quadrupole Antishielding Factors for Selected Ions Ion þ
Li B3þ N5þ Naþ Al3þ Cl Kþ Ca2þ Br Rbþ Nb5þ I
Free Ion a
b;c
d
Crystal Ion
e
0.261 , 0.257 , 0.256 , 0.248 0.146a , 0.145b;c;d , 0.142e 0.101a 5.029a , 5.072b , 4.50.1c;d;e 2.434a , 2.462b , 2.59d , 2.236c 82.047a , 83.5b , 53.91c , 49.28g , 63.21h 18.768a , 19.16b , 17.32i , 12.17c , 12.84g , 18.27h 14.521a , 13.95b , 12.12b , 13.32h 195.014a , 210.0b , 99.0g , 123.0i 51.196a , 54.97b , 49.29g , 47.9j 22.204a 331.663a , 396.60b , 178.75g , 138.4k
0.282a , 0.271 f 0.208a , 0.189b 0.110a 7.686a , 4.747 f 5.715a , 3.217 f 38.915a , 27.04 f 28.701a , 22.83 f 25.714a , 20.58 f 97.424a 77.063a 28.991a 317.655a
a
Sen and Narasimhan (1974). Feiock and Johnson (1969). c Langhoff and Hurst (1965). d Das and Bersohn (1956). e Lahiri and Mukherji (1966). f Burns and Wikner (1961). g Wikner and Das (1958). h Lahiri and Mukherji (1967). i Sternheimer (1963). j Sternheimer and Peierls (1971). k Sternheimer (1966). b
And there are now three observable transitions, two of which correspond to the degenerate transitions for Z ¼ 0, and one which generally falls at very low frequency. o ¼
3e2 qQ Z 1 ; 4h 3
o0 ¼
e2 qQZ 2h
ð8Þ
Observation of a single NQR transition at zero field for N is an indication that the asymmetry parameter is zero (or that one transition has been missed). Observation of a pair of transitions for any species permits the q.c.c. and Z to be extracted, and thence all of the magnitude information for the tensor to be obtained. The low-frequency transition is seldom observed, and in any case contains redundant information.
Since this frequency is a function of both e2qQ=h and Z, those quantities cannot be separately measured by simple NQR, though they can be obtained by a variety of twodimensional methods (see Practical Aspects of the Method). Spin 5/2. At Z ¼ 0, the energy levels of the spin 5/2 nucleus are given by
14
Spin 3/2. The energy levels of spin 3/2 nuclei are given
E1=2 ¼
e2 qQ ; 5h
E3=2 ¼
e2 qQ ; 20h
e5=2 ¼
e2 qQ 4h
ð11Þ
Transitions between the m ¼ 1=2 and 3=2 states, and between the 3=2 and 5=2 states are allowed. These appear at o1 ¼
3e2 qQ ; 20h
o2 ¼
3e2 qQ 10h
ð12Þ
by 1=2 1=2 e qQ Z2 e2 qQ Z2 1þ 1þ ¼ : E3=2 ¼ 4h 4h 3 3 2
E1=2
ð9Þ As can be seen, the 1/2 and 1=2 states are degenerate, as are the 3/2 and 3=2 states. This degeneracy is not lifted by a nonzero asymmetry parameter, which mixes the degenerate states without splitting them. The single quadrupolar frequency is
oQ ¼
1=2 e2 qQ Z2 1þ 2h 3
ð10Þ
and obviously lead to a 2:1 ratio of frequencies. If the asymmetry parameter is nonzero, there are still two transition frequencies, but these deviate from a 2:1 ratio. Solving for them requires finding the roots of a cubic secular equation. The algebraic solutions have been published (Creel et al., 1980; Yu, 1991); more useful are the q.c.c. and Z in terms of the measured frequencies sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi e2 qQ 20 ðn22 þ n2 n1 þ n21 Þ cos ðf=3Þ ¼ 7 h 3 pffiffiffi ð13Þ Z ¼ 3tanðf=3Þ 3=2 7 ðn2 þ 2n1 Þðn2 n1 Þð2n2 þ n1 Þ cosf ¼ 20 ðn22 þ n2 n1 þ n21 Þ
NUCLEAR QUADRUPOLE RESONANCE
779
with n1;2 ¼ o1;2 =2p. Thus, measurement of both frequencies for a spin 5/2 nucleus such as 127I gives a complete set of magnitude information about the quadrupolar tensor. In addition to these two transitions, for Z 6¼ 0 the 1=2 ! 5=2 transition becomes weakly allowed, and can occasionally be observed. However, if the two single-quantum frequencies are already known, this nearly forbidden transition gives only redundant information. Higher-Spin Nuclei. Spin 7/2 nuclei have three, and spin 9/2 four observable transitions at zero field. At Z ¼ 0, these frequencies are in simple algebraic ratio, and can be easily obtained from the Hamiltonian given in Abragam (1961). If Z 6¼ 0, the situation is more complicated, although analytical expressions for the spin 7/2 frequencies are still available by solution of a quartic secular equation (Creel, 1983). Any pair of frequencies (if assigned) is sufficient to give e2 qQ=h and Z. First-Order Zeeman Perturbation and Zeeman Line Shapes Most NQR spectra are collected not at zero field, but in the earth’s field. In practice, in most cases, the effect of this small field is unobservable, save for certain systems with extremely narrow NQR lines. In these cases, most frequently highly ordered ionic crystals, Zeeman effects may be observed. An external field introduces a preferential external reference from of the system, and for a single crystal (the simplest case) it therefore makes the spectrum dependent on the relative orientation of magnetic field and rE. For spin 3/2 systems (which are most frequently studied) the effect is to split the single transition into a pair of doublets whose splittings depend on the size of the magnetic field and its orientation with respect to the field-gradient tensor. Analytical expressions for the positions of these lines have been determined (Creel, 1983); in addition, certain preferred orientations of the tensor relative to field give no splitting, and the loci of these cones of zero splitting can be used to get the field-gradient tensor orientation, which is not directly available from NQR at zero field (Markworth et al., 1987). In powders, Zeeman-perturbed spectra of spin 3/2 nuclei have a characteristic lineshape with four maxima; these line shapes can be used to determine the asymmetry parameter (Morino and Toyama, 1961). Figure 3 shows the Zeeman-perturbed spectra of potassium chlorate (KClO3), an excellent test sample for NQR, which has a 35Cl resonance at 28.08 MHz. The spectra were obtained in a probe with an external field coil oriented parallel to the Earth’s field, and shown for a series of coil currents. As can be seen, the minimum linewidth is obtained with a current of 0.2 A, where the external field presumably nulls out the Earth’s field precisely. At higher or lower values of the current, characteristic Zeemanperturbed line shapes are obtained. Spin Relaxation in NQR As in NMR, NQR relaxation is generally parameterized in terms of two relaxation times: T1 , the longitudinal relaxation time, which is the time constant for relaxation
Figure 3. Effect of small external magnetic fields (expressed in terms of external coil current) on the lineshape of potassium chlorate resonance recorded at 28.85 MHz. A negative current corresponds to an applied field in the direction of the terrestrial magnetic field.
of the populations of the diagonal elements of the electric quadrupole coupling tensor in its eigenbasis, and T2 , the transverse relaxation time, which governs the relaxation of off-diagonal elements. Except in the case of spin 3/2, more than one T1 and T2 are generally observed. The longitudinal relaxation time is probably in most cases dominated by relaxation via very high-frequency motions of the crystal, typically librations of bonds or groups. Phenomenologically, it is generally observed to have an Arrhenius-like dependence on temperature, even in the absence of a well defined thermally activated motional mechanism. The transverse relaxation time, T2 , must be distinguished from T2 , the reciprocal of the NQR line width. The latter is generally dominated by inhomogeneous broadening, which is a significant factor in all but the most perfectly crystalline systems. The processes giving rise to T2 in NQR have, to our knowledge, not been systematically investigated, but it has been our observation that, in many cases, T2 is similar in value to T1 , and is therefore likely a result of the same high-frequency motions. T1 and T2 may be measured by the same pulse sequences used in high-field NMR; T1 by an inversion recovery pulse sequence (p t p=2 acquire); T2 using a single spin echo or train of spin echoes. T1 values at room
780
RESONANCE METHODS
temperature range from hundreds of microseconds to tens of milliseconds for half-integer spin nuclei with appreciable quadrupole coupling constants, such as 35Cl, 63Cu, 79 Br, or 127I; for nuclei with smaller quadrupole moments, such as 14N, they can range from a few milliseconds to (in rare cases) hundreds of seconds. T1 values generally increase by one or two orders of magnitude on cooling to 77 K. One phenomenon in which spin-relaxation times in NQR can be used to get detailed dynamical information is the ‘‘bleach-out’’ effect seen for trichloromethyl groups in the solid state. At low temperatures, three chlorine resonances are usually seen for such groups; however, as the temperature is raised, the resonances broaden and weaken, eventually disappearing, as a result of reorientation of the group about its three-fold axis. Depending on the environment of the grouping, ‘‘bleach-out temperatures’’ can be as low as 100 C or as high at 708C, as is observed in the compound p-chloroaniline trichloroacetate (Markworth et al., 1987). Such exchange effects are quantummechanically very different from exchange effects in NMR, since reorientation of the group dramatically reorients the axis of quantization of the nuclear spin Hamiltonian, and is therefore in the limit of the ‘‘strong collision’’ regime. Under these circumstances, the relaxation time T1 is to a very good approximation equal to the correlation time for the reorientation, making extraction of that correlation time trivial. Similar phenomena are expected for other large-angle, low-frequency motions, but have otherwise been reported only for 14N. PRACTICAL ASPECTS OF THE METHOD Spectrometers for Direct Detection of NQR Transitions Frequency-Swept Continuous Wave Detection. For its first 30 years, the bulk of NQR spectroscopy was done using superregenerative spectrometers, either home-built or commercial (Decca Radar). The superregenerative oscillator was a solution to the constraint that an NMR spectrometer must be capable of sweeping over a wide range of frequencies, while in the 1950s and 1960s most radiofrequency devices (e.g., amplifiers, phase modulators and detectors, and receivers) were narrow-band and tunable. Briefly, the superregenerative circuit is one where the sample coil forms the inductive element in the tank circuit of the primary oscillator, which is frequency-swept using a motor-driven variable capacitor. Absorption of radiofrequency by the NQR spins causes a damping of the oscillator, which can be detected by a bridge circuit. The result is a frequency-swept spectrum in which NQR transitions are detected as the second derivative of their lineshape. The superregenerative circuit has the virtues of cheapness and robustness; however it has all the disadvantages of continuous wave detection—it does not lend itself easily to signal averaging, or to the more sophisticated 2D approaches discussed below. It is probably useful only for pure samples of molecular weights < 1000 Da. Frequency-swept Fourier-transform Spectrometers. Most modern NQR spectroscopy likely is done on commercial
Figure 4. Probes for Fourier transform NQR. (A) Inductively matched series tank single-resonance probe; (B) inductively matched series/parallel tank double-resonance probe.
solid-state NMR instruments, as the equipment requirements (save for the magnet) are very similar to those of NMR. Critical elements are a fast digitizer (3/2 at two frequencies corresponding to distinct but connected NQR resonances. Double-resonance methods have not yet found significant application in materials research, and so are not described at length here; the interested reader is referred to Liao and
NUCLEAR QUADRUPOLE RESONANCE
Harbison (1994). The circuit consists of a series and a parallel tank in series. In principle, either coil could contain the sample; in practice we use the series inductor L1 . A typical probe configuration uses coil inductances of L1 ¼ 2:3 mH and L2 ¼ 1:2 mH, with a small two- or threeturn matching inductor L3 . The circuit is essentially a pair of tank circuits, whose coupling varies with the separation of their resonance frequencies. Again, highvoltage 1.5 to 30 pF variable capacitors are used for C1 and C2 . The two-fold difference between L1 and L2 ensures that the coupling between the two tanks is only moderate, so that resonance n1 primarily resides in tank circuit L1 C1 and n2 in L2 C2 . With this configuration, n1 tunes between 17 MHz and 44 MHz and n2 between 47 and 101 MHz, although the two ranges are not entirely independent. L3 acts as a matching inductor for both resonances; differential adjustment of the matching inductance can be achieved by adjusting the coupling between the two inductors, either by partially screening the two coils by an intervening aluminum plate, or even by adjusting their relative orientation with a pair of pliers. One notable feature that the probe lacks is any isolation between the two resonances; in fact, it has but a single port. This is because, unlike high-field NMR, it is seldom necessary to pulse one resonance while observing a second. We can therefore combine the output of both radio frequency channels before the final amplifier (which is broad-band and linear) and send the single amplifier output to a standard quarter-wave duplexer, which in practice is broad-band enough to handle the typical 2 : 1 frequency ratio used in NQR double-resonance experiments. Using single output state amplification and omitting isolation makes it far easier to design a probe to tune both channels over a wide frequency range.
SQUID Detectors. To overcome the sensitivity problems of NQR at low frequency, a DC superconducting quantum interference device (SQUID)—based amplifier can be used (Clarke, 1994; TonThat and Clarke, 1996). Using this device allows one to detect magnetic flux directly rather than differentially. This makes the intensity of the signal a linear function of the frequency instead of a quadratic function, greatly improving sensitivity at low frequencies compared to conventional amplification. SQUID NMR spectrometers can operate in both the continuous wave (Connor et al., 1990) or pulsed (TonThat and Clarke, 1996) modes. The latest spectrometers can attain a bandwidth up to 5 MHz (TonThat and Clarke, 1996). One of the drawbacks of the technique is cost, a result of the more complex design of the probe and spectrometer. SQUID spectrometers are not yet commercially available. Experiments are performed at 4.2 K in liquid helium. Examples of nuclei observed using this technique are 27 Al in Al2O3[Cr3þ] at 714 kHz (TonThat and Clarke, 1996), 129Xe in laser-polarized solid xenon (Ton That et al., 1997), and 14N in materials such as amino acids (Werner-Zwanzinger et al., 1994), cocaine hydrochloride (Yesinowski et al., 1995), and ammonium perchlorate (Clarke, 1994). Note that the detection of nitrogen transitions can be performed by cross-relaxation to protons
781
coupled to nitrogen by the dipole-dipole interaction (Werner-Zwanzinger et al., 1994; Yesinowski et al., 1995). Indirect Detection of NQR Transitions by Field Cycling Methods Where relaxation properties of a material are favorable (long intrinsic proton relaxation time, short quadrupole relaxation time), indirect detection methods give a significant improvement in signal-to-noise ratio, since detection is via the proton spins, which have a large gyromagnetic ratio and relatively narrow linewidth. Such methods are therefore often favored over direct detection methods for spins with low NQR frequencies and low magnetic moments, e.g., 14N or 35Cl. The term ‘‘field cycling’’ comes from the fact that the sample experiences different magnetic fields during the course of the experiment. The field intensity is therefore cycled between transients. Field cycling techniques were first developed during the 1950s (Pound, 1951). During a simple cycle, the sample is first magnetized at a high field, B0, then rapidly brought adiabatically to a low field or to zero field. During the evolution period, the magnetization is allowed to oscillate under local interactions. The sample is finally brought back to high field, where the signal is detected. The field strength during the detection period can be identical to that for the preparation period, which is convenient but it can also be lower, as in ‘‘soak field’’ techniques (Koening and Schillinger, 1969). Also, other cycles involving more than two different field strengths have been developed, such as the ‘‘zero-field technique’’ (Bielecki et al., 1983). Field switching can be implemented either by electronic switches or by mechanically moving the sample. The latter is more simple and can be easily implemented on any spectrometer, but is limited for experiments necessitating very rapid field switching. Electronic switches are limited by Faraday’s induction law, which dictates the maximum value of (dB0 =dt) for a given static B0 field. Field cycling can be applied in the solution state and in liquid crystals as well as in the solid state. As a competitive technique to NQR, quadrupolar nuclei can be observed directly; some examples are, but are not limited to, deuterium (Thayer and Pines, 1987) and nitrogen-14 (Selinger et al., 1994). Indirect detection (‘‘level crossing’’) is also feasible by observing the signal of the abundant nucleus (typically protons) dipolar-coupled to the quadrupolar nucleus (Millar et al., 1985). An example of level crossing is given in a latter section to record the nutation spectrum of trissarcosine calcium chloride at 600 kHz (Blinc and Selinger, 1992). The large number of applications of field cycling NMR is outside the scope of this unit, and more detailed information can be found in reviews by Kimmich (1980) and Noack (1986); see Literature Cited for additional information. One-Dimensional Fourier Transform NQR: Implementation The hardest task in Fourier transform NQR is to find the resonance. While the chemistry of the species being studied sometimes gives one insight about where the resonances might lie—e.g., an organochlorine can be expected
782
RESONANCE METHODS
small static Zeeman field, H0 , parallel to H1 , the radiofrequency field, as a perturbation to remove the degeneracy of the quadrupolar levels. The resulting splitting creates singularities, first described by Morino and Toyama (1961), at nQ ð1 þ ZÞn0
ð15Þ
nQ ð1 ZÞn0
ð16Þ
and
Figure 5. Frequency-swept spin-echo NQR. Each slice in the time dimension is a spin echo, recorded at a particular frequency; a slice in the frequency dimension parallel to the frequency axis gives the NQR spectrum.
to fall between 30 and 40 MHz—in general, this range will be far wider than the bandwidth of an NQR probe or the effective bandwidth of a high-power pulse. Some frequency sweeping will therefore have to be done. In a modern solidstate NMR spectrometer, the output power will usually be fairly constant over a range of 10 to 20 MHz, and the other components will generally be broad-band. Sweeping frequency, therefore, involves incrementing the spectrometer frequency by some fairly large fraction of the probe bandwidth, retuning the probe to that frequency, and acquiring a free-induction decay or preferably a spin echo at that frequency. Spectrometer frequencies are generally under computer control; the task of retuning the probe can be accomplished manually, or preferably automatically, by controlling the probe’s tuning capacitor from the spectrometer via a stepper motor, and using this motor to minimize reflected power. The result of such a sweep is a series of spin echoes collected as a function of spectrometer bandwidth; these echoes, either phased or in magnitude mode, can be stacked in an array such as shown in Figure 5, which is a set of spin echoes collected for the 35 Cl signal of the polymer poly-4-chlorostyrene. A slice along the spine of the spin echo is the NQR spectrum collected at the resolution of the frequency sweep. If the line is narrower than this step frequency, the individual spin-echo slice in which the line appears may be Fourier transformed; if it is broad, the slice through the spin echoes suffices.
where n0 is the Larmor frequency. Note that applying the static field H0 perpendicular to H1 does not change the position of the singularities but affects the intensity distribution. By measuring the splitting, the asymmetry parameter can be determined. Ramachandran and Oldfield’s two-dimensional version of this experiment was implemented using both one pulse and spin echo (see Fig. 6, panels A and B). The field is turned on during the preparation period, during which the spins evolve under both the Zeeman and quadrupole interactions. The field is turned off during acquisition. The 2D spectrum is obtained by incrementing t1 in regular intervals followed by 2D Fourier transform. The projection
Two-dimensional Zero-Field NQR Zeeman-Perturbed Nuclear Resonance Spectroscopy ZNQRS. For 3/2 spins, the 3=2 ! 1=2 transition depends on both the quadrupolar coupling constant, e2 qQ=h, and the asymmetry parameter Z. The resonance frequency of the single line is given by nQ ¼
1=2 1 e2 qQ Z2 1þ 2 h 3
ð14Þ
That single-resonance frequency is insufficient to determine these two parameters separately. To overcome the problem, Ramachandran and Oldfield (1984) applied a
Figure 6. 2D NQR pulse sequences: (A) Zeeman-perturbed NQR one-pulse; (B) Zeeman-perturbed NQR spin echo; (C) zero-field nutation; (D) RF pulse train method; (E) level-crossing double resonance NQR nutation; (F) 2D exchange NQR.
NUCLEAR QUADRUPOLE RESONANCE
783
on the o2 axis shows the zero-field NQR spectrum. The Zeeman NQR powder pattern is observed by projection along the o1 axis, which allows the determination of the asymmetry parameter Z. The limitation of the method is that it requires a rapidly switchable homogeneous Zeeman field. Zero-Field Nutation Nuclear Resonance Spectroscopy. Another method to determine the asymmetry parameter has been described (Harbison et al., 1989; Harbison and Slokenbergs, 1990) in which no Zeeman field is applied. The absence of a static field makes the frequency spectrum orientation independent. To determine Z, it is necessary to obtain an orientation-dependent spectrum, but to do so, it is unnecessary to introduce an extra perturbation such as the Zeeman field. The sample radio frequency coil itself introduces an external preferential axis and thus an orientation dependence. During an RF pulse in a zero-field NQR experiment, a 3/2 spin undergoes nutation about the unique axis of the EFG (Bloom et al., 1955). The strength of the effective H1 field depends on the relative orientation of the coil and quadrupolar axes and goes to zero when the two axes are parallel. The nutation frequency is given by pffiffiffi oN ¼ ð 3oR sin yÞ=2 ð17Þ where oR ¼ gH1 , y is the angle between the coil axis and the unique axis of the electric field gradient tensor, and g is the gyromagnetic ratio of the nucleus. The voltage induced in the coil by the precessing magnetization after the pulse is proportional to sin y. Figure 6, panel C, shows the 2D nutation NQR pulse sequence. For a single crystal, the NQR free precession signal is pffiffiffi Fðt1 ; t2 ; yÞ / sin y sinð 3oR t1 sin y=2Þ sinðoQ t2 Þ ð18Þ where oQ is the quadrupolar frequency, t1 is the time the RF pulse is applied and t2 is the acquisition time. For an isotropic powder, the nutation spectrum is obtained by powder integration over y, followed by complex Fourier transformation in the second dimension and an imaginary Fourier transform in the first Gðo1 ; o2 Þ /
ðp ð1 ð1 0
1
pffiffiffi sin2 y sinð 3oR t1 sin y=2Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4Z2 cos2 y þ ½9 þ Z2 þ 6Z cosð2fÞsin2 y
ð19Þ
ð20Þ
The 2D frequency-domain spectrum becomes ðp ð1 ð1 ð1 0
1
1
ZoR 2pn1 ¼ pffiffiffiqffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; 3 1 þ 13 Z2 and
ð3 ZÞoR 2pn2 ¼ pffiffiffiqffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; 2 3 1 þ 13 Z2
ð3 þ ZÞoR 2pn3 ¼ pffiffiffiqffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 3 1 þ 13 Z2
ð22Þ
The asymmetry parameter can be determined from the position of n2 and n3 alone (Harbison et al., 1989) Z¼
3ðn3 n2 Þ n3 þ n2
ð23Þ
Off-resonance effects can be calculated using a similar procedure (Mackowiak and Katowski, 1996). The nutation frequency is given by noff N ¼ x=p ¼
The angular factor sin y must be replaced by a factor Rðy; fÞ where y and f are the polar angles relating the coil axis and the quadrupolar tensor (Pratt et al., 1975). For an axially asymmetric tensor
Gðo1 ; o2 Þ /
The on-resonance nutation spectrum in the o1 dimension (see Fig. 7) shows three singularities, n1 , n2 , and n3 , whose frequencies are given by
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 ðnon N Þ þ ðnÞ
ð24Þ
1
sinðoQ t2 Þeio1 t1 eio2 t2 dt2 dt1 dy
Rðy; fÞ ¼
Figure 7. Fourier transform of the on-resonance nutation spectrum and corresponding MEM spectrum (the slices through the 2D spectrum parallel to the F1 axis at F2 ¼ nNQR ) obtained at 77 K for nNQR ¼ 36.772 in C3N3Cl3. The linear prediction filter used to obtain the MEM spectrum from N ¼ 162 time-domain data points was m ¼ 0.95 N (from Mackowiak and Katowski, 1996).
pffiffiffi sin yRðy; fÞ sinðoR t1 Rðy; fÞ=2 3rÞ
1 io1 t1 io2 t2
sinðoQ t2 Þe
e
dt2 dt1 dy df
ð21Þ
where non N is the on-resonance nutation frequency and n is the resonance offset. The asymmetry parameter is determined from the relation
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 2 2 ðnoff 3 ðnoff 3 Þ ðnÞ 2 Þ ðnÞ Z ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 2 2 ðnoff ðnoff 3 Þ ðnÞ þ 2 Þ ðnÞ
ð25Þ
Another way to do an off-resonance nutation experiment is to start acquisition at some constant delay, to , after the RF pulse procedure (Sinyavski, 1991; Mackowiak and Katowski, 1996). The nutation spectrum consists of three lines. The line at n is independent of the EFG parameters
784
RESONANCE METHODS
and the asymmetry parameter can be calculated from any of the two other lines using
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 2 2 ðnoff 3 ðnoff 3 Þ 2ðnÞ 2 Þ 2ðnÞ Z ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð26Þ 2 2 2 2 ðnoff ðnoff 3 Þ 2ðnÞ þ 2 Þ 2ðnÞ The spectral resolution for an off-resonance experiment decreases with the frequency offset up to 80% for a 100 kHz offset. For experimental nutation spectra, the data are occasionally truncated and noisy. Therefore the position of the singularities is often poorly resolved, which makes an accurate determination of Z difficult. As an alternative to the 2D Fourier transform, the maximum entropy method (MEM) can be used for processing the time-domain data (Sinjavski, 1991; Mackowiak and Katowski, 1993, 1996). The basis of the maximum entropy method is to maximize the entropy of the spectrum while maintaining correlation between the inverse Fourier transform of the spectrum and the free induction decay (FID). The process works by successive iterations in calculating the w2 correlation between the inverse Fourier transform of the trial spectrum and the FID. The spectrum is modified so its entropy increases without drastically increasing w2 , until convergence is reached. One of the inherent advantages to this method is that the MEM spectrum is virtually noiseless and free of artifacts. Mackowiak and Katowski used the Burg algorithm to determine the lineshape that maximizes the entropy for the nutation experiment (Stephenson, 1988; Mackowiak and Katowski, 1993, 1996). In this unit we do not intend to discuss the mathematics involved in MEM, but we can point readers to a review by D.S. Stephenson about linear prediction and maximum entropy methods used in NMR spectroscopy (Stephenson, 1988; also see references therein). MEM has been very successfully applied in high-field NMR for both1D and 2D data reconstruction. Figure 7 shows on-resonance simulated and experimental 2D Fourier transform (FT) and MEM spectra. The noise and resolution improvements for the MEM method are clearly visible. The singularities n2 and n3 can be easily measured from the MEM spectrum. The asymmetry parameter is determined as previously using Equation 23. The quadrupolar frequency is very temperature sensitive. Rapid recycling can cause significant sample heating and introduce unwanted frequency shifts during the nutation experiment. Thus, recycle delays much longer than necessary for spin relaxation are used (Harbison et al., 1989; Harbison and Slokenbergs, 1990; Mackowiak and Katowski, 1996), together with active gas cooling. Changes in the RF excitation pulse length between slices can cause undesirable effects. Also, for weak NQR lines, acquiring the 2D NQR nutation experiment can take a long time. To reduce the experiment time, Sinyavski et al. (1996) reported a sequence of identical short RF pulses separated by time intervals t. The NQR signal induced in the RF coil just after turnoff of the nth pulse is Gðntw Þ ¼ hIx in sin y cos f þ hIy in sin y sin f þ hIz in cos y ð27Þ
where hIx;y;z in are the expectation values of the magnetization along the coordinate axis and are calculated using a wave function approach (Pratt et al., 1975; Sinyavski et al., 1996). In the general case, the expression in Equation 27 is a sum of signals with various phases. For the case when o0 t þ otw ¼ 2pk
with
k ¼ 0; 1; 2 . . .
ð28Þ
that is, when the phase accumulated due to resonance offset is an integer multiple of 2p, the free induction decay obtained is identical to that found in the two-dimensional nutation experiment. Here, tw is the time delay between pulses in the nutation pulse train. In addition, if the NQR signal is measured at some constant delay, td , after each pulse in the sequence, then the signal after synchronous detection is Gðntw Þ ¼aR2 ðy; fÞ
sinð2nxtw Þ cosðotd þ Þ 2x ð29Þ þo½1 cosð2nxtw Þsinðotd þ Þ=2x
Rðy; fÞ is defined in Equation 20, o is the frequency offset, is a constant phase shift of the reference signal for the synchronous detector, and 2x ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðoÞ2 þ 4m2
with
pffiffiffiffiffiffiffiffiffiffiffiffiffiffi m ¼ ðgH1 =4ÞRðy; fÞ= 3 þ Z2 ð30Þ
Equation 29 describes the ‘‘discrete interferogram’’ of a single nutation line at the frequency 2x (Sinyavski et al., 1996). The pulse sequence for the train-pulse method is shown in Figure 6, panel D. For the on-resonance condition (o ¼ 0), tw and t can easily be chosen so that condition in Equation 28 is satisfied. For o 6¼ 0, Equation 28 can also be satisfied. If the sum of the intervals tw þ t in the pulse train is an integer multiple of the data sampling period, the nutation interferogram DðiÞ can be reconstructed from the raw data by the simple algorithm DðiÞ ¼ D½c þ ði 1ÞðpnnÞ
ð31Þ
where i ¼ 1; 2; . . . ; n; n is the number of pulses, p the number of data points, the backslash denotes an integer division (e.g., a\b is the largest integer less than a/b), and c is a constant that can be determined by trial (Sinyavski et al., 1996). If the NQR line is broad, dead-time problems occur and a spin echo signal can be used. The equivalent of the expression in Equation 29 for the echo signal is Gðntw Þ ¼ aR2 ðy; fÞð1 ½o=2x2 Þ½1 cosð2xt0w Þ ½sinð2nxtw Þcos þ o½1 cosð2nxtw Þsin =ð4xÞ=4x
ð32Þ
where t0w is the delay between nutation pulses. The maximum time-saving factor can be roughly estimated as equal to the number of pulses contained in a pulse train (Sinyavski et al., 1996).
NUCLEAR QUADRUPOLE RESONANCE
Level-Crossing Double Resonance NQR Nutation Spectroscopy. When the NQR signal of the 3/2 spin nuclei is weak (low natural abundance) or if the quadrupolar frequency is too low to be observable, a double-resonance technique known as level-crossing double resonance NQR nutation (Blinc and Selinger, 1992) can be used to retrieve the nutation spectrum. In this experiment, the quadrupolar nuclei are observed via their effect on signals from the other nuclei (typically protons). Dipolar coupling must be present between the two nuclei and spin-lattice relaxation times have to be relatively long. Figure 6, panel E, shows the field cycling and pulse sequence used for the experiment. The sample is first polarized in a strong magnetic field, H0. The system is then adiabatically brought to zero magnetic field (either by decreasing H0 or by moving the sample out of the magnet). At a specific field strength, level crossing occurs between the Zeeman levels of the 1/2 spin (protons) and the Zeeman-perturbed quadrupolar levels (Blinc et al., 1972; Edmonds and Speight, 1972; Blinc and Selinger, 1992). The level crossing polarizes the quadrupolar nuclei (decreasing the population difference N1=2 N3=2 ) and decreases the magnetization of the protons. An RF pulse of length t1 is applied at zero field with frequency of ¼ oQ þ d, where oQ is the pure quadrupolar frequency. The proton system is remagnetized and the proton FID acquired after a 908 pulse. The 2D spectrum is acquired by varying t1 and . The proton signal SH ðt1 Þ is proportional to the population difference (Blinc and Selinger, 1992). SH ðt1 Þ / ðN1=2 N3=2 Þ ¼ A cos½oðy; j; dÞt1
ð33Þ
where A is a constant and oðy; j; dÞ ¼ gQ H1 Rðy; jÞ= p ð2 3 þ ZÞ is the nutation frequency of the quadrupolar nuclei. For a powder, Equation 33 is integrated over the solid angle ð SH ð0Þ SH ðt1 Þ ¼ cos½oðy; j; dÞt1 d ð34Þ 4p The Fourier transform of SH(t1) gives the nutation spectrum with singularities given in Equation 22. The o2 dimension shows the zero-field NQR spectra. This technique has been successfully applied to determine e2qQ/h and Z for NQR frequencies as low as 600 kHz (Blinc and Selinger, 1992) for tris-sarcosine calcium chloride. 2D Exchange NQR Spectroscopy. 2D exchange NMR spectroscopy was first suggested by Jeener et al. (1979) The technique has been widely used in high field. Rommel et al. (1992a) reported the first NQR application of the 2D exchange experiment. The 2D exchange experiment (Rommel et al., 1992a; Nickel and Kimmich, 1995) consists of three identical RF pulses separated by interval t1 and tm (Fig. 6, panel F). The first pulse produces spin coherence evolving during t1; the second pulse partially transfers the magnetization components orthogonal to the rotating frame RF field into components along the local direction of the EFG. The mixing time, tm , is set to be much longer than t1, so most of the exchange occurs during this interval. The
785
last pulse generates the FID. The NQR signal oscillates along the axis of the RF coil. By incrementing t1, and after double Fourier transform, the 2D spectrum Sðn1 ; n2 ; tm Þ shows diagonal and cross-peaks, the latter corresponding to nuclei undergoing exchange. The theoretical treatment is based on the fictitious spin 1/2 formalism (Abragam, 1961; Goldman, 1990; Nickel and Kimmich, 1995). The matrix treatment of the high field 2D exchange NMR (Ernst et al., 1987) has to be modified for the specifics of the NQR experiment. During the exchange process, in contrast to its high-field equivalent (where only the resonance frequency changes), the direction of quantization (given by the direction of the EFG) can also change. This reduces the cross-peak intensities by projection losses from the initial to the final direction of quantization of the exchange process (Nickel and Kimmich, 1995). It is important for the 2D exchange that all resonances participating in the exchange be excited simultaneously. Phase-alternating pulses producing tunable sidebands can overcome this problem (Rommel et al., 1992a). If the exchanging resonances fall outside the feasable single resonance bandwidth, a double-resonance configuration could be used instead. Spatially Resolved NQR and NQR Imaging A logical step following 2D NQR experiments is to perform spatially resolved NQR and NQR imaging. This would be of particular value for the characterization of solid materials. Imaging is a very recent technique in NQR spectroscopy that only a handful of research groups are developing. The main restriction to performing NQR imaging is that even if the quadrupole resonances are narrow, it is practically impossible to shift the NQR frequency, since it only depends on the interaction between the quadrupole moment and the electric field gradient at the nuclei. Magnetic Field Gradient Method. The first NQR experiment that allowed retrieval of spatial information was performed recently by Matsui et al. (1990). If the NQR resonance is relatively sharp, the half-width of a Zeeman-perturbed spectrum is proportional to the strength of the small static Zeeman field applied (Das and Hahn, 1958). Thus, the height of the pattern at the resonance frequency is reduced proportionally to the Zeeman field strength (Matsui et al., 1990). Figure 3 shows the changes in the spectrum lineshape of potassium chlorate due to different Zeeman field strengths. The sample coil is placed between two Helmholtz coils oriented perpendicular to the terrestrial magnetic field. Changing the current applied through the coil changes the field strength. Note that because the Zeeman field is parallel to the terrestrial field, a shimming effect appears for a current of 0.2 A, reducing the half-width by approximately a factor of two (unpublished results). This effect was not observed in the work of Matsui et al. (1990). If a magnetic field gradient is applied, the reduction effect can be used for imaging since it depends on spatial location (Matsui et al., 1990). Given N discrete quadrupolar spin densities rðXn Þ, the observed signal for the onedimensional experiment will be the sum of all the powder
786
RESONANCE METHODS
patterns affected by the field gradient. The spectral height of the spectrum at the resonance frequency o0 becomes HðXn0 Þ ¼
N X
WðXn Xn0 ÞrðXn Þ
ð35Þ
n¼1
The function WðXn Xn0 Þrepresents the spectral height reduction at Xn. The reduction function can be measured by applying uniform Zeeman fields of known strength. The N discrete spin densities can be determined by solving the system of N linear equations given by Equation 21 at the resonance frequency (Matsui et al., 1990). As the field gradient increases, the plot becomes similar to the projection of the sample. This is because the signal contribution of the neighboring locations decreases with increasing gradients. Determining the spin densities is a deconvolution, and small experimental errors (e.g., amplitude and location) are enhanced by the conversion. Rotating Frame NQR Imaging. Rotating frame NQR imaging (r NQRI) is similar to the rotating frame zeugmatography proposed by Hoult (1979). This technique has the advantage of using pure NQR without any magnetic field or magnetic-field gradient, in contrast to the previous method. The rotating frame zeugmatography is a flipangle encoding technique used in NMR imaging. Nonuniform radiofrequency fields are applied so that the flip angle of an RF pulse depends on the position with respect to the RF field gradient. The RF coils are designed to produce constant field gradients. For NQR, only the amplitudeencoding form of the method is applicable. In NQR, the transverse magnetization oscillates rather than precesses, making the phase encoding (Hoult, 1979) variant of the method inapplicable. The first application of r NQRI was developed by Rommel et al. (1991b) in order to determine the onedimensional profile of solid samples. In this experiment, an anti-Helmholtz coil (transmitter coil) produces the RF whose distribution is given by " # m0 I R2 R2 B1 ðzÞ ¼ 2 ðR2 þ ðz þ z0 Þ2 Þ3=2 ðR2 þ ðz z0 Þ2 Þ3=2
small angle increments, the rotation axis being perpendicular to the RF field gradient (Nickel et al., 1991; Kimmich et al., 1992; Rommel et al., 1992b). Using a surface coil, where the sample is placed outside the coil, creates the RF field gradient and allows two-dimensional spatial encoding. The 2D image is reconstructed using the projection/reconstruction procedure proposed by Lauterbur (1973). A stepping motor allows the sample to rotate, producing the 2D image. The surface coil is placed perpendicular to the rotation axis. The spatial information is amplitude encoded in the FID signals by the gradient of the RF pulse B1. The RF gradient (G1) is aligned along the coil z axis and considered as constant (Nickel et al., 1991). G1 ðzÞ ¼
ð37Þ
The excitation pulse is characterized by the effective pulse length: tp ¼ tw a, where a is the transmitter attenuation factor and tw is the proper pulse length. tp can be varied by varying either tw or a. For 3/2 spins, the ‘‘pseudo’’ FID given in Equation 21 for the nutation experiment (Harbison et al., 1989) can be rewritten as ðp ð 2p pffiffiffi 3 1 ~ðzÞdz dy r Sðtp Þ ¼ Rðy; fÞ sin y p 2 0 0 0 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffi sinð 3tp o1 ðzÞRðy; fÞ=2 1 þ Z2 =3Þ ð1
ð38Þ
or Sðtp Þ ¼ c
ð1 0
1 ~ðzÞ f ðz; tp Þdz r p
ð39Þ
~ðzÞ ¼ gðo; zÞrðzÞ, where gðo; zÞ The proper distribution is r is the local line shape acting as a weighting factor. o1 ðzÞ ¼ gB1 ðzÞ and Rðy; fÞ are defined by Equation 6. For constant gradient, o1 ðzÞ: ¼ G1 . Introducing tp ¼ td eu ; z ¼ z0 ev and x ¼ u n leads to ~ SðuÞ Sðtd eu Þ ¼ c
ð1
fa ðnÞfb ðu nÞdn
ð40Þ
0
ð36Þ pffiffiffi where R is the coil radius, z0 ¼ R 3=2 is half the distance between the coils and 2I is the current amplitude through the coils. The receiver coil is placed coaxial to the transmitter coil (Rommel et al., 1991b). With this arrangement, transmitter and receiver coils are electrically decoupled. Later coil designs (Nickel et al., 1991) prefer the use of a surface coil acting as both a transmitter and a receiver coil. The sample is placed in only one-half of the transmitter coil. The pulse sequence for the experiment is very similar to the 2D-nutation experiment (Harbison et al., 1989) described earlier in this unit. Free induction decay signals are excited with increasing RF pulse length. Since r NQRI is restricted to solid samples, a simple way to access the second dimension is to rotate the sample by
qB1 ðzÞ qz
The deconvolution can be performed using the following expression fa ðnÞ ¼ Fu1
Fu fSðtd eu Þg Fx ffb ðxÞg
ð41Þ
fa ðnÞ can be derived numerically, since both fb ðxÞ and ~ (z) ¼ fa(v)/z S(td eu ) are known. The profile is given by r and v ¼ ln(zo/z). The 2D image is reconstructed from the profiles of all the orientations using the following steps (Nickel et al., 1991; Kimmich et al., 1992; Rommel et al., 1992b). The pseudo FIDs are baseline- and phase-corrected so that the imaginary part can be zeroed with no loss of information, and the profiles are determined. The profiles are
NUCLEAR QUADRUPOLE RESONANCE
centered with respect to the rotation axis. If G1(z) is not constant, the profiles must be stretched or compressed by a coordinate transformation so that the abscissae are linearly related to the space coordinate. The profile ordinates must be corrected by a factor G1(z)/B1(z) because the RF pickup sensitivity of the coil depends on z in the same way as B1 and because the signal intensity is weighted by a factor 1/G1(z). Finally, the image is reconstructed by the back-projection method (Lauterbur, 1973). Robert et al. (1994) used the maximum entropy method (MEM) as an alternative to the Fourier transform to process the pseudo FIDs. The MEM procedure shows better resolution and avoids the noise and artifacts of the Fourier method. The drawback is the much longer time required for the MEM method to process the data (30 s per FID compare to the millisecond scale of the fast Fourier transform or FFT). A Hankel transform can also be applied instead of FFT or MEM (Robert et al., 1994). To reduce the acquisition time, a procedure similar to the 2D nutation NQR train pulse method (Sinyavski et al., 1996), discussed previously, can be used (Robert et al., 1996). Theoretical calculations show that the pseudo-FIDs recorded using this single experiment (SEXI) method (Robert et al., 1996) are identical to those obtained using the multiple-experiment procedure. Data processing and image reconstruction are performed as described previously using MEM processing and the back-projection algorithm. The SEXI method reduces acquisition time by a factor of 50. So far, theoretical treatments and experiments using the SEXI method are only applied for 3/2 spins, while the multiple-experiment method also can be applied for 5/2 spins (Robert et al., 1996). The combination of r NQRI and Matsui’s Zeeman perturbed NQR technique, described earlier, allows adding the third dimension to the imaging process by adding slice selection (Robert and Pusiol, 1996a,b). The width of the Zeeman-perturbed NQR spectrum is proportional to the local Zeeman field (Matsui et al., 1990). In a zero-crossing magnetic field gradient, the nuclei in the zero-field region show sharp resonance while nuclei experiencing a field show weak and broad resonances. Therefore, the sample slice localized within the B ¼ 0 plane will show an unchanged spectrum, while the other slices away from the zero-field plane show broad or even vanishing spectra. If the field gradient is large enough, signals outside the zero-field slice virtually disappear. Two Helmholtz coils of similar diameter, separated by a distance L and carrying current in opposite directions, produce the magnetic field gradient. Obviously, other coil designs can be used to produce the magnetic field gradient. Varying the electric current ratio between the Helmholtz coils shifts the zerofield plane. Each slice can be obtained as described previously using the SEXI method. Robert and Pusiol (1996a,b) applied the r NQRI with slice-selection method on a test object filled with p-dichlorobenzene. Results are shown in Figure 8 for two different orientations of the test object. In both cases, the geometry of the test objects is well resolved and slice-selection appears to be a nice improvement of the r NQRI method. Recently, a new imaging technique developed by Robert and Pusiol (1997) was reported. This method, instead of
787
Figure 8. (A) Schematic arrangement of the surface RF and the selective magnetic gradient coils together with the cross-section of the object used for the test experiment. The selective static field gradient (rB) is applied in a direction normal to the desired image plane. (B) Top: Pseudo-FID and profile along the cylindrical symmetry axis of the object imaged without external magnetic field. Middle: Magnetic gradient selecting the right hand cylinder. Bottom: After shifting the B ¼ 0 plane towards the central cylinder. (C) Coil and test object arrangement for the 2D imaging experiment and cross image for the selection of the central cylinder (top) and of the external paradichlorobenzene disks (bottom) From Robert and Pusiol (1996a).
retrieving the second spatial dimension by rotating the sample, used a second surface RF coil perpendicular to the first one. In the ‘‘bidimensional rotating frame imaging technique 2D r NQRI,’’ orthogonal RF gradients along the x and y axis are applied. A constant time interval T is introduced between the two pulses tx and ty. This is necessary to remove the transverse coherence created by the first pulse (Robert and Pusiol, 1997). The 2D experiment is carried out by incrementing tx and ty. Alternatively, the second pulse can be replaced by the SEXI train pulse described previously (Robert et al., 1996), and the complete 2D signal is obtained by incrementing only tx. The resulting FIDs depend on both the x and y coordinates Fðtx ; ty Þ
ð1 0
dx
ð1 0
Hd rðx; yÞS1 ðtx ÞS2 ðty Þ dy
ð42Þ
788
RESONANCE METHODS
where S1 ðtx Þ ¼
ð 2p 0
df1
ðp 0
sin y1 cos½o1 lðy1 ; f1 Þtx dy1
ð43Þ
and S2 ðty Þ ¼
ð 2p 0
df2
ðp 0
sin y2 lðy2 ; f2 Þsin½o2 lðy2 ; f2 Þty dy2 ð44Þ
lðy; fÞ is proportional to Rðy; fÞ from Equation 20. The corresponding 2D image is produced by determining the two-dimensional spin density function r (tx,ty) The 2D image reconstruction is performed by a ‘‘true’’ 2D version of maximum entropy method and after RF field correction (Robert and Pusiol, 1997). Temperature, Stress, and Pressure Imaging. As mentioned earlier, temperature strongly affects the quadrupolar frequency. Although this is an inconvenience for the spatially resolved imaging techniques, where the temperature must remain constant during acquisition to avoid undesired artifacts or loss in resolution, the temperature shift, as well as the effect of applying pressure or stress to the sample, can in fact be useful to detect temperature and pressure gradients, giving spectroscopic resolution in the second dimension. The spatial resolution in the first dimension can be performed using the initial r NQRI technique (Rommel et al., 1991a; Nickel et al., 1994) with no rotation of the sample and with a surface coil to produce the RF field gradient. To study the effect of the temperature (Rommel et al., 1991b), a temperature gradient can be applied by two water baths at different temperature on each side of the sample. Data are collected in the usual way and are processed by 2D fast Fourier transform (FFT; Brigham, 1974). Figure 9, panel A, represents the r NQRI image of the test object and clearly shows the temperature gradients on the three sample layers through their corresponding frequency shifts. Using this technique, temperature gradients can be resolved to 18C/mm. Using MEM and deconvolution methods can highly improve the resolution, and experimental times can be greatly reduced by using the pulse-train method (Robert et al., 1996) instead of the multiple-experiment method. To study pressure and stress, a similar procedure can be applied (Nickel et al., 1994). The probe contains two sample compartments, the first one serving as a reference while pressure is exerted on the second. Images are recorded for different applied pressures. If the sample is ‘‘partially embedded’’ in soft rubber or other soft matrices, broadening due to localized distribution of stress does not appear and only frequency shifts are observed in the image. If the pressure coefficient (in kHz/MPa) is known, pressures can be calculated from the corresponding frequency shifts. Figure 9, panel B, shows the shifting effect of applied pressure on the test sample. In the case of harder matrices or pure crystals, the frequency shift is accompanied by a line broadening caused by the local
Figure 9. (A) Two-dimensional representation of temperature imaging experiments. The vertical axis represents the spacial distribution of the (cylindrical) sample, whereas the horizontal axis is the line shift corresponding to the local temperature (from Rommel et al., 1991a). (B) Spatial distribution of the 127I NQR spectra of lithium iodate embedded in rubber in a two-compartment sample arrangement. Top: no applied pressure. Middle and bottom: applied pressure of 7 MPa and 14 MPa, respectively (from Nickel et al.,1994).
stress distribution within the sample. This broadening can even hide the shifting effect of the stress or pressure applied and the image only shows a decrease in intensity (Nickel et al., 1994). Field Cycling Methods. Field cycling techniques are of potential interest for the imaging of materials containing quadrupolar nuclei dipolar-coupled to an abundant 1/2 nucleus (i.e., protons), and for detecting signals in the case where the quadrupolar frequency is too low or the signal too weak to be directly observed by pure NQR spectroscopy. Recently, Lee et al. (1993) applied field cycling to the one-dimensional 11B imaging of a phantom of boric acid. The experiment consisted of a period at high field, during which the proton magnetization is built up, followed by a
NUCLEAR QUADRUPOLE RESONANCE
rapid passage to zero field, during which the magnetization is transferred to the quadrupolar nuclei. RF irradiation at the quadrupolar frequency is applied and the remaining magnetization is transferred back to the proton by rapidly bringing the sample back to high field. Finally, the proton FID is obtained after solid echo. Spatial resolution is obtained by translating along the sample the small irradiation RF coil that produced the field gradient (Lee et al., 1993). The percentage of ‘‘recovered magnetization’’ is obtained from two reference experiments—maximum and minimum values of the ‘‘recovered magnetization’’ corresponding to ‘‘normal’’ and long residence times at zero field with no irradiation. Despite the low resolution, the image accurately represents the object and the experiment demonstrated the applicability of the technique. Using more elaborate coil designs, the same research group (Lee and Butler, 1995) produced 14N 2D images spatially resolved in one dimension and frequency-resolved in the other. The image was produced by swapping both the RF coil position and the irradiation frequency at zero field. It may be worth noting that field cycling imaging can possibly be achieved by performing the ‘‘level crossing double resonance NQR nutation spectroscopy’’ experiment (Blinc and Selinger, 1992) as described earlier, with a surface RF coil arrangement to access the spatial information as in the r NQRI techniques. If applicable, this would lead to improved image resolution. So far, in contrast to its high field equivalent, and especially compared to medical imaging, the spatial resolution of the NQR imaging techniques is still fairly poor (1 mm). Nevertheless, the recent advances are very convincing for the future importance of NQR imaging in materials science. Resolution improvements can be achieved by new coil arrangement designs and deconvolution algorithms. The limiting factor in resolution is ultimately the intrinsic linewidth of the signal. The future of NQR imaging may reside in the development of line-narrowing techniques. DATA ANALYSIS AND INITIAL INTERPRETATION Most modern NMR spectrometers contain a full package of one and two-dimensional data processing routines, and the interested reader is referred to the spectrometer documentation. Very briefly, processing time domain NMR data (free induction decays or FIDs) is done in five steps. 1. Baseline correction is applied to remove the DC offset in the FID. 2. Usually, the first few points of the free-induction decays are contaminated by pulse ringdown and give rise to artifacts. They must be removed by shifting the all data sets by several points. 3. Exponential multiplication is usually performed to increase the signal-to-noise ratio. This is performed by multiplying the real and imaginary part by the function (Ct), where t is the time in seconds and C is a constant in Hz. The larger the C, the faster the decay and the larger will be the broadening in the spectrum after the Fourier transform. This is
789
equivalent to convolving the frequency domain spectrum with a Lorentzian function. Other apodization functions, such as sinebell and Gaussian multiplication, can also be used. 4. Spectral resolution can also be improved by doubling the size of the data set. This is done by filling the second half of the time-domain data with zeros. 5. Fourier transform allows transformation from the time to the frequency domain where the spectrum is analyzed. The theory of the fast Fourier transform (FFT) has been described in detail elsewhere and will not be discussed here. We refer the reader instead to Brigham (1974). Two-dimensional data processing in general repeats this procedure in a second dimension, although there are some mathematical niceties involved. Any modern handbook of NMR spectroscopy, such as Ernst et al. (1987), deals with the details.
PROBLEMS Sensitivity As with most radiofrequency spectroscopy methods, the overriding problem is sensitivity. Boltzmann populations and magnetic moments of quadrupolar nuclei are small, and their direct detection either by nuclear induction or by SQUID methods often stretches spectrometers to the limit. Offsetting these sensitivity difficulties is the fact that longitudinal relaxation times are often short, and so very rapid signal averaging is often possible; in fact, typical data acquisition rates are limited not by relaxation but by thermal heating of the sample by the RF pulses. Sensitivity depends on a host of factors—linewidth, relaxation times, the quadrupole constant and gyromagnetic ratio of the nucleus, the amount of sample, and, of course, the quality of the spectrometer. As an order-ofmagnitude estimate, a ‘‘typical’’ NMR nucleus (63Cu) with a quadrupole coupling constant in the range 20 to 25 MHz, can probably be detected in amounts of 1 to 10 mM. Spurious Signals As in any form of spectroscopy, it is essential to distinguish between signals arising from the sample and artifactual resonances from the sample container, probe, or elsewhere in the experimental apparatus. Artifacts extraneous to the sample can, generally, easily be detected by adequate controls, and with experience can be recognized and discounted; our probes, for example, have an impressivelooking NQR signal at 32.0 MHz, apparently due to a ceramic in the capacitors! A harder problem to deal with is identifying and assigning multiple NQR resonances in the same material; in contrast to NMR, at zero field the identity of a resonance cannot simply be determined from its frequency. The particular nucleus giving rise to an NQR signal can be identified with certainty if there exists another isotope of that nucleus. For example, naturalabundance copper has two isotopes—63Cu and 65Cu—with
790
RESONANCE METHODS
a well-known ratio of quadrupole moments. Similar pairs are 35Cl and 37Cl, 79Br and 81Br, 121Sb and 123Sb, etc. Where a second isotope does not exist, or where additional confirmation is required, the gyromagnetic ratio of the nucleus can be measured with a nutation experiment (see Practical Aspects of the Method). Two signals from the same nucleus and arising from different chemical species are the most difficult problem to deal with. Such signals generally cannot be assigned by independent means, but must be inferred from the composition and chemistry of the sample. This is an area where much more work needs to be done. Dynamics The broadening and disappearance of NQR lines caused by dynamical processes on a time scale comparable to the NQR frequency is a useful means of quantifying dynamics, if such processes are recognized; however, they may also cause expected NQR signals to be very weak or absent. We suspect the tendency of NQR signals to be very idiosyncratic in intensity to be largely due to this effect. If NQR lines are missing, or unaccountably weak, and dynamics is suspected to be the culprit, cooling the sample will often alleviate the problem. Heterogeneous Broadening Finally, another limitation on the detectability of NQR samples is heterogeneous broadening. In pure samples of perfect crystals, NQR lines can often be very narrow, but such systems are seldom of interest to the materials scientist; more often the material contains considerable physical or chemical heterogeneity. The spectrum of poly-4-chlorostyrene in Figure 5 is an example of chemically heterogeneous broadening; the coupling constants of the chlorine are distributed over 1 MHz due to local variations in the chemical environment of the aromatic chlorines. Similar broadening is observed, for example, in the copper NQR signals of high Tc superconductors. Physical heterogeneous broadening is induced by variations in temperature or strain over the sample. Simply acquiring NQR data too fast will generally induce inhomogenous RF heating of the sample, and since NQR resonances are exquisitely temperature dependent, the result is a highly broadened and shifted line. Strain broadening can be demonstrated simply by grinding a crystalline NQR sample in a mortar and pestle; the strain induced by grinding will often broaden the lines by tens or hundreds of kilohertz. Obviously, the former effect can be avoided by limiting the rate of data acquisition, and the latter by careful sample handling. When working with crystalline solids, we avoid even pressing the material into the sample tube, but pack it instead by agitation.
in 1996. This project was funded by NSF under grant number MCB 9604521. LITERATURE CITED Abragam, A. 1961. The Principles of Nuclear Magnetism. Oxford University Press, New York. Bielecki, A., Zax, D., Zilm, K., and Pines, A. 1983. Zero field nuclear magnetic resonance. Phys. Rev. Lett. 50:1807–1810. Blinc, R. and Selinger, J. 1992. 2D methods in NQR spectroscopy. Z. Naturforsch. 47a:333–341. Blinc, R., Mali, M., Osredkar, R., Prelesnik, A., Selinger, J., Zupancic, I., and Ehrenberg, L. 1972. Nitrogen-14 NQR (nuclear quadrupole resonance) spectroscopy of some amino acid and nucleic bases via double resonance in the laboratory frame. J. Chem. Phys. 57:5087. Bloom, M., Hahn, E. L., and Herzog, B. 1955. Free magnetic induction in nuclear quadrupole resonance. Phys. Rev. 97:1699– 1709. Brigham, E. O. 1974. The Fast Fourier Transform. Prentice-Hall, Englewood Cliffs, N.J. Burns, G. and Wikner, E. G. 1961. Antishielding and contracted wave functions. Phys. Rev. 121:155–158. Clarke, J. 1994. Low frequency nuclear quadrupole resonace with SQUID amplifiers. Z. Naturforsch. 49a:5–13. Connor, C., Chang, J., and Pines, A. 1990. Magnetic resonance spectrometer with a D. C. SQUID detector. Rev. Sci. Instrum. 61:1059–1063. Creel, R. B. 1983. Analytic solution of fourth degree secular equations: I¼3/2 Zeeman-quadrupole interactions and I¼7/2 pure quadrupole interaction. J. Magn. Reson. 52:515–517. Creel, R. B., Brooker, H. R., and Barnes, R. G. 1980. Exact analytic expressions for NQR parameters in terms of the transition frequencies. J. Magn. Reson. 41:146–149. Das, T. P. and Bersohn, R. 1956. Variational approach to the quadrupole polarizability of ions. Phys. Rev. 102:733–738. Das, T. P. and Hahn, E. L. 1958. Nuclear Quadrupole Resonance Spectroscopy. Academic Press, New York. Edmonds, D. T. and Speight, P. A. 1972. Nuclear quadrupole resonance of 14N in pyrimidines purines and their nucleosides. J. Magn. Reson. 6:265–273. Ernst, R. R., Bodenhausen, G., and Wokaun, A. 1987. Principles of Nuclear Magnetic Resonance in One and Two Dimensions. Clarendon, Oxford. Feiock, F. D. and Johnson, W. R. 1969. Atomic susceptibilities and shielding factors. Phys. Rev. 187:39–50. Goldman, M. 1990. Spin 1/2 description of spins 3/2. Adv. Magn. Reson. 14:59. Harbison, G. S. and Slokenbergs, A. 1990. Two-dimensional nutation echo nuclear quadrupole resonance spectroscopy. Z. Naturforsch. 45a:575–580. Harbison, G. S., Slokenbergs, A., and Barbara, T. S. 1989. Twodimensional zero field nutation nuclear quadrupole resonance spectroscopy. J. Chem. Phys. 90:5292–5298. Hoult, D. I. 1979. Rotating frame zeugmatography. J. Magn. Reson. 33:183–197.
ACKNOWLEDGMENTS
Jeener, J., Meier, B. H., Bachmann, P., and Ernst, R. R. 1979. Investigation of exchange processes by two-dimensional NMR spectroscopy. J. Chem. Phys. 71:4546.
We thank Young-Sik Kye for permission to include unpublished results. The data in Figure 5 were collected by Solomon Arhunmwunde, who died under tragic circumstances
Kimmich, R. 1980. Field cycling in NMR relaxation spectroscopy: Applications in biological, chemical and polymer physics. Bull. Magn. Res. 1:195.
NUCLEAR QUADRUPOLE RESONANCE
791
Kimmich, R., Rommel, E., Nickel, P., and Pusiol, D. 1992. NQR imaging. Z. Naturforsch. 47a:361–366.
Raghavan, P. 1989. Table of Nuclear Moments. Atomic Data and Nuclear Data Tables 42:189–291.
Koening, S. H. and Schillinger, W. S. 1969. Nuclear magnetic relaxation dispersion in protein solutions. J. Biol. Chem. 244: 3283–3289. Kye, Y.-S. 1998 The nuclear quadrupole coupling constant of the nitrate ion. Ph.D. Thesis. University of Nebraska at Lincoln.
Ramachandran, R. and Oldfield, E. 1984. Two dimensional Zeeman nuclear quadrupole resonance spectroscopy. J. Chem. Phys. 80:674–677. Robert, H. and Pusiol, D. 1996a. Fast r -NQR imaging with slice selection. Z. Naturforsch. 51a:353–356.
Lahiri, J. and Mukherji, A. 1966. Self-consistent perturbation. II. Calculation of quadrupole polarizability and shielding factor. Phys. Rev. 141:428–430.
Robert, H. and Pusiol, D. 1996b. Slice selection in NQR spatially resolved spectroscopy. J. Magn. Reson. A118:279–281.
Lahiri, J. and Mukherji, A. 1967. Electrostatic polarizability and shielding factors for ions of argon configuration. Phys. Rev. 155:24–25. Langhoff, P. W. and Hurst, R. P. 1965. Multipole polarizabilities and shielding factors from Hartree-Fock wave functions. Phys. Rev. 139:A1415–A1425. Lauterbur, P. C. 1973. Image formation by induced local interactions: Examples employing nuclear magnetic resonance. Nature 242:190–191. Lee, Y. and Butler, L. G. 1995. Field cycling 14N imaging with spatial and frequency resolution. J. Magn. Reson. A112:92– 95.
Robert, H. and Pusiol, D. 1997. Two dimensional rotating-frame NQR imaging. J. Magn. Reson. 127:109–114. Robert, H., Pussiol, D., Rommel, E., and Kimmich R. 1994. On the reconstruction of NQR nutation spectra in solid with powder geometry. Z. Naturforsch. 49a:35–41. Robert, H., Minuzzi, A., and Pusiol, D. 1996. A fast method for the spatial encoding in rotating-frame NQR imaging. J. Magn. Reson. A118:189–194. Rommel, E., Kimmich, R., and Pusiol, D. 1991a. Spectroscopic rotating-frame NQR imaging (r NQRI) using surface coils. Meas. Sci. Technol. 2:866–871. Rommel, E., Nickel, P., Kimmich, R., and Pusiol, D. 1991b. Rotating-frame NQR imaging. J. Magn. Reson. 91:630–636.
Lee, Y., Michaels, D. C. and Butler, L. G. 1993. 11B imaging with field cycling NMR as a line narrowing technique. Chem. Phys. Lett. 206:464–466.
Rommel, E., Nickel, P., Rohmer, F., Kimmich, R., Gonzales, C., and Pusiol, D. 1992a. Two dimensional exchange spectroscopy using pure NQR. Z. Naturforsch. 47a:382–388.
Liao, M. Y. and Harbison, G. S. 1994 The nuclear hexadecapole interaction of iodine-127 in cadmium iodide measured using zero-field two dimensional nuclear magnetic resonance. J. Chem. Phys. 100:1895–1901.
Rommel, E., Kimmich, R., Robert, H., and Pusiol, D. 1992b. A reconstruction algorithm for rotating-frame NQR imaging (r NQRI) of solids with powder geometry. Meas. Sci. Technol. 3:446–450.
Mackowiak, M. and Katowski, P. 1993. Application of maximum entropy methods in NQR data processing. Appl. Magn. Reson. 5:433–443. Mackowiak, M. and Katowski, P. 1996. Enhanced information recovery in 2D on- and off-resonance nutation NQR using the maximum entropy method. Z. Naturforsch. 51a:337–347.
Selinger, J., Zagar, V., and Blinc, R. 1994. 1H-14N nuclear quadrupole double resonance with multiple frequency sweeps. Z. Naturforsch. 49a:31–34.
Markworth, A., Weiden, N., and Weiss, A. 1987. Microcomputer controlled 4-Pi-Zeeman split NQR spectroscopy of Cl-35 in single-crystal para-chloroanilinium trichloroacetate—crystalstructure and activation-energy for the bleaching-out process. Ber. Bunsenges. Phys. Chem. 91:1158–1166. Matsui, S., Kose, K., and Inouye, T. 1990. An NQR imaging experiment on a disordered solid. J. Magn. Reson. 88:186–191. Millar, J. M., Thayer, A. M., Bielecki, A., Zax, D. B., and Pines, A. 1985. Zero field NMR and NQR with selective pulses and indirect detection. J. Chem. Phys. 83:934–938. Morino, Y. and Toyama, M. 1961. Zeeman effect of the nuclear quadrupole resonance spectrum in crystalline powder. J. Chem. Phys. 35:1289–1296. Nickel, P. and Kimmich, R. 1995. 2D exchange NQR spectroscopy. J. Mol. Struct. 345:253–264. Nickel, P., Rommel, E., Kimmich, R., and Pusiol, D. 1991. Twodimensional projection/reconstruction rotating-frame NQR imaging (r NQRI). Chem. Phys Lett. 183:183–186. Nickel, P., Robert, H., Kimmich R., and Pusiol D. 1994. NQR method for stress and pressure imaging. J. Magn. Reson. A111:191–194. Noack, F. 1986. NMR field cycling spectroscopy: Principles and applications. Prog. NMR Spectrosc. 18:171–276.
Sen, K. D. and Narasimhan, P. T. 1974. Polarizabilities and antishielding factors in crystals. In Advances in Nuclear Quadrupole Resonance, Vol. 1 (J. A. S. Smith, ed.). Heyden and Sons, London. Sinyavski, N., Ostafin, M., and Mackowiak, M. 1996. Rapid measurement of nutation NQR spectra in powder using an rf pulse train. Z. Naturforsch. 51a:363–367. Stephenson, D. S. 1988. Linear prediction and maximum entropy methods in NMR spectroscopy. Prog. Nucl. Magn. Reson Spectrosc. 20:516–626. Sternheimer, R. M. 1963. Quadrupole antishielding factors of ions. Phys Rev. 130:1423–1424. Sternheimer, R. M. 1966. Shielding and antishielding effects for various ions and atomic systems. Phys. Rev. 146:140. Sternheimer, R. M. and Peierls, R. F. 1971. Quadrupole antishielding factor and the nuclear quadrupole moments of several alkali isotopes. Phys. Rev. A3:837–848. Thayer, A. M and Pines, A. 1987. Zero field NMR. Acc. Chem. Res. 20:47–53. TonThat, D. M. and Clarke, J. 1996. Direct current superconducting quantum interference device spectrometer for pulsed magnetic resonance and nuclear quadrupole resonance at frequencies up to 5 MHz. Rev. Sci. Instrum. 67:2890– 2893.
Pound, R. V. 1951. Nuclear spin relaxation times in a single crystal of LiF. Phys. Rev. 81:156–156.
TonThat, D. M., Ziegeweid, M., Song, Y. Q., Munson, E. G., Applelt, S., Pines, A., and Clarke, J. 1997. SQUID detected NMR of laser polarized Xenon at 4.2 K and at frequencies down to 200 Hz. Chem. Phys. Lett. 272:245–249.
Pratt, J. C., Raganuthan, P., and McDowell, C. A. 1975. Transient response of a quadrupolar system in zero applied field. J. Magn. Reson. 20:313–327.
Werner-Zwanzinger, U., Zeigeweid, M., Black, B., and Pines, A. 1994. Nitrogen-14 SQUID NQR of L-Ala-L-His and of serine. Z. Naturforsch. 49a:1188–1192.
792
RESONANCE METHODS
Wikner, E. G. and Das, T. P. 1958. Antishielding of nuclear quadrupole moment of heavy ions. Phys. Rev. 109:360–368. Yesinowski, J. P., Buess, M. L., Garroway, A. N., Zeigeweid, M., and Pines A. 1995. Detection of 14N and 35Cl in cocaine base and hydrochloride using NQR, NMR and SQUID techniques. Anal. Chem. 67:2256–2263. Yu, H.-Y. 1991. Studies of NQR spectroscopy for spin-5/2 systems. M. S. Thesis, SUNY at Stony Brook.
KEY REFERENCES Abragam, 1961. See above. Still the best comprehensive text on NMR and NQR theory. Das, T. P. and Hahn, E. L. 1958. Nuclear Quadrupole Resonance Spectroscopy. Academic Press, New York. A more specialized review of the theory of NQR: old but still indispensible. Harbison et al., 1989. See above. The first true zero-field multidimensional NMR experiment. Robert and Pusiol, 1997. See above. A good source for references on NQR imaging.
APPENDIX: GLOSSARY OF TERMS AND SYMBOLS eq eQ h H1 I T1, T2 g g1 Z nQ oQ oR
the most distinct principal value of the electric field gradient at the nucleus electric quadrupole moment of the nucleus Planck’s constant applied radiofrequency field in tesla nuclear spin longitudinal and transverse relaxation times gyromagnetic ratio of the nucleus Sternheimer antishielding factor of the atom or ion asymmetry parameter of the electric field gradient at the nucleus NQR resonance frequency in Hz NQR resonance frequency in rad s1 intrinsic precession frequency in rad s1; oR ¼ gH1 BRUNO HERREROS GERARD S. HARBISON University of Nebraska Lincoln, Nebraska
ELECTRON PARAMAGNETIC RESONANCE SPECTROSCOPY INTRODUCTION Electron paramagnetic resonance (EPR) spectroscopy, also called electron spin resonance (ESR) or electron magnetic resonance (EMR), measures the absorption of electromagnetic energy by a paramagnetic center with one or more unpaired electrons (Atherton, 1993; Weil et al., 1994). In the presence of a magnetic field, the degeneracy of the electron spin energy levels is removed and transitions between
Figure 1. Energy-level splitting diagram for an unpaired electron in the presence of a magnetic field, interacting with one nucleus with I ¼ 1=2. The energy separation between the two levels for the unpaired electron is linearly proportional to the magnetic field strength, B. Coupling to the nuclear spin splits each electron spin energy level into two. Transitions between the two electron energy levels are stimulated by microwave radiation when hn ¼ gbB, where b is the electron Bohr magneton. Only transitions with mS ¼ 1; mI ¼ 0 are allowed, so interaction with nuclear spins causes the signal to split into 2nI þ 1 lines, where n is the number of equivalent nuclei with spin I. The microwave magnetic field, B1 is perpendicular to B. If the line shape is determined by relaxation, it is Lorentzian. The customary display in EPR spectroscopy is the first derivative of the absorption line shape. Sometimes the dispersion signal is detected instead of the absorption because the dispersion signal saturates less readily than the absorption. The dispersion signal is related to the absorption signal by the Kramers-Kronig transform.
the energy levels can be caused to occur by supplying energy. When the energy of the microwave photons equals the separation between the energy levels of the unpaired electrons, there is absorption of energy by the sample and the system is said to be at ‘‘resonance’’ (Fig. 1). The fundamental equation that describes the experiment for a paramagnetic center with one unpaired electron is hn ¼ gbB, where h is Planck’s constant, n is the frequency of the microwaves, g is a characteristic of the sample, b is the Bohr magneton, and B is the strength of the magnetic field in which the sample is placed. Typically the experiment is performed with magnetic fields such that the energies are in the microwave region. If the paramagnetic center is tumbling rapidly in solution, g is a scalar quantity. When the paramagnetic center is immobilized, as in solid samples, most samples exhibit g anisotropy and g is then represented by a matrix. Hyperfine splitting of the signal occurs due to interaction with nuclear spins and can therefore be used to identify the number and types of nuclear spins in proximity to the paramagnetic center.
ELECTRON PARAMAGNETIC RESONANCE SPECTROSCOPY
The information content of EPR arises from the ability to detect a signal and from characteristics of the signal, including integrated intensity, hyperfine splitting by nuclear spins, g value, line shape, and electron spin relaxation time. The characteristics of the signal may depend on environmental factors, including temperature, pressure, solvent, and other chemical species. Various types of EPR experiments can optimize information concerning these observables. The following list includes some commonly asked materials science questions that one might seek to answer based on these observables and some corresponding experimental design considerations. Detailed background information on the physical significance and theoretical basis for the observable parameters is provided in the standard texts cited (Poole, 1967, 1983; Pake and Estle, 1973; Eaton and Eaton, 1990a, 1997a; Weil et al., 1994). Some general information and practical considerations are given in this unit. Whenever unpaired electrons are involved, EPR is potentially the best physical technique for studying the system. Historically, the majority of applications of EPR have been to the study of organic free radicals and transition metal complexes (Abragam and Bleaney, 1970; Swartz et al., 1972; Dalton, 1985; Pilbrow, 1990). Today, these applications continue, but in the context of biological systems, where the organic radicals are naturally occurring radicals, spin labels, and spin-trapped radicals, and the transition metals are in metalloproteins (Berliner, 1976, 1979; Eaton et al., 1998). Applications to the study of materials are extensive, but deserve more attention than they have had in the past. Recent studies include the use of EPR to monitor the age of archeological artifacts (Ikeya, 1993), characterize semiconductors and superconductors, measure the spatial distribution of radicals in processed polymers, monitor photochemical degradation of paints, and characterize the molecular structure of glasses (Rudowicz et al., 1998). One might seek to answer the following types of questions. 1. Are there paramagnetic species in the sample? The signal could be due to organic radicals, paramagnetic metal ions, defects in materials (Lancaster, 1967) such as dangling bonds or radiation damage, or paramagnetic species intentionally added as probes. The primary concern in this case would be the presence or absence of a signal. The search for a signal might include obtaining data over a range of temperatures, because some signals can only be detected at low temperatures and others are best detected near room temperature. 2. What is the concentration of paramagnetic species? Does the concentration of paramagnetic species change as a function of procedure used to prepare the material or of sample handling? Spectra would need to be recorded in a quantitative fashion (Eaton and Eaton, 1980, 1990a, 1992, 1997a). 3. What is the nature of the paramagnetic species? If it is due to a paramagnetic metal, what metal? If it is due to an organic radical, what is the nature of the radical? Is more than one paramagnetic species present in the sample?
793
One would seek to obtain spectra that are as well resolved as possible. To examine weak interactions with nuclear spins that are too small compared with line widths to be resolved in the typical EPR spectrum, but which might help to identify the species that gives rise to the EPR signal, one might use techniques such as ENDOR (electron nuclear double resonance; Atherton, 1993) or ESEEM (electron spin echo envelope modulation; Dikanov and Tsvetkov, 1992). 4. Does the sample undergo phase transitions that change the environment of the paramagnetic center? Experiments could be run as a function of temperature, pressure, or other parameters that cause the phase transition. 5. Are the paramagnetic centers isolated from each other or present in pairs, clusters, or higher aggregates? EPR spectra are very sensitive to interactions between paramagnetic centers (Bencini and Gatteschi, 1990). Strong interactions are reflected in line shapes. Weaker interactions are reflected in relaxation times. 6. What is the mobility of the paramagnetic centers? EPR spectra are sensitive to motion on the time scale of anisotropies in resonance energies that arise from anisotropies in g values and/or electron-nuclear hyperfine interactions. This time scale typically is microseconds to nanoseconds, depending on the experiment. Thus, EPR spectra can be used as probes of local mobility and microviscosity (Berliner and Reuben, 1989). 7. Are the paramagnetic centers uniformly distributed through the sample or spatially localized? This issue would be best addressed with an EPR imaging experiment (Eaton and Eaton, 1988a, 1990b, 1991, 1993a, 1995a, 1996b, 1999a; Eaton et al., 1991; Sueki et al., 1990). Bulk magnetic susceptibility (GENERATION AND MEASURE& MAGNETIC MOMENT AND MAGNETIZATION) also can be used to study systems with unpaired electrons, and can be used to determine the nature of the interactions between spins in concentrated spin systems. Rather large samples are required for bulk magnetic susceptibility measurements, whereas EPR typically is limited to rather small samples. Bulk susceptibility and EPR are complementary techniques. EPR has the great advantage that it uniquely measures unpaired electrons in low concentrations, and in fact is most informative for systems that are magnetically dilute. It is particularly powerful for identification of paramagnetic species present in the sample and characterization of the environment of the paramagnetic species. In this unit, we seek to provide enough information so that a potential user can determine whether EPR is likely to be informative for a particular type of sample, and which type of EPR experiment would most likely be useful. MENT OF MAGNETIC FIELDS
PRINCIPLES OF EPR The fundamental principles of EPR are similar to those of nuclear magnetic resonance (NMR; Carrington and
794
RESONANCE METHODS
McLachlan, 1967) and magnetic resonance imaging (MRI), which are described elsewhere in this volume (NUCLEAR MAGNETIC RESONANCE IMAGING). However, several major differences between the properties of unpaired electron spins and of nuclear spins result in substantial differences between NMR and EPR spectroscopy. First, the magnetogyric ratio of the electron is 658 times that of the proton, so for the same magnetic field the frequency for electron spin resonance is 658 times the frequency for proton resonance. Many EPR spectrometers operate in the 9 to 9.5 GHz frequency range (called ‘‘X-band’’), which corresponds to resonance for an organic radical (g 2) at a magnetic field of 3200 to 3400 gauss [G; 1 G¼104 tesla (T)]. Second, electron spins couple to the electron orbital angular momentum, resulting in shorter relaxation times than are observed in NMR. Electron spin relaxation times are strongly dependent on the type of paramagnetic centers. For example, typical room temperature electron spin relaxation times range from 106 s for organic radicals to as short as 1012 s for low-spin Fe(III). Third, short relaxation times can result in very broad lines. Even for organic radicals with relatively long relaxation times, EPR lines frequently are relatively broad due to unresolved coupling to neighboring nuclear spins. Line widths for detectable EPR signals range from fractions of a gauss to tens or hundreds of gauss. Since 1 G corresponds to about 2.8 MHz, these EPR line widths correspond to 106 to 108 Hz, which is much greater than NMR line widths. Fourth, coupling to the electron orbital angular momentum also results in larger spectral dispersion for EPR than for NMR. A room temperature spectrum of an organic radical might extend over 10 to 100 G. However, the spectrum for a transition metal ion might extend over one hundred to several thousand gauss. Finally, the larger magnetic moment of the unpaired electron also results in larger spinspin interactions, so high-resolution spectra require lower concentrations for EPR than for NMR. These differences make the EPR measurement much more technically challenging than the NMR measurement. For example, pulsed Fourier transform NMR provides major advances in sensitivity; however pulsed Fourier transform spectroscopy has more restricted applicability in EPR because a pulse of finite width cannot excite the full spectrum for many types of paramagnetic samples (Kevan and Schwartz, 1979; Kevan and Bowman, 1990). In a typical EPR experiment the sample tube is placed in a structure that is called a resonator. The most common type of resonator is a rectangular cavity. This name is used because microwaves from a source are used to set up a standing wave pattern. In order to set up the standing wave pattern, the resonator must have dimensions appropriate for a relatively narrow range of microwave frequencies. Although the use of a resonator enhances the signal-to-noise (S/N) ratio for the experiment, it requires that the microwave frequency be held approximately constant and that the magnetic field be swept to achieve resonance. The spectrometer detects the microwave energy reflected from the resonator. When spins are at resonance, energy is absorbed from the standing wave pattern, the reflected energy decreases, and a signal is detected. In a typical continuous wave (CW) EPR experiment, the
Figure 2. Block diagram of an EPR spectrometer. The fundamental modules of a CW EPR spectrometer include: the microwave system comprising source, resonator, and detector; the magnet system comprising power supply and field controller; the magnetic field modulation and phase-sensitive detection system; and the data display and manipulation system. Each of these subsystems can be largely independent of the others. In modern spectrometers there is a trend toward increasing integration of these units via interfaces to a computer. The computer is then the controller of the spectrometer and also provides the data display and manipulation. In a pulse system a pulse timing unit is added and magnetic field modulation is not used.
magnetic field is swept and the change in reflected energy (the EPR signal) is recorded. To further improve S/N, the magnetic field is usually modulated (100 kHz modulation is commonly used), and the EPR signal is detected using a phase-sensitive detector at this modulation frequency. This detection scheme produces a first derivative of the EPR absorption signal. A sketch of how magnetic-field modulation results in a derivative display is found in Eaton and Eaton, (1990a, 1997a). Because taking a derivative is a form of resolution enhancement and EPR signals frequently are broad, it is advantageous for many types of samples to work directly with the first-derivative display. Thus, the traditional EPR spectrum is a plot of the first derivative of the EPR absorption as a function of magnetic field. A block diagram for a CW EPR spectrometer is given in Figure 2. Types of EPR Experiments One can list over 100 separate types of EPR spectra. One such list is in an issue of the EPR Newsletter (Eaton and Eaton, 1997b). For this unit we focus primarily on experiments that can be done with commercial spectrometers. For the near future, those are the experiments that are most likely to be routinely available to materials scientists. In the following paragraphs we outline some of the more common types of experiments and indicate what characteristics of the sample might indicate the desirability of a particular type of experiment. Experimental details are reserved for a later section. 1. CW Experiments at X-band with a Rectangular Resonator This category represents the vast majority of experiments currently performed, whether for routine analysis or for research, in materials or in biomedical areas. In this experiment the microwaves are on continuously and the magnetic field is scanned to achieve resonance. This is the experiment that was described in the introductory
ELECTRON PARAMAGNETIC RESONANCE SPECTROSCOPY
795
Figure 3. Room temperature CW spectrum of a nitroxyl radical obtained with a microwave frequency of 9.09 GHz. The major three-line splitting is due to coupling to the nitroxyl nitrogen (I ¼ 1) and the small doublet splitting is due to interaction with a unique proton (I ¼ 1=2). The calculated spectrum was obtained with hyperfine coupling to the nitrogen of 14.7 G, coupling to the proton of 1.6 G, and a Gaussian line width of 1.1 G due to additional unresolved proton hyperfine coupling. The g value (2.0056) is shifted from 2.0023 because of spin-orbit coupling involving the nitrogen heteroatom.
paragraphs and which will be the focus of much of this unit. Examples of spectra obtained at X-band in fluid solution and for an immobilized sample are shown in Figures 3 and 4. The parameters obtained by simulation of these spectra are indicated in the figure captions. 2. CW Experiments at Frequencies Other than X-band Until relatively recently, almost all EPR was performed at approximately 9 GHz (X-band). However, experiments at
lower frequencies can be advantageous in dealing with larger samples and samples composed of materials that absorb microwaves strongly at X-band, as well as for resolving nitrogen hyperfine splitting in certain Cu(II) complexes (Hyde and Froncisz, 1982). Experiments at higher microwave frequencies are advantageous in resolving signals from species with small g-value differences (Krinichnyi, 1995; Eaton and Eaton, 1993c, 1999b), in obtaining signals from
Figure 4. Room temperature CW spectrum of a vanadyl porphyrin in a 2:1 toluene:chloroform glass at 100 K, obtained with a microwave frequency of 9.200 GHz. The anisotropy of the g and hyperfine values results in different resonant conditions for the molecules as a function of orientation with respect to the external field. For this random distribution of orientations, the EPR absorption signal extends from about 2750 to 3950 G. The first derivative display emphasizes regions of the spectrum where there are more rapid changes in slope, so ‘‘peaks’’ in the first derivative curve occur at extrema in the powder distribution. These extrema define the values of g and A along the principal axes of the magnetic tensors. The hyperfine splitting into eight approximately equally spaced lines is due to the nuclear spin (I ¼ 3.5) of the vanadium nucleus. The hyperfine splitting is greater along the normal to the porphyrin plane (z axis) than in the perpendicular plane. The calculated spectrum was obtained with gx ¼ 1.984, gy ¼ 1.981, gz ¼ 1.965, Ax ¼ 55 104 cm1, Ay ¼ 53 104 cm1, and Az ¼ 158 104 cm1.
796
RESONANCE METHODS
paramagnetic centers with more than one unpaired electron, and in analyzing magnetic interactions in magnetically concentrated samples (Date, 1983). In principle, experiments at higher microwave frequency should have higher sensitivity than experiments at X-band, due to the larger difference in the Boltzmann populations of the electron spin states. Sensitivity varies with (frequency)11/4 if the microwave magnetic field at the sample is kept constant and the size of the resonator is scaled inversely with frequency (Rinard et al., 1999). This is a realistic projection for very small samples, as might occur in some materials research applications. To achieve the expected increase in sensitivity (Rinard et al., 1999) will require substantial engineering improvements in the sources, resonators and detectors used at higher frequency. Computer simulations of complicated spectra as a function of microwave frequency provide a much more stringent test of the parameters than can be obtained at a single frequency. 3. Electron-Nuclear Double Resonance (ENDOR) Due to the large line widths of many EPR signals, it is difficult to resolve couplings to nuclear spins. However, these couplings often are key to identifying the paramagnetic center that gives rise to the EPR signal. By simultaneously applying radio frequency (RF) and microwave frequency energies, one can achieve double resonance of both the electron and nuclear spins. The line widths of the signals in this experiment are much narrower than typical EPR line widths and the total number of lines in the spectrum is smaller, so it is easier to identify the nuclear spins that are interacting with the electron spin in an ENDOR spectrum (Box, 1977; Schweiger, 1982; Atherton, 1993; Piekara-Sady and Kispert, 1994; Goslar et al., 1994) than in an experiment without the RF frequency—i.e., a ‘‘normal’’ CW spectrum. 4. Pulsed/Fourier Transform EPR Measurements of relaxation times via pulsed or timedomain techniques typically are much more accurate than continuous wave measurements. Pulse sequences also can be tailored to obtain detailed information about the spins. However, the much shorter relaxation times for electron spins than for nuclear spins limit pulse sequences to shorter times than can be used in pulsed NMR (Kevan and Schwartz, 1979). A technique referred to as electron spin echo envelope modulation (ESEEM) is particularly useful in characterizing interacting nuclear spins, including applications in materials (Kevan and Bowman, 1990). Interaction between inequivalent unpaired electrons can cause changes in relaxation times that depend on distance between the spins (Eaton and Eaton, 1996a). 5. Experiments with Resonators Other than the Rectangular Resonator New lumped-circuit resonators, such as the loop-gap resonator (LGR) and others designed on this principle (Hyde and Froncisz, 1986) can be adapted to a specific
experiment to optimize aspects of the measurement (Rinard et al., 1993, 1994, 1996a, 1996b). These structures are particularly important for pulsed experiments and for experiments at frequencies lower than X-band.
PRACTICAL ASPECTS OF THE CW METHOD The selection of the microwave power and modulation amplitude for recording CW EPR signals is very important in obtaining reliable results. We therefore give a brief discussion of issues related to the selection of these parameters. Greater detail concerning the selection of parameters can be found in Eaton and Eaton (1990a, 1997a). Microwave Power If microwave power is absorbed by the sample at a rate that is faster than the sample can dissipate energy to the lattice (i.e., faster than the electron spin relaxation rate), the EPR signal will not be proportional to the number of spins in the sample. This condition is called saturation. The power that can be used without saturating the sample depends on the relaxation rate. To avoid saturation of spectra, much lower powers must be used for samples that have long relaxation times, such as organic radicals, than for samples with much shorter relaxation times, such as transition metals. Relaxation times can range over more than 18 orders of magnitude, from hours to 1014 s, with a wide range of temperature dependence, from T9 to temperature-independent, and a wide variety of magnetic field dependence and/or dependence on position in the EPR spectrum (Du et al., 1995). One can test for saturation by recording signal amplitude at a series of microwave powers, P, and plotting signal amplitude as a function of pffiffiffiffi P. This plot is called a power-saturation curve. The point pffiffiffiffi where the signal dependence on P begins to deviate from linearity is called the onset of saturation. Data must be recorded at a power level below the onset of saturation in order to use integrated data to determine the spin concentration in the sample. Higher powers also cause line broadening, so powers should be kept below the onset of saturation if data analysis is based on line shape information. Above the onset of saturation, the signal intensity typically increases with increasing power and goes through a maximum. If the primary goal is to get maximum S/N at the expense of line shape or quantitation, then one can operate at the power level that gives maximum signal amplitude, but under such conditions quantitation of spins can be erroneous. Modulation Amplitude Provided that the modulation amplitude is less than 1/10 of the peak-to-peak line width of the first-derivative EPR signal to be recorded, increasing modulation amplitude improves S/N without distorting line shape. However, as the modulation amplitude is increased further, it causes broadening and distortion (Poole, 1967). If the key information to be obtained from the spectrum is based on the line shape, care must be taken in selecting the modulation
ELECTRON PARAMAGNETIC RESONANCE SPECTROSCOPY
amplitude. Unlike the problems that occur when too high a microwave power is used to record the spectrum, increasing the modulation amplitude does not change the linear relationship between integrated signal intensity and modulation amplitude. Thus, if the primary information to be obtained from a spectrum is the integrated signal intensity (to determine spin concentration), it can be useful to increase modulation amplitude, and thereby improve S/N, at the expense of some line shape distortion. The maximum spectral amplitude, and therefore the best S/N, occurs at a peak-to-peak modulation amplitude 2 times the peak-to-peak width of the derivative EPR spectrum. Sensitivity The most important issue concerning practical applications of EPR is sensitivity (Eaton and Eaton 1980, 1992; Rinard et al., 1999). In many analytical methods, one can describe sensitivity simply in terms of minimum detectable concentrations. The problem is more complicated for EPR because of the wide range of line widths and large differences in power saturation among species of interest. Vendor literature cites sensitivity as about 0.8 1010/B spins/G, where B is the line width of the EPR signal. This is a statement of the minimum number of detectable spins with a S/N of 1. It is based on the assumption that spectra can be recorded at a power of 200 mW, with a modulation amplitude of 8 G, and that there is no hyperfine splitting that divides intensity of the signal into multiple resolved lines. To apply this statement to estimate sensitivity for another sample requires taking account of the actual conditions appropriate for a particular sample. Those conditions depend upon the information that one requires from the spectra. One can consider two types of conditions as described below. Case 1. Selection of microwave power, modulation amplitude, and time constant of the signal detection system that provide undistorted line shapes. This mode of operation is crucial if one wishes to obtain information concerning mobility of the paramagnetic center or partially resolved hyperfine interactions. Typically this would also require a S/N significantly greater than 1. This case might be called ‘‘minimum number of detectable spins with desired spectral information.’’ It would require the use of a microwave power that did not cause saturation and a modulation amplitude that did not cause line broadening. Case 2. Selection of microwave power, modulation amplitude, and time constant of the signal detection system that provide the maximum signal amplitude, at the expense of line shape information. This mode of operation might be selected, for example, in some spin trapping experiments, where one seeks to determine whether an EPR signal is present and one can obtain adequate information for identification of the species from observed hyperfine couplings even in the presence of line shape distortions that come from power broadening or overmodulation. This case might be called ‘‘minimum number of detectable spins.’’
797
To correct the minimum number of detectable spins for microwave power, multiply 0.8 1010/B by rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Psample 200
ð1Þ
where Psample is the power incident on the critically coupled resonator, in mW. To correct for changes in modulation amplitude (MA), multiply by MAsample (in gauss)/ 8 (G). For samples with narrow lines and long relaxation times, the minimum number of detectable spins may be higher than 0.8 1010/B by several orders of magnitude. Calibration Quality assurance requires a major effort in EPR (Eaton and Eaton 1980, 1992). Because the properties of the sample affect the characteristics of the resonator, and because the microwave magnetic field varies over the sample, selecting a standard for quantitation of the number of spins requires understanding of the spin system and of the spectrometer [Poole, 1967, 1983; Wilmshurst, 1968 (see especially THERMAL ANALYSIS); Alger 1968; Czoch and Francik, 1989]. In general, the standard should be as similar as possible to the sample to be measured, in the same solvent or solid host, and in a tube that is as close as possible to the same size. The EPR tube has to be treated as a volumetric flask, and calibrated accordingly. Determining the number of spins in the sample requires subtraction of any background signal, and then integrating twice (derivative ! absorption ! area). If there is background slope, the double integration may not be very accurate. All of this requires that the EPR spectrum be in digital form. If the line shape is the same for the sample and the reference, then peak heights can be used instead of integrated areas. This is sometimes difficult to ensure, especially in the wings of the spectrum, which may constitute a lot of the area under the spectrum. However, for fairly widespread applications, such as spin trapping and spin labeling, reasonable estimates of spin concentration can be obtained by peak-height comparison with gravimetrically prepared standard solutions. Very noisy spectra with well defined peak positions might best be quantitated using simulated spectra that best fit the experimental spectra. Another important calibration is of magnetic field scan. The best standard is an NMR gaussmeter, but note that the magnetic field varies across the pole face, and the gaussmeter probe has to be as close to the sample as possible—ideally inside the resonator. A few carefullymeasured samples that can be prepared reproducibly have been reported in the literature (Eaton and Eaton 1980, 1992) and are good secondary standards for magnetic field calibrations. The magnetic field control device on electromagnetbased EPR spectrometers is a Hall probe. These are inherently not very accurate devices, but EPR field control units contain circuitry to correct for their inaccuracies. Recent Bruker field controllers, for example, use a ROM containing the characteristics of the specific Hall probe with which it is matched. Such systems provide sufficient accuracy for most EPR measurements.
798
RESONANCE METHODS
METHOD AUTOMATION Early EPR spectrometers did not include computers. Many, but not all of those systems, have been retrofitted with computers for data acquisition. Historically, the operator spent full time at the EPR spectrometer console while the spectrometer was being operated. In the next generation, a computer was attached to the EPR spectrometer for data acquisition, but the spectrometer was still substantially manually operated. Even when computers were built into the spectrometer, there was a control console at which the operator sat. The newer spectrometers are operated from the computer, and have a console of electronic modules that control various functions of the magnet, microwave system, and data acquisition. The latest pulsed EPR spectrometers are able to execute long (including overnight) multipulse experiments without operator intervention. The most recent cryogenic temperature-control systems also can be programmed and can control the cryogen valve in addition to the heater. There is one research lab that has highly automated ENDOR spectrometers (Corradi et al, 1991). Except for these examples, EPR has not been automated to a significant degree. For many samples, the sample positioning is sufficiently critical, as is the resonator tuning and coupling, that automation of sample changing for unattended operation is less feasible than for many other forms of spectroscopy, including NMR. Automation would be feasible for multiple samples that shared the same geometry, dielectric constant, and microwave loss factor, if they could be run near room temperature. The only currently available spectrometer with an automatic sample changer is one designed by Bruker for dosimetry. This system is customized to handle a particular style of alanine dosimeter and includes software to quantitate the radical signal and to prepare reports. References to computer use for acquisition and interpretation are given by Vancamp and Heiss (1981) and Kirste (1994). DATA ANALYSIS AND INITIAL INTERPRETATION The wide range of queries one may make of the spin system, listed elsewhere in this unit, makes the data analysis and interpretation diverse. An example of a spectrum of an organic radical in fluid solution is shown in Figure 3. Computer simulation of the spectrum gives the g value for the radical and the nuclear hyperfine splittings which can be used to identify the nuclear spins with the largest couplings to the unpaired electrons. An example of a spectrum of a vanadyl complex immobilized in 1:1 toluene:chloroform at 100 K is shown in Figure 4. The characteristic 8-line splitting patterns permit unambiguous assignment as a vanadium species (I ¼ 3.5). Computer simulation of the spectrum gives the anisotropic components of the g and A values, which permits a more detailed characterization of the electronic structure of the paramagnetic center than could be obtained from the isotropic averages of these parameters observed in fluid solution. As discussed above, under instrument calibration, a first stage in data analysis and interpretation is to determine how many spins are present, as well as the mag-
nitude of line widths and hyperfine splittings. The importance of quantitation should not be underestimated. There are many reports in the literature of detailed analysis of a signal that accounts for only a very small fraction of the spins in the sample, the others having been overlooked. The problem is particularly acute in samples that contain both sharp and broad signals. The first-derivative display tends to emphasize the sharp signals, and it is easy to overlook broad signals unless spectra are recorded under a wide range of experimental conditions. The software available as part of the spectrometer systems is increasingly powerful and versatile. Additional software is available, some commercially and some from individual labs (easily locatable via the EPR Society software exchange; http://www.ierc.scs.uiuc.edu), for simulation of more specialized spin systems. The understanding of many spin systems is now at the stage where a full simulation of the experimental line shape is a necessary step in interpreting an EPR spectrum. For S ¼ 1/2 organic radicals in fluid solution, one should be able to fit spectra within experimental error if the radical has been correctly identified. However, there will always remain important problems for which the key is to understand the spin system, and for these systems simulation may push the state of the art. For example, for many high-spin Fe(III) systems, there is little information about zero-field splitting (ZFS) terms, so one does not even know which transitions should be included in the simulation. In pulsed EPR measurements, the first step after quantitation is to determine the relaxation times, and, if relevant, the ESEEM frequencies in the echo decay by Fourier transformation of the time-domain signal. Software for these analyses and for interpretation of the results is still largely resident within individual research groups, especially for newly evolved experiments.
SAMPLE PREPARATION Samples for study by EPR can be solid, dissolved solid, liquid, or, less frequently, gas. In the gas phase the electron spin couples to the rotational angular momentum, so EPR spectroscopy has been used primarily for very small molecules. Thus, the focus of this unit is on samples in solid or liquid phases. For most EPR resonators at X-band or lower frequency, the sample is placed in a 4-mm outside diameter (o.d.) quartz tube. Usually, synthetic fused silica is used to avoid paramagnetic metal impurities. The weaker the sample signal, the higher the quality of quartz required. Pyrex or Kimax (trade names of Corning and Kimble) have very strong EPR signals. Often, oxygen must be removed from the sample because the sample reacts with oxygen, or because the paramagnetic O2 broadens EPR signals by Heisenberg exchange during collisions. The oxygen broadening is a problem for high-resolution spectra of organic radicals in solution and for the study of carbonaceous materials. This problem is so severe that one can turn this around and use oxygen broadening as a measure of oxygen concentration. EPR oximetry is a powerful analytical tool (Hyde and Subczynski, 1989).
ELECTRON PARAMAGNETIC RESONANCE SPECTROSCOPY
For some problems one would want to study the concentrated solid, which may not exist in any other form. Maybe one wants to examine the spin-spin interaction as a function of temperature. In this case, one would put the solid directly into the EPR tube, either as a powder or as a single crystal, which might be oriented. There are a number of caveats for these types of samples. First, one must avoid overloading the resonator. More sample is not always better. With a ‘‘standard’’ TE102 (transverse electric 102 mode cavity) rectangular resonator (which until recently was the most common EPR resonator), one cannot use more than 1017 spins (Goldberg and Crowe, 1977). If the number of spins is too high, one violates the criterion that the energy absorption on resonance is a small perturbation to the energy reflected from the resonator. This is not an uncommon problem, especially with ‘‘unknown’’ samples, which might contain, for example, large quantities of iron oxide. A similar problem occurs with samples that are lossy— i.e., that absorb microwaves other than by magnetic resonance (Dalal et al., 1981). Water is a prime example. For aqueous samples, and especially biological samples, there are special flat cells to keep the lossy material in a nodal plane of the microwave field in the resonator. There are special resonators to optimize the S/N for lossy samples. Usually, these operate with a TM (transverse magnetic) mode rather than a TE mode. Lower frequency can also be advantageous for lossy samples. There are many examples of lossy materials commonly encountered in the study of materials. Carbonaceous materials often have highly conducting regions. Some semiconductors are sufficiently conducting that only small amounts of sample can be put in the resonator without reducing the Q (the resonator quality factor) so much that the spectrometer cannot operate. These problems are obvious to the experienced operator, because general-purpose commercial EPR spectrometers provide a readout of the microwave power as a function of frequency during the ‘‘tune’’ mode of setting up the measurement. The power absorbed by the resonator appears as a ‘‘dip’’ in the power versus frequency display. The broader the dip, the lower the resonator Q. One useful definition of Q is the resonant frequency divided by the half-power bandwidth of the resonator, Q ¼ n=n. A lossy or conducting sample lowers Q, which is evident to the operator as a broadening of the dip. If the sample has much effect on Q, care must be taken in quantitation of the EPR signal, since the signal amplitude is proportional to Q. Ideally, one would measure the Q for every sample if spin quantitation is a goal. No commercial spectrometer gives an accurate readout of Q, so the best approach is to make the Q effect of the samples under comparison as precisely the same as possible. Magnetically concentrated solids can exhibit the full range of magnetic interactions, including antiferromagnetism, ferromagnetism, and ferrimagnetism, in addition to paramagnetism, many of which are amenable to study by EPR (Bencini and Gatteschi, 1990; Stevens, 1997). Because of the strength of magnetic interactions in magnetically concentrated solids, ferrimagnetic resonance and ferromagnetic resonance have become separate subfields of magnetic resonance. High-field EPR is especially valu-
799
able to studies of magnetically concentrated solids, because it is possible in some cases to achieve external magnetic fields on the order of or greater than the internal magnetic fields. If the goal is the study of isolated spins, they need to be magnetically dilute. In practice, this means that the spin concentration in a doped solid or liquid or solid solution should be less than 1 mM (6 1017 spins per cm3). For narrow-line spectra, such as defect centers in solids or organic radicals, either in liquid or solid solution, concentrations less than 1 mM are required to achieve minimum line width. For accurate measurement of relaxation times, concentrations have to be lower than those required for accurate measurement of line widths in CW spectra. For some materials, it is convenient to prepare a solution and cool it to obtain an immobilized sample. If this is done, it is important to select a solvent or solvent mixture that forms a glass, not crystals, when it is cooled. Crystallization excludes solute from the lattice and thereby generates regions of locally high concentrations, which can result in poor resolution of the spectra. The discussion of lossy samples hints at part of the problem of preparing fluid solution samples. The solvent has to be nonlossy, or the sample has to be very small and carefully located. Some solvents are more lossy at given temperatures than others. Most solvents are nonlossy when frozen. Broadening by oxygen is usually not detectable in frozen solution line widths, but there still can be an impact on relaxation times. Furthermore, the paramagnetic O2 also can yield an EPR signal in frozen solution (it is detectable in the gas phase and in the solid, but not in fluid solution), and can be mistaken for signals in the sample under study. Care must be taken when examining powdered crystalline solids to get a true powder (random orientation) line shape. If the particles are not small enough they may preferentially orient when placed in the sample tube. At high magnetic fields, magnetically anisotropic solids can align in the magnetic field. Either of these effects can yield spectra that are representative of special crystal orientations, rather than a powder (spherical) average. The combination of these effects can dominate attempts to obtain spectra at high magnetic fields, such as at 95 GHz and above, where very small samples are needed and magnetic fields are strong enough to macroscopically align microcrystals. Although experiments are more time consuming, very detailed information concerning the electronic structure of the paramagnetic center can be obtained by examining single crystals. In these samples, data are obtained as a function of the orientation of the crystal, which then provides a full mapping of the orientation dependence of the g-matrix and of the electron-nuclear hyperfine interaction (Byrn and Strouse, 1983; Weil et al., 1994).
SPECIMEN MODIFICATION The issue of specimen modification is inherent in the discussion of sample preparation. EPR is appropriately described as a nondestructive technique. If the sample fits the EPR resonator, it can be recovered unchanged.
800
RESONANCE METHODS
However, some of the detailed sample preparations discussed above do result in modification of the sample. Grinding to achieve a true powder average spectrum destroys the original crystal. Sometimes grinding itself introduces defects that yield EPR signals, and this could be a study in itself. Sometimes grinding exposes surfaces to reactants that convert the sample to something different. Similarly, dissolving a sample may irreversibly change it, due to reactivity or association or dissociation equilibria. Cooling or heating a single crystal sometimes causes phase transitions that shatter the crystal. Unfortunately freezing a solvent may shatter the quartz EPR tube, due to differences in coefficient of expansion, causing loss of the sample. These potential problems notwithstanding, EPR is a nondestructive and noninvasive methodology for study of materials. Consider the alternatives to EPR imaging to determine spatial distribution of paramagnetic centers. Even using EPR to monitor the radical concentration, one would grind or etch away the surface of the sample, monitoring changes in the signal after each step of surface removal. The sample is destroyed in the process. With EPR imaging, the sample is still intact and can be used for a further study, e.g., additional irradiation or heat treatment (Sueki et al., 1995). PROBLEMS How to Decide Whether and How EPR Should be Used What do you want to learn about the sample? Is EPR the right tool for the task? This section attempts to guide thinking about these questions. Is there an EPR signal? The first thing we learn via EPR is whether the sample in a particular environment exhibits resonance at the selected frequency, magnetic field, temperature, microwave power, and time after a pulse. This is a nontrivial observation, and by itself may provide almost the full answer to a question about the material. One should be careful, however, about jumping to conclusions based on only qualitative EPR results. With care it is possible to find an EPR signal in almost any sample, but that signal may not even be relevant to the information desired about the sample. There are spins everywhere. Dirt and dust in the environment, even indoors, will yield an EPR signal, often with strong Mn(II) signals and large broad signals due to iron oxides. Quartz sample tubes and Dewars contain impurities that yield EPR signal close to g ¼ 2 and slightly higher. Most often, these are ‘‘background’’ interferences that have to be subtracted for quantitative work. What if the sample does not exhibit a nonbackground EPR signal? Does this mean that there are no unpaired spins in the sample? Definitely not! The experienced spectroscopist is more likely to find a signal than the novice, because of the extensive number of tradeoffs needed to optimize an EPR spectrum (sample positioning, microwave frequency, resonator coupling, magnetic field sweep, microwave power, modulation frequency and amplitude, and time constant). The sample might have relaxation times such that the EPR signal is not observable at the
temperature selected (Eaton and Eaton 1990a). In general, relaxation times increase with decreasing temperature, and there will be some optimum temperature at which the relaxation time is sufficiently long that the lines are not immeasurably broad, but not so long that the line is saturated at all available microwave power levels. Furthermore, the failure to observe an EPR signal could be a fundamental property of the spin system. If the electron spin, S, for the paramagnetic center of interest is half-integer, usually at least the transition between the ms ¼ 1=2 levels will be observable at some temperature. However, for integer spin systems (S ¼ 1, 2, 3) the transition may not be observable at all in the usual spectrometer configuration, even if the right frequency, field, temperature, and so forth are chosen (Abragam and Bleaney, 1970; Pilbrow, 1990). This is because such spin transitions are forbidden when the microwave magnetic field, B1, is perpendicular to the external magnetic field, B0, i.e., in the usual arrangement. For integer spin systems one needs B1 parallel to B0, and a special cavity is required. Fortunately, these are commercially available at X-band. Thus, if the question is ‘‘does this sample contain paramagnetic Ni(II),’’ a special parallel mode resonator is needed. An overview of EPR-detectable paramagnetic centers and corresponding parameters can be found in Eaton and Eaton (1990a). Another reason for not observing an EPR signal, or at least the signal from the species of interest, is that spinspin interaction resulted in a singlet-triplet separation such that only one of these states is significantly populated under the conditions of measurement. A classic example is dimeric Cu(II) complexes, which have singlet ground states, so the signal disappears as the sample is cooled to He temperatures. Similarly, in fullerene chemistry it was found that the EPR signal from C2 60 is due to a thermally excited triplet state and could only be seen in a restricted temperature range (Trulove et al., 1995). Even if the goal is to measure relaxation times with pulsed EPR, one would usually start with CW survey spectra to ensure the integrity of the sample. Usually, pulsed EPR is less sensitive on a molar spin basis than CW EPR when both are optimized, because of the better noise filtering that can be done in CW EPR. However, if the relaxation time is long, it may not be possible to obtain a valid slow-passage CW EPR spectrum, and alternative techniques will be superior. Sometimes rapid-passage CW EPR, or dispersion spectra instead of absorption spectra, could be the EPR methodology of choice, especially for broad signals. However, for EPR signals that are narrow relative to the bandwidth of the resonator, pulsed FT EPR may provide both better S/N and more fidelity in line shape than CW EPR. This is the case for the signal of the E0 center in the g-irradiated fused quartz sample that has been proposed as a standard sample for timedomain EPR (Eaton and Eaton, 1993b; available from Wilmad Glass). How many species are there? In many forms of spectroscopy one can monitor isosbestic points to determine limits on the number of species present. Note that, since the common CW EPR spectrum is a derivative display, one should in general not interpret spectral positions that
ELECTRON PARAMAGNETIC RESONANCE SPECTROSCOPY
are constant as a function of some variable as isosbestic points. Because of the derivative display they are isoclinic points—points of the same slope, not points of the same absorption. Integrated spectra need to be used to judge relative behavior of multiple species in the way that has become common in UV-visible electronic spectroscopy. Relatively few paramagnetic species yield single-line EPR spectra, due to electron-nuclear coupling and electron-electron coupling. If the nuclear coupling is well resolved in a CW spectrum, it is a straightforward exercise to figure out the number of nuclear spins of each nuclear moment. However, unresolved coupling is common, and more powerful (and expensive) techniques have to be used. Electron-nuclear double resonance (ENDOR), which uses RF at the nuclear resonant frequency simultaneous with microwaves resonant at particular EPR transition, is a powerful tool for identifying nuclear couplings not resolved in CW EPR spectra (Kevan and Kispert, 1976; Box, 1977; Dorio and Freed, 1979; Schweiger, 1982; Piekara-Sady and Kispert, 1994; Goslar et al., 1994). A good introduction is given in (Atherton, 1993). Nuclear couplings too small to be observed in ENDOR commonly can be observed with spin echo EPR, where they result in electron spin echo envelope modulation (ESEEM, also abbreviated ESEM; Dikanov and Tsvetkov, 1992). This technique requires a pulsed EPR spectrometer. ESEEM is especially powerful for determining the nuclei in the environment of an electron spin, and is extensively used in metalloprotein studies. Where are the species that yield the EPR signals? The ENDOR and ESEEM techniques provide answers about the immediate environments of the paramagnetic centers, but there may be a larger-scale question: are the paramagnetic species on the surface of the sample (Sueki et al., 1995), in the center, uniformly distributed, or localized in high-concentration regions throughout the sample? These questions also are answerable via various EPR techniques. EPR imaging can tell where in the sample the spins are located (Eaton et al., 1991). EPR imaging is analogous to NMR imaging, but is most often performed with slowly stepped magnetic field gradients and CW spectra, whereas most NMR imaging is done with pulsed-field gradients and time-domain (usually spin echo) spectra (see NUCLEAR MAGNETIC RESONANCE IMAGING). The differences in technique are due to the differences in spectral width and relaxation times. Low-frequency (L-band) spectrometers are available from JEOL and Bruker. Uniform versus nonuniform distributions of paramagnetic centers can be ascertained by several observations. Nonuniformity is a more common problem than is recognized. Dopants (impurities) in solids usually distort the geometry of the lattice and sometimes the charge balance, so they are often not distributed in a truly random fashion. Frozen solutions (at whatever temperature) often exhibit partial precipitation and/or aggregation of the paramagnetic centers. The use of pure solvents rarely gives a good glass. For example, even flash freezing does not yield glassy water under normal laboratory conditions. A cosolvent such as glycerol is needed to form a glass. If there is even partial devitrification, it is likely that the paramagnetic species (an impurity in the solvent) will be localized
801
at grain boundaries. Local high concentrations sometimes can be recognized by broadening of the spectra (if one knows what the ‘‘normal’’ spectrum should look like), or even exchange narrowing of the spectra when concentrations are very high. Time-domain (pulse) spectra are more sensitive than CW spectra to these effects, and provide a convenient measure through a phenomenon called instantaneous diffusion (Eaton and Eaton, 1991), which is revealed when the two-pulse echo decay rate depends on the pulse power. The observation of instantaneous diffusion is an indication that the spin concentration is high. If the bulk concentration is known to be low enough that instantaneous diffusion is not likely to be observed (e.g., less than 1 mM for broad spectra and less than 0.1 mM for narrow spectra), then the observation of instantaneous diffusion indicates that there are locally high concentrations of spins—i.e., that the spin distribution is microscopically nonuniform. Multiple species can show up as multiple time constants in pulsed EPR, but there may be reasons other than multiple species for the multiple time constants. Many checks are needed, and the literature of timedomain EPR should be consulted (Standley and Vaughan, 1969; Muus and Atkins, 1972; Kevan and Schwartz, 1979; Dikanov and Tsvetkov, 1992; Kevan and Bowman, 1990). One of the most common questions is ‘‘how long will it take to run an EPR spectrum?’’ The answer depends strongly on what one wants to learn from the sample, and can range from a few minutes to many weeks. Even the simple question as to whether there are any unpaired electrons present may take quite a bit of effort to answer, unless one already knows a lot about the sample. The spins may have relaxation times so long that they are difficult to observe without saturation (e.g., defect centers in solids) or so short that they cannot be observed except at very low temperature, where the relaxation times become longer [e.g., high-spin Co(II) in many environments]. At the other extreme, column fractions of a nitroxyl-spin-labeled protein can be monitored for radicals about as fast as the samples can be put in the spectrometer. This is an example of an application that could be automated. Similarly, alanine-based radiation dosimeters, if of a standard geometry, can be loaded automatically into a Bruker spectrometer designed for this application, and the spectra can be run in a few minutes. If, on the other hand, one wants to know the concentration of high-spin Co(II) in a sample, the need for quantitative sample preparation, accurate cryogenic temperature control, careful background subtraction, and skillful setting of instrument parameters leads to a rather time-consuming measurement. Current research on EPR is usually reported in the Journal of Magnetic Resonance, Applied Magnetic Resonance, Chemical Physics Letters, Journal of Chemical Physics, and journals focusing on application areas, such as Macromolecules, Journal of Non-Crystalline Solids, Inorganic Chemistry, and numerous biochemical journals. Examination of the current literature will suggest applications of EPR to materials science beyond those briefly mentioned in this unit. The Specialist Periodical Report on ESR of the Royal Society of Chemistry is the best source of annual updates on progress in EPR.
802
RESONANCE METHODS
Reports of two workshops on the future of EPR (in 1987 and 1992) and a volume celebrating the first 50 years of EPR (Eaton et al., 1998) provide a vision of the future directions of the field (Eaton and Eaton, 1988b, 1995b). Finally, it should be pointed out that merely obtaining an EPR spectrum properly, and characterizing its dependence on the factors discussed above, is just the first step. Its interpretation in terms of the physical properties of the material of interest is the real intellectual challenge and payoff.
Eaton, G. R. and Eaton, S. S. 1988a. EPR imaging: progress and prospects. Bull. Magn. Reson. 10:22–31. Eaton, G. R. and Eaton S. S. 1988b. Workshop on the future of EPR (ESR) instrumentation: Denver, Colorado, August 7, 1987. Bull. Magn. Reson. 10:3–21. Eaton, G. R. and Eaton, S. S. 1990a. Electron paramagnetic resonance. In Analytical Instrumentation Handbook (G. W. Ewing, ed.) pp. 467–530. Marcel Dekker, New York. Eaton, S. S. and Eaton, G. R. 1990b. Electron spin resonance imaging. In Modern Pulsed and Continuous-Wave Electron Spin Resonance (L. Kevan and M. Bowman, eds.) pp. 405–435 Wiley Interscience, New York.
Abragam, A. and Bleaney, B. 1970. Electron Paramagnetic Resonance of Transition Ions. Oxford University Press, Oxford.
Eaton, G. R., Eaton, S., and Ohno, K., eds. 1991. EPR Imaging and in Vivo EPR, CRC Press, Boca Raton, Fla. Eaton, S. S. and Eaton, G. R. 1991. EPR imaging, In Electron Spin Resonance Specialist Periodical Reports (M. C. R. Symons, ed.) 12b:176–190. Royal Society of London, London.
Alger, R. S. 1968. Electron Paramagnetic Resonance: Techniques and Applications. Wiley-Interscience, New York.
Eaton, S. S. and Eaton, G. R. 1992. Quality assurance in EPR. Bull. Magn. Reson. 13:83–89.
Atherton, N. M. 1993. Principles of Electron Spin Resonance. Prentice Hall, London.
Eaton, G. R. and Eaton, S. S. 1993a. Electron paramagnetic resonance imaging. In Microscopic and Spectroscopic Imaging of the Chemical State (M. Morris, ed.) pp. 395–419. Marcel Dekker, New York. Eaton, S. S. and Eaton, G. R. 1993b. Irradiated fused quartz standard sample for time domain EPR. J. Magn. Reson. A102:354– 356.
LITERATURE CITED
Baden-Fuller, A. J. 1990. Microwaves: An Introduction to Microwave Theory and Techniques, 3rd ed. Pergamon Press, Oxford. Bencini, A. and Gatteschi, D. 1990. EPR of Exchange Coupled Systems. Springer-Verlag, Berlin. Berliner, L. J., ed., 1976. Spin Labeling: Theory and Applications. Academic Press, New York. Berliner, L. J., ed., 1979. Spin Labeling II. Academic Press, New York. Berliner, L. J. and Reuben, J., eds., 1989. Spin Labeling: Theory and Applications. Plenum, New York. Box, H. C. 1977. Radiation Effects: ESR and ENDOR Analysis. Academic Press, New York. Byrn, M. P. and Strouse, C. E. 1983. g-Tensor determination from single-crystal ESR data. J. Magn. Reson. 53:32–39. Carrington, A. and McLachlan, A. D. 1967. Introduction to Magnetic Resonance. Harper and Row, New York. Corradi, G., So¨ the, H., Spaeth, J.-M., and Polgar, K. 1991. Electron spin resonance and electron-nuclear double resonance investigation of a new Cr3þ defect on an Nb site in LiNbO3: Mg:Cr. J. Phys. Condens. Matter 3:1901–1908. Czoch, R. and Francik, A. 1989. Instrumental Effects of Homodyne Electron Paramagnetic Resonance Spectrometers. WileyHalsted, New York. Dalal, D. P., Eaton, S. S., and Eaton, G. R. 1981. The effects of lossy solvents on quantitative EPR studies. J. Magn. Reson. 44:415–428. Dalton, L. R., ed., 1985. EPR and Advanced EPR Studies of Biological Systems. CRC Press, Boca Raton, Fla. Date, M., ed., 1983. High Field Magnetism. North-Holland Publishing Co., Amsterdam. Dikanov, S. A. and Tsvetkov, Yu. D. 1992. Electron Spin Echo Envelope Modulation (ESEEM) Spectroscopy. CRC Press, Boca Raton, Fla. Dorio, M. M. and J. H. Freed, eds., 1979. Multiple Electron Resonance Spectroscopy. Plenum Press, New York. Du, J.-L., Eaton, G. R., and Eaton, S. S. 1995. Temperature, orientation, and solvent dependence of electron spin lattice relaxation rates for nitroxyl radicals in glassy solvents and doped solids. J. Magn. Reson. A115:213–221. Eaton, S. S. and Eaton, G. R. 1980. Signal area measurements in EPR. Bull. Magn. Reson. 1:130–138.
Eaton, S. S. and Eaton, G. R. 1993c. Applications of high magnetic fields in EPR spectroscopy. Magn. Reson. Rev. 16:157–181. Eaton, G. R. and Eaton, S. S. 1995a. Introduction to EPR imaging using magnetic field gradients. Concepts in Magnetic Resonance 7:49–67. Eaton, S. S. and Eaton, G. R. 1995b. The future of electron paramagnetic resonance spectroscopy. Bull. Magn. Reson. 16:149– 192. Eaton, S. S. and Eaton, G. R. 1996a. Electron spin relaxation in discrete molecular species. Current Topics in Biophysics 20: 9–14. Eaton, S. S. and Eaton, G. R. 1996b. EPR imaging. In Electron Spin Resonance Specialist Periodical Reports (M. C. R. Symons, ed.) 15:169–185. Royal Society of London, London. Eaton, S. S. and Eaton, G. R. 1997a. Electron paramagnetic resonance. In Analytical Instrumentation Handbook (G. W. Ewing, ed.), 2nd ed. pp. 767–862. Marcel Dekker, New York. Eaton, S. S. and Eaton, G. R., 1997b, EPR methodologies: Ways of looking at electron spins. EPR Newsletter 9:15–18. Eaton, G. R., Eaton, S. S., and Salikhov, K., eds. 1998. Foundations of Modern EPR. World Scientific Publishing, Singapore. Eaton, G. R. and Eaton, S. S. 1999a. ESR imaging. In Handbook of Electron Spin Resonance (C. P. Poole, Jr., and H. A. Farach, eds.), vol 2. Springer-Verlag, New York. Eaton, S. S. and Eaton, G. R. 1999b. Magnetic fields and high frequencies in ESR spectroscopy. In Handbook of Electron Spin Resonance (C. P. Poole, Jr., and H. A. Farach, eds.), vol 2. Springer-Verlag, New York. Goldberg, I. B. and Crowe, H. R. 1977. Effect of cavity loading on analytical electron spin resonance spectrometry. Anal. Chem. 49:1353. Goslar, J., Piekara-Sady, L. and Kispert, L. D. 1994. ENDOR data tabulation. In Handbook of Electron Spin Resonance (C. P. Poole, Jr., and H. A. Farach, eds.). AIP Press, NY. Hyde, J. S. and Froncisz, W. 1982. The role of microwave frequency in EPR spectroscopy of copper complexes. Ann. Rev. Biophys. Bioeng. 11:391–417.
ELECTRON PARAMAGNETIC RESONANCE SPECTROSCOPY
803
Hyde, J. S. and Froncisz, W. 1986. Loop gap resonators. In Electron Spin Resonance Specialist Periodical Reports (M. C. R. Symons, ed.) 10:175–185. Royal Society, London.
Noise, and Signal-to-Noise. In Handbook of Electron Spin Resonance (C. P. Poole, Jr., and H. A. Farach, eds.), vol. 2. SpringerVerlag, New York.
Hyde, J. S. and Subczynski, W. K. 1989. Spin-label oximetry. In Spin Labeling: Theory and Applications, (L. J. Berliner, and J. Reuben, eds.) Plenum, New York.
Rudowicz, C. Z., Yu, K. N., and Hiraoka, H., eds. 1998. Modern Applications of EPR/ESR: From Biophysics to Materials Science. Springer-Verlag, Singapore.
Ikeya, M. 1993. New Applications of Electron Spin Resonance. Dating, Dosimetry, and Microscopy. World Scientific, Singapore. Kevan, L. and Kispert, L. D. 1976. Electron Spin Double Resonance Spectroscopy. John Wiley & Sons, New York.
Schweiger, A. 1982. Electron Nuclear Double Resonance of Transition Metal Complexes with Organic Ligands, Structure and Bonding vol. 51. Springer-Verlag, New York. Standley, K. J. and Vaughan, R. A. 1969. Electron Spin Relaxation Phenomena in Solids. Plenum Press, New York.
Kevan, L. and Bowman, M. K., eds. 1990. Modern Pulsed and Continuous-Wave Electron Spin Resonance. Wiley-Interscience, New York. Kevan, L. and Schwartz, R. N., eds. 1979. Time Domain Electron Spin Resonance. John Wiley & Sons, New York.
Stevens, K. W. H. 1997. Magnetic Ions in Crystals. Princeton University Press, Princeton, N.J.
Kirste, B., 1994. Computer techniques. In Handbook of Electron Spin Resonance (C. P. Poole, Jr., and H. A. Farach, eds.). AIP Press, NY.
Sueki, M., Austin, W. R., Zhang, L., Kerwin, D. B., Leisure, R. G., Eaton, G. R., and Eaton, S. S. 1995. Determination of depth profiles of E’ defects in irradiated vitreous silica by electron paramagnetic resonance imaging. J. Appl. Phys.77:790– 794.
Krinichnyi, V. I. 1995. 2-mm Wave Band EPR Spectroscopy of Condensed Systems. CRC Press, Boca Raton, Fla. Lancaster, G., 1967. Electron Spin Resonance in Semiconductors. Plenum, New York. Muus, L. T. and Atkins, P. W., eds. 1972. Electron Spin Relaxation in Liquids, Plenum Press, New York. Pake, G. E. and Estle, T. L. 1973. The Physical Principles of Electron Paramagnetic Resonance, 2nd ed. W. A. Benjamin, Reading, Mass. Piekara-Sady, L. and Kispert, L. D. 1994. ENDOR spectroscopy. In Handbook of Electron Spin Resonance (C. P. Poole, Jr., and H. A. Farach, eds.). AIP Press, NY. Pilbrow, J. R., 1990. Transition Ion Electron Paramagnetic Resonance. Oxford University Press, London. Poole, C. P., Jr., 1967. Electron Spin Resonance: A Comprehensive Treatise on Experimental Techniques, pp. 398–413. John Wiley & Sons, New York. Poole, C. P., Jr., 1983. Electron Spin Resonance: A Comprehensive Treatise on Experimental Techniques, 2nd ed. Wiley-Interscience, New York. Quine, R. W., Eaton, G. R., and Eaton, S. S. 1987. Pulsed EPR spectrometer. Rev. Sci. Instrum. 58:1709–1724. Quine, R. W., Eaton, S. S., and Eaton, G. R. 1992. A saturation recovery electron paramagnetic resonance spectrometer. Rev. Sci. Instrum. 63:4251–4262. Quine, R. W., Rinard, G. A., Ghim, B. T., Eaton, S. S. and Eaton, G. R. 1996. A 1-2 GHz pulsed and continuous wave electron paramagnetic resonance spectrometer. Rev. Sci. Instrum. 67:2514–2527. Rinard, G. A., Quine, R. W., Eaton, S. S. and Eaton, G. R. 1993. Microwave coupling structures for spectroscopy. J. Magn. Reson. A105:134–144. Rinard, G. A., Quine, R. W., Eaton, S. S., Eaton, G. R., and Froncisz, W. 1994. Relative benefits of overcoupled resonators vs. inherently low-Q resonators for pulsed magnetic resonance. J. Magn. Reson. A108:71–81. Rinard, G. A., Quine, R. W., Ghim, B. T., Eaton, S. S., and Eaton, G. R. 1996a. Easily tunable crossed-loop (bimodal) EPR resonator. J. Magn. Reson. A122:50–57. Rinard, G. A., Quine, R. W., Ghim, B. T., Eaton, S. S., and Eaton, G. R. 1996b. Dispersion and superheterodyne EPR using a bimodal resonator. J. Magn. Reson. A122:58–63. Rinard, G. A., Eaton, S. S., Eaton, G. R., Poole, C. P., Jr., and Farach, H. A., 1999. Sensitivity of ESR Spectrometers: Signal,
Sueki, M., Eaton, G. R., and Eaton, S. S. 1990. Electron spin echo and CW perspectives in 3D EPR imaging. Appl. Magn. Reson. 1:20–28.
Swartz, H. M., Bolton, J. R., and Borg, D. C., eds. 1972. Biological Applications of Electron Spin Resonance, John Wiley & Sons, New York. Trulove, P. C., Carlin, R. T., Eaton, G. R., and Eaton, S. S. 1995. Determination of the Singlet-Triplet Energy Separation for C2 60 in DMSO by Electron Paramagnetic Resonance. J. Am. Chem. Soc. 117:6265–6272. Vancamp, H. L. and Heiss, A. H. 1981. Computer applications in electron paramagnetic resonance. Magn. Reson. Rev. 7:1–40. Weil, J. A., Bolton, J. R., and Wertz, J. E. 1994. Electron Paramagnetic Resonance: Elementary Theory and Practical Applications, John Wiley & Sons, New York. Wilmshurst, T. H. 1968. Electron Spin Resonance Spectrometers, Plenum, N. Y., see especially ch. 4.
KEY REFERENCES Poole, 1967. See above. Poole, 1983. See above. The two editions of this book provide a truly comprehensive coverage of EPR. They are especially strong in instrumentation and technique, though most early applications to materials science are cited also. This is ‘‘the bible’’ for EPR spectroscopists. Eaton and Eaton, 1990a. See above. Eaton and Eaton, 1997a. See above. These introductory chapters in two editions of the Analytical Instrumentation Handbook provide extensive background and references. They are a good first source for a person with no background in the subject. Weil et al., 1994. See above. Atherton, 1993. See above. These two very good comprehensive textbooks have been updated recently. They assume a fairly solid understanding of physical chemistry. Detailed reviews of these texts are given in: Eaton, G.R. 1995. J. Magn. Reson. A113:135-136. Detailed review of Principles of Electron Spin Resonance (Atherton, 1993). Eaton, S.S. 1995. J. Magn. Reson. A113:137.
804
RESONANCE METHODS
Detailed book review of Electron Paramagnetic Resonance: Theory and Practical Applications (Weil et al., 1994).
INTERNET RESOURCES http://ierc.scs.uiuc.edu The International EPR (ESR) Society strives to broadly disseminate information concerning EPR spectroscopy. This Web site contains a newsletter published for the Society by the staff of the Illinois EPR Research Center, University of Illinois. Supported by National Institutes of Health. http://www.biophysics.mcw.edu EPR center at the Medical College of Wisconsin. Supported by National Institutes of Health.
EPR spectrometers are expensive to purchase and to operate. They require a well trained operator, and except for a few tabletop models, they take up a lot of floor space and use substantial electrical power and cooling water for the electromagnet. Newer EPR spectrometer systems are operated via a computer, so the new operator who is familiar with use of computers can learn operation much faster than in the older equipment where remembering a critical sequence of manual valves and knobs was part of doing spectroscopy. Even so, the use of a high-Q resonator results in much stronger interaction between the sample and the spectrometer than in most other analytical techniques, and a lot of judgment is needed to get useful, especially quantitatively meaningful results.
http://spin.aecom.yu.edu
Commercial Instruments
EPR center at Albert Einstein College of Medicine. Supported by National Institutes of Health.
Most materials science needs will be satisfied by commercial spectrometers. Some references are provided below to recent spectrometers built by various research labs. These provide some details about design that may be helpful to the reader who wants to go beyond the level of this article. The following brief outline of commercial instruments is intended to guide the reader to the range of spectrometers available. The largest manufacturers—Bruker Instruments EPR Division, and JEOL—market general-purpose spectrometers intended to fulfill most analytical needs. The focus is on X-band (9 to 10 GHz) CW spectrometers, with a wide variety of resonators to provide for many types of samples. Accessories facilitate control of the sample temperature from
Tc 1 ¼ 2p oc
or oc t ¼
eB et t ¼ B ¼ mB > 1 m m
ð2Þ
where Tc is the period of cyclotron motion and m ¼ et=m is the DC mobility of the electron. Let us examine this CR observability condition for a realistic set of parameters. If m ¼ 0:1m0 (where m0 ¼ 9:1 1031 kg) and B ¼ 1 Tesla, then oc ¼ eB=m 2 1012 sec1 . Thus one needs a microwave field with a frequency of fc ¼ oc =2p 3 1011 Hz ¼ 300 GHz (or a wavelength of lc ¼ c=fc 1 mm). Then, in order to satisfy Equation 2, one needs a minimum mobility of m ¼ 1 m2/V-sec ¼ 1 104 cm2 /V-sec. This value of mobility can be achieved only in a limited number of high-purity semiconductors at low temperatures, thereby posing a severe limit on the observations of microwave CR. From the resonance condition o ¼ oc , it is obvious that if a higher magnetic field is available (see GENERATION AND MEASUREMENT OF MAGNETIC FIELDS) one can use a higher frequency (or a shorter wavelength), which should make Equation 2 easier to satisfy. Hence modern CR methods almost invariably use far-infrared (FIR) [or Terahertz (THz)] radiation instead of microwaves. Strong magnetic
806
RESONANCE METHODS
fields are available either in pulsed form (up to 103 T) or in steady form by superconducting magnets (up to 20 T), water-cooled magnets (up to 30 T), or hybrid magnets (up to 45 T). In these cases, even at room temperature, Equation 2 may be fulfilled. Here we are only concerned with the methods of FIR-CR. The reader particularly interested in microwave CR is referred to Lax and Mavroides (1960). Although this unit is mainly concerned with the simplest case of free carrier CR in bulk semiconductors, one can also study a wide variety of FIR magneto-optical phenomena with essentially the same techniques as CR. These phenomena (‘‘derivatives’’ of CR) include: (a) spin-flip resonances, i.e., electron spin resonance and combined resonance, (b) resonances of bound carriers, i.e., internal transitions of shallow impurities and excitons, (c) polaronic coupling, i.e., resonant interactions of carriers with phonons and plasmons, and (d) 1-D and 2-D magnetoplasmon excitations. It should be also mentioned that in 2-D systems in the magnetic quantum limit, there are still unresolved issues concerning the effects of disorder and electron-electron interactions on CR (for a review, see, e.g., Petrou and McCombe, 1991; Nicholas, 1994). It is important to note that all the early CR studies were carried out on semiconductors, not on metals. This is because of the high carrier concentrations present in metals, which preclude direct transmission spectroscopy except in the case of very thin films where thickness is less than the depth of penetration (skin depth) of the electromagnetic fields. In bulk metals, special geometries are thus required to detect CR, the most important of which is the Azbel-Kaner geometry (Azbel and Kaner, ~ 1958). In this geometry, both the DC magnetic field B ~ are applied parallel to the samand the AC electric field E ~ E ~ or B ~ ? E. ~ The electrons then exeple surface, either B== ~ cute a spiral motion along B, moving in and out of the skin ~ is present. Thus, whenever the electron depth, where E ~ and if the enters the skin depth, it is accelerated by E ~ phase of E is the same every time the electron enters the skin depth, then the electron can resonantly absorb energy from the AC field. The condition for resonance here is noc ¼ oðn ¼ 1; 2; 3; . . .Þ. For more details on CR in metals, see, e.g., Mavroides (1972). Many techniques can provide information on effective masses, but none can rival CR for directness and accuracy. Effective masses can be estimated from the temperature dependence of the amplitude of the galvanomagnetic effects, i.e., the Shubnikov–de Haas and de Haas–van Alphen effects. Interband magneto-optical absorption can determine the reduced mass m ¼ ð1=me þ 1=mh Þ1 of photo-created electrons and holes. Measurements of the infrared Faraday rotation effect due to free carriers can provide information on the anisotropy of elliptical equienergy surfaces. The temperature dependence of electronic specific heat provides a good measure of the density of levels at the Fermi level, which in turn is proportional to the effective mass. Nonresonant free carrier absorption (see CARRIER LIFETIME: FREE CARRIER ABSORPTION, PHOTOCONDUCTIVITY, AND PHOTOLUMINESCENCE) can be used to estimate effective masses, but, of course, this simply represents a tail or shoulder of a CR absorption curve.
It is worth pointing out here that several different definitions of effective masses exist in the literature and care must be taken when one discusses masses. The band-edge mass, defined as in Equation 1 at band extrema ~ ¼ 0 in most semiconductors), is the most important (e.g., k band parameter to characterize a material. The specific heat mass is directly related to the density of states at the Fermi level, and is thus also called the density-ofstates mass. The cyclotron mass is defined as mc ¼ ð h2 = 2pÞqA=qE, where A is the momentum-space area enclosed by the cyclotron orbit; this definition follows naturally from calculation of oc in momentum space. The spectroscopic mass can be defined for any resonance peak, and is identical to the cyclotron mass when the resonance is due to free-carrier CR (also see Data Analysis and Initial Interpretation). The basic theory and experimental methods of cyclotron resonance is presented in this unit. Basic theoretical back ground will first be presented (see Principles of the Method). A detailed description will be given of the actual experimentation procedures (see Practical Aspects of the Method). Finally, typical data analysis procedures are presented (see Data Analysis and Initial Interpretation).
PRINCIPLES OF THE METHOD As described in the Introduction, the basic physics of CR is the interaction of electromagnetic (EM) radiation with charge carriers in a magnetic field. Here, more quantitative descriptions of this physical phenomenon will be presented, based on (1) a semiclassical model and (2) a quantum-mechanical model. In analyzing CR data, judicious combination, modification, and refinement of these basic models are necessary, depending upon the experimental conditions and the material under study, in order to obtain the maximum amount of information from a given set of data. The most commonly used method for describing the motion of charge carriers in solids perturbed by external fields is the effective-mass approximation (EMA), developed by many workers in the early history of the quantum theory of solids. The beauty of this method lies in the ability to replace the effect of the lattice periodic potential on electron motion by a mass tensor, the elements of which are determined by the unperturbed band structure. In other words, instead of considering electrons in a lattice we may consider the motion of effective-mass particles, which obey simple equations of motion in the presence of external fields. Rigorous treatments and full justification of the EMA can be found in the early original papers (e.g., Wannier, 1937; Slater, 1949; Luttinger, 1951; Luttinger and Kohn, 1955).
Semiclassical Drude Description of CR In many cases it is satisfactory to use the semiclassical Drude model (e.g., Aschcroft and Mermin, 1976) to describe the conductivity tensor of free carriers in a magnetic field (see MAGNETOTRANSPORT IN METALS AND ALLOYS). In this
CYCLOTRON RESONANCE
807
model each electron is assumed to independently obey the equation of motion $
m
v d~ v $ ~ ~þ ~ ~ v BÞ þ m ¼ eðE t dt
ð3Þ
$ is the effective-mass tensor (see, e.g., Equation 1 where m for nondegenerate bands), ~ n is the drift velocity of the electrons, t is the scattering lifetime (which is assumed to be a ~ is the AC electric field, and B ~ is the DC magconstant), E $ netic field. The complex conductivity tensor s is then $ ~ ~ ~ defined by J ¼ ne~ n ¼ s E, where J is the current density and n is the carrier density. Assuming that the AC field and the drift velocity have the harmonically varying ~ ¼E ~0 expðiotÞ, ~ form, i.e., EðtÞ nðtÞ ¼ ~ n0 expðiotÞ, one can easily solve Equation 3. In particular, for cubic materials ~ k z^, $ and for B s is given by
0
sxx B $ s ¼ @ syx
sxy syy
0 sxx ¼ syy ¼ s0
0
1 0 C 0 A szz iot þ 1
ð4aÞ
ð4bÞ
ðiot þ 1Þ2 þ o2c t2 oc t sxy ¼ syx ¼ s0 ðiot þ 1Þ2 þ o2c t2 1 szz ¼ s0 iot þ 1 ne2 t s0 ¼ nem ¼ m
ð4cÞ ð4dÞ ð4eÞ
where m is the carrier mobility and s0 is the DC conductivity. Once we know the conductivity tensor, we can evaluate the power P absorbed by the carriers from the AC field as
Figure 1. The CR absorption power versus o for different values of oc t. The traces are obtained from Equation 7. CR occurs at o ¼ oc when oc t > 1. The absorption is expressed in units of jE0 j2 s0 =4.
correspond to the two opposite senses of circular polarization. It can be shown (see, e.g., Palik and Furdyna, 1970) ~k z^, where ~ that in the Faraday geometry ~ qkB q is the wavevector of the EM wave) a 3-D magneto-plasma can support only two propagating EM modes represented by ~ ¼ E0 p1ffiffiffi ð~ E ey Þexp½iðq z otÞ ex i~ 2
where ~ ex and ~ ey are the unit vectors in the x and y directions, respectively. The dispersion relations, q versus o, for the two modes are obtained from q ¼
1 ~ EðtÞi ~ ~ Þ P ¼ hJðtÞ ¼ Reð~ jE 2
ð8Þ
oN c
ð9aÞ
ð5Þ
~ is the comwhere h. . .i represents the time average and E ~ For an EM wave linearly polarized in plex conjugate of E. ~ ¼ ðEx ; 0; 0Þ, Equation 5 simplifies to the x-direction, i.e., E 1 1 P ¼ ReðJx Ex Þ ¼ jE0 j2 Reðsxx Þ 2 2
ð6Þ
Substituting part b of Equation 4 into Equation 6, we obtain ( ) 1 iot þ 1 2 PðoÞ ¼ jE0 j s0 Re 2 ðiot þ 1Þ2 þ o2c t2 " # 1 1 1 2 þ ¼ jE0 j s0 4 ðo oc Þ2 t2 þ 1 ðo þ oc Þ2 t2 þ 1
ð7Þ
This is plotted in Fig. 1 for different values of the parameter oc t. It is evident from this figure that the absorption peak occurs when o ¼ oc and oc t > 1. Note that Equation 7 contains two resonances—one at o ¼ oc and the other at o ¼ oc . These two resonances
8 2 ¼ k ¼ kxx ikxy N > > > > iot þ 1 > < kxx ¼ kl þ i sxx ¼ kl þ is0 oe0 ðiot þ 1Þ2 þ o2c t2 oe0 > > i is oc t > 0 > > : kxy ¼ oe sxy ¼ oe 2 2 0 0 ðiot þ 1Þ þ o2 ct
ð9bÞ ð9cÞ ð9dÞ
where N is the complex refractive index, kl is the relative dielectric constant of the lattice (assumed to be constant in the FIR), and kxx and kxy$are components of the generalized $ $ $ dielectric tensor k ¼ kl I þ ði=oe0 Þs, where I is the unit $ tensor and s is the conductivity tensor (Equation 4). The positive sign corresponds to a circularly-polarized FIR field, rotating in the same sense as a negatively-charged particle, and is traditionally referred to as cyclotron-resonance-active (CRA) for electrons. Similarly, the negative sign represents the opposite sense of circular polarization, cyclotron-resonance-inactive (CRI) for electrons. The CRA mode for electrons is the CRI mode for holes, and vice versa. For linearly-polarized FIR radiation, which is an equal-weight mixture of the two modes, both terms in Equation 7 contribute to the absorption curve, as represented in Figure 1.
808
RESONANCE METHODS
Quantum Mechanical Description of CR According to the EMA, if the unperturbed energy-momentum relation for the band n, En ð~ pÞ, is known, then the allowed energies E of a crystalline system perturbed by a ~ (where A ~ is the vec~¼ r ~A uniform DC magnetic field B tor potential) are given approximately by solving the effective Schro¨ dinger equation ~ n ð~ ^ n ð~ ~ þ eAÞF HF r Þ ¼ E^n ðihr r Þ ¼ EFn ð~ rÞ
ð10Þ
~ means that we first ~ þ eAÞ Here the operator E^n ðihr replace the momentum ~ p in the function En ð~ pÞ with the ~ (see, kinematic (or mechanical) momentum ~ p¼~ p þ eA e.g., Sakurai, 1985) and then transform it into an operator ~. The function Fn ð~ rÞ is a slowly varying by ~ p ! ihr ‘‘envelope’’ wavefunction; the total wavefunction ð~ r Þ is given by a linear combination of the Bloch functions at ~ p ¼ 0 (or the cell-periodic function) cn0 ð~ rÞ X Fn ð~ r Þ n0 ð~ rÞ ð11Þ ð~ rÞ ¼ n
For simplicity let us consider a 2-D electron in a conduction-band with a parabolic and isotropic dispersion En ð~ pÞ ¼ j~ pj2 =2m ¼ ðP2x þ P2y Þ=2m . The Hamiltonian in Equation 10 is then simply expressed as pj2 1 ^0 ¼ j~ ¼ ðp2 þ p2y Þ H 2m 2m x
ð12Þ
These kinematic momentum operators obey the following commutation relations ½x; px ¼ ½ y; py ¼ ih ½x; py ¼ ½ y; px ¼ ½x; y ¼ 0 ½px ; py ¼ ih2 =‘2
ð13aÞ ð13bÞ ð13cÞ
where ‘ ¼ ðh=eBÞ1=2 is the magnetic length, which measures the spatial extent of electronic (envelope) wavefunctions in magnetic fields—the quantum mechanical counterpart of the cyclotron radius rc . Note the non-commutability between px and py (Equation 13, line c), in contrast to px and py in the zero magnetic-field case. We now introduce the Landau level raising and lowering operators ‘ a^ ¼ pffiffiffi ðpx ipy Þ 2h
ð14Þ
for which we can show that ½^ a ; a^þ ¼ 1 ^0 ; a^ ¼ hoc a^ ½H
ð15aÞ ð15bÞ
Combining Equations 13 to 15, we can see that a^ connects ^0 ¼ the state jNi and the states jN 1i and that H hoc ^ ^ ðaþ a þ 1=2Þ, and hence the eigenenergies are E¼
Nþ
1 hoc ; 2
ðN ¼ 0; 1; . . .Þ
ð16Þ
These discrete energy levels are well-known as Landau levels. If we now apply a FIR field with CRA (CRI) polarization (Equation 8) to the system, then, in the electric dipole approximation, the Hamiltonian becomes 1 e ~0 ¼ H ^ ¼ 1 j~ ^0 þ H ^0 H p þ eA^ þ eA^0 j2 j~ pj2 þ ~ pA 2m 2m m ð17Þ where, from Equation 8, the vector potential for the FIR ~0 is given by field A
0 ~0 ¼ pE ffiffiffi ð~ ey ÞexpðiotÞ ex i~ A 2io
ð18Þ
^ 0 is given by Thus the perturbation Hamiltonian H ^ 0 ¼ e ðpx A0 þ py A;y Þ H ;x m eE0 ¼ pffiffiffi ðpx ipy ÞexpðiotÞ 2iom e hE0 a^ expðiotÞ ¼ iom ‘
ð19Þ
We can immediately see that this perturbation, containing ~ a , connects the state jNi with the states jN 1i, so that a sharp absorption occurs at o ¼ oc .
PRACTICAL ASPECTS OF THE METHOD CR spectroscopy, or the more general FIR magneto-spectroscopy, is performed in two distinct ways—Fourier-transform magneto-spectroscopy (FTMS) and laser magnetospectroscopy (LMS). The former is wavelength-dependent spectroscopy and the latter is magnetic-field-dependent spectroscopy. Generally speaking, these two methods are complimentary. Narrow spectral widths and high output powers are two of the main advantages of lasers. The former makes LMS suitable for investigating spectral features that cannot be resolved by using FTMS, and the latter makes it suitable for studying absorption features in the presence of strong background absorption or reflection. It is also much easier to conduct polarization-dependent measurements by LMS than by FTMS. Moreover, LMS can be easily combined with strong pulsed magnets. Finally, with intense and short-pulse lasers such as the free-electron laser (FEL; see e.g., Brau, 1990), LMS can be extended to nonlinear and time-resolved FIR magneto-spectroscopy. On the other hand, FTMS has some significant advantages with respect to LMS. First, currently available FIR laser sources (except for FELs) can produce only discrete wavelengths, whereas FTMS uses light sources that produce continuous spectra. Second, it is needless to say that LMS can only detect spectral features that are magnetic-field-dependent, so that it is unable to generate zero-field spectra. Third, LMS often overlooks or distorts features that have a very weak field dependence, in which case only FTMS can give unambiguous results, the
CYCLOTRON RESONANCE
1s ! 2p transition of shallow neutral donors being a good example (e.g., McCombe and Wagner, 1975). Finally, for studying 2-D electron systems, spectroscopy at fixed filling factors, namely at fixed magnetic fields, is sometimes crucial. In this section, after briefly reviewing FIR sources, these two modes of operation—FTMS and LMS—will be described in detail. In addition, short descriptions of two other unconventional methods—cross-modulation (or photoconductivity), in which the sample itself is used as a detector, and optically detected resonance (ODR) spectroscopy, which is a recently developed, highly sensitive method—will be provided. For the reader interested in a detailed description of FIR detectors and other FIR techniques, see, e.g., Kimmitt (1970), Stewart (1970), and Chantry (1971). Far-Infrared Sources The two ‘‘classic’’ sources of FIR radiation commonly used in FTMS are the globar and the Hg arc lamp. The globar consists of a rod of silicon carbide usually 2 cm long and 0.5 cm in diameter. It is heated by electrical conduction; normally 5 A are passed through it, which raises its temperature to 1500 K. The globar is bright at wavelengths between 2 and 40 mm, but beyond 40 mm its emissivity falls slowly, although it is still sufficient for spectroscopy up to 100 mm. The mercury lamp has higher emissivity than the globar at wavelengths longer than 100 mm. It is normally termed a ‘‘high-pressure’’ arc, although the actual pressure is only 1 to 2 atm. (Low-pressure gaseous discharges are not useful here because they emit discrete line spectra.) It is contained in a fused quartz inner envelope. At the shorter wavelengths of the FIR, quartz is opaque, but it becomes very hot and emits thermal radiation. At the longer wavelengths, radiation from the mercury plasma is transmitted by the quartz and replaces the thermal radiation. Originally used by Rubens and von Baeyer (1911), the mercury arc lamp is still the most widely employed source in the FIR. Three types of laser sources currently available to FIR spectroscopists are molecular-gas lasers, the FEL, and the p-type germanium laser. The most frequently used among these are the hundreds of laser lines available from a large number of molecular gases. The low-pressure gas consisting of HCN, H2O, D2O, CH3OH, CH3CN, etc., flows through a glass or metal tube, where population inversion is achieved either through a high-voltage discharge or by optical excitation with a CO2 laser. Output powers range from a few mW to several hundred mW, depending on the line, gas pressure, pumping power, and whether continuous or pulsed excitation is used. The FEL, first operated in 1977, is an unusual laser source which converts the kinetic energy of free electrons to EM radiation. It is tunable in a wide range of frequencies, from millimeter to ultraviolet. An FEL consists of an electron gun, an accelerator, an optical cavity, and a periodic array of magnets called an undulator or wiggler. The wavelength of the output optical beam is determined by (1) the kinetic energy of the incident electrons, (2) the spatial period of the wiggler, and (3) the strength of the wiggler magnets, all of which are continuously tunable. With FEL’s enormously high peak
809
powers (up to 1 GW) and short pulse widths (down to 200 fsec), a new class of nonequilibrium phenomena is currently being explored in the FIR. The p-Ge laser is a new type of tunable solid-state FIR laser (for a review, see Gornik, 1991; Pidgeon, 1994). Its operation relies on the fact that streaming holes in the valence band in crossed strong electric and magnetic fields can result in an inverted hot-carrier distribution. Two different lasing processes, employing light hole–light hole and light hole– heavy hole transitions, respectively, have been identified (the former is the first realization of a CR laser). The lasing wavelength is continuously tunable, by adjusting the electric and magnetic fields. Lasing in a wide spectral range (75 to 500 mm) with powers up to almost 1 W has been reported. Nonlinear optical effects provide tunable sources in the FIR. Various schemes exist, but the most thoroughly studied method has been Difference Frequency Mixing using the 9.6 and 10.6 mm lines from two CO2 lasers (for a review, see, Aggarwal and Lax, 1977). InSb is usually used as the mixing crystal. Phase matching is achieved either through the use of temperature dependence of the anomalous dispersion or by using the free-carrier contribution to the refractive index in a magnetic field. As the CO2 laser produces a large number of closely spaced lines at 9.6 and 10.6 mm, thousands of lines covering the FIR region from 70 mm to mm can be produced. However, the efficiency of this method is very low: considerable input laser powers are necessary to obtain output powers only in the mW range. Another important method for generating FIR radiation is Optical Parametric Oscillation (OPO; see, e.g., Byer and Herbst, 1977). In this method, a birefringent crystal in an optical cavity is used to split the pump beam at frequency o3 into simultaneous oscillations at two other frequencies o1 (idler) and o2 (signal), where o1 þ o2 ¼ o3 . This is achieved either spontaneously (through vacuum fluctuation) or by feeding a beam at frequency o1 . The advantages of OPO are high efficiency, wide tuning range, and an all solid-state system. The longest wavelength currently available from OPO is 25 mm. More recently, the remarkable advances in high-speed opto-electronic and NIR/visible femtosecond laser technology have enabled generation and detection of ultrashort pulses of broadband FIR radiation (more frequently referred to as THz radiation or ‘‘T Rays’’ in this context). The technique has proven to be extremely useful for FIR spectroscopic measurements in the time domain. Many experiments have shown that ultrafast photoexcitation of semiconductors and semiconductor heterostructures can be used to generate coherent charge oscillations, which emit transient THz EM radiation. This is currently an active topic of research, and the interested reader is referred to, e.g., Nuss and Orenstein (1998) and references therein. Fourier Transform FIR Magneto-Spectroscopy A Fourier-transform spectrometer is essentially a Michelson type two-beam interferometer, the basic components of which are collimating optics, a fixed mirror, a beam splitter, and a movable mirror. The basic operation principle can be stated as follows. IR radiation emitted from the
810
RESONANCE METHODS
light source is divided by the beam splitter into two beams with approximately the same intensity. One of the beams reaches the fixed mirror, and the other reaches the movable mirror. The two beams bounce back from the two mirrors and recombine at the beam splitter. When the movable mirror is at the zero-path-difference (ZPD) position, the output of the light intensity becomes maximum, since the two beams constructively interfere at all wavelengths. When the path difference, x, measured from the ZPD position is varied, an interference pattern as a function of x, called an interferogram, is obtained that is the FT of the spectrum of the light passing through the interferometer. Hence, by taking the inverse Fourier transform of the interferogram using a computer, one obtains the spectrum. Two different types of FT spectrometers exist: (1) ‘‘slowscan’’ (or step-scan) and (2) ‘‘fast-scan’’ spectrometers. In a slow-scan FT spectrometer, a stepping motor drives the movable mirror. A computer controls the step size in multiples of the fundamental step size, the dwell time at each mirror position, and the total number of steps. The product of the step size and the number of steps determines the total path difference, and hence the spectral resolution. A mechanical chopper (see Fig. 2) usually chops the FIR beam. The AC signal at this frequency from the detector is fed into a lock-in amplifier and the reference signal from the chopper into the reference input of the lock-in amplifier. The data acquisition occurs at each movablemirror position, and thus an interferogram is constructed as the magnitude of the output versus the position of the movable mirror. Computer Fourier analysis with a Fast Fourier Transform algorithm converts the interferogram into an intensity versus frequency distribution—the spectrum. Rapid-scan FT spectrometers operate quite differently, although the basic principles are the same. The movable mirror of a rapid-scan FT machine is driven at a constant velocity. Instead of using a mechanical chopper, the constant velocity of the mirror produces a sinusoidal intensity variation with a unique frequency for each spectral element o. The modulation frequency is given by
# ¼ 2Vo, where V is the velocity of the mirror. High precision of parallel alignment between the two mirrors and the constant velocity of the moving mirror is provided in situ by a dynamic feedback controlling system. The signal sampling takes place at equally spaced mirror displacements, and is determined by the fringes of a He-Ne laser reference. A slow-scan FTMS system for transmission CR studies is schematically shown in Fig. 2. FIR radiation generated by a Hg-arc lamp inside the spectrometer is coupled out by a Cassegrain output mirror and guided through a 3/4 -inch (‘‘oversized’’) brass light-pipe to a 458 mirror. The beam reflected by the mirror is then directed down and passes through a white polyethylene window into a sample probe, which consists of a stainless-steel light pipe, a sampleholder/heater/temperature-sensor complex, metallic light cones, and a detector. The probe is sealed by the white polyethylene window and a stainless steel vacuum jacket, and inserted into a superconducting-magnet cryostat. The beam is focused by a condensing cone onto the sample located at the end of the light-cone at the center of the field. A black polyethylene filter is placed in front of the sample in order to filter out the high frequency part ($500 cm1 ) of the radiation from the light source. The FIR light transmitted through the sample is further guided by lightpipe/light-cone optics into a detector, which is placed at the bottom of the light-pipe system, far away from the center of the magnet. If a cancellation coil is available, the detector is placed at the center of the cancellation coil where B ¼ 0. The sample and detector are cooled by helium exchange gas contained in the vacuum jacket of the probe. Figures 3A and 3B show a typical interferogram and spectrum, respectively. The spectrum obtained contains not only the spectral response (transmittance in this case) of the sample but also the spectral response of combined effects of any absorption, filtering, and reflection when the light travels from the source to the detector, in addition to the output intensity spectrum of the radiation source. Therefore, in most experimental situations, spectra such as those obtained Fig. 3B are ratioed to an appropriate background spectrum taken under a different condition such as a different magnetic field, temperature, optical pumping intensity, or some other parameter that would only change the transmittance of the sample. In CR studies, spectra are usually ratioed to a zero-magneticfield spectrum. In this way all the unwanted fieldinsensitive spectral structures are canceled out. Laser FIR Magneto-Spectroscopy
Figure 2. Schematic diagram of an experimental setup for CR studies with a (step-scan) Fourier transform spectrometer.
LMS is generally easier to carry out than FTMS, although the experimental setup is almost identical to that of FTMS (the only difference is that the FT spectrometer is replaced by a laser in Fig. 2). This is partly because of the high power available from lasers compared with conventional radiation sources employed in FTMS, and also because mathematical treatments of the recorded data are not required. The data acquisition simply consists of monitoring an output signal from the detector that is proportional to the amount of FIR light transmitted by the sample while
CYCLOTRON RESONANCE
811
Figure 4. Example of CR with a laser in very high pulsed magnetic fields. (A) FIR transmission and magnetic field as functions of time. (B) Replot of the transmission signal as a function of magnetic field where the two traces arise from the rising and falling portions of the field strength shown in (A). Data obtained for ntype 3C-SiC.
Figure 3. (A) An interferogram obtained with the FTMS setup shown in Fig. 2. (B) Spectrum obtained after Fourier transforming the interferogram in (A). This spectrum contains the output intensity spectrum of the Hg arc lamp, the spectral response of all the components between the source and the detector (filters, lightpipes, light cones, etc.), the responsivity spectrum of the detector (Ge:Ge extrinsic photoconductor), as well as the transmission spectrum of the sample.
the magnetic field is swept. The signal changes with magnetic field, decreasing resonantly and showing minima at resonant magnetic fields (see Fig. 4). Only magnetic-fielddependent features can thus be observed. If the laser out-
put is stable while the field is being swept, no ratioing is necessary. A very important feature of LMS is that it can easily incorporate pulsed magnets, thus allowing observations of CR at very high magnetic fields (see Miura, 1984). Pulsed magnetic fields up to 40 to 60 Tesla with millisecond pulse durations can be routinely produced at various laboratories. Stronger magnetic fields in the megagauss (1 megagauss = 100 Tesla) range can be also generated by special destructive methods (see, e.g., Herlach, 1984) at some institutes. These strong fields have been used to explore new physics in the ultra-quantum limit (see, e.g., Herlach, 1984; Miura, 1984) as well as to observe CR in wide-gap, heavy-mass, and low-mobility materials unmeasurable by other techniques (see, e.g., Kono et al., 1993). The cyclotron energy of electrons in megagauss fields can
812
RESONANCE METHODS
exceed other characteristic energies in solids such as binding energies of impurities and excitons, optical phonon energies, plasma energies, and even the fundamental band-gap in some materials, causing strong modifications in CR spectra. An example of megagauss CR is shown in Fig. 4. Here, the recorded traces of the magnetic field pulse and the transmitted radiation from an n-type silicon carbide sample are plotted in Fig. 4A. The transmitted signal is then plotted as a function of magnetic field in Fig. 4B. The 36mm line from a water-cooled H2O vapor pulse laser and a Ge:Cu impurity photoconductive detector are used. The magnetic field is generated by a destructive single-turn coil method (see, e.g., Herlach, 1984), in which one shot of large current (2.5 MA) from a fast capacitor bank (100 kJ, 40 kV) is passed through a thin single-turn copper coil. Although the coil is destroyed during the shot due to the EM force and the Joule heat, the sample can survive a number of such shots, so that repeated measurements can be made on the same sample. The resonance absorption is observed twice in one pulse, in the rising and falling slopes of the magnetic field, as seen in Fig. 4. It should be noted that the coincidence of the two traces (Fig. 4B) confirms a sufficiently fast response of the detector system. Cross-Modulation (or Photoconductivity) It is well known that the energy transferred from EM radiation to an electron system due to CR absorption induces significant heating in the system (free-carrier or Drude heating). This type of heating in turn induces a change in the conductivity tensor (e.g., an increase in 1/t or m*), which causes (third-order) optical nonlinearity and modulation at other frequencies, allowing an observation of CR at a second frequency. Most conveniently, the DC conductivity s(o ¼ 0) shows a pronounced change at a resonant magnetic field. This effect is known as CrossModulation or CR-induced photoconductivity, and has been described as a very sensitive technique to detect CR (Zeiger et al., 1959; Lax and Mavroides, 1960). The beauty of this method is that the sample acts as its own detector, so that there is no detector-related noise. The disadvantage is that the detection mechanism(s) is not well understood, so that quantitative lineshape analysis is difficult, unlike direct absorption. Either a decrease or increase in conductivity is observed, depending on a number of experimental conditions. Although many suggestions concerning the underlying mechanism(s) have been proposed, a complete understanding has been elusive. Contrary to the situation of CR, which is a free carrier resonance, the mechanism of photoconductivity due to bound carrier resonances is much more clearly understood (see CARRIER LIFETIME: FREE CARRIER ABSORPTION, PHOTOCONDUCTIVITY, AND PHOTOLUMINESCENCE). For example, a resonant absorption occurs due to the 1s to 2p hydrogenic impurity transition in zero magnetic field in GaAs. Although the 2p state is below the continuum, an electron in the 2p state can be thermally excited into the conduction band, increasing conductivity (photo-thermal ionization). Other p-like excited states also can be studied in this manner, and these states evolve with increasing magnetic fields. As a
result, one can map out the hydrogenic spectrum of the GaAs impurities simply by studying the photoconductivity of the sample. Since this is a null technique (i.e., there is photoresponse only at resonances), it is much more sensitive than transmission studies. Optically-Detected Resonance Spectroscopy Recently, a great deal of development work has centered on a new type of detection scheme, Optically-Detected Resonance (ODR) spectroscopy in the FIR. This novel technique possesses several significant advantages over conventional CR methods, stimulating considerable interest among workers in the community. With this technique, FIR resonances are detected through the change in the intensity of photoluminescence (PL; see CARRIER LIFETIME: FREE CARRIER ABSORPTION, PHOTOCONDUCTIVITY, AND PHOTOLUMINESCENCE)
while the magnetic field is swept, rather than measuring FIR absorption directly. This technique, originally developed for the microwave region, was extended to the FIR by Wright et al. (1990) in studies of epitaxial GaAs, and subsequently by others. Remarkable sensitivity in comparison with conventional FIR methods has been demonstrated in studies of CR (e.g., Ahmed et al., 1992), impurity transitions (Michels et al., 1994; Kono et al., 1995), and internal transitions in excitons (Cerne et al., 1996; Salib et al., 1996). Since carriers are optically created, ODR enables detection of CR in ‘‘clean’’ systems with no intentional doping, with increased scattering time, and in materials for which doping is difficult. Furthermore, with ODR it is possible to select a specific PL feature in the spectrum, among various band-edge features, as the detection ‘‘channel.’’ It is then possible to use the specificity of the FIR spectrum to obtain information about recombination mechanisms and the interactions that give rise to the various lines. Figure 5A is a schematic of the experimental apparatus used for an ODR study (Kono et al., 1995). The sample is mounted in the Faraday geometry in a FIR lightpipe at the center of a 9-T superconducting magnet cooled to 4.2 K. PL is excited with the 632.8 nm line of a He-Ne laser via a 600-mm-diameter optical fiber. The signals are collected with a second 600-mm fiber, and analyzed with 0.25-m single-grating spectrometer/Si diode detector combination. A CO2-pumped FIR laser is used to generate FIR radiation. The FIR laser power supply is chopped with the lock-in amplifier referenced to this chopped signal. A computer is used to simultaneously record the magnetic field values and the detected changes in the PL, and to step the monochromator to follow the center of the desired PL peak as it shifts with magnetic field. Two scans of the change in PL intensity as a function of magnetic field (ODR signal) for a FIR laser line of 118.8 mm are presented in Fig. 6. The upper trace shows a positive change (i.e., increase) in the intensity of the free exciton PL from a well-center-doped GaAs quantum well, whereas the lower trace shows a negative change (i.e., decrease) in the intensity of the bound exciton luminescence, demonstrating spectral specificity. For both scans, four different FIR resonances are clearly seen with an excellent signal-to-noise ratio (the sharp feature near 6T is the electron CR, and the other features are donor-related features).
CYCLOTRON RESONANCE
813
Figure 6. Two ODR spectra for a well-center-doped GaAs quantum well at a FIR wavelength of 118.8 mm. The upper trace shows a positive change in the intensity of the free exciton PL, whereas the lower trace shows a negative change in the intensity of the bound exciton luminescence, demonstrating the spectral specificity of ODR signals.
Figure 5. (A) Schematic diagram of an experimental setup for ODR spectroscopy. (B) Diagram of the FIR light cone, sample, and optical fiber arrangement used.
DATA ANALYSIS AND INITIAL INTERPRETATION After minimizing noise to get as clean a spectrum as possible, and making sure that the spectrum is free from any artifacts and multiple-reflection interference effects, one can analyze resonance ‘‘positions’’ (i.e., magnetic fields in LMS and frequencies in FTMS). For each resonance feature, an effective mass m in units of m0 (free electron mass in vacuum) can be obtained in different unit systems as follows m eB 0:1158 B½T 0:9337 B½T ¼ ¼ ¼ ~n½cm1 m0 m0 o ho½meV 27:99 B½T ¼ ¼ 93:37 B½T l½m f ½GHz
ð20Þ
Note that one can obtain an effective mass (a spectroscopic mass) for any resonance. For example, for the resonances (1) and (2) for SiC in Fig. 4B one can obtain m1 ¼ 0:247m0 and m2 ¼ 0:406m0 , and for the resonances (a) (d) for doped GaAs quantum wells in Fig. 5C one can obtain ma ¼ 0:023m0 , mb ¼ 0:044m0 , mc ¼ 0:069m0 , and md ¼ 0:079m0 , irrespective of their origins. However, the spectroscopic mass is identical to the cyclotron mass only when the feature is due to free carrier CR; bound-carrier resonances such as the donor-related features in Fig. 6, have different spectroscopic masses at different frequencies/ fields. Hence one needs to know which spectroscopic feature(s) arises from free-carrier CR. This can be found by two methods: temperature dependence and magnetic field (or frequency) dependence. Examining the temperature dependence of a spectral feature is the easiest way to check the origin of the feature. As a rule of thumb, features associated with bound carriers increase in intensity with decreasing temperature at the expense of free carrier resonances. This is because in bulk semiconductors with doping level below the Mott condition, free carriers freeze out onto impurities, leaving no electrical conductivity at the lowest temperature. Thus, free-carrier CR grows with increasing temperature, but it broadens with the resulting increase of carrier scattering. A more stringent test of whether a particular feature originates from CR is the response of its frequency versus magnetic field. In the case of LMS, one can take data only at discrete frequencies, but in the case of FTMS one can generate a continuous relationship between frequency and magnetic field. This is one of the advantages of FTMS over LMS as discussed earlier. Therefore, LMS is
814
RESONANCE METHODS
usually performed to study, at a fixed wavelength with a higher resolution than FTMS and with circular polarizers if necessary, those spectral features whose magnetic-fielddependence is already known by FTMS. The resonance frequency versus magnetic field thus obtained for CR should show a straight line passing through the origin, if the nonparabolicity of the material is negligible and there are no localization effects (e.g., Nicholas, 1994). The slope of this straight line then provides the spectroscopic mass, which is constant and identical to the cyclotron mass and is also equal to the band-edge mass in this idealized situation. In the case of a nonparabolic band, the cyclotron mass gradually increases with magnetic field. This means that the slope versus B for CR becomes smaller (i.e., the line bends down) with increasing B. Impurity-related lines, on the other hand, extrapolate to finite frequencies at zero magnetic field, corresponding to the zero-field binding energies of impurities. Different transitions have different slopes versus B, but all transitions originating from the same impurity atoms converge to an approximately common intersect at zero field. The most dominant donor-related line, the 1s to 2p+ transition, is sometimes called impurity-shifted CR (ICR). This is because its slope versus B becomes nearly equal to that of free electron CR in the high-field limit, i.e., hoc Ry , where Ry is the binding energy of the impurity at zero field (McCombe and Wagner, 1975). In many cases, multiple CR lines are observed in one spectrum (see, e.g., Dresselhaus et al., 1955; Otsuka, 1991; Petrou and McCombe, 1991; Nicolas, 1994). This generally indicates the existence of multiple types of carriers with different masses. Possible origins include: multi-valley splitting in the conduction band, light holes and heavy holes in the valence band, splitting due to resonant electron-phonon coupling, nonparabolicity-induced spin splitting, Landau-level splitting (see THEORY OF MAGNETIC PHASE TRANSITIONS), and population of two sub-bands in a quantum well. Explanation of each of these phenomena would be beyond the scope of this unit. As discussed in the Introduction, CR linewidth is a sensitive probe for studying carrier scattering phenomena. In general, the linewidth of a CR line is related to the scattering lifetime, and the integrated absorption intensity of a CR line is related to the density of carriers participating in CR. Thus, if carrier density is constant, the product of absorption linewidth and depth is constant, even though the width and depth are not individually constant. More quantitatively, if the observed lineshape is well fitted by a Lorentzian, in the small absorption and reflection approximation it may be compared with Tðo; BÞ 1 dne2 t 1 ¼ 1 Aðo; BÞ ¼ 1 Tðo; 0Þ 2 ce0 k1=2 m 1 þ ðo oc Þ2 t2 l
ð21Þ where T is transmission, A is absorption, d is the sample thickness, c is the speed of light, and the other symbols have been defined earlier. The half-width at half maximum (HWHM) is thus equal to 1/t. The peak absorption depth gives an estimate of the carrier density. For a
more detailed and complete description on carrier transport studies using CR, see Otsuka (1991).
SAMPLE PREPARATION Cyclotron resonance does not require any complicated sample preparation unless it is combined with other experimental techniques. The minimum sample size required depends on the design of the FIR optics used, i.e., how tightly the FIR beam can be focused onto the sample. Highly absorptive samples and samples with high carrier concentrations need to be polished down so that they are thin enough for transmission studies. In any case, wedging the sample substrates 28 to 38 is necessary to avoid multiple-reflection interference effects. Choosing the right sample is crucial for the success of a CR study. Samples with the highest possible carrier mobility are always preferable, if available. The DC mobility and density (see CONDUCTIVITY MEASUREMENT) can provide a rough estimate for the CR lineshape expected.
PROBLEMS Generally speaking, the FIR (or THz) frequency regime, where CR is usually observed, is a difficult spectral range in which to carry out sophisticated spectroscopy. This range lies in the so-called ‘‘technology-gap’’ existing between electronics (%100 GHz) and optics ($10 THz) frequencies. The well-developed NIR/visible technology does not extend to this range; sources are dimmer and detectors are less sensitive. In addition, because of the lack of efficient non-linear crystals, there exist no amplitude or phase modulators in the FIR, except for simple mechanical choppers. Therefore, in many experiments, one deals with small signals having large background noise. In steadystate experiments with a step-scan FT spectrometer, lock-in techniques are always preferable. Modulating a property of the sample that only changes the size of the signal of interest—e.g., modulating the carrier density with tunable gate electrodes—has proven to be a very efficient way to detect small signals. The cross-modulation (or photoconductivity) technique is also frequently used to detect small signals since it is a very sensitive method, as discussed earlier. Aside from this signal-to-noise problem inherent in the FIR, there are some additional problems that CR spectroscopists might encounter. A problem particularly important in bulk semiconductors is the carrier freeze-out effect mentioned earlier. In most semiconductors, lowtemperature FIR magneto-spectra are dominated by impurity transitions. At high temperatures, free carriers are liberated from the impurities, but at the same time CR often becomes too broad to be observed because of the increased scattering rate. So one has to be careful in choosing the right temperature range to study CR. In very pure semiconductors, the only way to get any CR signal is by optical pumping. In Si and Ge, whose carrier lifetimes are very long (msec in high-quality samples), one can create a large number of carriers sufficient for steady-state
CYCLOTRON RESONANCE
FIR absorption spectroscopy. In direct-gap semiconductors such as GaAs, carrier lifetimes are very short (%1 nsec), so that it is nearly impossible to create enough carriers for steady-state FIR experiments, although short-pulse NIR-FIR two-color spectroscopy with an FEL is able to capture transient FIR absorption by photo-created nonequilibrium carriers. In low-dimensional semiconductor systems, so-called modulation doping is possible, where carriers can be spatially separated from their parent impurities so that they do not freeze out even at the lowest temperature. The use of a strong magnet introduces a new class of problems. As we have seen above, in all CR studies, either in the form of FTMS or LMS, the transmission through the sample at finite magnetic field is compared with the transmission at zero magnetic field. The success of this ratioing relies on the assumption that it is only the sample that changes transmissivity with magnetic field. If anything else in the system changes some property with magnetic field, this method fails. Therefore, great care must be taken in order to make sure that no optical components have magnetic-field-dependent characteristics, that the FIR source and detector are not affected by the magnetic field, and that no component moves with magnetic field.
ACKNOWLEDGMENTS The author would like to thank Prof. B. D. McCombe for useful discussions, comments, and suggestions. He is also grateful to Prof. N. Miura for critically reading the article, Prof. R. A. Stradling and Prof. C. R. Pidgeon for useful comments, and G. Vacca and D. C. Larrabee for proofreading the manuscript. This work was supported in part by NSF DMR-0049024 and ONR N00014-94-1-1024. LITERATURE CITED Aggarwal, R. L. and Lax, B. 1977. Optical mixing of CO2 lasers in the far-infrared. In Nonlinear Infrared Generation—Vol. 16 of Topics in Applied Physics (Y.-R. Shen, ed.) pp. 19–80. SpringerVerlag, Berlin. Ahmed, N., Agool, I. R., Wright, M. G., Mitchell, K., Koohian, A., Adams, S. J. A., Pidgeon, C. R., Cavenett, B. C., Stanley, C. R., and Kean, A. H. 1992. Far-infrared optically detected cyclotron resonance in GaAs layers and low-dimensional structures. Semicond. Sci. Technol. 7:357–363. Aschcroft, N. W. and Mermin, N. D. 1976. Solid State Physics. Holt, Rinehart and Winston, Philadelphia. Azbel, M. Ya. and Kaner, E. A. 1958. Cyclotron resonance in metals. J. Phys. Chem. Solids 6:113–135. Bowers, R. and Yafet, Y. 1959. Magnetic susceptibility of InSb. Phys. Rev. 115:1165–1172. Brau, C. A. 1990. Free-Electron Lasers. Academic Press, San Diego. Byer, R. L. and Herbst, R. L. 1977. Parametric oscillation and mixing. In Nonlinear Infrared Generation—Vol. 16 of Topics in Applied Physics (Y.-R. Shen, ed.) pp. 81–137. Springer-Verlag, Berlin. Cerne, J., Kono, J., Sherwin, M. S., Sundaram, M., Gossard, A. C., and Bauer, G. E. W. 1996. Terahertz dynamics of excitons in GaAs/AlGaAs quantum wells. Phys. Rev. Lett. 77:1131–1134.
815
Chantry, G. W. 1971. Submillimeter Spectroscopy. Academic Press, New York. Dresselhaus, G., Kip, A. F., and Kittel, C. 1953. Observation of cyclotron resonance in germanium crystals. Phys. Rev. 92:827. Dresselhaus, G., Kip, A. F., and Kittel, C. 1955. Cyclotron resonance of electrons and holes in silicon and germanium crystals. Phys. Rev. 98:368–384. Gornik, E. 1991. Landau emission. In Landau Level Spectroscopy—Vol. 27.2 of Modern Problems in Condensed Matter Sciences (G. Landwehr and E. I. Rashba, eds.) pp. 912–996. Elsevier Science, Amsterdam. Herlach, F. ed. 1984. Strong and Ultrastrong Magnetic Fields and Their Applications—Vol. 57 of Topics in Applied Physics. Springer-Verlag, Berlin. Hensel, J. C. and Suzuki, K. 1974. Quantum resonances in the valence bands of germanium. II. Cyclotron resonances in uniaxially stressed crystals. Phys. Rev. B 9:4219–4257. Kawabata, A. 1967. Theory of cyclotron resonance linewidth. J. Phys. Soc. Japan 23:999–1006. Kimmitt, M. F. 1970. Far-infrared techniques. Pion Limited, London. Kittel, C. 1987. Quantum Theory of Solids. John Wiley & Sons, New York. Kono, J., Miura, N., Takeyama, S., Yokoi, H., Fujimori, N., Nishibayashi, Y., Nakajima, T., Tsuji, K., and Yamanaka, M. 1993. Observation of cyclotron resonance in low-mobility semiconductors using pulsed ultra-high magnetic fields. Physica B 184:178–183. Kono, J., Lee, S. T., Salib, M. S., Herold, G. S., Petrou, A., and McCombe, B. D. 1995. Optically detected far-infrared resonances in doped GaAs quantum wells. Phys. Rev. B 52: R8654-R8657. Lax, B. and Mavroides, J. G. 1960. Cyclotron resonance. In Solid State Physics Vol. 11 (F. Seitz and D. Turnbull, eds.) pp. 261– 400. Academic Press, New York. Lax, B, Zeiger, H. J., Dexter, R. N., and Rosenblum, E. S. 1953. Directional properties of the cyclotron resonance in germanium. Phys. Rev. 93:1418–1420. Luttinger, J. M. 1951. The effect of a magnetic field on electrons in a periodic potential. Phys. Rev. 84:814–817. Luttinger, J. M. 1956. Quantum theory of cyclotron resonance in semiconductors: General theory. Phys. Rev. 102:1030–1041. Luttinger, J. M. and Kohn, W. 1955. Motion of electrons and holes in perturbed periodic fields. Phys. Rev. 97:869–883. Mavroides, J. G. 1972. Magneto-optical properties. In Optical Properties of Solids (F. Abeles, ed.) pp. 351–528. NorthHolland, Amsterdam. McCombe, B. D. and Wagner, R. J. 1975. Intraband magneto-optical studies of semiconductors in the far-infrared. In Advances in Electronics and Electron Physics (L. Marton, ed.) Vol. 37, pp. 1–78 and Vol. 38, pp. 1–53. Academic Press, New York. Michels, J. G., Warburton, R. J., Nicholas, R. J., and Stanley, C. R. 1994. An optically detected cyclotron resonance study of bulk GaAs. Semicond. Sci. Technol. 9:198–206. Miura, N. 1984. Infrared magnetooptical spectroscopy in semiconductors and magnetic materials in high pulsed magnetic fields. In Infrared and Millimeter Waves Vol. 12 (K. J. Button, ed.) pp. 73–143. Academic Press, New York. Nicholas, R. J. 1994. Intraband optical properties of low-dimensional semiconductor systems. In Handbook on Semiconductors Vol. 2 ‘‘Optical Properties’’ (M. Balkanski, ed.) pp. 385– 461. Elsevier, Amsterdam.
816
RESONANCE METHODS
Nuss, M. C. and Orenstein, J. 1998. Terahertz time-domain spectroscopy. In Millimeter and Submillimeter Wave Spectroscopy of Solids (G. Gru¨ ner, ed.) pp. 7–44. Springer-Verlag, Berlin. Otsuka, E. 1991. Cyclotron resonance. In Landau Level Spectroscopy—Vol. 27.1 of Modern Problems in Condensed Matter Sciences (G. Landwehr and E. I. Rashba, eds.) pp. 1–78. Elsevier Science, Amsterdam. Palik, E. D. and Furdyna, J. K. 1970. Infrared and microwave magnetoplasma effects in semiconductors. Rep. Prog. Phys. 33: 1193–1322. Petrou, A. and McCombe, B. D. 1991. Magnetospectroscopy of confined semiconductor systems. In Landau Level Spectroscopy— Vol. 27.2 of Modern Problems in Condensed Matter Sciences (G. Landwehr and E. I. Rashba, eds.) pp. 679–775. Elsevier Science, Amsterdam. Pidgeon, C. R. 1994. Free carrier Landau level absorption and emission in semiconductors. In Handbook on Semiconductors Vol. 2 ‘‘Optical Properties’’ (M. Balkanski, ed.) pp. 637–678. Elsevier, Amsterdam. Pidgeon, C. R. and Brown, R. N. 1966. Interband magneto-absorption and Faraday rotation in InSb. Phys. Rev. 146:575–583.
Lax and Mavroides, 1960. See above. This review article provides a thorough overview on early, primarily microwave, CR studies in the 1950s. A historical discussion is given also to CR of electrons and ions in ionized gases, which had been extensively investigated before CR in solids was first observed. McCombe and Wagner, 1975. See above. Describes a wide variety of far-infrared mangeto-optical phenomena observed in bulk semiconductors in the 1960s and 1970 s. Detailed describtions are given to basic theoretical formulations and experimental techniques as well as extensive coverage of experimental results.
J. KONO Rice University Houston, Texas
¨ SSBAUER SPECTROMETRY MO
Rubens, H. and von Baeyer, O. 1911. On extremely long waves, emitted by the quartz mercury lamp. Phil. Mag. 21:689–695.
INTRODUCTION
Sakurai, J. J. 1985. Modern Quantum Mechanics. Addison-Wesley Publishing Co., Redwood City, California.
Mo¨ ssbauer spectrometry is based on the quantum-mechanical ‘‘Mo¨ ssbauer effect,’’ which provides a nonintuitive link between nuclear and solid-state physics. Mo¨ ssbauer spectrometry measures the spectrum of energies at which specific nuclei absorb g rays. Curiously, for one nucleus to emit a g ray and a second nucleus to absorb it with efficiency, the atoms containing the two nuclei must be bonded chemically in solids. A young R. L. Mo¨ ssbauer observed this efficient g-ray emission and absorption process in 191 Ir, and explained why the nuclei must be embedded in solids. Mo¨ ssbauer spectrometry is now performed primarily with the nuclei 57Fe, 119Sn, 151Eu, 121Sb, and 161 Dy. Mo¨ ssbauer spectrometry can be performed with other nuclei, but only if the experimenter can accept short radioisotope half-lives, cryogenic temperatures, and the preparation of radiation sources in hot cells. Most applications of Mo¨ ssbauer spectrometry in materials science utilize ‘‘hyperfine interactions,’’ in which the electrons around a nucleus perturb the energies of nuclear states. Hyperfine interactions cause very small perturbations of 109 to 107 eV in the energies of Mo¨ ssbauer g rays. For comparison, the g rays themselves have energies of 104 to 105 eV. Surprisingly, these small hyperfine perturbations of g-ray energies can be measured easily, and with high accuracy, using a low-cost Mo¨ ssbauer spectrometer. Interpretations of Mo¨ ssbauer spectra have few parallels with other methods of materials characterization. A Mo¨ ssbauer spectrum looks at a material from the ‘‘inside out,’’ where ‘‘inside’’ means the Mo¨ ssbauer nucleus. Hyperfine interactions are sensitive to the electronic structure at the Mo¨ ssbauer atom, or at its nearest neighbors. The important hyperfine interactions originate with the electron density at the nucleus, the gradient of the electric field at the nucleus, or the unpaired electron spins at the nucleus. These three hyperfine interactions are called the ‘‘isomer shift,’’ ‘‘electric quadrupole splitting,’’ and ‘‘hyperfine magnetic field,’’ respectively. The viewpoint from the nucleus is sometimes too small to address problems in the microstructure of materials.
Salib, M. S., Nickel, H. A., Herold, G. S., Petrou, A., McCombe, B. D., Chen, R., Bajaj, K. K., and Schaff, W. 1996. Observation of internal transitions of confined excitons in GaAs/AlGaAs quantum wells. Phys. Rev. Lett. 77:1135–1138. Slater, J. C. 1949. Electrons in perturbed periodic lattices. Phys. Rev. 76:1592–1601. Stewart, J. E. 1970. Infrared Spectroscopy. Marcel Dekker, New York. Suzuki, K. and Hensel, J. C. 1974. Quantum resonances in the valence bands of germanium. I. Theoretical considerations. Phys. Rev. B 9:4184–4218. Wannier, G. H. 1937. The structure of electronic excitation levels in insulating crystals. Phys. Rev. 52:191–197. Wright, M. G., Ahmed, N., Koohian, A., Mitchell, K., Johnson, G. R., Cavenett, B. C., Pidgeon, C. R., Stanley, C. R., and Kean, A. H. 1990. Far-infrared optically detected cyclotron resonance observation of quantum effects in GaAs. Semicond. Sci. Technol. 5:438–441. Zeiger, H. J., Rauch, C. J., and Behrndt, M. E. 1959. Cross modulation of D. C. resistance by microwave cyclotron resonance. Phys. Chem. Solids 8:496–498.
KEY REFERENCES Dresselhaus et al.,1955. See above. This seminal article is still a useful reference not only for CR spectroscopists but also for students beginning to study semiconductor physics. Both experimental and theoretical aspects of CR in solids as well as the band structure of these two fundamental semiconductors (Si and Ge) are described in great detail. Landwehr, G. and Rashba, E. I. eds. 1991. Landau Level Spectroscopy—Vol. 27.1 and 27.2 of Modern Problems in Condensed Matter Sciences. Elsevier Science, Amsterdam. These two volumes contain a number of excellent review articles on magneto-optical and magneto-transport phenomena in bulk semiconductors and low-dimensional semiconductor quantum structures.
¨ SSBAUER SPECTROMETRY MO
Over the past four decades, however, there has been considerable effort to learn how the three hyperfine interactions respond to the environment around the nucleus. In general, it is found that Mo¨ ssbauer spectrometry is best for identifying the electronic or magnetic structure at the Mo¨ ssbauer atom itself, such as its valence, spin state, or magnetic moment. The Mo¨ ssbauer effect is sensitive to the arrangements of surrounding atoms, however, because the local crystal structure will affect the electronic or magnetic structure at the nucleus. Different chemical and structural environments around the nucleus can often be assigned to specific hyperfine interactions. In such cases, measuring the fractions of nuclei with different hyperfine interactions is equivalent to measuring the fractions of the various chemical and structural environments in a material. Phase fractions and solute distributions, for example, can be determined in this way. Other applications of the Mo¨ ssbauer effect utilize its sensitivity to vibrations in solids, its timescale for scattering, or its coherence. To date these phenomena have seen little use outside the international community of a few hundred Mo¨ ssbauer spectroscopists. Nevertheless, some new applications for them have recently become possible with the advent of synchrotron sources for Mo¨ ssbauer spectrometry. There have been a number of books written about the Mo¨ ssbauer effect and its spectroscopies (see Key References). Most include reviews of materials research. These reviews typically demonstrate applications of the measurable quantities in Mo¨ ssbauer spectrometry, and provide copious references. This unit is not a review of the field, but an instructional reference that gives the working materials scientist a basis for evaluating whether or not Mo¨ ssbauer spectrometry may be useful for a research problem. Recent research publications on Mo¨ ssbauer spectrometry of materials have involved, in descending order in terms of numbers of papers: oxides, metals and alloys, organometallics, glasses, and minerals. For some problems, materials characterization by Mo¨ ssbauer spectrometry is now ‘‘routine.’’ A few representative applications to materials studies are presented. These applications were chosen in part according to the taste of the author, who makes no claim to have reviewed the literature of approximately 40,000 publications utilizing the Mo¨ ssbauer effect (see Internet Resources for Mo¨ ssbauer Effect Data Center Web site). PRINCIPLES OF THE METHOD Nuclear Excitations Many properties of atomic nuclei and nuclear matter are well established, but these properties are generally not of importance to materials scientists. Since Mo¨ ssbauer spectrometry measures transitions between states of nuclei, however, some knowledge of nuclear properties is necessary to understand the measurements. A nucleus can undergo transitions between quantum states, much like the electrons of an atom, and doing so involves large changes in energy. For example, the first
817
excited state of 57Fe is 14.41 keV above its ground state. The Mo¨ ssbauer effect is sometimes called ‘‘nuclear resonant g-ray scattering’’ because it involves the emission of a g ray from an excited nucleus, followed by the absorption of this g ray by a second nucleus, which becomes excited. The scattering is called ‘‘resonant’’ because the phase and energy relationships for the g-ray emission and absorption processes are much the same as for two coupled harmonic oscillators. The state of a nucleus is described in part by the quantum numbers E, I, and Iz, where E is energy and I is the nuclear spin with orientation Iz along a z axis. In addition to these three internal nuclear coordinates, to understand the Mo¨ ssbauer effect we also need spatial coordinates, X, for the nuclear center of mass as the nucleus moves through space or vibrates in a crystal lattice. These center-of-mass coordinates are decoupled from the internal excitations of the nucleus. The internal coordinates of the nucleus are mutually coupled. For example, the first excited state of the nucleus 57 Fe has spin I ¼ 3/2. For I ¼ 3/2, there are four possible values of Iz, namely, 3/2, 1/2 , þ1/2 , and þ3/2. The ground state of 57Fe has I ¼ 1/2 and two allowed values of Iz. In the absence of hyperfine interactions to lift the energy degeneracies of spin levels, all allowed transitions between these spin levels will occur at the same energy, giving a total cross-section for nuclear absorption, s0 , of 2.57 1018 cm2. Although s0 is smaller by a factor of 100 than a typical projected area of an atomic electron cloud, s0 is much larger than the characteristic size of the nucleus. It is also hundreds of times larger than the cross-section for scattering a 14.41-keV photon by the atomic electrons at 57Fe. The characteristic lifetime of the excited state of the 57 Fe nucleus, t, is 141 ns, which is relatively long. An ensemble of independent 57Fe nuclei that are excited simultaneously, by a flash of synchrotron light, for example, will decay at various times, t, with the probability per unit time of 1t expð1=tÞ. The time uncertainty of the nuclear excited state, t, is related to the energy uncertainty of the excited state, E, through the uncertainty relationship, h Et. For t ¼ 141 ns, the uncertainty relationship provides E ¼ 4:7 109 eV. This is remarkably small—the energy of the nuclear excited state is extremely precise in energy. A nuclear resonant g-ray emission or absorption has an oscillator quality factor, Q, of 3 1012. The purity of phase of the g ray is equally impressive. In terms of information technology, it is possible in principle to transmit high-quality audio recordings of all the Beethoven symphonies on a single Mo¨ ssbauer g-ray photon. The technology for modulating and demodulating this information remains problematic, however. In the absence of significant hyperfine interactions, the energy dependence of the cross-section for Mo¨ ssbauer scattering is of Lorentzian form, with a width determined by the small lifetime broadening of the excited state energy sj ðEÞ ¼ 1þ
s0 pj 2 EEj =2
ð1Þ
818
RESONANCE METHODS
where for 57Fe, ¼ E ¼ 4.7 109 eV, and Ej is the mean energy of the nuclear level transition (14.41 keV). Here pj is the fraction of nuclear absorptions that will occur with energy Ej. In the usual case where the energy levels of the different Mo¨ ssbauer nuclei are inequivalent and the nuclei scatter independently, the total cross section is sðEÞ ¼
X
sj ðEÞ
ð2Þ
j
A Mo¨ ssbauer spectrometry measurement is usually designed to measure the energy dependence of the total cross-section, s(E), which is often a sum of Lorentzian functions of natural line width . A highly monochromatic g ray from a first nucleus is required to excite a second Mo¨ ssbauer nucleus. The subsequent decay of the nuclear excitation need not occur by the reemission of a g ray, however, and for 57Fe only 10.9% of the decays occur in this way. Most of the decays occur by ‘‘internal conversion’’ processes, where the energy of the nuclear excited state is transferred to the atomic electrons. These electrons typically leave the atom, or rearrange their atomic states to emit an x ray. These conversion electrons or conversion x rays can themselves be used for measuring a Mo¨ ssbauer spectrum. The conversion electrons offer the capability for surface analysis of a material. The surface sensitivity of conversion electron Mo¨ ssbauer spectrometry can be as small as a monolayer (Faldum et al., 1994; Stahl and Kankeleit, 1997; Kruijer et al., 1997). More typically, electrons of a wide range of energies are detected, providing a depth sensitivity for conversion electron Mo¨ ssbauer spectrometry of 100 nm (Gancedo et al., 1991; Williamson, 1993). It is sometimes possible to measure coherent Mo¨ ssbauer scattering. Here the total intensity, I(E), from a sample is not the sum of independent intensity contributions from individual nuclei. One considers instead the total wave, (r,E), at a detector located at r. The total wave, (r,E), is the sum of the scattered waves from individual nuclei, j ðr; EÞ ¼
X
j ðr; EÞ
ð3Þ
j
Equation 3 is fundamentally different from Equation 2, since wave amplitudes rather than intensities are added. Since we add the individual j , it is necessary to account precisely for the phases of the waves scattered by the different nuclei. The Mo¨ ssbauer Effect Up to this point, we have assumed it possible for a second nucleus to become excited by absorbing the energy of a g ray emitted by a first nucleus. There were a few such experiments performed before Mo¨ ssbauer’s discovery, but they suffered from a well recognized difficulty. As mentioned above, the energy precision of a nuclear excited state can be on the order of 108 eV. This is an extremely small energy target to hit with an incident g ray. At room temperature, for example, vibrations of the nuclear center
of mass have energies of 2.5 102 eV/atom. If there were any change in the vibrational energy of the nucleus caused by g-ray emission, the g ray would be far too imprecise in energy to be absorbed by the sharp resonance of a second nucleus. Such a change seems likely, since the emission of a g ray of momentum pg ¼ Eg/c requires the recoil of the emitting system with an opposite momentum (where Eg is the g-ray energy and c is the speed of light). A mass, m, will recoil after such a momentum transfer, and the kinetic energy in the recoil, Erecoil, will detract from the g-ray energy Erecoil
p2g ¼ ¼ 2m
E2g 2mc2
! ð4Þ
For the recoil of a single nucleus, we use the mass of a 57Fe nucleus for m in Equation 4, and find that Erecoil ¼ 1.86 103 eV. This is again many orders of magnitude larger than the energy precision required for the g ray to be absorbed by a second nucleus. Rudolf Mo¨ ssbauer’s doctoral thesis project was to measure nuclear resonant scattering in 191Ir. His approach was to use thermal Doppler broadening of the emission line to compensate for the recoil energy. A few resonant nuclear absorptions could be expected this way. To his surprise, the number of resonant absorptions was large, and was even larger when his radiation source and absorber were cooled to liquid nitrogen temperature (where the thermal Doppler broadening is smaller). Adapting a theory developed by W. E. Lamb for neutron resonant scattering (Lamb, 1939), Mo¨ ssbauer interpreted his observed effect and obtained the equivalent of Equation 9, below. Mo¨ ssbauer further realized that by using small mechanical motions, he could provide Doppler shifts to the g-ray energies and tune through the nuclear resonance. He did so, and observed a spectrum without thermal Doppler broadening. In 1961, R. L. Mo¨ ssbauer won the Nobel prize in physics. He was 32. Mo¨ ssbauer discovered (Mo¨ ssbauer, 1958) that under appropriate conditions, the mass, m, in Equation 4 could be equal to the mass of the crystal. In such a case, the recoil energy is trivially small, the energy of the outgoing g ray is precise to better than 109 eV, and the g ray can be absorbed by exciting a second nucleus. The question is now how the mass, m, could be so large. The idea is that the nuclear mass is attached rigidly to the mass of the crystal. This sounds rather unrealistic, of course, and a better model is that the 57Fe nucleus is attached to the crystal mass by a spring. This is the problem of a simple harmonic oscillator, or equivalently the Einstein model of a solid with Einstein frequency oE. The solution to this quantum mechanical problem shows that some of the nuclear recoils occur as if the nucleus is indeed attached rigidly to the crystal, but other g-ray emissions occur by changing the state of the Einstein oscillator. Nearly all of the energy of the emitted g ray comes from changes in the internal coordinates of the nucleus, independently of the motion of the nuclear center of mass. The concern about the change in the nuclear center of mass coordinates arises from the conservation of the
¨ SSBAUER SPECTROMETRY MO
momentum during the emission of a g ray with momentum pg ¼ Eg/c. Eventually, the momentum of the g-ray emission will be taken up by the recoil of the crystal as a whole. However, it is possible that the energy levels of a simple harmonic oscillator (comprising the Mo¨ ssbauer nucleus bound to the other atoms of the crystal lattice) could be changed by the g-ray emission. An excitation of this oscillator would depreciate the g-ray energy by nh if n phonons are excited during the g-ray emission. Since hoE is on the order of 102 eV, any change in oscillator energy would spoil the possibility for a subsequent resonant absorption. In essence, changes in the oscillator excitation (or phonons in a periodic solid) replace the classical recoil energy of Equation 4 for spoiling the energy precision of the emitted g ray. We need to calculate the probability that phonon excitation will not occur during g-ray emission. Before g-ray emission, the wavefunction of the nuclear center of mass is ci (X), which can also be represented in momentum space through the Fourier transformation 1 fi ðpÞ ¼ pffiffiffiffiffiffiffiffiffi 2ph
ipX ci ðX0 Þ dX0 exp h 1
ð1
ð5Þ
because this integration over p gives a Dirac delta function (times 2p h) cf ðXÞ ¼
ipg X exp ci ðX0 ÞdðX X0 Þ dX0 h 1
ð1
cf ðXÞ ¼ exp
ipg X ci ðXÞ h
or
ð1 1
ð1 1
exp
ipX fi ðpÞ dp h
ð6Þ
The momentum space representation can handily accommodate the impulse of the g-ray emission, to provide the final state of the nuclear center of mass, cf (X). Recall that the impulse is the time integral of the force, F ¼ dp/dt, which equals the change in momentum. The analog to impulse in momentum space is a translation in realspace, such as X!XX0. This corresponds to obtaining a final state by a shift in origin of an initial eigenstate. With the emission of a g ray having momentum pg, we obtain the final state wave function from the initial eigenstate through a shift of origin in momentum space, fi(p)!fi(ppg). We interpret the final state in real-space, cf (X), with Equation 6 1 cf ðXÞ ¼ pffiffiffiffiffiffiffiffiffi 2ph
iðp þ pg ÞX fi ðpÞdp exp h 1
ð1
ð7Þ
ð 1 1 iðp þ pg ÞX exp 2ph 1 h ð1 ipX0 ci ðX0 ÞdX0 dp exp h 1
ð11Þ
ci ðXÞcf ðXÞ dX
ð12Þ
Substituting Equation 11 into Equation 12, and using Dirac notation ipg X jii hijf i ¼ hijexp h
ð13Þ
Using the convention for the g-ray wavevector, kg 2pn=c ¼ Eg = hc hijf i ¼ hijexpðikg XÞjii
ð14Þ
The inner product hijf i is the projection of the initial state of the nuclear center of mass on the final state after emission of the g ray. It provides the probability that there is no change in the state of the nuclear center of mass caused by g-ray emission. The probability of this recoilless emission, f, is the square of the matrix element of Equation 14, normalized by all possible changes of the center-of-mass eigenfunctions jhijexpðikg XÞjiij2 f ¼X jh jjexpðikg XÞjiij2
Now, substituting Equation 5 into Equation 7 cf ðXÞ ¼
ð10Þ
The exponential in Equation 11 is a translation of the eigenstate, i(X), in position for a fixed momentum transfer, pg. It is similar to the translation in time, t, of an eigenstate with fixed energy, E, which is exp(iEt/ h) or a translation in momentum for a fixed spatial translation, X0, which is exp(ipX0 / h). (If the initial state is not an eigenstate, pg in Equation 11 must be replaced by an operator.) For the nuclear center-of-mass wavefunction after g-ray emission, we seek the amplitude of the initial state wavefunction that remains in the final state wavefunction. In Dirac notation hijf i ¼
1 ci ðXÞ ¼ pffiffiffiffiffiffiffiffiffi 2p h
819
ð15Þ
j
ð8Þ
jhijexpðikg XÞjiij2 f ¼X hijexpðikg XÞ jih jjexpðþikg XÞjii
ð16Þ
j
Isolating the integration over momentum, p "ð # 1 1 ipg X ipðX X0 Þ cf ðXÞ ¼ exp exp ci ðX0 Þ dp dX0 2ph 1 h h 1 ð1
ð9Þ
Using the closure relation j j jih jj ¼ 1 and the normalization hijii ¼ 1, Equation 16 becomes f ¼ jhijexpðikg XÞjiij2
ð17Þ
820
RESONANCE METHODS
The quantity f is the ‘‘recoil-free-fraction.’’ It is the probability that, after the g ray removes momentum pg from the nuclear center of mass, there will be no change in the lattice state function involving the nuclear center of mass. In other words, f is the probability that a g ray will be emitted with no energy loss to phonons. A similar factor is required for the absorption of a g ray by a nucleus in a second crystal (e.g., the sample). The evaluation of f is straightforward for the ground state of the Einstein solid. The ground state wavefunction is ! mo 1=4 moE X2 E 0 cCM ðXÞ ¼ ð18Þ exp ph 2h Inserting Equation 18 into Equation 17, and evaluating the integral (which is the Fourier transform of a Gaussian function) ! h2 k2g ER f ¼ exp ¼ exp ¼ exp k2g hX2 i 2mhoE hoE ð19Þ where ER is the recoil energy of a free 57Fe nucleus, and hX2i is the mean-squared displacement of the nucleus when bound in an oscillator. It is somewhat more complicated to use a Debye model for calculating f with a distribution of phonon energies (Mo¨ ssbauer, 1958). When the lattice dynamics are known, computer calculations can be used to obtain f from the full phonon spectrum of the solid, including the phonon polarizations. These more detailed calculations essentially confirm the result of Equation 19. The only nontrivial point is that low-energy phonons do not alter the result significantly. The recoil of a single nucleus does not couple effectively to long wavelength phonons, and there are few of them, so their excitation is not a problem for recoilless emission. The condition for obtaining a significant number of ‘‘recoilless’’ g-ray emissions is that the characteristic recoil energy of a free nucleus, ER, is smaller than, or on the order of, the energy of the short wavelength phonons in the solid. These phonon energies are typically estimated from the Debye or Einstein temperatures of the solid to be a few tens of meV. Since ER ¼ 1.86 103 eV for 57Fe, this condition is satisfied nicely. It is not uncommon for most of the g-ray emissions or absorptions from 57Fe to be recoil-free. It is helpful that the energy of the g ray, 14.41 keV, is relatively low. Higher-energy g rays cause ER to be large, as seen by the quadratic relation in Equation 4. Energies of most g rays are far greater than 14 keV, so Mo¨ ssbauer spectrometry is not practical for most nuclear transitions. Overview of Hyperfine Interactions Given the existence of the Mo¨ ssbauer effect, the question remains as to what it can do. The answer is given in two parts: what are the phenomena that can be measured, and then what do these measurables tell us about materials? The four standard measurable quantities are the recoil-free fraction and the three hyperfine interactions: the isomer shift, the electric quadrupole splitting, and
the hyperfine magnetic field. To date, the three hyperfine interactions have proved the most useful measurable quantities for the characterization of materials by Mo¨ ssbauer spectrometry. This overview provides a few rules of thumb as to the types of information that can be obtained from hyperfine interactions. The section below (see More Exotic Measurable Quantities) describes quantities that are measurable, but which have seen fewer applications so far. For specific applications of hyperfine interactions for studies of materials, see Practical Aspects of the Method. The isomer shift is the easiest hyperfine interaction to understand. It is a direct measure of electron density, albeit at the nucleus and away from the electron density responsible for chemical bonding between the Mo¨ ssbauer atom and its neighbors. The isomer shift changes considerably with the valence of the Mo¨ ssbauer atom in the cases of 57 Fe and 119Sn. It is possible to use the isomer shift to estimate the fraction of Mo¨ ssbauer isotope in different valence states, which may originate from different crystallographic site occupancies or from the presence of multiple phases in a sample. Valence analysis is often straightforward, and is probably the most common type of service work that Mo¨ ssbauer spectroscopists provide for other materials scientists. The isomer shift has proven most useful for studies of ionic or covalently bonded materials such as oxides and minerals. Unfortunately, although the isomer shift is in principle sensitive to local atomic coordinations, it has usually not proven useful for structural characterization of materials, except when changes in valence are involved. The isomer shifts caused by most local structural distortions are generally too small to be useful. Electric field gradients (EFG) are often correlated to isomer shifts. The existence of an EFG requires an asymmetric (i.e., noncubic) electronic environment around the nucleus, however, and this usually correlates with the local atomic structure. Again, like the isomer shift, the EFG has proven most useful for studies of oxides and minerals. Although interpretations of the EFG are not so straightforward as the isomer shift, the EFG is more capable of providing information about the local atomic coordination of the Mo¨ ssbauer isotope. For 57Fe, the shifts in peak positions caused by the EFG tend to be comparable to, or larger than, those caused by the isomer shift. While isomer shifts are universal, hyperfine magnetic fields are confined to ferro-, ferri-, or antiferromagnetic materials. However, while isomer shifts tend to be small, hyperfine magnetic fields usually provide large and distinct shifts of Mo¨ ssbauer peaks. Because their effects are so large and varied, hyperfine magnetic fields often permit detailed materials characterizations by Mo¨ ssbauer spectrometry. For body-centered cubic (bcc) Fe alloys, it is known how most solutes in the periodic table alter the magnetic moments and hyperfine magnetic fields at neighboring Fe atoms, so it is often possible to measure the distribution of hyperfine magnetic fields and determine solute distributions about 57Fe atoms. In magnetically ordered Fe oxides, the distinct hyperfine magnetic fields allow for ready identification of phase, sometimes more readily than by x-ray diffractometry.
¨ SSBAUER SPECTROMETRY MO
Even in cases where fundamental interpretations of Mo¨ ssbauer spectra are impossible, the identification of the local chemistry around the Mo¨ ssbauer isotope is often possible by ‘‘fingerprint’’ comparisons with known standards. Mo¨ ssbauer spectrometers tend to have similar instrument characteristics, so quantitative comparisons with published spectra are often possible. A literature search for related Mo¨ ssbauer publications is usually enough to locate standard spectra for comparison. The Mo¨ ssbauer Effect Data Center (see Internet Resources) is another resource that can provide this information. Recoil-Free Fraction An obvious quantity to measure with the Mo¨ ssbauer effect is its intensity, given by Equation 19 as the recoil-free fraction, f. The recoil-free fraction is reminiscent of the DebyeWaller factor for x-ray diffraction. It is large when the lattice is stiff and oE is large. Like the Debye-Waller factor, f is a weighted average over all phonons in the solid. Unlike the Debye-Waller factor, however, f must be determined from measurements with only one value of wavevector k, which is of course kg. It is difficult to obtain f from a single absolute measurement, since details about the sample thickness and absorption characteristics must be known accurately. Comparative studies may be possible with in situ experiments where a material undergoes a phase transition from one state to another while the macroscopic shape of the specimen is unchanged. The usual way to determine f for a single-phase material is by measuring Mo¨ ssbauer spectral areas as a function of temperature, T. Equation 19 shows that the intensity of the Mo¨ ssbauer effect will decrease with hX2i, the meansquared displacement of the nuclear motion. The hX2i increases with T, so measurements of spectral intensity versus T can provide the means for determining f, and hence the Debye or Einstein temperature of the solid. Another effect that occurs with temperature provides a measure of hv2i, where v is the velocity of the nuclear center of mass. This effect is sometimes called the ‘‘second order Doppler shift,’’ but it originates with special relativity. When a nucleus emits a g ray and loses energy, its mass is reduced slightly. The phonon occupation numbers do not change, but the phonon energy is increased slightly owing to the diminished mass. This reduces the energy available to the g-ray photon. This effect is usually of greater concern for absorption by the specimen, for which the energy shift is Etherm ¼
1hV2 i E0 2 c2
ð20Þ
The thermal shift scales with the thermal kinetic energy in the sample, which is essentially a measure of temperature. For 57Fe, Etherm ¼ 7.3 104 mm/s K. Isomer Shift The peaks in a Mo¨ ssbauer spectrum undergo observable shifts in energy when the Mo¨ ssbauer atom is in different materials. These shifts originate from a hyperfine interac-
821
tion involving the nucleus and the inner electrons of the atom. These ‘‘isomer shifts’’ are in proportion to the electron density at the nucleus. Two possibly unfamiliar concepts underlie the origin of the isomer shift. First, some atomic electron wavefunctions are actually present inside the nucleus. Second, the nuclear radius is different in the nuclear ground and excited states. In solving the Schro¨ dinger equation for radial wavefunctions of electrons around a point nucleus, it is found that for r!0 (i.e., toward the nucleus) the electron wavefunctions go as rl, where l is the angular momentum quantum number of the electron. For s electrons (1s, 2s, 3s, 4s, etc.) with l ¼ 0, the electron wavefunction is quite large at r ¼ 0. It might be guessed that the wavefunctions of s electrons could make some sort of sharp wiggle so they go to zero inside the nucleus, but this would cost too much kinetic energy. The s electrons (and some relativistic p electrons) are actually present inside the nucleus. Furthermore, the electron density is essentially constant over the size of the nucleus. The overlap of the s-electron wavefunction with the finite nucleus provides a Coulomb perturbation which lowers the nuclear energy levels. If the excited state and ground-state energy levels were lowered equally, however, the energy of the nuclear transition would be unaffected, and the emitted (or absorbed) g ray would have the same energy. It is well known that the radius of an atom changes when an electron enters an excited state. The same type of effect occurs for nuclei—the nuclear radius is different for the nuclear ground and excited states. For 57Fe, the effective radius of the nuclear excited state, Rex, is smaller than the radius of the ground state, Rg, but for 119Sn it is the other way around. For the overlap of a finite nucleus with a constant charge density, the total electrostatic attraction is stronger when the nucleus is smaller. This leads to a difference in energy between the nuclear excited state and ground state in the presence of a constant electron density jcð0Þj2 . This shift in transition energy will usually be different for nuclei in the radiation source and nuclei in the sample, giving the following shift in position of the absorption peak in the measured spectrum # $# $ EIS ¼ CZ e2 ðR2ex R2g Þ jcsample ð0Þj2 jcsource ð0Þj2 ð21Þ The factor C depends on the shape of the nuclear charge distribution, which need not be uniform or spherical. The sign of Equation 21 for 57Fe is such that with an increasing s-electron density at the nucleus, the Mo¨ ssbauer peaks will be shifted to more negative velocity. For 119Sn, the difference in nuclear radii has the opposite sign. With increasing s-electron density at a 119Sn nucleus, the Mo¨ ssbauer peaks shift to more positive velocity. There remains another issue in interpreting isomer shifts, however. In the case of Fe, the 3d electrons are expected to partly screen the nuclear charge from the 4s electrons. An increase in the number of 3d electrons at an 57Fe atom will therefore increase this screening, reducing the s-electron density at the 57Fe nucleus and causing a more positive isomer shift. The s-electron density at the
822
RESONANCE METHODS
nucleus is therefore not simply proportional to the number of valence s electrons at the ion. The effect of this 3d electron screening is large for ionic compounds (Gu¨ tlich, 1975). In these compounds there is a series of trend lines for how the isomer shift depends on the 4s electron density, where the different trends correspond to the different number of 3d electrons at the 57Fe atom (Walker et al., 1961). With more 3d electrons, the isomer shift is more positive, but also the isomer shift becomes less sensitive to the number of 4s electrons at the atom. Determining the valence state of Fe atoms from isomer shifts is generally a realistic type of experiment, however (see Practical Aspects of the Method). For metals it has been more recently learned that the isomer shifts do not depend on the 3d electron density (Akai et al., 1986). In Fe alloys, the isomer shift corresponds nicely to the 4s charge transfer, in spite of changes in the 3d electrons at the Fe atoms. For # the first factor$ in Equation 21, a proposed choice for 57Fe is CZe2 ðR2ex R2g Þ ¼ 0:24 a30 mm/s (Akai et al., 1986), where a0 is the Bohr radius ˚. of 0.529 A Electric Quadrupole Splitting The isomer shift, described in the previous section, is an electric monopole interaction. There is no static dipole moment of the nucleus. The nucleus does have an electric quadrupole moment that originates with its asymmetrical shape. The asymmetry of the nucleus depends on its spin, which differs for the ground and excited states of the nucleus. In a uniform electric field, the shape of the nuclear charge distribution has no effect on the Coulomb energy. An EFG, however, will have different interaction energies for different alignments of the electric quadrupole moment of the nucleus. An EFG generally involves a variation with position of the x, y, and z components of the electric field vector. In specifying an EFG, it is necessary to know, for example, how the x component of the electric field, Vx qV=qx varies along the y direction, Vy q2 V= qyqx [here V(x,y,z) is the electric potential]. The EFG involves all such partial derivatives, and is a tensor quantity. In the absence of competing hyperfine interactions, it is possible to choose freely a set of principal axes so that the off-diagonal elements of the EFG tensor are zero. By convention, we label the principal axes such that jVzz j > jVyy j > jVxx j. Furthermore, because the Laplacian of the potential vanishes, Vxx þ Vyy þ Vzz ¼ 0, there are only two parameters required to specify the EFG. These are chosen to be Vzz and an asymmetry parameter, Z ðVxx Vyy Þ=Vzz . The isotopes 57Fe and 119Sn have an excited-state spin of I ¼ 3/2 and a ground-state spin of 1/2. The shape of the excited nucleus is that of a prolate spheroid. This prolate spheroid will be oriented with its long axis pointing along the z axis of the EFG when Iz ¼ 3/2. There is no effect from the sign of Iz, since inverting a prolate spheroid does not change its charge distribution. The Iz ¼ 3/2 states have a low energy compared to the Iz ¼ 1/2 orientation of the excited state. In the presence of an EFG, the excited-state energy is split into two levels. Since Iz ¼ 1/2 for the ground state, however, the ground state
Figure 1. Energy level diagrams for 57Fe in an electric field gradient (EFG; left) or hyperfine magnetic field (HMF; right). For an HMF at the sample, the numbers 1 to 6 indicate progressively more energetic transitions, which give experimental peaks at progressively more positive velocities. Sign convention is that an applied magnetic field along the direction of lattice magnetization will reduce the HMF and the magnetic splitting. The case where the nucleus is exposed simultaneously to an EFG and HMF of approximately the same energies is much more complicated than can be presented on a simple energy level diagram.
energy is not split by the EFG. With an electric quadrupole moment for the excited state defined as Q, for 57Fe and 119 Sn the quadrupole splitting of energy levels is
Eq ¼
1=2 1 Z2 eQVzz 1 þ 3 4
ð22Þ
where often there is the additional definition eq Vzz. The energy level diagram is shown in Figure 1. By definition, Z < 1, so the asymmetry factor can vary only from 1 to 1.155. For 57Fe and 119Sn, for which Equation 22 is valid, the asymmetry can usually be neglected, and the electric quadrupole interaction can be assumed to be a measure of Vzz. Unfortunately, it is not possible to determine the sign of Vzz easily (although this has been done by applying high magnetic fields to the sample). The EFG is zero when the electronic environment of the Mo¨ ssbauer isotope has cubic symmetry. When the electronic symmetry is reduced, a single line in the Mo¨ ssbauer spectrum appears as two lines separated in energy as described by Equation 22 (as shown in Fig. 1). When the 57 Fe atom has a 3d electron structure with orbital angular momentum, Vzz is large. High- and low-spin Fe complexes can be identified by differences in their electric quadrupole splitting. The electric quadrupole splitting is also sensitive to the local atomic arrangements, such as ligand charge and coordination, but this sensitivity is not possible to interpret by simple calculations. The ligand field gives an enhanced effect on the EFG at the nucleus because the electronic structure at the Mo¨ ssbauer atom is itself distorted by the ligand. This effect is termed ‘‘Sternheimer antishielding,’’ and enhances the EFG from the ligands by a factor of about 7 for 57Fe (Watson and Freeman, 1967).
¨ SSBAUER SPECTROMETRY MO
Figure 2. Mo¨ ssbauer spectrum from bcc Fe. Data were acquired at 300 K in transmission geometry with a constant acceleration spectrometer (Ranger MS900). The points are the experimental data. The solid line is a fit to the data for six independent Lorentzian functions with unconstrained centers, widths, and depths. Also in the fit was a parabolic background function, which accounts for the fact that the radiation source was somewhat closer to the specimen at zero velocity than at the large positive or negative velocities. A 57Co source in Rh was used, but the zero of the velocity scale is the centroid of the Fe spectrum itself. Separation between peaks 1 and 6 is 10.62 mm/s.
Hyperfine Magnetic Field Splitting The nuclear states have spin, and therefore associated magnetic dipole moments. The spins can be oriented with different projections along a magnetic field. The energies of nuclear transitions are therefore modified when the nucleus is in a magnetic field. The energy perturbations caused by this HMF are sometimes termed the ‘‘nuclear Zeeman effect,’’ in analogy with the more familiar splitting of energy levels of atomic electrons when there is a magnetic field at the atom. A hyperfine magnetic field lifts all degeneracies of the spin states of the nucleus, resulting in separate transitions identifiable in a Mo¨ ssbauer spectrum (see, e.g., Fig. 2). The Iz range from I to þI in increments of 1, being {3/2, 1/ 2, þ1/2, þ3/2} for the excited state of 57Fe and {1/2, þ1/2} for the ground state. The allowed transitions between ground and excited states are set by selection rules. For the M1 magnetic dipole radiation for 57Fe, six transitions are allowed: {(1/2!3/2) (1/2!1/2) (1/2!þ1/2) (þ1/2!1/2) (þ1/2!þ1/2) (þ1/2!þ3/2)}. The allowed transitions are shown in Figure 1. Notice the inversion in energy levels of the nuclear ground state. In ferromagnetic iron metal, the magnetic field at the 57 Fe nucleus, the HMF, is 33.0 T at 300 K. The enormity of this HMF suggests immediately that it does not originate from the traditional mechanisms of solid-state magnetism. Furthermore, when an external magnetic field is applied to a sample of Fe metal, there is a decrease in magnetic splitting of the measured Mo¨ ssbauer peaks. This latter observation shows that the HMF at the 57Fe nucleus has a sign opposite to that of the lattice magnetization of Fe metal, so the HMF is given as 33.0 T. It is easiest to understand the classical contributions to the HMF, denoted Hmag, Hdip and Horb. The contribution Hmag is the magnetic field from the lattice magnetization, M, which is 4pM/3. To this contribution we add any magnetic fields applied by the experimenter, and we subtract the demagnetization caused by the return flux. Typically,
823
Hmag < þ0.7 T. The contribution Hdip is the classical dipole magnetic field caused by magnetic moments at atoms near the Mo¨ ssbauer nucleus. In Fe metal, Hdip vanishes owing to cubic symmetry, but contributions of þ0.1 T are possible when neighboring Fe atoms are replaced with nonmagnetic solutes. Finally, Horb originates with any residual orbital magnetic moment from the Mo¨ ssbauer atom that is not quenched when the atom is a crystal lattice. This contribution is about þ2 T (Akai, 1986), and it may not change significantly when Fe metal is alloyed with solute atoms, for example. These classical mechanisms make only minor contributions to the HMF. The big contribution to the HMF at a Mo¨ ssbauer nucleus originates with the ‘‘Fermi contact interaction.’’ Using the Dirac equation, Fermi and Segre discovered a new term in the Hamiltonian for the interaction of a nucleus and an atomic electron hFC ¼
8p ge gN me mN I S dðrÞ 3
ð23Þ
Here I and S are spin operators that act on the nuclear and electron wavefunctions, respectively, me and mN are the electron and nuclear magnetons, and (r) ensures that the electron wavefunction is sampled at the nucleus. Much like the electron gyromagnetic ratio, ge, the nuclear gyromagnetic ratio, gN, is a proportionality between the nuclear spin and the nuclear magnetic moment. Unlike the case for an electron, the nuclear ground and excited states do not have the same value of gN; that of the ground state of 57 Fe is larger by a factor of 1.7145. The nuclear magnetic moment is gN mN I, so we can express the Fermi contact energy by considering this nuclear magnetic moment in an effective magnetic field, Heff, defined as Heff ¼
8p ge me Sjcð0Þj2 3
ð24Þ
where the electron spin is 1/2, and jcð0Þj2 is the electron density at the nucleus. If two electrons of opposite spin have the same density at the nucleus, their contributions will cancel and Heff will be zero. A large HMF requires an unpaired electron density at the nucleus, expressed as jSj > 0. The Fermi contact interaction explains why the HMF is negative in 57Fe. As described above (see Isomer Shift), only s electrons of Fe have a substantial presence at the nucleus. The largest contribution to the 57Fe HMF is from 2s electrons, however, which are spin-paired core electrons. The reason that spin-paired core electrons can make a large contribution to the HMF is that the 2s" and 2s# wavefunctions have slightly different shapes when the Fe atom is magnetic. The magnetic moment of Fe atoms originates primarily with unpaired 3d electrons, so the imbalance in numbers of 3d" and 3d# electrons must affect the shapes of the paired 2s" and 2s# electrons. These shapes of the 2s" and 2s# electron wavefunctions are altered by exchange interactions with the 3d" and 3d# electrons. The exchange interaction originates with the Pauli exclusion principle, which requires that a multielectron wavefunction be antisymmetric under the exchange
824
RESONANCE METHODS
of electron coordinates. The process of antisymmetrization of a multielectron wavefunction produces an energy contribution from the Coulomb interaction between electrons called the ‘‘exchange energy,’’ since it is the expectation value of the Coulomb energy for all pairs of electrons of like spin exchanged between their wavefunctions. The net effect of the exchange interaction is to decrease the repulsive energy between electrons of like spin. In particular, the exchange interaction reduces the Coulomb repulsion between the 2s" and 3d" electrons, allowing the more centralized 2s" electrons to expand outward away from the nucleus. The same effect occurs for the 2s# and 3d# electrons, but to a lesser extent because there are fewer 3d# electrons than 3d" electrons in ferromagnetic Fe. The result is a higher density of 2s# than 2s" electrons at the 57Fe nucleus. The same effect occurs for the 1s shell, and the net result is that the HMF at the 57Fe nucleus is opposite in sign to the lattice magnetization (which is dominated by the 3d" electrons). The 3s electrons contribute to the HMF, but are at about the same mean radius as the 3d electrons, so their spin unbalance at the 57 Fe nucleus is smaller. The 4s electrons, on the other hand, lie outside the 3d shell, and exchange interactions bring a higher density of 4s" electrons into the 57Fe nucleus, although not enough to overcome the effects of the 1s# and 2s# electrons. These 4s spin polarizations are sensitive to the magnetic moments at nearest neighbor atoms, however, and provide a mechanism for the 57Fe atom to sense the presence of neighboring solute atoms. This is described below (see Solutes in bcc Fe Alloys). More Exotic Measurable Quantities Relaxation Phenomena. Hyperfine interactions have natural time windows for sampling electric or magnetic fields. This time window is the characteristic time, thf, associated with the energy of a hyperfine splitting, thf ¼ h=Ehf . When a hyperfine electric or magnetic field undergoes fluctuations on the order of thf or faster, observable distortions appear in the measured Mo¨ ssbauer spectrum. The lifetime of the nuclear excited state does not play a direct role in setting the timescale for observing such relaxation phenomena. However, the lifetime of the nuclear excited state does provide a reasonable estimate of the longest characteristic time for fluctuations that can be measured by Mo¨ ssbauer spectrometry. Sensitivity to changes in valence of the Mo¨ ssbauer atom between Fe(II) and Fe(III) has been used in studies of the Verwey transition in Fe3O4, which occurs at 120 K. Above the Verwey transition temperature the Mo¨ ssbauer spectrum comprises two sextets, but when Fe3O4 is cooled below the Verwey transition temperature the spectrum becomes complex (Degrave et al., 1993). Atomic diffusion is another phenomenon that can be studied by Mo¨ ssbauer spectrometry (Ruebenbauer et al., 1994). As an atom jumps to a new site on a crystal lattice, the coherence of its g-ray emission is disturbed. The shortening of the time for coherent g-ray emission causes a broadening of the linewidths in the Mo¨ ssbauer spectrum. In single crystals this broadening can be shown to occur by different amounts along different crystallographic
directions, and has been used to identify the atom jump directions and mechanisms of diffusion in Fe alloys (Feldwisch et al., 1994; Vogl et al., 1994; Sepiol et al., 1996). Perhaps the most familiar example of a relaxation effect in Mo¨ ssbauer spectrometry is the superparamagnetic behavior of small particles. This phenomenon is described below (see Crystal Defects and Small Particles). Phonons. The phonon partial density of states (DOS) has recently become measurable by Mo¨ ssbauer spectrometry. Technically, nuclear resonant scattering that occurs with the creation or annihilation of a phonon is inelastic scattering, and therefore not the Mo¨ ssbauer effect. However, techniques for measuring the phonon partial DOS have been developed as a capability of synchrotron radiation sources for Mo¨ ssbauer scattering. The experiments are performed by detuning the incident photon energies above and below the nuclear resonance by 100 meV or so. This range of energy is far beyond the energy width of the Mo¨ ssbauer resonance or any of its hyperfine interactions. However, it is in the range of typical phonon energies. The inelastic spectra so obtained are called ‘‘partial’’ phonon densities of states because they involve the motions of only the Mo¨ ssbauer nucleus. The recent experiments (Seto et al., 1995; Sturhahn et al., 1995; Fultz et al., 1997) are performed with incoherent scattering (a Mo¨ ssbauer g ray into the sample, a conversion x ray out), and are interpreted in the same way as incoherent inelastic neutron scattering spectra (Squires, 1978). Compared to this latter, more established technique, the inelastic nuclear resonant scattering experiments have the capability of working with much smaller samples, owing to the large cross-section for nuclear resonant scattering. The vibrational spectra of monolayers of 57Fe atoms at interfaces of thin films have been measured in preliminary experiments. Coherence and Diffraction. Mo¨ ssbauer scattering can be coherent, meaning that the phase of the incident wave is in a precise relationship to the phase of the scattered wave. For coherent scattering, wave amplitudes are added (Equation 3) instead of independent photon intensities (Equation 2). For the isotope 57Fe, coherency occurs only in experiments where a 14.41 keV g ray is absorbed and a 14.41 keV g ray is reemitted through the reverse nuclear transition. The waves scattered by different coherent processes interfere with each other, either constructively or destructively. The interference between Mo¨ ssbauer scattering and x-ray Rayleigh scattering undergoes a change from constructive in-phase interference above the Mo¨ ssbauer resonance to destructive out-of-phase interference below. This gives rise to an asymmetry in the peaks measured in an energy spectrum, first observed by measuring a Mo¨ ssbauer energy spectrum in scattering geometry (Black and Moon, 1960). Diffraction is a specialized type of interference phenomenon. Of particular interest to the physics of Mo¨ ssbauer diffraction is a suppression of internal conversion processes when diffraction is strong. With multiple transfers of energy between forward and diffracted beams, there is a nonintuitive enhancement in the rate of decay of the
¨ SSBAUER SPECTROMETRY MO
nuclear excited state (Hannon and Trammell, 1969; van Bu¨ rck et al., 1978; Shvyd’ko and Smirnov, 1989), and a broadening of the characteristic linewidth. A fortunate consequence for highly perfect crystals is that with strong Bragg diffraction, a much larger fraction of the reemissions from 57Fe nuclei occur by coherent 14.41 keV emission. The intensities of Mo¨ ssbauer diffraction peaks therefore become stronger and easier to observe. For solving unknown structures in materials or condensed matter, however, it is difficult to interpret the intensities of diffraction peaks when there are multiple scatterings. Quantification of diffraction intensities with kinematical theory is an advantage of performing Mo¨ ssbauer diffraction experiments on polycrystalline samples. Such samples also avoid the broadening of features in the Mo¨ ssbauer energy spectrum that accompanies the speedup of the nuclear decay. Unfortunately, without the dynamical enhancement of coherent decay channels, kinematical diffraction experiments on small crystals suffer a serious penalty in diffraction intensity. Powder diffraction patterns have not been obtained until recently (Stephens et al., 1994), owing to the low intensities of the diffraction peaks. Mo¨ ssbauer diffraction from polycrystalline alloys does offer a new capability, however, of combining the spectroscopic capabilities of hyperfine interactions to extract a diffraction pattern from a particular chemical environment of the Mo¨ ssbauer isotope (Stephens and Fultz, 1997). PRACTICAL ASPECTS OF THE METHOD Radioisotope Sources The vast majority of Mo¨ ssbauer spectra have been measured with instrumentation as shown in Figure 3. The spectrum is obtained by counting the number of g-ray photons that pass through a thin specimen as a function of the g-ray energy. At energies where the Mo¨ ssbauer effect is strong, a dip is observed in the g-ray transmission. The g-ray energy is tuned with a drive that imparts a Doppler shift, E, to the g ray in the reference frame of the sample: n E ¼ Eg c
ð25Þ
where v is the velocity of the drive. A velocity of 10 mm/s will provide an energy shift, E, of 4.8 107 eV to a 14.41 keV g ray of 57Fe. Recall that the energy width of the Mo¨ ssbauer resonance is 4.7 109 eV, which corresponds to 0.097 mm/s. An energy range of 10 mm/s is usually more than sufficient to tune through the full Mo¨ ssbauer energy spectrum of 57Fe or 119Sn. It is conventional to present the energy axis of a Mo¨ ssbauer spectrum in units of mm/s. The equipment required for Mo¨ ssbauer spectrometry is simple, and adequate instrumentation is often found in instructional laboratories for undergraduate physics students. In a typical coursework laboratory exercise, students learn the operation of the detector electronics and the spectrometer drive system in a few hours, and complete a measurement or two in about a week. (The under-
825
Figure 3. Transmission Mo¨ ssbauer spectrometer. The radiation source sends g rays to the right through a thin specimen into a detector. The electromagnetic drive is operated with feedback control by comparing a measured velocity signal with a desired reference waveform. The drive is cycled repetitively, usually so the velocity of the source varies linearly with time (constant acceleration mode). Counts from the detector are accumulated repetitively in short time intervals associated with memory addresses of a multichannel scaler. Each time interval corresponds to a particular velocity of the radiation source. Typical numbers are 1024 data points of 50-ms time duration and a period of 20 Hz.
standing of the measured spectrum typically takes much longer.) Most components for the Mo¨ ssbauer spectrometer in Figure 3 are standard items for x-ray detection and data acquisition. The items specialized for Mo¨ ssbauer spectrometry are the electromagnetic drive and the radiation source. Abandoned electromagnetic drives and controllers are often found in university and industrial laboratories, and hardware manufactured since about 1970 by Austin Science Associates, Ranger Scientific, Wissel/Oxford Instruments, and Elscint, Ltd. are all capable of providing excellent results. Half-lives for radiation sources are: 57 Co, 271 days, 119mSn, 245 days, 151Sm, 93 years, and 125 Te, 2.7 years. A new laboratory setup for 57Fe or 119Sn work may require the purchase of a radiation source. Suppliers include Amersham International, Dupont/NEN, and Isotope Products. It is also possible to obtain high-quality radiation sources from the former Soviet Union. Specifications for the purchase of a new Mo¨ ssbauer source, besides activity level (typically 20 to 50 mCi for 57Co), should include linewidth and sometimes levels of impurity radioisotopes. The measured energy spectrum from the sample is convoluted with the energy spectrum of the radiation source. For a spectrum with sharp Lorentzian lines of natural linewidth, (see Equation 1), the convolution of the source and sample Lorentzian functions provides a measured
826
RESONANCE METHODS
Lorentzian function of full width at half-maximum of 0.198 mm/s. An excellent 57Fe spectrum from pure Fe metal over an energy range of 10 mm/s may have linewidths of 0.23 mm/s, although instrumental linewidths of somewhat less than 0.3 mm/s are not uncommon owing to technical problems with the purity of the radiation source and vibrations of the specimen or source. Radiation sources for 57Fe Mo¨ ssbauer spectrometry use the 57Co radioisotope. The unstable 57Co nucleus absorbs an inner-shell electron, transmuting to 57Fe and emitting a 122-keV g ray. The 57Fe nucleus thus formed is in its first excited state, and decays about 141 ns later by the emission of a 14.41-keV g ray. This second g ray is the useful photon for Mo¨ ssbauer spectrometry. While the 122-keV g ray can be used as a clock to mark the formation of the 57 Fe excited state, it is generally considered a nuisance in Mo¨ ssbauer spectrometry, along with emissions from other contamination radioisotopes in the radiation source. A Mo¨ ssbauer radiation source is prepared by diffusing the 57 Co isotope into a matrix material such as Rh, so that atoms of 57Co reside as dilute substitutional solutes on the fcc Rh crystal lattice. Being dilute, the 57Co atoms have a neighborhood of pure Rh, and therefore all 57Co atoms have the same local environment and the same nuclear energy levels. They will therefore emit g rays of the same energy. Although radiation precautions are required for handling the source, the samples (absorbers) are not radioactive either before or after measurement in the spectrometer. Enrichment of the Mo¨ ssbauer isotope is sometimes needed when, for example, the 2.2% natural abundance of 57 Fe is insufficient to provide a strong spectrum. Although 57 Fe is not radioactive, material enriched to 95% 57Fe costs approximately $15 to $30 per mg, so specimen preparation usually involves only small quantities of isotope. Biochemical experiments often require growing organisms in the presence of 57Fe. This is common practice for studies on heme proteins, for example. For inorganic materials, it is sometimes possible to study dilute concentrations of Fe by isotopic enrichment. It is also common practice to use 57Fe as an impurity, even when Fe is not part of the structure. Sometimes it is clear that the 57Fe atom will substitute on the site of another transition metal, for example, and the local chemistry of this site can be studied with 57Fe dopants. The same approach can be used with the 57Co radioisotope, but this is not a common practice because it involves the preparation of radioactive materials. With 57 Co doping, the sample material itself serves as the radiation source, and the sample is moved with respect to a single-line absorber to acquire the Mo¨ ssbauer spectrum. These ‘‘source experiments’’ can be performed with concentrations of 57Co in the ppm range, providing a potent local probe in the material. Another advantage of source experiments is that the samples are usually so dilute in the Mo¨ ssbauer isotope that there is no thickness distortion of the measured spectrum. The single-line absorber, typically sodium ferrocyanide containing 0.2 mg/cm2 of 57Fe, may itself have thickness distortion, but it is the same for all Doppler velocities. The net effect of absorber thickness is a broadening of spectral features without a distortion of intensities.
Synchrotron Sources Since 1985 (Gerdau et al., 1985), it has become increasingly practical to perform Mo¨ ssbauer spectrometry measurements with a synchrotron source of radiation, rather than a radioisotope source. This work has become more routine with the advent of Mo¨ ssbauer beamlines at the European Synchrotron Radiation Facility at Grenoble, France, the Advanced Photon Source at Argonne National Laboratory, Argonne, Illinois, and the SPring-8 facility in Harima, Japan. Work at these facilities first requires success in an experiment approval process. Successful beamtime proposals will not involve experiments that can be done with radioisotope sources. Special capabilities that are offered by synchrotron radiation sources are the time structure of the incident radiation, its brightness and collimation, and the prospect of measuring energy spectra offresonance to study phonons and other excitations in solids. Synchrotron radiation for Mo¨ ssbauer spectrometry is provided by an undulator magnet device inserted in the synchrotron storage ring. The undulator has tens of magnetic poles, positioned precisely so that the electron accelerations in each pole are arranged to add in phase. This provides a high concentration of radiation within a narrow range of angle, somewhat like Bragg diffraction from a crystal. This highly parallel radiation can be used to advantage in measurements through narrow windows, such as in diamond anvil cells. The highly parallel synchrotron radiation should permit a number of new diffraction experiments, using the Mo¨ ssbauer effect for the coherent scattering mechanism. Measurements of energy spectra are impractical with a synchrotron source, but equivalent spectroscopic information is available in the time domain. The method may be perceived as ‘‘Fourier transform Mo¨ ssbauer spectrometry.’’ A synchrotron photon, with time coherence less than 1 ns, can excite all resonant nuclei in the sample. Over the period of time for nuclear decay, 100 ns or so, the nuclei emit photon waves with energies characteristic of their hyperfine fields. Assume that there are two such hyperfine fields in the solid, providing photons of energy E0g þ e1 and E0g þ e2 . In the forward scattering direction, the two photon waves can add in phase. The time dependence of the photon at the detector is obtained by the coherent sum as in Equation 3 TðtÞ ¼ exp½iðE0g þ e1 Þt= h þ exp½iðE0g þ e2 Þt= h
ð26Þ
The photon intensity at the detector, IðtÞ, has the time dependence hg IðtÞ ¼ T ðtÞTðtÞ ¼ 2f1 þ cos½e2 e1 Þt=
ð27Þ
When the energy difference between levels, e2 e1 , is greater than the natural linewidth, , the forward scattered intensity measured at the detector will undergo a number of oscillations during the time of the nuclear decay. These ‘‘quantum beats’’ can be Fourier transformed to provide energy differences between hyperfine levels of the nucleus (Smirnov, 1996). It should be mentioned that forward scattering from thick samples also shows a
¨ SSBAUER SPECTROMETRY MO
phenomenon of ‘‘dynamical beats,’’ which involve energy interchanges between scattering processes.
Table 1. Hyperfine Parameters of Common Oxides and Oxyhydroxidesa
Valence and Spin Determination
Compound (Fe Site)
The isomer shift, with supplementary information provided by the quadrupole splitting, can be used to determine the valence and spin of 57Fe and 119Sn atoms. The isomer shift is proportional to the electron density at the nucleus, but this is influenced by the different s- and p-donor acceptance strengths of surrounding ligands, their electronegativities, covalency effects, and other phenomena. It is usually best to have some independent knowledge about the electronic state of Fe or Sn in the material before attempting a valency determination. Even for unknown materials, however, valence and spin can often be determined reliably for the Mo¨ ssbauer isotope. The 57Fe isomer shifts shown in Figure 4 are useful for determining the valence and spin state of Fe ions. If the 57 Fe isomer shift of an unknown compound is þ1.2 mm/s with respect to bcc Fe, for example, it is identified as high-spin Fe(II). Low-spin Fe(II) and Fe(III) compounds show very similar isomer shifts, so it is not possible to distinguish between them on the basis of isomer shift alone. Fortunately, there are distinct differences in the electric quadrupole splittings of these electronic states. For lowspin Fe(II), the quadrupole splittings are rather small, being in the range of 0 to 0.8 mm/s. For low spin Fe(III) the electric quadrupole splittings are larger, being in the range 0.7 to 1.7 mm/s. The other oxidation states shown in Figure 4 are not so common, and tend to be of greater interest to chemists than materials scientists. The previous analysis of valence and spin state assumed that the material is not magnetically ordered. In cases where a hyperfine magnetic field is present, identification of the chemical state of Fe is sometimes even easier. Table 1 presents a few examples of hyperfine magnetic fields and isomer shifts for common magnetic oxides and oxyhydroxides (Simmons and Leidheiser, 1976). This table is given as a convenient guide, but the hyperfine parameters may depend on crystalline quality and stoichiometry (Bowen et al., 1993).
a-FeOOH a-FeOOH b-FeOOH b-FeOOH g-FeOOH d-FeOOH (large cryst.) FeO Fe3O4 (Fe(III), A) Fe3O4 (Fe(II, III), B) a-Fe2O3 g-Fe2O3(A) g-Fe2O3(B)
HMF (T)
Q.S.
50.0 38.2 48.5 0 0 42.0
0.25 0.25 0.64 0.62 0.60
827
I.S. (vs. Fe)
Temp. (K)
þ0.61 þ0.38 þ0.39 þ0.38 þ0.35
77 300 80 300 295 295
49.3
þ0.93 þ0.26
295 298
46.0
þ0.67
298
þ0.39 þ0.18 þ0.40
296 300 300
0.8
51.8 50.2 50.3
þ0.42
a Abbreviations: HMF, hyperfine magnetic field; I.S., isomer shift; Q.S., quadrupole splitting; T, tesla.
The isomer shifts for 119Sn compounds have a wider range than for 57Fe compounds. Isomer shifts for compounds with Sn(IV) ions have a range from 0.5 to þ1.5 mm/s versus SnO2. For Sn(II) compounds, the range of isomer shifts is þ2.2 to þ4.2 versus SnO2. Within these ranges it is possible to identify other chemical trends. In particular, for Sn compounds there is a strong correlation of isomer shift with the electronegativity of the ligands. This correlation between isomer shift and ligand electronegativity is especially reliable for Sn(IV) ions. Within a family of Sn(IV) compounds of similar coordination, the isomer shift depends on the electronegativity of the surrounding ligands as 1.27 w mm/s, where w is the Pauling electronegativity. The correlation with Sn(II) is less reliable, in part because of the different coordinations found for this ion. Finally, it should be mentioned that there have been a number of efforts to correlate the local coordination of 57Fe with the electric quadrupole splitting. These correlations are often reliable within a specific class of compounds, typically showing a semiquantitative relationship between quadrupole splitting and the degree of distortion of the local atomic structure. Phase Analysis
Figure 4. Ranges of isomer shifts in Fe compounds with various valences and spin states, with reference to bcc Fe at 300 K. Thicker lines are more common configurations (Greenwood and Gibb, 1971; Gu¨ tlich, 1975).
When more than one crystallographic phase is present in a material containing 57Fe or 119Sn, it is often possible to determine the phase fractions at least semiquantitatively. Usually some supplemental information is required before quantitative information can be derived. For example, most multiphase materials contain several chemical elements. Since Mo¨ ssbauer spectrometry detects only the Mo¨ ssbauer isotope, to determine the volume fraction of each phase, it is necessary to know its concentration of Mo¨ ssbauer isotope. Quantitative phase analysis tends to be most reliable when the material is rich in the Mo¨ ssbauer atom. Phase fractions in iron alloys, steels, and iron oxides can often be measured routinely by Mo¨ ssbauer
828
RESONANCE METHODS
Figure 5. Mo¨ ssbauer spectra of an alloy of Fe8.9 atomic % Knee. The initial state of the material was ferromagnetic bcc phase, shown by the six-line spectrum at the top of the figure. This top spectrum was acquired at 238C. The sample was heated in situ in the Mo¨ ssbauer spectrometer to 6008C, for the numbers of hours marked on the curves, to form increasing amounts of fcc phase, evident as the single paramagnetic peak near 0.4 mm/s. This fcc phase is stable at 5008C, but not at 238C, so the middle spectra were acquired at 5008C in interludes between heatings at 6008C for various times. At the end of the high-temperature runs, the sample temperature was again reduced to 238C, and the final spectrum shown at the bottom of the figure showed that the fcc phase had transformed back into bcc phase. A trace of oxide is evident in all spectra as additional intensity around þ0.4 mm/s at 238C.
spectrometry (Schwartz, 1976; Simmons and Leidheiser, 1976; Cortie and Pollak, 1995; Campbell et al., 1995). Figure 5 is an example of phase analysis of an Fe-Knee alloy, for which the interest was in determining the kinetics of fcc phase formation at 6008C (Fultz, 1982). The fcc phase, once formed, is stable at 5008C but not at room temperature. To determine the amount of fcc phase formed at 6008C it was necessary to measure Mo¨ ssbauer spectra at 5008C without an intervening cooling to room temperature for spectrum acquisition. Mo¨ ssbauer spectrometry is well suited for detecting small amounts of fcc phase in a bcc matrix, since the fcc phase is paramagnetic, and all its intensity appears as a single peak near the center of the spectrum. Amounts of fcc phase (‘‘austenite’’) of 0.5% can be detected in iron alloys and steels, and quantitative analysis of the fcc phase fraction is straightforward. The spectra in Figure 5 clearly show the six-line pattern of the bcc phase and the growth of the single peak at 0.4 mm/s from the fcc phase. These spectra show three other features that are common to many Mo¨ ssbauer spectra. First, the spectrum at the top of Figure 5 shows a broadening of the outer lines of the sextet with respect to the inner lines (also see Fig. 2). This broadening originates with a distribution of hyperfine magnetic fields in alloys.
The different numbers of Knee neighbors about the various 57Fe atoms in the bcc phase cause different perturbations of the 57Fe HMF. Second, the Curie temperature of bcc Fe8.9 atomic % Knee is 7008C. At the Curie temperature the average lattice magnetization is zero, and the HMF is also zero. At 5008C the alloy is approaching the Curie temperature, and shows a strong reduction in its HMF as evidenced by the smaller splitting of the six line pattern with respect to the pattern at 238C. Finally, at 5008C the entire spectrum is shifted to the left towards more negative isomer shift. This is the relativistic thermal shift of Equation 20. To obtain the phase fractions, the fcc and bcc components of the spectrum were isolated and integrated numerically. Isolating the fcc peak was possible by digital subtraction of the initial spectrum from spectra measured after different annealing times. The fraction of the fcc spectral component then needed two correction factors to convert it into a molar phase fraction. One factor accounted for the different chemical compositions of the fcc and bcc phases (the fcc phase was enriched in Knee to about 25%). A second factor accounted for the differences in recoil-free fraction of the two phases. Fortunately, the Debye temperatures of the two phases were known, and they differed little, so the differences in recoil-free fraction were not significant. The amount of fcc phase in the alloy at 5008C was found to change from 0.5% initially to 7.5% after 34 h of heating at 6008C. Solutes in bcc Fe Alloys The HMF in pure bcc Fe is 33.0 T for every Fe atom, since every Fe atom has an identical chemical environment of 8 Fe first-nearest-neighbors (1nn), 6 Fe 2nn, 12 Fe 3nn, etc. In ferromagnetic alloys, however, the 57Fe HMF is perturbed significantly by the presence of neighboring solute atoms. In many cases, this perturbation is about þ2.5 T (a reduction in the magnitude of the HMF) for each 1nn solute atom. A compilation of some HMF perturbations for 1nn solutes and 2nn solutes is presented in Figure 6. These data were obtained by analysis of Mo¨ ssbauer spectra from dilute bcc Fe-X alloys (Vincze and Campbell, 1973; Vincze and Aldred, 1974; Fultz, 1993). In general, the HMF perturbations at 57Fe nuclei from nearest-neighbor solute atoms originate from several sources, but for nonmagnetic solutes such as Si, the effects are fairly simple to understand. When the Si atom substitutes for an Fe atom in the bcc lattice, a magnetic moment of 2.2 mB is removed (the Fe) and replaced with a magnetic hole (the Si). The 4s conduction electrons redistribute their spin density around the Si atom, and this redistribution is significant at 1nn and 2nn distances. The Fermi contact interaction and Heff (Equation 24) are sensitive to the 4s electron spin density, which has finite probability at the 57 Fe nucleus. Another important feature of 3p, 4p, and 5p solutes is that their presence does not significantly affect the magnetic moments at neighboring Fe atoms. Bulk magnetization measurements on Fe-Si and Fe-Al alloys, for example, show that the magnetic moment of the material decreases approximately in proportion to the fraction of Al or Si in the alloy. The core electron
¨ SSBAUER SPECTROMETRY MO
Figure 6. The hyperfine magnetic field perturbation, HX1 , at a Fe atom caused by one 1nn solute of type X, and the 2nn perturbation, HX2 , versus the atomic number of the solute. The vertical line denotes the column of Fe in the periodic table.
polarization, which involves exchange interactions between the unpaired 3d electrons at the 57Fe atom and its inner-shell s electrons, is therefore not much affected by the presence of Si neighbors. The dominant effect comes from the magnetic hole at the solute atom, which causes the redistribution of 4s spin density. Figure 6 shows that the nonmagnetic 3p, 4p, and 5p elements all cause about the same HMF perturbation at neighboring 57Fe atoms, as do the nonmagnetic early 3d transition metals.
829
For solutes that perturb significantly the magnetic moments at neighboring Fe atoms, especially the late transition metals, the core polarization at the 57Fe atom is altered. There is an additional complicating effect from the matrix Fe atoms near the 57Fe atom, whose magnetic moments are altered enough to affect the 4s conduction electron polarization at the 57Fe (Fultz, 1993). The HMF distribution can sometimes provide detailed information on the arrangements of solutes in nondilute bcc Fe alloys. For most solutes (that do not perturb significantly the magnetic moments at Fe atoms), the HMF at a 57 Fe atom depends monotonically on the number of solute atoms in its 1nn and 2nn shells. Hyperfine magnetic field perturbations can therefore be used to measure the chemical composition or the chemical short-range order in an alloy containing up to 10 atomic % solute or even more. In many cases, it is possible to distinguish among Fe atoms having different numbers of solute atoms as first neighbors, and then determine the fractions of these different first neighbor environments. This is considerably more information on chemical short-range order (SRO) than just the average number of solute neighbors, as provided by a 1nn Warren-Cowley SRO parameter, for example. An example of chemical short range order in an Fe26 atomic % Al alloy is presented in Figure 7. The material was cooled at a rate of 106 K/s from the melt by piston-anvil quenching, producing a polycrystalline ferromagnetic alloy with a nearly random distribution of Al atoms on the bcc lattice. With low-temperature annealing, the material evolved toward its equilibrium state of D03 chemical order. The Mo¨ ssbauer spectra in Figure 7A change significantly as the alloy evolves chemical order. The overlap of several sets of six line patterns does confuse the physical picture, however, and further analysis requires the extraction of an HMF distribution from the experimental data. Software packages available for such work are described below (see Data Analysis). Figure 7B shows HMF distributions extracted from the three spectra of Figure 7A. At the top of Figure 7B are markers indicating the numbers of Al atoms in the 1nn shell of the 57Fe nucleus associated with the HMF. With low-temperature annealing, there is a clear increase in the numbers of 57Fe atoms with 0 and 4 Al neighbors, as expected when D03 order is evolving in the material. The perfectly ordered D03 structure has two chemical sites for Fe atoms, one with 0 Al neighbors and the other with 4 Al neighbors, in a 1:2 ratio. The HMF distributions were fit to a set of Gaussian functions to provide data on the chemical short range order in the alloys. These
Figure 7. (A) Conversion electron Mo¨ ssbauer spectra from a specimen of bcc 57 Fe and three specimens of disordered, partially ordered, and D03-ordered 57 Fe3Al. (B) HMF distribution of the 57 Fe3Al specimens. Peaks in the HMF distribution are labeled with numbers indicating the different numbers of 1nn Al neighbors about the 57Fe atom. (C) Probabilities for the 57Fe atom having various numbers, n, of Al atoms as 1nn.
830
RESONANCE METHODS
data on chemical short-range order are presented in Figure 7C. Crystal Defects and Small Particles Since Mo¨ ssbauer spectrometry probes local environments around a nucleus, it has often been proposed that Mo¨ ssbauer spectra should be sensitive to the local atomic structures at grain boundaries and defects such as dislocations and vacancies. This is in fact true, but the measured spectra are an average over all Mo¨ ssbauer atoms in a sample. Unless the material is chosen carefully so that the Mo¨ ssbauer atom is segregated to the defect of interest, the spectral contribution from the defects is usually overwhelmed by the contribution from Mo¨ ssbauer atoms in regions of perfect crystal. The recent interest in nanocrystalline materials, however, has provided a number of new opportunities for Mo¨ ssbauer spectrometry (Herr et al., 1987; Fultz et al., 1995). The number of atoms at and near grain boundaries in nanocrystals is typically 35% for bcc Fe alloys with crystallite sizes of 7 nm or so. Such a large fraction of grain boundary atoms makes it possible to identify distinct contributions from Mo¨ ssbauer atoms at grain boundaries, and to identify their local electronic environment. Mo¨ ssbauer spectrometry can provide detailed information on some features of small-particle magnetism (Mørup, 1990). When a magnetically ordered material is in the form of a very small particle, it is easier for thermal energy to realign its direction of magnetization. The particle retains its magnetic order, but the change in axis of magnetization will disturb the shape of the Mo¨ ssbauer spectrum if the magnetic realignment occurs on the time scale ts, which is h divided by the hyperfine magnetic field energy (see Relaxation Phenomena for discussion of the time window for measuring hyperfine interactions). An activation energy is associated with this ‘‘superparamagnetic’’ behavior, which is the magnetocrystalline anisotropy energy times the volume of the crystallite, kV. The probability of activating a spin rotation in a small particle is the Boltzmann factor for overcoming the anisotropy energy, so the condition for observing a strong relaxation effect in the Mo¨ ssbauer spectrum is ts ¼ A expðkV=kB Tb Þ
average orientation, which serve to reduce the HMF by a modest amount. At increasing temperatures around Tb, however, large fluctuations occur in the magnetic alignment. The result is first a severe uncertainty of the HMF distribution, leading to a very broad background in the spectrum, followed by the growth of a paramagnetic peak near zero velocity. All of these effects can be observed in the spectra shown in Figure 8. Here, the biomaterial samples
ð28Þ
The temperature, Tb, satisfying Equation 28 is known as the ‘‘blocking temperature.’’ The prefactor of Equation 28, the attempt frequency, is not so well understood, so studies of superparamagnetic behavior often study the blocking temperature versus the volume of the particles. In practice, most clusters of small particles have a distribution of blocking temperatures, and there are often interactions between the magnetic moments at adjacent particles. These effects can produce Mo¨ ssbauer spectra with a wide variety of shapes, including very broad Lorentizian lines. At temperatures below Tb, the magnetic moments of small particles undergo small fluctuations in their alignment. These small-amplitude fluctuations can be considered as vibrations of the particle magnetization about an
Figure 8. Mo¨ ssbauer spectra from a specimen of haemosiderin, showing the effects of superparamagnetism with increasing temperature (Bell et al., 1984; Dickson, 1987).
¨ SSBAUER SPECTROMETRY MO
comprised a core of haemosiderin, an iron storage compound, encapsulated within a protein shell. A clear sixline pattern is observed at 4.2 K, but the splitting of these six lines is found to decrease with temperature owing to small amplitude fluctuations in magnetic alignment. At temperatures around 40 to 70 K, a broad background appears under the measured spectrum, and a paramagnetic doublet begins to grow in intensity with increasing temperature. These effects are caused by large thermal reorientations of the magnetization. Finally, by 200 K, the thermally induced magnetic fluctuations are all of large amplitude and of characteristic times too short to permit a HMF to be detected by Mo¨ ssbauer spectrometry.
DATA ANALYSIS AND INITIAL INTERPRETATION Mo¨ ssbauer spectra are often presented for publication with little or no processing. An obvious correction that can be applied to most transmission spectra is a correction for thickness distortion (see Sample Preparation). This correction is rarely performed, however, in large part because the thickness of the specimen is usually not known or is not uniform. The specimen is typically prepared to be thin, or at least this is assumed, and the spectrum is assumed to be representative of the Mo¨ ssbauer absorption cross-section. A typical goal of data analysis is to find individual hyperfine parameters, or more typically a distribution of hyperfine parameters, that characterize a measured spectrum. For example, the HMF distribution of Figure 2 should resemble a delta function centered at 330 kG. On the other hand, the HMF distribution of Figure 7B shows a number of peaks that are characteristic of different local chemical environments. Distributions of electric quadrupole splittings and isomer shifts are also useful for understanding heterogeneities in the local atomic arrangements in materials. Several software packages are available to extract distributions of hyperfine parameters from Mo¨ ssbauer spectra (Hesse and Rutbartsch, 1974; Le Car and DuBoise, 1979; Brand and Le Cae¨ r, 1988; Lagarec and Rancourt, 1997). These programs are often distributed by their authors who may be located with the Mo¨ ssbauer Information eXchange (see Internet Resources). The different programs extract hyperfine distributions from experimental spectra with different numerical approaches, but all will show how successfully the hyperfine distribution can be used to regenerate the experimental spectrum. In the presence of statistical noise, the reliability of these derived hyperfine distributions must be considered carefully. In particular, over small ranges of hyperfine parameters, the hyperfine distributions are not unique. For example, it may be unrealistic to distinguish one Lorentzian-shaped peak centered at a particular velocity from the sum of several peaks distributed within a quarter of a linewidth around this same velocity. This nonuniqueness can lead to numerical problems in extracting hyperfine distributions from experimental data. Some software packages use smoothing parameters to penalize the algorithm when it picks a candidate HMF distribution with
831
sharp curvature. When differences in hyperfine distributions are small, there is always an issue of their uniqueness. Sometimes the data analysis cannot distinguish between different types of hyperfine distributions. For example, a spectrum that has been broadened by an EFG distribution, or even an HMF distribution, can be fit perfectly with an IS distribution. The physical origin of hyperfine distributions may not be obvious, especially when the spectrum shows little structure. Application of an external magnetic field may be helpful in identifying a weak HMF, however. In general, distributions of all three hyperfine parameters (IS, EFG, HMF) will be present simultaneously in a measured spectrum. These parameters may be correlated; for example nuclei having the largest HMF may have the largest (or smallest) IS. Sorting out these correlations is often a research topic in itself, although the software for calculating hyperfine distributions typically allows for simple linear correlations between the distributions. Both the EFG and the HMF use an axis of quantization for the nuclear spin. However, the direction of magnetization (for the HMF axis) generally does not coincide with the directions of the chemical bonds responsible for the EFG. The general case with comparable hyperfine interaction energies of HMFs and EFGs is quite complicated, and is well beyond the scope of this unit. Some software packages using model spin Hamiltonians are available to calculate spectra acquired under these conditions, however. In the common case when the HMF causes much larger spectral splittings than the EFG, with polycrystalline samples the usual effect of the EFG is a simple broadening of the peaks in the magnetic sextet, with no shifts in their positions. With even modest experimental care, Mo¨ ssbauer spectra can be highly reproducible from run to run. For example, the Mo¨ ssbauer spectrum in Figure 2 was repeated many times over a time period of several years. Almost all of these bcc Fe spectra had data points that overlaid on top of each other, point by point, within the accuracy of the counting statistics. Because of this reproducibility, it is tempting and often appropriate to try to identify spectral features with energy width smaller than the characteristic linewidth. An underexploited technique for data analysis is ‘‘partial deconvolution’’ or ‘‘thinning.’’ Since the lineshape of each nuclear transition is close to a Lorentzian function, and can be quite reproducible, it is appropriate to deconvolute a Lorentzian function from the experimental spectrum. This is formally the same as obtaining an IS distribution, but no assumptions about the origin of the hyperfine distributions are implied by this process. The net effect is to sharpen the peaks from the experimental Mo¨ ssbauer spectrum, and this improvement in effective resolution can be advantageous when overlapping peaks are present in the spectra. The method does require excellent counting statistics to be reliable, however. Finally, in spite of all the ready availability of computing resources, it is always important to look at differences in the experimental spectra themselves. Sometimes a digital subtraction of one normalized spectrum from another is an excellent way to identify changes in a material. In any
832
RESONANCE METHODS
event, if no differences are detected by direct inspection of the data, changes in the hyperfine distributions obtained by computer software should not be believed. For this reason it is still necessary to show actual experimental spectra in research papers that use Mo¨ ssbauer spectrometry.
¼
SAMPLE PREPARATION A central concern for transmission Mo¨ ssbauer spectrometry is the choice and control of specimen thickness. The natural thickness of a specimen is t t ð fa na sa Þ1
impurities in the Fe metal, its Mo¨ ssbauer spectrum has sharp lines as shown in Figure 2. The recoil-free-fraction of bcc Fe is 0.80 at 300 K, and other quantities follow Equation 29
ð29Þ
where fa is the recoil-free fraction of the Mo¨ ssbauer isotope in the specimen, na is the number of Mo¨ ssbauer nuclei per cm3, and sa is the cross-section in units of cm2. The fa can be estimated from Equation 19, for which it is useful to know that the g-ray energies are 14.41 keV for 57Fe, 23.875 keV for 119Sn, and 21.64 keV for 151Eu. To obtain na it is important to know that the natural isotopic abundance is 2.2% for 57Fe, 8.6% for 119Sn, and 48% for 151Eu. The cross-sections for these isotopes are, in units of 1019 cm2, 25.7 for 57Fe, 14.0 for 119Sn, and 1.14 for 151Eu. Finally, natural linewidths, , of Equation 1, are 0.097 mm/s for 57Fe, 0.323 mm/s for 119Sn, and 0.652 mm/s for 151Eu. The observed intensity in a Mo¨ ssbauer transmission spectrum appears as a dip in count rate as the Mo¨ ssbauer effect removes g rays from the transmitted beam. Since this dip in transmission increases with sample thickness, thicker samples provide better signal-to-noise ratios and shorter data acquisition times. For quantitative work, however, it is poor practice to work with samples that are the natural thickness, t, or thicker owing to an effect called ‘‘thickness distortion.’’ In a typical constant-acceleration spectrometer, the incident beam will have uniform intensity at all velocities of the source, and the top layer of sample will absorb g rays in proportion to its cross-section (Equation 2). On the other hand, layers deeper within the sample will be exposed to a g-ray intensity diminished at velocities where the top layers have absorbed strongly. The effect of this ‘‘thickness distortion’’ is to reduce the overall sample absorption at velocities where the Mo¨ ssbauer effect is strong. Broadening of the Mo¨ ssbauer peaks therefore occurs as the samples become thicker. This broadening can be modeled approximately as increasing the effective linewidth of Equation 1 from the natural to ð1 þ 0:135t=tÞ. However, it is important to note that in the tails of the Mo¨ ssbauer peaks, where the absorption is weak, there is less thickness distortion. The peak shape in the presence of thickness distortion is therefore not a true Lorentzian function. Numerical corrections for the effects of thickness distortion are sometimes possible, but are rarely performed owing to difficulties in knowing the precise sample thickness and thickness uniformity. For quantitative work the standard practice is to use samples of thickness/2 or so. We calculate an effective thickness, t, for the case of natural Fe metal, which is widely used as a calibration standard for Mo¨ ssbauer spectrometers. If there are no
7:86 g 1 mol 0:80 0:022 cm3 55:85 g 6:023 1023 atoms 1 25 1019 cm2 mole 4
¼ 11 104 cm ¼ 11 mm
1 ð30Þ ð31Þ
The factor of 1/4 in Equation 30 accounts for the fact that the absorption cross-section is split into six different peaks owing to the hyperfine magnetic field in bcc Fe. The strongest of these comprises 1/4 of the total absorption. Figure 2 was acquired with a sample of natural bcc Fe 25 mm in thickness. The linewidths of the inner two peaks are 0.235 mm/s whereas the outer two are 0.291 mm/s. Although the outer two peaks are broadened by thickness distortion, effects of impurity atoms in the Fe are equally important. The widths of the inner two lines are probably a better measure of the spectrometer resolution.
LITERATURE CITED Akai, H., Blu¨ gel, S., Zeller, R., and Dederichs, P. H., 1986. Isomer shifts and their relation to charge transfer in dilute Fe alloys. Phys. Rev. Lett. 56:2407–2410. Bell, S. H., Weir, M. P., Dickson, D. P. E., Gibson, J. F., Sharp, G. A., and Peters, T. J. 1984. Mo¨ ssbauer spectroscopic studies of human haemosiderin and ferritin. Biochim. Biophys. Acta. 787:227–236. Black, P. J., and Moon, P. B. 1960. Resonant scattering of the 14-keV iron-57 g-ray, and its interference with Rayleigh scattering. Nature (London) 188:481–484. Bowen, L. H., De Grave, E., and Vandenberghe, R. E. 1993. Mo¨ ssbauer effect studies of magnetic soils and sediments. In Mo¨ ssbauer Spectroscopy Applied to Magnetism and Materials Science Vol. I. (G. J. Long and F. Grandjean, eds.). pp. 115– 159. Plenum, New York. Brand, R. A. and Le Car, G. 1988. Improving the validity of Mo¨ ssbauer hyperfine parameter distributions: The maximum entropy formalism and its applications. Nucl. Instr. Methods Phys. Res. B 34:272–284. Campbell, S. J., Kaczmarek, W. A., and Wang, G. M. 1995. Mechanochemical transformation of haematite to magnetite. Nanostruct. Mater. 6:35–738. Cortie, M. B. and Pollak, H. 1995. Embrittlement and aging at 4758C in an experimental ferritic stainless-steel containing 38 wt.% chromium. Mater. Sci. Eng. A 199:153–163. Degrave, E., Persoons, R. M., and Vandenberghe, R. E. 1993. Mo¨ ssbauer study of the high temperature phase of co-substituted magnetite. Phys. Rev. B 47:5881–5893. Dickson, D. P. E. 1987. Mo¨ ssbauer Spectroscopic Studies of Magnetically Ordered Biological Materials. Hyperfine Interact. 33: 263–276. Faldum, T., Meisel, W., and Gu¨ tlich, P. 1994. Oxidic and metallic Fe/Knee multilayers prepared from Langmuir-Blodgett films. Hyperfine Interact. 92:1263–1269.
¨ SSBAUER SPECTROMETRY MO Feldwisch, R., Sepiol, B., and Vogl, G. 1994. Elementary diffusion jump of iron atoms in intermetallic phases studied by Mo¨ ssbauer spectroscopy 2. From order to disorder. Acta Metall. Mater. 42:3175–3181. Fultz, B. 1982. A Mo¨ ssbauer Spectrometry Study of Fe-Knee-X Alloys. Ph.D. Thesis. University of California at Berkeley. Fultz, B. 1993. Chemical systematics of iron-57 hyperfine magnetic field distributions in iron alloys. In Mo¨ ssbauer Spectroscopy Applied to Magnetism and Materials Science Vol. I. (G. J. Long and F. Grandjean, eds.). pp. 1–31. Plenum Press, New York. Fultz, B., Ahn, C. C., Alp, E. E., Sturhahn, W., and Toellner, T. S. 1997. Phonons in nanocrystalline 57Fe. Phys. Rev. Lett. 79: 937–940. Fultz, B., Kuwano, H., and Ouyang, H. 1995. Average widths of grain boundaries in nanophase alloys synthesized by mechanical attrition. J. Appl. Phys. 77:3458–3466. Gancedo, J. R., Gracia, M. and Marco, J. F. 1991. CEMS Methodology. Hyperfine Interact. 66:83–94. Gerdau, E., Ru¨ ffer, R., Winkler, H., Tolksdorf, W., Klages, C. P., and Hannon, J. P. 1985. Nuclear Bragg diffraction of synchrotron radiation in yttrium iron garnet, Phys. Rev. Lett. 54: 835– 838. Greenwood, N. N. and Gibb, T. C. 1971. Mo¨ ssbauer Spectroscopy. Chapman & Hall, London. Gu¨ tlich, P. 1975. Mo¨ ssbauer spectroscopy in chemistry. In Mo¨ ssbauer Spectroscopy. (U. Gonser, ed.). Chapter 2. Springer-Verlag, New York. Hannon, J. P. and Trammell, G. T. 1969. Mo¨ ssbauer diffraction. II. Dynamical Theory of Mo¨ ssbauer Optics. Phys. Rev. 186:306– 325. Herr, U., Jing, J., Birringer, R., Gonser, U., and Gleiter, H. 1987. Investigation of nanocrystalline iron materials by Mo¨ ssbauer spectroscopy. Appl. Phys. Lett. 50:472–474. Hesse, J. and Rutbartsch, A. 1974. Model independent evaluation of overlapped Mo¨ ssbauer spectra. J. Phys. E: Sci. Instrum. 7: 526–532. Kruijer, S., Keune, W., Dobler, M., and Reuther, H. 1997. Depth analysis of phase formation in Si after high-dose Fe ion-implantation by depth-selective conversion electron Mo¨ ssbauer spectroscopy. Appl. Phys. Lett. 70:2696–2698. Lagarec, K. and Rancourt, D. G. 1997. Extended Voigt-based analytic lineshape method for determining n-dimensional correlated hyperfine parameter distributions in Mo¨ ssbauer spectroscopy. Nucl. Inst. Methods Phys. Res. B 128:266– 280. Lamb, W. E. Jr. 1939. Capture of neutrons by atoms in a crystal. Phys. Rev. 55:190–197. Le Cae¨ r, G. and Duboise, J. M. 1979. Evaluation of hyperfine parameter distributions from overlapped Mo¨ ssbauer spectra of amorphous alloys. J. Phys. E: Sci. Instrum. 12:1083–1090. Mørup, S. 1990. Mo¨ ssbauer effect in small particles. Hyperfine Interact. 60:959–974. Mo¨ ssbauer, R. L. 1958. Kernresonanzfluoreszenz von Gammastrahlung in Ir191. Z. Phys. 151:124.
833
phonon excitation using synchrotron radiation. Phys. Rev. Lett. 74:3828–3831. Sepiol, B., Meyer, A., Vogl, G., Ruffer, R., Chumakov, A. I., and Baron, A. Q. R. 1996. Time domain study of Fe-57 diffusion using nuclear forward scattering of synchrotron radiation. Phys. Rev. Lett. 76:3220–3223. Shvyd’ko, Yu. V., and Smirnov, G. V. 1989. Experimental study of time and frequency properties of collective nuclear excitations in a single crystal. J. Phys. Condens. Matter 1:10563–10584. Simmons, G. W. and Leidheiser, Jr., H. 1976. Corrosion and interfacial reactions. In Applications of Mo¨ ssbauer Spectroscopy, Vol. 1. pp. 92–93. (R. L. Cohen, ed.). Academic Press, New York. Squires, G. L. 1978. Introduction to the Theory of Thermal Neutron Scattering. p. 54. Dover Publications, New York. Stahl, B. and Kankeleit, E. 1997. A high-luminosity UHV orange type magnetic spectrometer developed for depth-selective Mo¨ ssbauer spectroscopy. Nucl. Instr. Meth. Phys. Res. B 122: 149–161. Stephens, T. A. and Fultz, B. 1997. Chemical environment selectivity in Mo¨ ssbauer diffraction from 57Fe3Al. Phys. Rev. Lett. 78:366–369. Stephens, T. A., Keune, W., and Fultz, B. 1994. Mo¨ ssbauer effect diffraction from polycrystalline 57Fe. Hyperfine Interact. 92: 1095–1100. Sturhahn, W., Toellner, T. S., Alp, E. E., Zhang, X., Ando, M., Yoda, Y., Kikuta, S., Seto, M., Kimball, C. W., and Dabrowski, B. 1995. Phonon density-of-states measured by inelastic nuclear resonant scattering. Phys. Rev. Lett. 74:3832–3835. Smirnov, G. V. 1996. Nuclear resonant scattering of synchrotron radiation. Hyperfine Interact. 97/98:551–588. van Bu¨ rck, U., Smirnov, G. V., Mo¨ ssbauer, R. L., Parak, F., and Semioschkina, N. A. 1978. Suppression of nuclear inelastic channels in nuclear resonance and electronic scattering of g-quanta for different hyperfine transtions in perfect 57Fe single crystals. J. Phys. C Solid State Phys. 11:2305–2321. Vincze, I. and Aldred, A. T. 1974. Mo¨ ssbauer measurements in iron-base alloys with nontransition elements. Phys. Rev. B 9:3845–3853. Vincze, I. and Campbell, I. A. 1973. Mo¨ ssbauer measurements in iron based alloys with transition metals. J. Phys. F Metal Phys. 3:647–663. Vogl, G. and Sepiol, B. 1994. Elementary diffusion jump of iron atoms in intermetallic phases studied by Mo¨ ssbauer spectroscopy 1. Fe-Al close to equiatomic stoichiometry. Acta Metall. Mater. 42:3175–3181. Walker, L. R., Wertheim, G. K., and Jaccarino, V. 1961. Interpretation of the Fe57 isomer shift. Phys. Rev. Lett. 6:98–101. Watson, R. E. and Freeman, A. J. 1967. Hartree-Fock theory of electric and magnetic hyperfine interactions in atoms and magnetic compounds. In Hyperfine Interactions. (A. J. Freeman and R. B. Frankel, eds.) Chapter 2. Academic Press, New York. Williamson, D. L. 1993. Microstructure and tribology of carbon, nitrogen, and oxygen implanted ferrous materials. Nucl. Instr. Methods Phys. Res. B 76:262–267.
Ruebenbauer, K., Mullen, J. G., Nienhaus, G. U., and Shupp, G. 1994. Simple model of the diffusive scattering law in glassforming liquids. Phys. Rev. B 49:15607–15614.
KEY REFERENCES
Schwartz, L. H. 1976. Ferrous alloy phase transformations. In Applications of Mo¨ ssbauer Spectroscopy, Vol. 1. pp. 37–81. (R. L. Cohen, ed.). Academic Press, New York.
Bancroft, G. M. 1973. Mo¨ ssbauer Spectroscopy: An Introduction for Inorganic Chemists and Geochemists. John Wiley & Sons, New York.
Seto, M., Yoda, Y., Kikuta, S., Zhang, X. W., and Ando, M. 1995. Observation of nuclear resonant scattering accompanied by
Belozerski, G. N. 1993. Mo¨ ssbauer Studies of Surface Layers. Elsevier/North Holland, Amsterdam.
834
RESONANCE METHODS
Cohen, R. L. (ed.). 1980. Applications of Mo¨ ssbauer Spectroscopy, Vols. 1, 2. Academic Press, New York. These 1980 volumes by Cohen contain review articles on the applications of Mo¨ssbauer spectrometry to a wide range of materials and phenomena, with some exposition of the principles involved. Cranshaw, T. E., Dale, B. W., Longworth, G. O., and Johnson, C. E. 1985. Mo¨ ssbauer Spectroscopy and its Applications. Cambridge University Press, Cambridge. Dickson, D. P. E. and Berry, F. J. (eds.). 1986. Mo¨ ssbauer Spectroscopy. Cambridge University Press, Cambridge. Frauenfelder, H. 1962. The Mo¨ ssbauer Effect: A Review with a Collection of Reprints. W. A. Benjamin, New York. This book by Frauenfelder was written in the early days of Mo¨ssbauer spectrometry, but contains a fine exposition of principles. More importantly, it contains reprints of the papers that first reported the phenomena that are the basis for much of Mo¨ssbauer spectrometry. It includes an English translation of one of Mo¨ssbauer’s first papers. Gibb, T. C. 1976. Principles of Mo¨ ssbauer Spectroscopy. Chapman and Hall, London. Gonser, U. (ed.). 1975. Mo¨ ssbauer Spectroscopy. Springer-Verlag, New York. Gonser, U. (ed.). 1986. Microscopic Methods in Metals, Topics in Current Physics, 40, Springer-Verlag, Berlin. Gruverman, I. J. (ed.). 1976. Mo¨ ssbauer Effect Methodology, Vols. 1-10. Plenum Press, New York. Gu¨ tlich, P., Link, R., and Trautwein, A. (eds.). 1978. Mo¨ ssbauer Spectroscopy and Transition Metal Chemistry. Springer-Verlag, Berlin. Long, G. J. and Grandjean, F. (eds.). 1984. Mo¨ ssbauer Spectroscopy Applied to Inorganic Chemistry, Vols. 1-3. Plenum Press, New York. Long, G. J. and Grandjean, F. (eds.). 1996. Mo¨ ssbauer Spectroscopy Applied to Magnetism and Materials Science, Vols. 1 and 2. Plenum Press, New York.
These 1996 volumes by Long and Grandjean contain review articles on different classes of materials, and on different techniques used in Mo¨ssbauer spectrometry. Long, G. J. and Stevens, J. G. (eds.). 1986. Industrial Applications of the Mo¨ ssbauer Effect. Plenum, New York. May, L. (ed.). 1971. An Introduction to Mo¨ ssbauer Spectroscopy. Plenum Press, New York. Mitra, S. (ed.). 1992. Applied Mo¨ ssbauer Spectroscopy: Theory and Practice for Geochemists and Archaeologists. Pergamon Press, Elmsford, New York. Thosar, B. V. and Iyengar, P. K. (eds.). 1983. Advances in Mo¨ ssbauer Spectroscopy, Studies in Physical and Theoretical Chemistry 25. Elsevier/North Holland, Amsterdam. Wertheim, G. 1964. Mo¨ ssbauer Effect: Principles and Applications. Academic Press, New York.
INTERNET RESOURCES http://www.kfki.hu/mixhp/ The Mo¨ssbauer Information eXchange, MIX, is a project of the KFKI Research Institute for Particle and Nuclear Physics, Budapest, Hungary. It is primarily for scientists, students, and manufacturers involved in Mo¨ssbauer spectroscopy and other nuclear solid-state methods. http://www.unca.edu/medc
[email protected] The Mo¨ssbauer Effect Data Center (University of North Carolina; J.G.. Stevens, Director) maintains a library of most publications involving the Mo¨ssbauer effect, including hard-to-access publications. Computerized databases and database search services are available to find papers on specific materials.
BRENT FULTZ California Institute of Technology Pasadena, California
X-RAY TECHNIQUES INTRODUCTION
solids is relatively strong at typical energies found in laboratory apparatus (8 keV), penetrating a few tens of microns into a sample. At x-ray synchrotron sources, xray beams of energies above 100 keV are available. At these high energies, the absorption of x-rays approaches small values comparable to neutron scattering. The energy dependence of the x-ray absorption cross-section is punctuated by absorption edges or resonances that are associated with interatomic processes of constituent elements. These resonances can be used to great advantage to investigate element-specific properties of the material and are the basis of many of the spectroscopic techniques described in this chapter. Facilities for x-ray scattering and spectroscopy range from turn-key powder diffraction instruments for laboratory use to very sophisticated beam lines at x-ray synchrotron sources around the world. The growth of these synchrotron facilities in recent years, with corresponding increases in energy and scattering angle resolution, energy tunability, polarization tunability, and raw intensity has spurred tremendous advances in this field. For example, the technique of x-ray absorption spectroscopy (XAS) requires a source of continuous radiation that can be tuned over approximately 1 keV through the absorption edge of interest. Although possible using the bremstrahlung radiation from conventional x-ray tubes, full exploitation of this technique requires the high intensity, collimated beam of synchrotron radiation. As another example, the x-ray magnetic scattering cross-section is quite small (but finite) compared to the charge scattering cross-section. The high flux of x-rays available at synchrotron sources compared to neutron fluxes from reactors, however, allows x-ray magnetic scattering to effectively complement neutron magnetic scattering in several instances. Again, largely due to high x-ray fluxes and strong collimation, energy resolution in the meV range are achievable at current third generation x-ray sources, paving the way for new x-ray inelastic scattering studies of elementary excitations. Third generation sources include the Advanced Photon Source (APS) at the Argonne National Laboratory and the Advanced Light Source (ALS) at the Lawrence Berkeley National Laboratory in the United States, the European Synchrotron Radiation Facility (ESRF) in Europe, and Spring-8 in Japan.
X-ray scattering and spectroscopy methods can provide a wealth of information concerning the physical and electronic structure of crystalline and noncrystalline materials in a variety of external conditions and environments. X-ray powder diffraction, for example, is generally the first, and perhaps the most widely used, probe of crystal structure. Over the last nine decades, especially with the introduction of x-ray synchrotron sources during the last 25 years, x-ray techniques have expanded well beyond their initial role in structure determination. This chapter explores the wide variety of applications of x-ray scattering and spectroscopy techniques to the study of materials. Materials, in this sense, include not only bulk condensed matter systems, but liquids and surfaces as well. The term ‘‘scattering,’’ is generally used to include x-ray measurements on noncrystalline systems, such as glasses and liquids, or poorly crystallized materials, such as polymers, as well as x-ray diffraction from crystalline solids. Information concerning long-range, short-range, and chemical ordering as well as the existence, distribution and characterization of various defects is accessible through these kinds of measurements. Spectroscopy techniques generally make use of the energy dependence of the scattering or absorption cross-section to study chemical short-range order, identify the presence and location of chemical species, and probe the electronic structure through inelastic excitations. X-ray scattering and spectroscopy provide information complementary to several other techniques found in this volume. Perhaps most closely related are electron and neutron scattering methods described in Chapters 11 and 13, respectively. The utility of all three methods in investigations of atomic scale structure arises from the close match of the wavelength of these probes to typical interatomic ˚ ngstroms). Of the three methods, the distances (a few A electron scattering interaction with matter is the strongest, so this technique is most appropriate to the study of surfaces or very thin samples. In addition, samples must be studied in an ultrahigh-vacuum environment. The relatively weak absorption of neutrons by most isotopes allows investigations of bulk samples. The principal neutron scattering interaction involves the nuclei of constituent elements and the magnetic moment of the outer electrons. Indeed, the cross-section for scattering from magnetic electrons is of the same order as scattering from the nuclei, so that this technique is of great utility in studying magnetic structures of magnetic materials. Neutron energies are typically on the order of a few meV to tens of meV, the same energy scale as many important elementary excitations in solids. Therefore inelastic neutron scattering has become a critical probe of elementary excitations including phonon and magnon dispersion in solids. The principal xray scattering interaction in materials involves the atomic electrons, but is significantly weaker than the electronelectron scattering cross-section. X-ray absorption by
ALAN I. GOLDMAN
X-RAY POWDER DIFFRACTION INTRODUCTION X-ray powder diffraction is used to determine the atomic structure of crystalline materials without the need for large (100-mm) single crystals. ‘‘Powder’’ can be a misnomer; the technique is applicable to polycrystalline phases 835
836
X-RAY TECHNIQUES
such as cast solids or films grown on a substrate. X-ray powder diffraction can be useful in a wide variety of situations. Below we list a number of questions that can be effectively addressed by this technique. This is an attempt to illustrate the versatility of x-ray powder diffraction and not, by any means, a complete list. Six experiments (corresponding to numbers 1 through 6 below), described later as concrete examples (see Practical Aspects of the Model), constitute an assortment of problems of varying difficulty and involvement that we have come across over the course of several years. 1. The positions and integrated intensities of a set of peaks in an x-ray powder diffraction pattern can be compared to a database of known materials in order to identify the contents of the sample and to determine the presence or absence of any particular phase. 2. A mixture of two or more crystalline phases can be easily and accurately analyzed in terms of its phase fractions, whether or not the crystal structures of all phases are known. This is called quantitative phase analysis, and it is particularly valuable if some or all of the phases are chemically identical and hence cannot be distinguished while in solution. 3. The crystal structure of a new or unknown material can be determined when a similar material with a known structure exists. Depending on the degree of similarity between the new and the old structure, this can be fairly straightforward. 4. The crystal structure of a new or unknown material can be solved ab initio even if no information about the material other than its stoichiometry is known. This case is significantly more difficult than the previous one, and it requires both high-resolution data and a significant investment of time and effort on the part of the investigator. 5. Phase transitions and solid-state reactions can be investigated in near real time by recording x-ray powder diffraction patterns as a function of time, pressure, and/or temperature. 6. Subtle structural details such as lattice vacancies of an otherwise known structure can be extracted. This also usually requires high-resolution data and a very high sample quality.
the fundamental principles of the technique, including the expression for the intensity of each diffraction peak, gives a brief overview of experimental techniques, describes several illustrative examples, gives a description of the procedures to interpret and analyze diffraction data, and discusses weaknesses and sources of possible errors. Competitive and Related Techniques Alternative Methods 1. Single-crystal x-ray diffraction requires single crystals of appropriate size (10 to 100 mm). Solution of single-crystal structures is usually more automated with appropriate software than powder structures. Single-crystal techniques can generally solve more complicated structures than powder techniques, whereas powder diffraction can determine the constituents of a mixture of crystalline solid phases. 2. Neutron powder diffraction generally requires larger samples than x rays. It is more sensitive to light atoms (especially hydrogen) than x rays are. Deuterated samples are often required if the specimen has a significant amount of hydrogen (see NEUTRON POWDER DIFFRACTION). Neutrons are sensitive to the magnetic structure (see MAGNETIC NEUTRON SCATTERING). Measurements must be performed at a special facility (reactor or spallation source). 3. Single-crystal neutron diffraction requires singlecrystal samples of millimeter size, is more sensitive to light atoms (especially hydrogen) than are x rays, and often requires deuterated samples if the specimen has a significant amount of hydrogen. As in the powder case, it is sensitive to the magnetic structure and must be performed at a special facility (reactor or spallation source). 4. Electron diffraction can give lattice information for samples that have a strain distribution too great to be indexed by the x-ray powder technique. It provides spatial resolution for inhomogeneous samples and can view individual grains but requires relatively sophisticated equipment (electron microscope; see LOW-ENERGY ELECTRON DIFFRACTION).
PRINCIPLES OF THE METHOD The variation of peak positions with sample orientation can be used to deduce information about the internal strain of a sample. This technique is not covered in this unit, and the interested reader is directed to references such as Noyan and Cohen (1987). Another related technique, not covered here, is texture analysis, the determination of the distribution of orientations in a polycrystalline sample. X-ray powder diffraction is an old technique, in use for most of this century. The capabilities of the technique have recently grown for two main reasons: (1) development of xray sources and optics (e.g., synchrotron radiation, Go¨ bel mirrors) and (2) the increasing power of computers and software for analysis of powder data. This unit discusses
If an x ray strikes an atom, it will be weakly scattered in all directions. If it encounters a periodic array of atoms, the waves scattered by each atom will reinforce in certain directions and cancel in others. Geometrically, one may imagine that a crystal is made up of families of lattice planes and that the scattering from a given family of planes will only be strong if the x rays reflected by each plane arrive at the detector in phase. This leads to a relationship between the x-ray wavelength l, the spacing d between lattice planes, and the angle of incidence y known as Bragg’s law, l ¼ 2dsiny. Note that the angle of deviation of the x ray is 2y from its initial direction. This is fairly restrictive for a single crystal: for a given l, even
X-RAY POWDER DIFFRACTION
if the detector is set at the correct 2y for a given d spacing within the crystal, there will be no diffracted intensity unless the crystal is properly aligned to both the incident beam and the detector. The essence of the powder diffraction technique is to illuminate a large number of crystallites, so that a substantial number of them are in the correct orientation to diffract x rays into the detector. The geometry of diffraction in a single grain is described through the reciprocal lattice. The fundamental concepts are in any of the Key References or in introductory solidstate physics texts such as Kittel (1996) or Ashcroft and Mermin (1976); also see SYMMETRY IN CRYSTALLOGRAPHY. If the crystal lattice is defined by three vectors a, b, and c, there are three reciprocal lattice vectors defined as
a ¼
2pb c abc
ð1Þ
and cyclic permutations thereof for b* and c* (in the chemistry literature the factor 2p is usually eliminated in this definition). These vectors define the reciprocal lattice, with the significance that any three integers (hkl) define a family of lattice planes with spacing d ¼ 2p=jha þ kb þ lc j, so that the diffraction vector K ¼ ha þ kb þ lc satisfies jKj ¼ 2p=d ¼ 4psiny=l (caution: most chemists and some physicists define jKj ¼ 1=d ¼ 2siny=l). The intensity of the diffracted beam is governed by the unit cell structure factor, defined as Fhkl ¼
X
eiKRj fj e 2W
ð2Þ
j
where Rj is the position of the jth atom in the unit cell, the summation is taken over all atoms in the unit cell, and fj is the atomic scattering factor, tabulated in, e.g., the International Tables for Crystallography (Brown et al., 1992), and is equal to the number of atomic electrons at 2y ¼ 0, decreasing as a smooth function of sin y=l (there are ‘‘anomalous’’ corrections to this amplitude if the x-ray energy is close to a transition in a target atom). The Debye-Waller factor 2W is given by 2W ¼ K 2 u2rms =3, where urms is the (three-dimensional) root-mean-square deviation of the atom from its lattice position due to thermal and zero-point fluctuations. Experimental results are often quoted as the thermal parameter B, defined as 8p2 u2rms =3, so that the Debye-Waller factor is given by 2W ¼ 2Bsin2 y=l2 . Note that thermal fluctuations of the atoms about their average position weaken the diffraction lines but do not broaden them. As long as the diffraction is sufficiently weak (kinematic limit, assumed valid for most powders), the diffracted intensity is proportional to the square of the structure factor. In powder diffraction, it is always useful to bear in mind that the positions of the observed peaks indicate the geometry of the lattice, both its dimensions and any internal symmetries, whereas the intensities are governed by the arrangement of atoms within the unit cell. In a powder experiment, various factors act to spread the intensity over a finite range of the diffraction angle,
837
and so it is useful to consider the integrated intensity (power) of a given peak over the diffraction angle, " # P0 l3 R2e l Vs 1 þ cos2 2y 2 Phkl ¼ F M ð3Þ hkl hkl 16pR V 2 sin 2y sin y where P0 is the power density of the incident beam, Re ¼ 2.82 fm is the classical electron radius, l and R are the width of the receiving slit and the distance between it and the sample, and Vs and V are the effective illuminated volume of the sample and the volume of one unit cell. The term Mhkl is the multiplicity of the hkl peak, e.g., 8 for a cubic hhh and 6 for a cubic h00, and ð1 þ cos2 2yÞ= ðsiny sin2yÞ is called the Lorentz polarization factor. The numerator takes the given form only for unpolarized incident radiation and in the absence of any other polarization-sensitive optical elements; it must be adapted, e.g., for a synchrotron-radiation source or a diffracted-beam monochromator. There are considerable experimental difficulties with measuring the absolute intensity either of the incident or the diffracted beam, and so the terms in the square brackets are usually lumped into a normalization factor, and one considers the relative intensity of different diffraction peaks, or more generally, the spectrum of intensity vs. scattering angle. An inherent limitation on the amount of information that can be derived from a powder diffraction spectrum arises from possible overlap of Bragg peaks with different (hkl), so that their intensities cannot be determined independently. This overlap may be exact, as in the coincidence of cubic (511) and (333) reflections, or it may allow a partial degree of separation, as in the case of two peaks whose positions differ by a fraction of their widths. Because powder diffraction peaks generally become broader and more closely spaced at higher angles, peak overlap is a factor in almost every powder diffraction experiment. Perhaps the most important advance in powder diffraction during the last 30 years is the development of whole-pattern (Rietveld) fitting techniques for dealing with partially overlapped peaks, as discussed below. Diffraction peaks acquire a nonzero width from three main factors: instrumental resolution (not discussed here), finite grain size, and random strains. If the grains have a linear dimension L, then the full-width at half-maximum (FWHM) in 2y, expressed in radians, of the diffraction line is estimated by the well-known Scherrer equation, FWHM2y ¼ 0:89l=Lcosy. This is a reflection of the fact that the length L of a crystal made up of atoms with period d can only be determined to within d. One should not take the numerical value too seriously, as it depends on the precise shape of the crystallites and dispersion of the crystallite size distribution. If the crystallites have a needle or plate morphology, the size can be different for different families of lattice planes. On the other hand, if the crystallites are subject to a random distribution of lattice fractional strains having a FWHM of eFWHM, the FWHM of the diffraction line will be 2 tan yeFWHM . It is sometimes asserted that size broadening produces a Lorentzian and strain a Gaussian lineshape, but there is no fundamental reason for this to be true, and counterexamples are frequently observed. If the sample peak width
838
X-RAY TECHNIQUES
exceeds the instrumental resolution, or can be corrected for that effect, it can be informative to make a plot (called a Williamson-Hall plot) of FWHMcosy vs. siny. If the data points fall on a smooth curve, the intercept will give the particle size and the limiting slope the strain distribution. Indeed, the curve will be a straight line if both effects give a Lorentzian lineshape, because the shape of any peak would be the convolution of the size and strain contributions. If the points in a Williamson-Hall plot are scattered, it may give useful information (or at least a warning) of anisotropic size or strain broadening. More elaborate techniques for the deconvolution of size and strain effects from experimental data are described in the literature (Klug and Alexander, 1974; Balzar and Ledbetter, 1993). There are two major problems in using powder diffraction measurements to determine the atomic structure of a material. First, as noted above, peaks overlap, so that the measured intensity cannot be uniquely assigned to the correct Miller indices (hkl). Second, even if the intensities were perfectly separated so that the magnitudes of the structure factors were known, one could not Fourier transform the measured structure factors to learn the atomic positions because their phases are not known.
PRACTICAL ASPECTS OF THE METHOD The requirements to obtain a useful powder diffraction data set are conceptually straightforward: allow a beam of x rays to impinge on the sample and record the diffracted intensity as a function of angle. Practical realizations are governed by the desire to optimize various aspects of the measurement, such as the intensity, resolution, and discrimination against undesired effects (e.g., background from sample fluorescence). Most laboratory powder x-ray diffractometers use a sealed x-ray tube with a target of copper, molybdenum, or some other metal. About half of the x rays from such a ˚ for Cu, tube are in the characteristic Ka line (l ¼ 1:54A ˚ for Mo), and the remainder are in other lines l ¼ 0:70A and in a continuous bremsstrahlung spectrum. Rotatinganode x-ray sources can be approximately ten times brighter than fixed targets, with an attendant cost in complexity and reliability. In either case, one can either use the x rays emitted by the anode directly (so that diffraction of the continuous component of the spectrum contributes a smooth background under the diffraction peaks) or select the line radiation by a crystal monochromator (using diffraction to pass only the correct wavelength) or by an energy-sensitive detector. The Ka line is actually a doublet ˚ for Cu), which can create the (1.54051 and 1.54433 A added complication of split peaks unless one uses a monochromator of sufficient resolving power to pass only one component. Synchrotron radiation sources are finding increasing application for powder diffraction, due to their high intensity, intrinsically good collimation (0.01 in the vertical direction) of x-ray beams, and tunability over a continuous spectrum and the proliferation of user facilities throughout the world.
There are a large number of detectors suitable for powder x-ray diffraction. Perhaps the simplest is photographic film, which allows the collection of an entire diffractogram at one time and, with proper procedures, can be used to obtain quantitative intensities with a dynamic range up to 100:1 (Klug and Alexander, 1974). An updated form of photographic film is the x-ray imaging plate, developed for medical radiography, which is read out electronically (Miyahara et al., 1986; Ito and Amemiya, 1991). The simplest electronic detector, the Geiger counter, is no longer widely used because of its rather long dead time, which limits the maximum count rate. The gas-filled proportional counter offers higher count rates and some degree of x-ray energy resolution. The most widely used x-ray detector today is the scintillation counter, in which x rays are converted into visible light, typically in a thallium-doped NaI crystal, and then into electronic pulses by a photomultiplier tube. Various semiconductor detectors [Si:Li, positive-intrinsic-negative (PIN)] offer energy resolutions of 100 to 300 eV, sufficient to distinguish fluorescence from different elements and from the diffracted x rays, although their count rate capability is generally lower than that of scintillation counters. There are various forms of electronic position-sensitive detectors. Gas-filled proportional detectors can have a spatial resolution of a small fraction of a millimeter and are available as straight-line detectors, limited to several degrees of 2y by parallax, or as curved detectors covering an angular range as large as 120 . They can operate at a count rate up to 105 Hz over the entire detector, but one must bear in mind that the count rate in one individual peak would be significantly less. Also, not all position-sensitive detectors are able to discriminate against x-ray fluorescence from the sample, although there is one elegant design using Kr gas and x rays just exceeding the Kr K edge that addresses this problem (Smith, 1991). Charge-coupled devices (CCDs) are two-dimensional detectors that integrate the total energy deposited into each pixel and therefore may have a larger dynamic range and/or a faster time response (Clarke and Rowe, 1991). Some of the most important configurations for x-ray powder diffraction instruments are illustrated in Figure 1. The simple Debye-Scherrer camera in (A) records a wide range of angles on curved photographic film but suffers from limited resolution. Modern incarnations include instruments using curved position-sensitive detectors and imaging plates and are in use at several synchrotron sources. It generally requires a thin rod-shaped sample either poured as a powder into a capillary or mixed with an appropriate binder and rolled into the desired shape. The Bragg-Brentano diffractometer illustrated in (B) utilizes parafocusing from a flat sample to increase the resolution available from a diverging x-ray beam; in this exaggerated sketch, the distribution of Bragg angles is 3 , despite the fact that the sample subtends an angle of 15 from source or detector. The addition of a diffractedbeam monochromator illustrated in (C) produces a marked improvement in performance by eliminating x-ray fluorescence from the sample. For high-pressure cells, with limited access for the x-ray beam, the energy-dispersive diffraction approach illustrated in (D) can be an attractive
X-RAY POWDER DIFFRACTION
Figure 1. Schematic illustration of experimental setups for powder diffraction measurements.
839
these mechanisms will have an energy that is identical to or generally indistinguishable from the primary beam, and so it is not possible to eliminate these effects by using energy-sensitive detectors. Another important source of background is x-ray fluorescence, the emission of x rays by atoms in the target that have been ionized by the primary x-ray beam. X-ray fluorescence is always of a longer wavelength (lower energy) than the primary beam and so can be eliminated by the use of a diffracting crystal between sample and detector or an energy-sensitive detector (with the precaution that the fluorescence radiation does not saturate it) or controlled by appropriate choice of the incident wavelength. It is not usually possible to predict the shape of this background, and so, once appropriate measures are taken to reduce it as much as practical, it is empirically separated from the relatively sharp powder diffraction peaks of interest. In data analysis such as the Rietveld method, one may take it as a piecewise linear or spline function between specified points or parameterize and adjust it to produce the best fit. There is some controversy about how to treat the statistical errors in a refinement with an adjustable background.
DATA ANALYSIS AND INTERPRETATION solution. A polychromatic beam is scattered through a fixed angle, and the energy spectrum of the diffracted x rays is converted to d spacings for interpretation. A typical synchrotron powder beamline such as X7A or X3B1 at the National Synchrotron Light Source (Brookhaven National Laboratory) is illustrated in (E). One particular wavelength is selected by the dual-crystal monochromator, and the x rays are diffracted again from an analyzer crystal mounted on the 2y arm. Monochromator and analyzer are typically semiconductor crystals with rocking-curve widths of a few thousandths of a degree. Besides its intrinsically high resolution, this configuration offers the strong advantage that it truly measures the angle through which a parallel beam of radiation is scattered and so is relatively insensitive to parallax or sample transparency errors. The advantage of a parallel incident beam is available on laboratory sources by the use of a curved multilayer (Go¨ bel) mirror (F), currently commercialized by Bruker Analytical X-ray Systems. One can also measure the diffracted angle, free from parallax, by use of a parallel-blade collimator as shown in (G). This generally gives higher intensity than an analyzer crystal at the cost of coarser angular resolution (0.1 to 0.03 ), but this is often not a disadvantage because diffraction peaks from real samples are almost always broader than the analyzer rocking curve. However, the analyzer crystal also discriminates against fluorescence. There are a number of sources of background in a powder diffraction experiment. X rays can be scattered from the sample by a number of mechanisms other than Bragg diffraction: Compton scattering, thermal diffuse scattering, scattering by defects in the crystal lattice, multiple scattering in the sample, and scattering by noncrystalline components of the sample, by the sample holder, or even by air in the x-ray beam path. The x rays scattered by all of
A very important technique for analysis of powder diffraction data is the whole-pattern fitting method proposed by Rietveld (1969). It is based on the following properties of xray (and neutron) powder diffraction data: a powder diffraction pattern usually comprises a large number of peaks, many of which overlap, often very seriously, making the separate direct measurement of their integrated intensities difficult or impossible. However, it is possible to describe the shape of all Bragg peaks in the pattern by a small (compared to the number of peaks) number of profile parameters. This allows the least-squares refinement of an atomic model combined with an appropriate peak shape function, i.e., a simulated powder pattern, directly against the measured powder pattern. This may be contrasted to the single-crystal case, where the atomic structure is refined against a list of extracted integrated intensities. The Rietveld method is an extremely powerful tool for the structural analysis of virtually all types of crystalline materials not available as single crystals. The parameters refined in the Rietveld method fall into two classes: those that describe the shape and position of the Bragg peaks in the pattern (profile parameters) and those that describe the underlying atomic model (atomic or structural parameters). The former include the lattice parameters and those describing the shape and width of the Bragg peaks. In x-ray powder diffraction, a widely used peak shape function is the pseudo-Voigt function (Thompson et al., 1987), a fast-computing approximation to a convolution of a Gaussian and a Lorentzian (Voigt function). It uses only five parameters (usually called U; V; W; X; and Y ) to describe the shape of all peaks in the powder pattern. In particular, the peak widths are a smooth function of the scattering angle 2y. Additional profile parameters are often used to describe the peak
840
X-RAY TECHNIQUES
asymmetry at low angles due to the intersection of the curved Debye-Scherrer cone of radiation with a straight receiving slit and corrections for preferred orientation. The structural parameters include the positions, types, and occupancies of the atoms in the structural model and isotropic or anisotropic thermal parameters (Debye-Waller factors). The power of the Rietveld method lies in simultaneous refinement of both profile and atomic parameters, thereby maximizing the amount of information obtained from the powder data. The analysis of x-ray powder diffraction data can be divided into a number of separate steps. While some of these steps rely on the correct completion of the previous one(s), they generally constitute independent tasks to be completed by the experimenter. Depending on the issues to be addressed by any particular experiment, one, several, or all of these tasks will be encountered, usually in the order in which they are described here. Crystal Lattice and Space Group Determination from X-Ray Powder Data Obtaining the lattice parameters (unit cell) of an unknown material is always the first step on the path to a structure solution. While predicting the scattering angles of a set of diffraction peaks given the unit cell is trivial, the inverse task is not, because the powder method reduces the threedimensional atomic distribution in real space to a onedimensional diffraction pattern in momentum space. In all but the simplest cases, a computer is required and a variety of programs are available for the task. They generally fall into two categories: those that attempt to find a suitable unit cell by trial and error (also often called the ‘‘semiexhaustive’’ method) and those that use an analytical approach based on the geometry of the reciprocal lattice. Descriptions of these approaches can be found below [see Key References, particularly Bish and Post (1989, pp. 188–196) and Klug and Alexander (1974, Section 6)]. Some widely used auto-indexing programs are TREOR (Werner et al., 1985), ITO (Visser, 1969), and DICVOL (Boultif and Loue¨ r, 1991). All of these programs take either a series of peak positions or corresponding d spacings as input, and their accuracy is crucial for this step to succeed. The programs may tolerate reasonably small random errors in the peak position; however, even a small systematic error will prevent them from finding the correct (or any) solution. Hence, the data set used to extract peak positions must be checked carefully for systematic errors such as the diffractometer zero point, for example, by looking at pairs of Bragg peaks whose values of sin y are related by integer multiples. Once the unit cell is known, the space group can be determined from systematic absences of Bragg peaks, i.e., Bragg peaks allowed by the unit cell but not observed in the actual spectrum. For example, the observed spectrum of a face-centered-cubic (fcc) material contains only Bragg peaks (hkl) for which h, k, and l are either all odd or all even. Tables listing possible systematic absences for each crystal symmetry class and correlating them with space groups can be found in the International Tables for Crystallography (Vos and Buerger, 1996).
Extracting Integrated Intensities of Bragg Peaks The extraction of accurate integrated intensities of Bragg peaks is a prerequisite for several other steps in x-ray powder diffraction data analysis. However, the fraction of peaks in the observed spectrum that can be individually fitted is usually very small. The problem of peak overlap is partially overcome in the Rietveld method by describing the shapes of all Bragg peaks observed in the x-ray powder pattern by a small number of profile parameters. It is then possible to perform a refinement of the powder pattern without any atomic model using (and refining) only the lattice and profile parameters to describe the position and shape of all Bragg peaks and letting the intensity of each peak vary freely; two variations are referred to as the Pawley (1981) and LeBail (1988) methods. While this does not solve the problem of exact peak overlap [e.g., of the cubic (511) and (333) peaks], it is very powerful in separately determining the intensities of clustered, partially overlapping peaks. The integrated intensities of individual peaks that can be determined in this manner usually constitute a significant fraction of the total number of allowed Bragg peaks, and they can then be used as input to search for candidate atom positions. Many Rietveld programs (see The Final Rietveld Refinement, below) feature an intensity extraction mode that can be used for this task. The resulting fit can also serve as an upper bound for the achievable goodness of fit in the final Rietveld refinement. EXTRA (Altomare et al., 1995) is another intensity extraction routine that interfaces directly with SIRPOW.92 (see Search for Candidate Atom Positions, below), a program that utilizes direct methods. The program EXPO consists of both the EXTRA and SIRPOW.92 modules. Search for Candidate Atom Positions When a suitable starting model for a crystal structure is not known and cannot easily be guessed, it must be determined from the x-ray powder data before any Rietveld refinements can be performed. The Fourier transform of the distribution of electrons in a crystal is the x-ray scattering amplitude. However, only the scattering intensity can be measured, and hence all phase information is lost. Nevertheless, the Fourier transform of the intensities can be useful in finding some of the atoms in the unit cell. A plot of the Fourier transform of the measured scattering intensities is called a Patterson map, and its peaks correspond to translation vectors between pairs of atoms in the unit cell. Obviously, the strongest peak will be at the origin, and depending on the crystal structure, it may be anywhere from simple to impossible to deduce atom positions from a Patterson map. Cases where there is one (or few) relatively heavy atom(s) in the unit cell are most favorable for this approach. Similarly, if the positions of most of the atoms in the unit cell (or at least of the heavy atoms) are already known, it may be reasonable to guess that the phases are dominated by the known part of the atomic structure. Then the differences between the measured intensities and those calculated from the known atoms together with the phase information calculated from the known atoms can be used to obtain a difference Fourier map that, if successful, will indicate the positions of the
X-RAY POWDER DIFFRACTION
remaining atoms. The ability to calculate and plot Patterson and Fourier maps is included in common Rietveld packages (see below). Another approach to finding atom candidate positions is the use of direct methods originally developed (and widely used) for the solution of crystal structures from singlecrystal x-ray data. A mathematical description is beyond the scope of this unit; the reader is referred to the literature (e.g., Giacovazzo, 1992, pp. 335–365). In direct methods, an attempt is made to derive the phase of the structure factor directly from the observed amplitudes through mathematical relationships. This is feasible because the electron density function is positive everywhere and consists of discrete atoms. These two properties, ‘‘positivity’’ and ‘‘atomicity,’’ are then used to establish likely relationships between the phases of certain groups of Bragg peaks. If the intensities of enough such groups of Bragg peaks have been measured accurately, this method yields the positions of some or all the atoms in the unit cell. The program SIRPOW.92 (Altomare et al., 1994) is an adaptation of direct methods to x-ray powder diffraction data and has been used to solve a number of organic and organometallic crystal structures from such data. We emphasize that these procedures are not straightforward, and even for experienced researchers, success is far from guaranteed. However, the chance of success increases appreciably with the quality of the data, in particular with the resolution of the x-ray powder pattern. The Final Rietveld Refinement Once a suitable starting model is found, the Rietveld method allows the simultaneous refinement of structural parameters such as atomic positions, site occupancies, isotropic or anisotropic Debye-Waller factors, along with lattice and profile parameters, against the observed x-ray powder diffraction pattern. Since the refinement is usually performed using a least-squares algorithm [chi-square (w2 ) minimization], the result will be a local minimum close to the set of starting parameters. It is the responsibility of the experimenter to confirm that this is the global minimum, i.e., that the model is in fact correct. There are criteria based on the goodness-of-fit of the final refinement that, if they are not fulfilled, almost certainly indicate a wrong solution. However, such criteria are not sufficient in establishing the correctness of a model. For example, if an inspection of the final fit shows that a relatively small amount of total residual error is noticeably concentrated in a few peaks rather than being distributed evenly over the entire spectrum, that may be an indication that the model is incorrect. More details on this subject are given in the International Tables for Crystallography (Prince and Spiegelmann, 1992) and in Young (1993) and Bish and Post (1989). A number of frequently updated Rietveld refinement program packages are available that can be run on various platforms, most of which are distant descendants of the original program by Rietveld (1969). Packages commonly in use today and freely distributed for noncommercial use by their authors include—among several—GSAS
841
(Larson and Von Dreele, 1994) and FULLPROF (Rodriguez-Carvajal, 1990). Notably, both include the latest descriptions of the x-ray powder diffraction peak shape functions (Thompson et al., 1987), analytical descriptions of the peak shape asymmetry at low angles (Finger et al., 1994), and an elementary description for anisotropic strain broadening for different crystal symmetries (Stephens, 1999). Both programs contain comprehensive documentation, which is no substitute for the key references discussing the Rietveld method (Young, 1993; Bish and Post, 1989). Estimated Standard Deviations in Rietveld Refinements The assignment of estimated standard deviations (ESDs) of the atomic parameters obtained from Rietveld refinements is a delicate matter. In general, the ESDs calculated by the Rietveld program are measures of the precision (statistical variation between equivalent experiments) rather than the accuracy (discrepancy from the correct value) of any given parameter. The latter cannot, in principle, be determined experimentally, since the ‘‘truly’’ correct model describing experimental data remains unknown. However, it is the accuracy that the experimenter is interested in, and it is desired to make the best possible estimate of it, based both on statistical considerations and the experience and judgment of the experimenter. The Rietveld program considers each data point of the measured powder pattern to be an independent measurement of the Bragg peak(s) contributing to it. This implicit assumption holds only if the difference between the refined profile and the experimental data is from counting statistics alone (i.e., if w2 ¼ 1 and the points of the difference curve form a random, uncorrelated distribution with center 0 and variance N ). However, this is not true for real experiments in which the dominant contributions to the difference curve are an imperfect description of the peak shapes and/or an imperfect atomic model. Consequently, accepting the precision of any refined parameter as a measure of its accuracy cannot be justified and would usually be unreasonably optimistic. A number of corrections to the Rietveld ESDs have been proposed, and even though all are empirical in nature and there is no statistical justification for these procedures, they can be very valuable in order to obtain estimates. If one or more extra parameters are introduced into a model, the fit to the data will invariably improve. The F statistic can be used to determine the likelihood that this improvement represents additional information (i.e., is statistically significant) and is not purely by chance (Prince and Spiegelmann, 1992). This can also be used to determine how far any given parameter must be shifted away from its ideal value in order to make a statistically significant difference in the goodness of fit. This can be used as an estimate of the accuracy of that parameter that is independent of its precision as calculated by the Rietveld program. There exists no rigid rule on whether precision or accuracy should be quoted when presenting results obtained from xray powder diffraction; it certainly depends on the motivation and outcome of the experiment. However, it is very important to distinguish between the two. We note that
842
X-RAY TECHNIQUES
the International Union of Crystallography has sponsored a Rietveld refinement round robin, during which identical samples of ZrO2 were measured and Rietveld refined by 51 participants (Hill and Cranswick, 1994), giving an empirical indication of the achievable accuracy from the technique. Quantitative Phase Analysis Quantitative phase analysis refers to the important technique of determining the amount of various crystalline phases in a mixture. It can be performed in two ways, based on the intensities of selected peaks or on a multiphase Rietveld refinement. In the first case, suppose that the sample is a flat plate that is optically thick to x rays. In a mixture of two phases with mass absorption coefficients m1 and m2 (cm2/g), the intensity of a given peak from phase 1 will be reduced from its value for a pure sample of phase 1 by the ratio I1 ðx1 Þ x1 m1 ¼ I1 ðpureÞ x1 m1 þ ð1 x1 Þm2
ð4Þ
where x1 is the weight fraction of phase 1. Note that the intensity is affected both by dilution and by the change of the sample’s absorption constant. When the absorption coefficients are not known, they can be determined from experimental measurements, such as by ‘‘spiking’’ the mixture with an additional amount of one of its constituents. Details are given in Klug and Alexander (1974) and Bish and Post (1989). If the atomic structures of the constituents are known, a multiphase Rietveld refinement (see above) directly yields scale factors sj for each phase. The weight fraction wj of the jth phase can then be calculated by sj Zj mj Vj wj ¼ P i si Zi mi Vi
ð5Þ
where Zj is the number of formula units per unit cell, mj is the mass of a formula unit, and Vj is the unit cell volume. In either case, the expressions above are given under the assumption that the powders are sufficiently fine, i.e., the product of the linear absorption coefficient m (cm1 , equal to m r) and the linear particle (not sample) size D is small (mD < 0:01). If not, one must make allowance for microabsorption, as described by Brindley (1945). It is worth noting that the sensitivity of an x-ray powder diffraction experiment to weak peaks originating from trace phases is governed by the signal-to-background ratio. Hence, in order to obtain maximum sensitivity, it is desirable to reduce the relative background as much as possible.
platelike morphology, so that reflections in certain directions are enhanced relative to others. Various measures such as mixing the powder with a binder or an inert material chosen to randomize the grains or pouring the sample sideways into the flat plate sample holder are in common use. [More details are given in, e.g., Klug and Alexander (1974), Bish and Post (1989), and Jenkins and Snyder (1996).] It is also possible to correct experimental data if one can model the distribution of crystallite orientations; this option is available in most common Rietveld programs (see The Final Rietveld Refinement, above). A related issue is that there must be a sufficient number of individual grains participating in the diffraction measurement to ensure a valid statistical sample. It may be necessary to grind and sieve the sample, especially in the case of strongly absorbing materials. However, grinding can introduce strain broadening into the pattern, and some experimentation is usually necessary to find the optimum means of preparing a sample. A useful test of whether a specimen in a diffractometer is sufficiently powdered is to scan the sample angle y over several degrees while leaving 2y fixed at the value of a strong Bragg peak, in steps of perhaps 0.01 ; fluctuations of more than a few percent indicate trouble. It is good practice to rock (flat plate) or twirl (capillary) the sample during data collection to increase the number of observed grains. PROBLEMS X rays are absorbed by any material through which they pass, and so the effective volume is generally not equal to the actual volume of the sample. Absorption constants are typically tabulated as the mass absorption coefficient m=r (cm2/g), which makes it easy to work out the attenuation length in a sample as a function of its composition. A plot of the absorption lengths (in micrometers) of some ‘‘typical’’ samples vs. x-ray energy/wavelength is given in Figure 2. To determine the relative intensities of different
SAMPLE PREPARATION The preparation of samples to avoid unwanted artifacts is an important consideration in powder diffraction experiments. One issue is that preferred orientation (texture) should be avoided or controlled. The grains of a sample may tend to align, especially if they have a needle or
Figure 2. Calculated absorption lengths for several materials as a function of x-ray wavelength or energy.
X-RAY POWDER DIFFRACTION
peaks, it is important to either arrange for the effective volume to be constant or be able to make a quantitative correction. If the sample is a flat plate in symmetrical geometry, as illustrated in Figure 1B, the effective volume is independent of diffraction angle as long as the beam does not spill over the edge of the sample. If the sample is a cylinder and the absorption constant is known, the angle-dependent attenuation constant is tabulated, e.g., in the International Tables for Crystallography (Maslen, 1992). It is also important to ensure that the slits and other optical components of the diffractometer are properly aligned so that the variation of illuminated sample volume with 2y is well controlled and understood by the experimenter. Inasmuch as a powder diffraction experiment consists of ‘‘simply’’ measuring the diffracted intensity as a function of angle, one can classify sources of systematic error as arising from measurement of the intensity or the angle or from the underlying assumptions. Errors of intensity can arise from detector saturation, drift or fluctuations of the strength of the x-ray source, and the statistical fluctuations intrinsic to counting the random arrival of photons. Error of the angle can arise from mechanical faults or instability of the instrument or from displacement of the sample from the axis of the diffractometer (parallax). A subtle form of parallax can occur for flat-plate samples that are moderately transparent to x rays, because the origin of the diffracted radiation is located below the surface of the sample by a distance that depends on the diffraction angle. Another effect that can give rise to an apparent shift of diffraction peaks is the asymmetry caused by the intersection of the curved Debye-Scherrer cone of radiation with a straight receiving slit. This geometric effect has been discussed by several authors (Finger et al., 1994, and references therein), and it can produce a significant shift in the apparent position of low-angle peaks if fitted with a model lineshape that does not properly account for it. A potential source of angle errors in very high precision work is the refraction of x rays; to an x ray, most of the electrons of a solid appear free, and if their number density is n, the index of refraction is given by
not easy to assure a priori that extinction is not a problem. One might be suspicious of an extinction effect if a refinement shows that the strongest peaks are weaker than predicted by a model that is otherwise satisfactory. To satisfy the basic premise of powder diffraction, there must be enough grains in the effective volume to produce a statistical sample. This may be a particularly serious issue in highly absorbing materials, for which only a small number of grains at the surface of the sample participate. One can test for this possibility by measuring the intensity of a strong reflection at constant 2y as a function of sample rotation; if the intensity fluctuates by more than a few percent, it is likely that there is an insufficient number of grains in the sample. One can partly overcome this problem by moving the sample during the measurement to increase the number of grains sampled. A capillary sample can be spun around its axis, while a flat plate can be rocked by a few degrees about the dividing position or rotated about its normal (or both) to achieve this aim. If the grains of a powder sample are not randomly oriented, it will distort the powder diffraction pattern by making the peaks in certain directions stronger or weaker than they would be in the ideal case. This arises most frequently if the crystallites have a needle- or platelike morphology. [Sample preparation procedures to control this effect are described in, e.g., Klug and Alexander (1974), Bish and Post (1989), and Jenkins and Snyder (1996).] It is also possible to correct experimental data if one can model the distribution of crystallite orientations. There are a number of standard samples that are useful for verifying and monitoring instrument performance and as internal standards. The National Institute of Standards and Technology sells several materials, such as SRM 640 (Silicon Powder), SRM 660 (LaB6, which has exceedingly sharp lines), SRM 674a (a set of five metal oxides of various absorption lengths, useful for quantitative analysis standards), and SRM 1976 (corundum Al2O3 plate, with somewhat sharper peaks than the same compound in SRM 674a and certified relative intensities). Silver behenate powder has been proposed as a useful standard with a very large ˚ (Blanton et al., 1995). lattice spacing of c ¼ 58.4 A
1
e2 nl2 mc2 2p
843
ð6Þ
so that the observed Bragg angle is slightly shifted from its value inside the sample. The interpretation of powder intensities is based on a number of assumptions that may or may not correspond to experimental reality in any given case. The integrated intensity is proportional to the square of the structure factor only if the diffracted radiation is sufficiently weak that it does not remove a significant fraction of the incident beam. If the individual grains of the powder sample are too large, strong reflections will be effectively weakened by this ‘‘extinction’’ effect. The basic phenomenon is described in any standard crystallography text, and a treatment specific to powder diffraction is given by Sabine (1993). This can be an issue for grains larger than 1 mm, which is well below the size that can be easily guaranteed by passing the sample through a sieve. Consequently, it is
EXAMPLES Comparison Against a Database of Known Materials An ongoing investigation of novel carbon materials yielded a series of samples with new and potentially interesting properties: (1) the strongest signal in a mass spectrometer was shown to correspond to the equivalent of an integer number of carbon atoms, (2) electron diffraction indicated that the samples were at least partially crystalline, and (3) while the material was predicted to consist entirely of carbon, the procedures for synthesis and purification had involved an organic Li compound. The top half of Figure 3 shows an x-ray powder diffraction pattern of one such sam˚ at beamline X3B1 of the National ple recorded at l ¼ 0.7 A Synchrotron Light Source. The peak positions can be indexed (Werner et al., 1985) to a monoclinic unit cell: ˚ , b ¼ 4.97 A ˚ , c ¼ 6.19 A ˚ , and b ¼ 114.7 . a ¼ 8.36 A
844
X-RAY TECHNIQUES
Quantitative Phase Analysis The qualitative and quantitative detection of small traces of polymorphs plays a major role in pharmacology, since many relevant molecules form two or more different crystal structures, and phase purity rather than chemical purity is required for scientific, regulatory, and patent-related legal reasons. Mixtures of the a and g phases of the antiinflammatory drug indomethacin (C19H16ClNO4) provide a good example for the detection limit of x-ray powder diffraction experiments in this case (Dinnebier et al., 1996). Mixtures of 10, 1, 0.1, 0.01, and 0 wt% a phase (with the balance g phase) were investigated. Using its five strongest Bragg peaks, 0.1% a phase can easily be detected, and the fact that the ‘‘pure’’ g phase from which the mixtures were prepared contains traces of a became evident, making the quantification of the detection limits below 0.1% difficult. Figure 4 shows the strong (120) peak of the a phase and the weak (010) peak (0.15% intensity of its strongest peak) of the g phase for three different concentrations of the a phase. While these experiments and the accompanying data analysis are not difficult to perform per se, it is probably necessary to have access to a synchrotron radiation source with its high resolution to obtain detection limits comparable to or even close to the values mentioned. The data of this example were collected on beamline X3B1 of the National Synchrotron Light Source. Figure 3. X-ray powder diffraction pattern of ‘‘unknown’’ sample (top) and relative peak intensities of Li2O3 as listed in the PDF ˚. database (bottom), both at l ¼ 0.7 A
The Powder Diffraction File (PDF) database provided by the International Center for Diffraction Data (ICDD, 1998) allows for searching based on several search criteria, including types of atoms, positions of Bragg peaks, or unit cell parameters. Note that it is not necessary to know the unit cell of a compound to search the database against its diffraction pattern, nor is it necessary to know the crystal structures (or even the unit cells) of all the compounds included in the database. Searches for materials with an appropriate position of their strong diffraction peaks containing C only, or containing some or all of C, N, O, or H, did not result in any candidate materials. However, including Li in the list of possible atoms matched the measured pattern with the one in the database for synthetic Li2CO3 (zabuyelite). The peak intensities listed in the database, converted to 2y for the appropriate wavelength, are shown in the bottom half of Figure 3, leading to the conclusion that the only crystalline fraction of the candidate sample is Li2CO3, not a new carbon material. This type of experiment is fairly simple, and both data collection and analysis can be performed quickly. Average sample quality is sufficient, and usually so are relatively low-resolution data from an x-ray tube. In fact, this test is routinely performed before any attempt to solve the structure of an unknown and presumably new material.
Figure 4. Strong (120) peak of the a phase and weak (010) peak of the g phase of the drug indomethacin for different concentrations of a phase.
X-RAY POWDER DIFFRACTION
Figure 5 Atomic clusters R(Al-Li-Cu) exhibiting nearly icosahedral symmetry.
Structure Determination in a Known Analog The icosahedral quasicrystal phase i(Al-Li-Cu) is close to a crystalline phase, R(Al-Li-Cu), which has a very similar local structure, making the crystal structure of the latter an interesting possible basis for the structure of the former. The crystal symmetry [body-centered-cubic (bcc)], ˚ ), space group (Im3), and lattice parameter (a ¼ 13.89 A composition (Al5.6Li2.9Cu) suggest that the crystal structure of R(Al-Li-Cu) is similar to that of Mg32(Zn, Al)49 (Bergman et al., 1952, 1957). However, using the atomic parameters of the latter compound does not give a satisfactory fit to the data of the former. On the other hand, a Rietveld analysis (see Data Analysis and Interpretation, above) using those parameters as starting values converges and gives a stable refinement of occupancies and atom positions (Guryan et al., 1988). Such an experiment can be done with a standard laboratory x-ray source. This example was solved using Mo Ka radiation from a rotating-anode x-ray generator. Figure 5 shows the nearly icosahedral symmetry of the atomic clusters of which this material is comprised. Ab Initio Structure Determination The Kolbe-Schmitt reaction, which proceeds from sodium phenolate, C6H5ONa, is the starting point for the synthesis of many pigments, fertilizers, and pharmaceuticals such as aspirin. However, replacing Na with heavier alkalis thwarts the reaction, a fact known since 1874 yet not understood to date. This provides motivation for the solution of the crystal structure of potassium phenolate, C6H5OK (Dinnebier et al., 1997).
845
A high-resolution x-ray powder diffraction pattern of a sample of C6H5OK was measured at room temperature, l ¼ ˚ , and 2y ¼ 5 65 with step size 0.005 for a total 1.15 A counting time of 20 hr on beamline X3B1 of the National Synchrotron Light Source. Low-angle diffraction peaks had a FWHM of 0.013 . Following are the steps that were necessary to obtain the structure solution of this compound. The powder pattern was indexed (Visser, 1969) to an ˚ , b ¼ 17.91 A ˚ , and c ¼ orthorhombic unit cell, a ¼ 14.10 A ˚ . Observed systematic absences of Bragg peaks, e.g., 7.16 A of all (h0l) peaks with h odd, were used to determine the space group, Pna21 (Vos and Buerger, 1996), and the number of formula units per unit cell (Z ¼ 12) was determined from geometrical considerations. Next, using the LeBail technique (Le Bail et al., 1988), i.e., fitting the powder pattern to lattice and profile parameters without any structural model, 300 Bragg peak intensities were extracted. Using these as input for the direct-methods program SIRPOW.92 (Altomare et al., 1994), it was possible to deduce the positions of potassium and some candidate oxygen atoms but none of the carbon atoms. In the next stage, the positions of the centers of the phenyl rings were searched by replacing them with diffuse pseudoatoms having the same number of electrons but a high temperature factor. After Rietveld refining this model (see Data Analysis and Interpretation, above) one pseudoatom at a time was back substituted for the corresponding phenyl ring, and the orientation of each phenyl ring was found via a grid search of all of its possible orientations. Upon completion of this procedure, the coordinates of all nonhydrogen atoms had been obtained. The final Rietveld refinement (see Data Analysis and Interpretation) was performed using the program GSAS (Larson and Von Dreele, 1994), the hydrogen atoms were included, and the model remained stable when all structural parameters were refined simultaneously. Figure 6 shows a plot of the final refinement of C6H5OK, and Figure 7 shows the crystal structure obtained. This example illustrates the power of high-resolution xray powder diffraction for the solution of rather complicated crystal structures. There are 24 nonhydrogen atoms in the asymmetric unit, which until recently would have been impossible to solve without the availability of a single crystal. At the same time, it is worth emphasizing that such an ab initio structure solution from x-ray powder diffraction requires both a sample of extremely high quality and an instrument with very high resolution. By comparison, it was not possible to obtain the crystal structure of C6H5OLi from a data set with approximately 3 times fewer independently observed peaks due to poorer crystallinity of the sample. Even with a sample and data of sufficient quality, solving a structure ab initio from a powder sample (unlike the single-crystal case) requires experience and a significant investment of time and effort on the part of the researcher. Time-Resolved X-Ray Powder Diffraction Zeolites are widely used in a number of applications, for example, as catalysts and for gas separations. Their
846
X-RAY TECHNIQUES
Figure 6. Rietveld refinement of potassium phenolate, C6H5OK. Diamonds denote x-ray powder diffraction data; solid line denotes atomic model. Difference curve is given below.
Figure 7. Crystal structure of potassium phenolate, C6H5OK, solved ab initio from x-ray powder diffraction data.
structural properties, particularly the location of the extra-framework cations, are important in understanding their role in these processes. Samples of CsY zeolite dehydrated at 300 and at 500 C show differences in their diffraction patterns (Fig. 8). Rietveld analysis (see Data Analysis and Interpretation, above) shows that the changes occur in the extra-framework cations. The basic framework contains large so-called supercages and smaller sodalite cages, and in dehydration at 300 C, almost all of the Cs cations occupy the supercages while the Na cations occupy the sodalite cages. However, after dehydration at 500 C, a fraction of both Cs and Na ions will populate each other’s cage position, resulting in mixed occupancy for both sites (Poshni et al., 1997; Norby et al., 1998). To observe this transition while it occurs, the subtle changes in the CsY zeolite structure were followed by in situ dehydration under vacuum. The sample, loosely packed in a 0.7-mm quartz capillary, was ramped to a final temperature of 500 C over a period of 8 hr. Figure 9 shows the evolution of the diffraction pattern as a function of time during the dehydration process (Poshni et al., 1997). The data were collected at beamline X7B of the National Synchrotron Light Source with a translating image plate data collection stage. To investigate phase transitions and solid-state reactions in near real time by recording x-ray powder diffraction patterns as a function of time, pressure, and/or temperature, a number of conditions must be met. For data collection, a position-sensitive detector or a translating image plate stage is required. Also, since there can be no collimator slits or analyzer crystal between the sample and the detector/image plate, such an experiment is optimally carried out at a synchrotron radiation source. These have an extremely low vertical divergence, allowing for a reasonably good angular resolution in this detector
X-RAY POWDER DIFFRACTION
847
Figure 8. X-ray powder diffraction patterns of CsY zeolite dehydrated at 500 C (top) and 300 C (bottom).
geometry. The sample environment must be designed with care in order to provide, e.g., the desired control of pressure, temperature, and chemical environment compatible with appropriate x-ray access and low background. Examples include a cryostat with a Be can, a sample heater, a diamond-anvil pressure apparatus, and a setup that allows a chemical reaction to take place inside the capillary. Determination of Subtle Structural Details The superconductor Rb3C60 has Tc 30 K. The fullerenes form an fcc lattice, and of the three Rbþ cations, one occupies the large octahedral (O) site at ð12, 0, 0Þ and the remaining two occupy the small tetrahedral (T ) sites at ð14 ; 14 ; 14Þ. The size mismatch between the smaller Rbþ ions and the larger octahedral site led to the suggestion that the Rbþ cations in the large octahedral site could be displaced from the site center, supported by some nuclear
magnetic resonance (NMR) and extended x-ray absorption fine structure (EXAFS) experiments (see XAFS SPECTROSCOPY). If true, such a displacement would have a significant impact on the electronic and superconductin properties. By comparison, the Rb C distances of the Rbþ cation in the tetrahedral site are consistent with ionic radii. In principle, a single x-ray pattern cannot distinguish between static displacement of an atom from its site center (as proposed for this system) and dynamic thermal fluctuation about a site (as described by a Debye-Waller factor). Hence, to address this issue, an x-ray powder diffraction study of Rb3C60 at various temperatures was carried out at beamline X3B1 of the National Synchrotron Light Source (Bendele et al., 1998). At each temperature the data were Rietveld refined (see Data Analysis and Interpretation, above) against two competing (and otherwise identical) structural models: (1) The octahedral Rbþ cation is fixed at ð12, 0, 0Þ and its isotropic
Figure 9. Evolution of the diffraction pattern during the in situ dehydration of CsY zeolite.
848
X-RAY TECHNIQUES
Figure 10. Electron distributions obtained from x-ray powder diffraction data in the octahedral site of Rb3C60 for T ¼ 300 K (A) and T ¼ 20 K (B).
Debye-Waller factor Bo is refined and (2) the octahedral Rbþ cation is shifted an increasing distance e away from ð12, 0, 0Þ until the refined Bo becomes comparable to the temperature factor of the tetrahedral ion, Bt. In each case, both models result in an identical goodness of fit, and the direction of the assumed displacement e has no effect whatsoever. Figure 10 shows the electron distributions in the octahedral site for both models (in each case convoluted with the appropriate Debye-Waller factor). While the width of this distribution (static or dynamic) is ˚ at room temperature, it has decreased to 0.56 A ˚ 1.32 A upon cooling the sample to T ¼ 20 K. If the octahedral Rbþ cations were truly displaced away from their site center (either at all temperatures or below some transition temperature Ttr), the amount of that displacement would act as an order parameter, i.e., increase or saturate with lower temperature and not decrease monotonically to zero as the data show it does. Hence, any off-center displacement of the octahedral Rbþ cations can be excluded from this x-ray powder diffraction experiment. Consequently, other possible causes for the features seen in NMR and EXAFS experiments that had given rise to this suggestion must be researched. For such an experiment to be valid and successful, it must be possible to put very tight limits on the accuracy of the measured structural parameters. This requires that two conditions are met. First, the sample must have the highest possible quality, in terms of both purity and crystallinity, and the data must have both high resolution and very good counting statistic. Second, the accuracy for each measured parameter must be judged carefully, which is not trivial in xray powder diffraction. Note, in particular, the discussion under Estimated Standard Deviations in Rietveld Refinements, above.
LITERATURE CITED Altomare, A., Burla, M. C., Cascarano, G., Giacovazzo, C., Guagliardi, A., Moliterni, A. G. G., and Polidori, G. 1995. Extra: A program for extracting structure-factor amplitudes from powder diffraction data. J. Appl. Crystallogr. 28:842–846. See Internet Resources.
Altomare, A., Cascarano, G., Giacovazzo, C., Guagliardi, A., Burla, M. C., Polidori, G., and Camalli, M. 1994. SIRPOW.92—a program for automatic solution of crystal structures by direct methods optimized for powder data. J. Appl. Crystallogr. 27:435–436. See Internet Resources. Ashcroft, N. W. and Mermin, N. D. 1976. Solid State Physics. Holt, Rinehart and Winston, New York. Balzar, D. and Ledbetter, H. 1993. Voigt-function modeling in Fourier analysis of size- and strain-broadened X-ray diffraction peaks. J. Appl. Crystrallogr. 26:97–103. Bendele, G. M., Stephens, P. W., and Fischer, J. E. 1998. Octahedral cations in Rb3C60: Reconciliation of conflicting evidence from different probes. Europhys. Lett. 41:553–558. Bergman, G., Waugh, J. L. T., and Pauling, L. 1952. Crystal structure of the intermetallic compound Mg32(Al,Zn)49 and related phases. Nature 169:1057. Bergman, G., Waugh, J. L. T., and Pauling, L. 1957. The crystal structure of the metallic phase Mg32(Al, Zn)49. Acta Crystallogr. 10:254–257. Bish, D. L. and Post, J. E. (eds.). 1989. Modern Powder Diffraction. Mineralogical Society of America, Washington, D.C. Blanton, T. N., Huang, T. C., Toraya, H., Hubbard, C. R., Robie, S. B., Loue¨ r, D., Go¨ bel, H. E., Will, G., Gilles, R., and Raftery T. 1995. JCPDS—International Centre for Diffraction Data round robin study of silver behenate. A possible low-angle Xray diffraction calibration standard. Powder Diffraction 10:91–95. Boultif, A. and Loue¨ r, D. 1991. Indexing of powder diffraction patterns for low symmetry lattices by the successive dichotomy method. J. Appl. Crystallogr. 24:987–993. Brindley, G. W. 1945. The effect of grain or particle size on x-ray reflections from mixed powders and alloys, considered in relation to the quantitative determination of crystalline substances by x-ray methods. Philos. Mag. 36:347–369. Brown, P. J., Fox, A. G., Maslen, E. N., O’Keefe, M. A., and Willis, B. T. M. 1992. Intensity of diffracted intensities. In International Tables for Crystallography, Vol. C (A. J. C. Wilson, ed.) pp. 476–516. Kluwer Academic Publishers, Dordrecht. Clarke, R. and Rowe, W. P. 1991. Real-time X-ray studies using CCDs. Synchrotron Radiation News 4(3):24–28. Dinnebier, R. E., Pink, M., Sieler, J., and Stephens, P. W. 1997. Novel alkali metal coordination in phenoxides: Powder diffraction results on C6H5OM [M = K, Rb, Cs]. Inorg. Chem. 36:3398– 3401.
X-RAY POWDER DIFFRACTION
849
Dinnebier, R. E., Stephens, P. W., Byrn, S., Andronis, V., and Zografi, G. 1996. Detection limit for polymorphs in drugs using powder diffraction (J. B. Hastings, ed.). p. B47. In BNL-NSLS 1995 Activity Report. National Synchrotron Light Source, Upton, N.Y.
Rodriguez-Carvajal, R. 1990. FULLPROF: A program for Rietveld refinement and pattern matching analysis. In Abstracts of the Satellite Meeting on Powder Diffraction of the XV Congress of the IUCr, p. 127. IUCr, Toulouse, France. See Internet Resources.
Finger, L. W., Cox, D. E., and Jephcoat, A. P. 1994. A correction for powder diffraction peak asymmetry due to axial divergence. J. Appl. Crystallogr. 27:892–900.
Sabine, T. M. 1993. The flow of radiation in a polycrystalline material. In The Rietveld Method (R. A. Young, ed.). pp. 55–61. Oxford University Press, Oxford.
Giacovazzo, C. (ed.) 1992. Fundamentals of Crystallography. Oxford University Press, Oxford.
Smith, G. W. 1991. X-ray imaging with gas proportional detectors. Synchrotron Radiation News 4(3):24–30.
Guryan, C. A., Stephens, P. W., Goldman, A. I., and Gayle, F. W. 1988. Structure of icosahedral clusters in cubic Al5.6Li2.9Cu. Phys. Rev. B 37:8495–8498.
Stephens, P. W. 1999. Phenomenological model of anisotropic peak broadening in powder diffraction. J. Appl. Crystallogr. 32:281–289.
Hill, R. J. and Cranswick, L. M. D. 1994. International Union of Crystallography Commission on Powder Diffraction Rietveld Refinement Round Robin. II. Analysis of monoclinic ZrO2. J. Appl. Crystallogr. 27:802–844.
Thompson, P., Cox, D. E., and Hastings. J. B. 1987. Rietveld refinement of Debye-Scherrer synchrotron X-ray data from Al2O3. J. Appl. Crystallogr. 20:79–83.
International Center for Diffraction Data (ICDD). 1998. PDF-2 Powder Diffraction File Database. ICCD, Newtown Square, Pa. See Internet Resources. Ito, M. and Amemiya, Y. 1991. X-ray energy dependence and uniformity of an imaging plate detector. Nucl. Instrum. Methods Phys. Res. A310:369–372. Jenkins, R. and Snyder, R. L. 1996. Introduction to X-ray Powder Diffractometry. John Wiley & Sons, New York. Kittel, C. 1996. Introduction to Solid State Physics. John Wiley & Sons, New York. Klug, H. P. and Alexander, L. E. 1974. X-ray Diffraction Procedures for Polycrystalline and Amorphous Materials. John Wiley & Sons, New York. Larson, A. C. and Von Dreele, R. B. 1994. GSAS: General Structure Analysis System. Los Alamos National Laboratory, publication LAUR 86–748. See Internet Resources.
Visser, J. W. 1969. A fully automatic program for finding the unit cell from powder data. J. Appl. Crystallogr. 2:89–95. Vos, A. and Buerger, M. J. 1996. Space-group determination and diffraction symbols. In International Tables for Crystallography, Vol. A (T. Hahn, ed.) pp. 39–48. Kluwer Academic Publishers, Dordrecht. Werner, P.-E., Eriksson, L., and Westdahl, M. 1985. TREOR, a semi-exhaustive trial-and-error powder indexing program for all symmetries. J. Appl. Crystallogr. 18:367–370. Young, R. A. (ed.) 1993. The Rietveld Method. Oxford University Press, Oxford.
KEY REFERENCES Azaroff, L. V. 1968. Elements of X-ray Crystallography. McGrawHill, New York.
Le Bail, A., Duryo, H., and Fourquet, J. L. 1988. Ab-initio structure determination of LiSbWO6 by X-ray powder diffraction. Mater. Res. Bull. 23:447–452. Maslen, E. N. 1992. X-ray absorption. In International Tables for Crystallography, Vol. C (A. J. C. Wilson, ed.). pp. 520–529. Kluwer Academic Publishers, Dordrecht.
Gives comprehensive discussions of x-ray crystallography albeit not particularly specialized to powder diffraction.
Miyahara, J., Takahashi, K., Amemiya, Y., Kamiya, N., and Satow, Y. 1986. A new type of X-ray area detector utilizing laser stimulated luminescence. Nucl. Instrum. Methods Phys. Res. A246:572–578.
Another comprehensive text on x-ray crystallography, not particularly specialized to powder diffraction.
Norby, P., Poshni, F. I., Gualtieri, A. F., Hanson, J. C., and Grey, C. P. 1998. Cation migration in zeolites: An in-situ powder diffraction and MAS NMR study of the structure of zeolite Cs(Na)-Y during dehydration. J. Phys. Chem. B 102:839– 856.
An invaluable resource for all aspects of crystallography. Volume A treats crystallographic symmetry in direct space, and it includes tables of all crystallographic plane, space, and point groups. Volume B contains accounts of numerous aspects of reciprocal space, such as the structure-factor formalism, and Volume C contains mathematical, physical, and chemical information needed for experimental studies in structural crystallography.
Noyan, I. C. and Cohen, J. B. 1987. Residual Stress. Springer- Verlag, New York.
Bish and Post, 1989. See above. An extremely useful reference for details on the Rietveld method. Giacovazzo, 1992. See above.
International Tables for Crystallography (3 vols.). Kluwer Academic Publishers, Dordrecht.
Pawley, G. S. 1981. Unit-cell refinement from powder diffraction scans. J. Appl. Crystallogr. 14:357–361.
Jenkins and Snyder, 1996. See above.
Poshni, F. I., Ciraolo, M. F., Grey, C. P., Gualtieri, A. F., Norby, P., and Hanson, J. C. 1997. An in-situ X-ray powder diffraction study of the dehydration of zeolite CsY (J. B. Hastings, ed.). p. B84. In BNL-NSLS 1996 Activity Report. National Synchrotron Light Source, Upton, N.Y.
Klug and Alexander, 1974. See above.
Prince, E. and Spiegelmann, C. H. 1992. Statistical significance tests. In International Tables for Crystallography, Vol. C (A. J. C. Wilson, ed.). pp. 618–621. Kluwer Academic Publishers, Dordrecht. Rietveld, H. M. 1969. A profile refinement method for nuclear and magnetic structures. J. Appl. Crystallogr. 2:65–71.
Offers a modern approach to techniques, especially of phase identification and quantification; however, there is very little discussion of synchrotron radiation or of structure solution. The classic text about many aspects of the technique, particularly the analysis of mixtures, although several of the experimental techniques described are rather dated. Langford, J. I. and Loue¨ r, D. 1996. Powder diffraction. Rep. Prog. Phys. 59:131–234. Offers another approach to techniques that includes a thorough discussion of synchrotron radiation and of structure solution.
850
X-RAY TECHNIQUES
Young, 1993. See above. Another valuable reference for details on the Rietveld refinements.
INTERNET RESOURCES http://www.ba.cnr.it/IRMEC/SirWare.html http://ww.iccd.com http://www.ccp14.ac.uk An invaluable resource for crystallographic computing that contains virtually all freely available software for power (and also single-crystal) diffraction for academia, including every program mentioned in this unit. ftp://ftp.lanl.gov/public/gsas ftp://charybde.saclay.cea.fr/pub/divers/fullp/
PETER W. STEPHENS State University of New York Stony Brook, New York
GOETZ M. BENDELE Los Alamos National Laboratory Los Alamos, New Mexico, and State University of New York Stony Brook, New York
SINGLE-CRYSTAL X-RAY STRUCTURE DETERMINATION INTRODUCTION X-ray techniques provide one of the most powerful, if not the most powerful, methods for deducing the crystal and molecular structure of a crystalline solid. In general, it is a technique that provides a large number of observations for every parameter to be determined and results obtained thereby can usually be considered very reliable. Significant advances have taken place, especially over the last twenty years, to make the collection of data and the subsequent steps to carry out the rest of the structure determination amenable to the relative novice in the field. This has led to the adoption of single-crystal diffraction methods as a standard analytical tool. It is no accident that this has paralleled the development of the digital computer—modern crystallography is strongly dependent on the use of the digital computer. Typically, a small single crystal (a few tenths of a millimeter in dimension) of the material to be investigated is placed on a diffractometer and data are collected under computer-controlled opera˚ on edge) can tion. A moderate-sized unit cell (10 to 20 A yield 5000 or so diffraction maxima, which might require a few day’s data collection time on a scintillation counter diffractometer or only hours of data collection time on the newer charge-coupled device (CCD) area detectors. From these data the fractional coordinates describing the positions of the atoms within the cell, some characteristics of their thermal motion, the cell dimensions, the crystal system, the space group symmetry, the number of formula units per cell, and a calculated density can be obtained. This ‘‘solving of the structure’’ may only take a
few additional hours or a day or two if no undue complications occur. We will concentrate here on the use of X-ray diffraction techniques as would be necessary for the normal crystal structure investigation. A discussion of the basic principles of single-crystal X-ray diffraction will be presented first, followed by a discussion of the practical applications of these techniques. In the course of this discussion, a few of the more widely used methods of overcoming the lack of measured phases, i.e., the ‘‘phase problem’’ in crystallography, will be presented. These methods are usually sufficient to handle most routine crystal structure determinations. Space will not permit discussion of other more specialized techniques that are available and can be used when standard methods fail. The reader may wish to refer to other books devoted to this subject (e.g., Ladd and Palmer, 1985; Stout and Jensen, 1989; Lipscomb and Jacobson, 1990) for further information on these other techniques and for a more in-depth discussion of diffraction methods. Competitive and Related Techniques Single-crystal X-ray diffraction is certainly not the only technique available to the investigator for the determination of the structure of materials. Many other techniques can provide information that is often complementary to that obtained from single-crystal X-ray experiments. Some of these other techniques are briefly described below. Crystalline powders also diffract X rays. In fact, it is convenient to view powder diffraction as single-crystal diffraction integrated over angular coordinates, thus retaining only sin y dependence (assuming the powder contains crystallites in random orientation). Diffraction maxima with essentially the same sin y values combine in the powder diffraction pattern. This results in fewer discrete observations and makes the unit cell determination and structure solution more difficult; this is especially true in lower symmetry cases. The fewer number of data points above background translates into larger standard deviations for the determined atomic parameters. Since the angular information is lost, obtaining an initial model is often quite difficult if nothing is known about the structure. Therefore, the majority of quantitative powder diffraction investigations are done in situations where an initial model can be obtained from a related structure; the calculated powder pattern is then fit to the observed one by adjusting the atomic parameters, the unit cell parameters, and the parameters that describe the peak shape of a single diffraction maximum. (This is usually termed Rietveld refinement; see X-RAY POWDER DIFFRACTION and NEUTRON POWDER DIFFRACTION) If an appropriate single crystal of the material could be obtained, single-crystal diffraction would be the preferred approach. Powder diffraction sometimes provides an alternate approach when only very small crystals can be obtained. Neutrons also undergo diffraction when they pass through crystalline materials. In fact, the theoretical descriptions of X-ray and neutron diffraction are very similar, primarily differing in the atomic scattering factor used (see NEUTRON POWDER DIFFRACTION and SINGLE-CRYSTAL
SINGLE-CRYSTAL X-RAY STRUCTURE DETERMINATION NEUTRON DIFFRACTION). The principal interaction of X rays with matter is electronic, whereas with neutrons two interactions predominate—one is nuclear and the other is magnetic (Majkrzak et al., 1990). The interaction of the neutron with the nucleus varies with the isotope and typically does not show the larger variation seen with X rays, where the scattering is proportional to the atomic number. Therefore neutrons can be useful in distinguishing between atoms of neighboring atomic number or light atoms, especially hydrogen in the presence of heavy atoms. The neutron scattering factors for hydrogen and deuterium are also markedly different, which opens up a number of interesting possible structural studies, especially of organic and biological compounds. As noted above, neutrons are also sensitive to magnetic structure (see MAGNETIC NEUTRON SCATTERING). In fact, neutron scattering data constitute the most significant body of experimental evidence regarding long-range magnetic ordering in solids. In practice, with the neutron sources that are presently available, it is necessary to use a considerably larger sample for a neutron diffraction investigation than is necessary for an X-ray investigation. Single-crystal X-ray studies are usually carried out initially, followed by the single-crystal neutron investigation to obtain complementary information where sample preparation is not a problem. (It might be noted that for powder studies neutron diffraction can have some advantages over X-ray diffraction in terms of profile pattern fitting due to the Gaussian character of the neutron diffraction peaks and the enhanced intensities that can be obtained at higher sin y values; see NEUTRON POWDER DIFFRACTION). Electron diffraction can also provide useful structural information on materials. The mathematical description appropriate for the scattering of an electron by the potential of an atom has many similarities to the description used in discussing X-ray diffraction. Electrons, however, are scattered much more efficiently than either X rays or neutrons, so much so that electrons can penetrate only a few atomic layers in a solid and are likely to undergo multiple scattering events in the process. Hence, electron diffraction is most often applied to the study of surfaces (low-energy electron diffraction; LEED), in electron-microscopic studies of microcrystals, or in gas-phase diffraction studies. X-ray absorption spectroscopy (EXAFS) (Heald and Tranquada, 1990) provides an additional technique that can be used to obtain structural information (see XAFS SPECTROSCOPY). At typical X-ray energies, absorption is a much more likely event than the scattering process that we associate with single-crystal diffraction. When an Xray photon is absorbed by an electron in an atom, the electron is emitted with an energy of the X-ray photon minus the electron-binding energy. Thus some minimum photon energy is necessary, and the X-ray absorption spectrum shows sharp increases in absorption as the various electron-binding energies are crossed. These sharp steps are usually termed X-ray edges and have characteristic values for each element. Accurate measurements have revealed that there is structure near the edge. This fine structure is explained by modulations in the final state of the photoelectron that are caused by backscattering from the sur-
851
rounding atoms. The advent of synchrotron X-ray sources has accelerated the use of EXAFS since they provide the higher intensities that are necessary to observe the fine structure. The local nature of EXAFS makes it particularly sensitive to small perturbations within the unit cell and can thus complement the information obtained in diffraction experiments. With EXAFS it is possible to focus in on a particular atom type, and changes can be monitored without determining the entire structure. It can be routinely performed on a wide variety of heavier elements. When long-range order does not exist, diffraction methods such as X-ray diffraction have less utility. EXAFS yields an averaged radial distribution function pertinent to the particular atom and can be applied to amorphous as well as crystalline materials. EXAFS can be considered as being complementary to single-crystal diffraction. Solid-state nuclear magnetic resonance (Hendrichs and Hewitt, 1980) and Mo¨ ssbauer spectroscopy (Berry, 1990; see MOSSBAUER SPECTROMETRY) can sometimes provide additional useful structural information, especially for nonsingle-crystal materials. PRINCIPLES OF X-RAY CRYSTALLOGRAPHY When a material is placed in an X-ray beam, the periodically varying electric field of the X-ray accelerates the electrons into periodic motion; each in turn emits an electromagnetic wave with a frequency essentially identical to that of the incident wave and with a definite phase relation to the source that we will take as identical to that of the incident wave. To the approximation usually employed, all electrons are assumed to scatter with the same amplitude and phase and to do so independently of all other electrons. (The energy of the X ray, approximately 20 keV for Mo Ka, is usually large in comparison with the binding energy of the electron, except for the inner electrons of the heavier atoms. It should be noted, however, that the anomalous behavior of first shell, or first- and second-shell electrons in heavy elements, can provide a useful technique for phase determination in some cases and has proven to be especially valuable in protein crystallography.) Interference can occur from electrons occupying different spatial positions, providing the spatial difference is comparable to the wavelength being employed. Interference effects can occur within an atom, as is shown in Figure 1, where the variation of scattering with angle for
Figure 1. Decrease of amplitude of scattering from an atom of atomic number Z as a function of scattering angle y. The function f is the atomic scattering factor. All atoms show similar shaped functions, the rate of decrease being somewhat less for atoms of larger Z; their inner electrons are closer to the nucleus.
852
X-RAY TECHNIQUES
a typical atom is plotted. These interference effects cause f, the atom’s amplitude of scattering, to decrease as the scattering angle increases. Interference effects can also occur between atoms, and it is the latter that permits one to deduce the positions of the atoms in the unit cell, i.e., the crystal structure. It is mathematically convenient to use a complex exponential description for the X-ray wave, E ¼ E0 exp½2piðnt x=l þ dÞ
I ¼ EE ¼ jE1 þ E2 j2 ð3Þ
Furthermore, since E01 would be expected to equal E02 , I ¼ jE0 ½expð 2pix1 =lÞ þ expð 2pix2 =lj2 ¼ E20 j½f1 þ exp½ 2piðx2 x1 Þ=lgexpð 2pix1 =lÞj2 ¼ E20 jf1 þ exp½ 2piðx2 x1 Þ=lgj2
ð4Þ
Generalizing to three dimensions, consider a wave (Fig. 2) incident upon an electron at the origin and a second electron at r. Then the difference in distance would be given by r ðs0 sÞ, where s0 and s represent directions of the incident and scattered rays and r is the distance between electrons I ¼ E20 j1 þ exp½2pir ðs s0 Þ=lj2
ð5Þ
or for n electrons I¼
n 1 X
2 exp½2pirj ðs s0 Þ=l j ¼ 0
E20
The structure factor expression can be further simplified by replacing the summation by an integral, F¼
ð
rðrÞexp½2pir ðs s0 Þ=ldV
where r(r) is the electron density. If the assumption is made that the electron density is approximated by the sum of atom electron densities, then F¼
N X
exp½2pirj ðs s0 Þ=l
j¼0
Figure 2. (A) Unit vectors s0 and s represent directions of the incident and scattered waves, respectively. (B) Scattering from a point at the origin and a second point at r. The point of observation P is very far away compared with the distance r.
fj exp½2pirj ðs s0 Þ=lÞ
ð10Þ
j¼0
where fj is the atomic scattering factor, i.e., the scattering that would be produced by an isolated atom of that atomic number, convoluted with effects due to thermal motion. To simplify this equation further, we now specify that the sample is a single crystal, and rj ¼ ðxj þ pÞa1 þ ðyj þ mÞa2 þ ðzj þ nÞa3 . The vectors a1, a2, and a3 describe the repeating unit (the unit cell), xj, yj, zj are the fractional coordinates of atom j in the cell, and p, m, and n are integers. Since rj ðs s0 Þ=l must be dimensionless, the ðs s0 Þ=l quantity must be of dimension reciprocal length and can be represented by ðs s0 Þ=l ¼ hb1 þ kb2 þ lb3
ð11Þ
where b1, b2, and b3 are termed the reciprocal cell vectors. They are defined such that ai bi ¼ 1 and ai bj ¼ 0. (It should be noted that another common convention is to use a, b, c to represent unit dimensions in direct space and a*, b*, c* to represent the reciprocal space quantities.) By substituting into the structure factor expression and assuming at least a few hundred cells in each direction, it can be shown (Lipscomb and Jacobson, 1990) that the restriction of p, m, and n to integer values yields nonzero diffraction maxima only when h, k, and l are integer. Hence the structure factor for single-crystal diffraction is usually written as
ð6Þ
Since we will only require relative intensities, we define n 1 X
ð9Þ
V
Fhkl ¼ s
F¼
ð8Þ
ð2Þ
For two electrons displaced in one dimension the intensity of scattering would be given by
¼ jE01 expð 2pix1 =lÞ þ E02 expð 2pix2 =lÞj2
I / jFj2
ð1Þ
where x is the distance, l is the wavelength, n is the frequency, t is the time, and d is a phase factor for the wave at x ¼ 0 and t ¼ 0. Anticipating that we will be referring to intensities (i.e., EE*, E* being the complex conjugate of E) and only be concerned with effects due to different spatial positions, we will use the simpler form E ¼ E0 expð 2pix=lÞ
where F is termed the structure factor and
ð7Þ
N X
fj exp½2piðhxj þ kyj þ lzj Þ
ð12Þ
j¼1
where the sum is now over only the atoms in the repeating unit and the other terms (those involving p, m, and n) contribute to a constant multiplying factor, s.
SINGLE-CRYSTAL X-RAY STRUCTURE DETERMINATION
853
Figure 3. (A) Graphical representation of s=l s0 =l ¼ h. (B) Path difference between successive planes a distance d0 apart is 2d00 sin y, which yields a maximum when equal to nl.
We now define a reciprocal lattice vector h as h ¼ hb1þkb2þlb3. Also from Equation 11, hl ¼ s s0
ð13Þ
and this is termed the Laue equation. The diffraction pattern from a single crystal therefore can be viewed as yielding a reciprocal lattice pattern (h, k, l all integer) that is weighted by |Fhkl|2. It can be readily shown that h is a vector normal to a plane passing through the unit cell with intercepts a1/h, a2/k, and a3/l. The integers h, k, and l are termed crystallographic indices (Miller indices if they are relatively prime numbers; see SYMMETRY IN CRYSTALLOGRAPHY). If 2y is defined as the angle between s and s0, then the scalar equivalent of the Laue equation is l ¼ 2dsiny
ð14Þ
i.e., Bragg’s law, where jhj ¼ 1=d, the distance between planes. (If Miller indices are used, a factor of n, the order of the diffraction, is included on the left side of Equation 14.) A convenient graphical representation of the Laue equation is given in Figure 3 and alternately in what is termed the Ewald construction in Figure 4. These are
especially useful when discussing diffractometers and other devices for the collection of X-ray data. Diffraction maxima will only occur when a reciprocal lattice point intersects the sphere of diffraction. The scattering from an isolated atom represents a combination of the scattering of an atom at rest convoluted with the effects due to thermal motion. These can be separated, and for an atom undergoing isotropic motion, a thermal factor expression Tj ¼ expð Bj sin2 y=l2 Þ
ð15Þ
can be used, where Bj ¼ 8p2 m2 , m2 being the mean-square amplitude of vibration. For anisotropic motion, we use Tj ¼ exp½ 2p2 ðU11 h2 b21 þ U22 k2 b22 þ U33 l2 b23 þ 2U12 hkb1 b2 þ 2U13 hlb1 b3 þ 2U23 klb2 b3 Þ ð16aÞ or the analogous expression Tj ¼ exp½ ðb11 h2 þ b22 k2 þ b33 l2 þ 2b12 hk þ 2b13 hl þ 2b23 klÞ
ð16bÞ
The U ’s are thermal parameters expressed in terms of mean-square amplitudes in angstroms, while the b’s are the associated quantities without units. The structure factor is then written as Fhkl ¼ s
N X
fj exp½2piðhxj þ kyj þ lzj ÞTj
ð17Þ
j¼1
Figure 4. Ewald construction. Assume the reciprocal lattice net represents the h0l zone of a monoclinic crystal. Then the a2 axis [010] is perpendicular to the plane of the paper. As the crystal is rotated about this axis, various points of the reciprocal lattice cross the circle of reflection; as they do so, the equation s=l s0 =l ¼ h is satisfied, and a diffracted beam occurs. Diffraction by the [10 1] plane is illustrated.
As we have seen, one form of the structure factor expression (Equation 9) involves r(r), the electron density function, in a Fourier series. Therefore, it would be expected that there exists an inverse Fourier series in which the electron density is expressed in terms of structure factor quantities. Indeed, the electron density function for the cell can be written as
rðrÞ ¼
1 1 1 X X 1 X Fhkl exp½ 2piðhx þ ky þ lzÞ V h ¼ 1 k ¼ 1 l ¼ 1
ð18Þ
854
X-RAY TECHNIQUES
In this expression V is the volume of the cell and the triple summation is in theory from 1 to 1 and in practice over all the structure factors. This series can be written in a number of equivalent ways. The structure factor can be expressed in terms of the real and imaginary components A and B as Fhkl ¼ Ahkl þ iBhkl X ¼s fj cos2pðhxj þ kyj þ lzj ÞTj j
þ is
X
fj sin2pðhxj þ kyj þ lzj ÞTj
ð19Þ
ð20Þ
j
and rðxyzÞ ¼
1 1 1 X X 1 X ½Ahkl cos2pðhx þ ky þ lzÞ V h ¼ 1 k ¼ 1 l ¼ 1
þ Bhkl sin2pðhx þ ky þ lzÞ
ð21Þ
Another form for the structure factor is Fhkl ¼ jFhkl j expð2piahkl Þ
ð22Þ
ahkl ¼ tan 1 ðBhkl =Ahkl Þ
ð23Þ
jFhkl j ¼ ðA2hkl þ B2hkl Þ1=2
ð24Þ
Then
and
The quantity ahkl is termed the crystallographic phase angle. It could also be noted that, using h to designate h Ahkl ¼ Ahkl
ð25Þ
Bhkl ¼ Bhkl
ð26Þ
and
if the atomic scattering factor fj is real. Also, jFhkl j2 ¼ jFhkl j2
ð27Þ
Equation 27 is termed Friedel’s law and as noted holds as long as fj is real. Equation 18 implies that the electron density for the cell can be obtained from the structure factors; a knowledge of the electron density would enable us to deduce the location and types of atoms making up the cell. The magnitude of the structure factor can be obtained from the intensities (Equation 8); however, the phase is not measured, and this gives rise to what is termed the phase problem in crystallography. Considerable effort has been devoted over the last half century to developing reliable methods to deduce these phases. One of the most widely used of the current methods is that developed in large part by J. Karle and H. Hauptman (Hauptman and Karle,
1953; Karle and Hauptman, 1956; Karle and Karle, 1966), which uses a statistical and probability approach to extract phases directly from the magnitudes of the structure factors. We will give a brief discussion of the background of such ‘‘direct methods’’ as well as a discussion of the heavy-atom method, another commonly employed approach. It should be noted, however, that a number of other techniques could be used if these methods fail to yield a good trial model; these will not be discussed here due to space limitations. Symmetry in Crystals Symmetry plays an important role in the determination of crystal structures; only those atomic positional and thermal parameters that are symmetry independent need to be determined. In fact, if the proper symmetry is not recognized and attempts are made to refine parameters that are symmetry related, correlation effects occur and unreliable refinement and even divergence can result. What restrictions are there on the symmetry elements and their combinations that can be present in a crystal? (We will confine our discussion to crystals as generated by a single tiling pattern and exclude cases such as the quasi-crystal.) Consider a lattice vector a operated on by a rotation axis perpendicular to the plane of the paper, as shown in Figure 5. Rotation by a or a will move the head of the vector to two other lattice points, and these can be connected by a vector b. Since b must be parallel to a, one can relate their lengths by b¼ma, where m is an integer, or by b ¼ 2a cos a. Thus m ¼ 2 cos a and a ¼ cos 1 ðm=2Þ
ð28Þ
Obviously the only allowed values for m are 0; 1, 2, which yield 0, 60 , 90 , 120 , 180 , 240 , 270 , and 300 as possible values for the rotation angle a. This in turn implies rotation axes of order 1, 2, 3, 4, or 6. No other orders are compatible with the repeating character of the lattice. A lattice can also have inversion symmetry, as would be implied by Friedel’s law (Equation 27). Coupling the inversion operation with a rotation axis produces the rotatory inversion axes: 1, 2, 3, 4, and 6. (These operations can be viewed as a rotation followed by an inversion. The 1 is just the inversion operation and 2 the mirror operation.) Various combinations of these symmetry elements can now be made. In doing so, 32 different point groups are produced (see SYMMETRY IN CRYSTALLOGRAPHY). Since in a lattice description the repeating unit is represented by a point and the lattice has inversion symmetry, those point
Figure 5. Rotation axis A(a) at a lattice point. Note that the vector b is parallel to a.
SINGLE-CRYSTAL X-RAY STRUCTURE DETERMINATION
855
Table 1. Fourteen Bravais Lattices Symmetry at a Lattice Point Ci 1 C2h 2=2m
Name
Lattice Points Per Unit Cell 1 1 2 1 2 4 2 1
D3d 3m
Triclinic Primitive monoclinic End-centered monoclinic C Primitive orthorhombic End-centered C, A, B Face-centered F Body-centered I Primitive tetragonal (equivalent to end-centered) Body-centered (equivalent to face-centered) I Trigonal
D6h 6=m mm
Heagonal
1
Oh m3m
Primitive cubic Face-centered cubic F Body-centered cubic I
1 4 2
D2h
D4h 4=m mm
groups that differ only by an inversion center can be grouped together. This produces the seven crystal systems given in Table 1. For a number of crystal systems, in addition to the primitive cell (one lattice point per cell), there are also centered cell possibilities. (Such cells are not allowed if redrawing the centered cell to obtain a primitive one can be done without loss of symmetry consistent with the crystal system. Also, if redrawing or renaming axes converts one type to another, only one of these is included.) The notation used to describe centered cells is as follows: A—centered on the a2, a3 face; B—centered on the a1, a3 face; C—centered on the a1, a2 face; I—body centered; and F—face centered. As noted in the table, trigonal cells can be reported using a primitive rhombohedral cell or as a hexagonal cell. If the latter, then a triply primitive R-centered cell is used. R centering has lattice points at (1/3, 2/3, 2/3) and (2/3, 1/3, 1/3) as well as at (0, 0, 0). The next task is to convert these 32 point groups into the related space groups. This can be accomplished by generating not only space groups having these point group symmetries but also groups in which the point group operation is replaced by an ‘‘isomorphous’’ operation that has a translational element associated with it. However, this must be done in such a way as to preserve the multiplicative properties of the group. The symmetry operation isomorphous to the rotation axis is the screw axis. They are designated by Nm, where the rotation is through an angle 2p/N and the translation is m/N, in terms of fractions of the cell, and is parallel to the rotation axis. Thus 21 is a 180 rotation coupled with a 1/2 cell translation, 41 is a 90 rotation coupled with 1/4 cell translation, and 62 represents a 60 rotation coupled
Elementary Cell Relations a1 ¼ a3 ¼ 90 a1 ¼ a2 ¼ a3 ¼ 90
a1 ¼ a2 ¼ a3 ¼ 90 a1 ¼ a2
2 1
a1 ¼ a2 ¼ a3 a1 ¼ a2 ¼ a3 (rhombohedral axes, for hexagonal description, see below) a1 ¼ a2 ¼ 90 a3 ¼ 120 a1 ¼ a2 a1 ¼ a2 ¼ a3 ¼ 90 a1 ¼ a2 ¼ a3
with a 1/3 cell translation. Note that 21 21 yields a cell translation along the rotation axis direction and hence produces a result isomorphous to the identity operation; performing a 62 six times yields a translation of two cell lengths. Mirror planes can be coupled with a translation and converted to glide planes. The following notation is used: a glide, translates 1/2 in a1; b glide, translates 1/2 in a2; c glide, translates 1/2 in a3; n glide, translates 1/2 along the face diagonal; and d glide, translates 1/4 along the face or body diagonal. Let us digress for a moment to continue to discuss notation in general. When referring to space groups, the Hermann-Mauguin notation is the one most commonly used (see SYMMETRY IN CRYSTALLOGRAPHY). The symbol given first indicates the cell type, followed by the symmetry in one or more directions as dictated by the crystal system. For the triclinic system, the only possible space groups are P1 and P1. For the monoclinic system, the cell type, P or C, is followed by the symmetry in the b direction. (The b direction is typically taken as the unique direction for the monoclinic system, although one can occasionally encounter cunique monoclinic descriptions in the older literature.) If a rotation axis and the normal to a mirror plane are in the same direction, then a slash (/) is used to designate that the two operations are in the same direction. For the point group C2h, the isomorphous space groups in the monoclinic system include P2/m, P21/m, P2/c, P21/c, C2/m and C2/c. In the orthorhombic system, the lattice type is followed by the symmetry along the a, b, and c directions. The space groups P222, Pca2, P212121, Pmmm, Pbca, Ama2,
856
X-RAY TECHNIQUES
Fdd2, and Cmcm are all examples of orthorhombic space groups. The notation for the higher symmetry crystal systems follows in the same manner. The symmetry along the symmetry axis of order greater than 2 is given first followed by symmetry in the other directions as dictated by the point group. For tetragonal and hexagonal systems, the symmetry along the c direction is given, followed by that for a and then for the ab diagonal, when isomorphous with a D-type point group. For cubic space groups the order is symmetry along the cell edge, followed by that along the body diagonal, followed by the symmetry along the face diagonal, where appropriate. A space group operation can be represented by 0
1 0 1 x x0 @ y0 A ¼ R@ y A þ t z z0
ð29Þ
Examination of the symmetry of the diffraction pattern (the Laue symmetry) permits the determination of the crystal system. To further limit the possible space groups within the crystal system, one needs to determine those classes of reflections that are systematically absent (extinct). Consider the c-glide plane as present in P21/c. For every atom at (x, y, z), there is another at (x, 1/2 y, 1/2þz). The structure factor can then be written as
Fhkl ¼
N=2 X
fj exp½2piðhxj þ kyj þ lzj Þ j¼1
1 1
yj þ l
zj þexp 2pi hxj þ k 2 2 For those reflections with k ¼ 0,
Fh0l ¼ where R is a 3 3 matrix corresponding to the point group operation and t is a column vector containing the three components of the associated translation. As noted above, group properties require that operations in the space group combine in the same fashion as those in the isomorphous point group; this in turn usually dictates the positions in the cell of the various symmetry elements making up the space group. Consider, for example, the space group P21/c in the monoclinic system. This commonly occurring space group is derived from the C2h point group. Since in the point group a twofold rotation followed by the reflection is equivalent to the inversion, in the space group the 21 operation followed by the c glide must be equivalent to the inversion. The usual convention is to place an inversion at the origin of the cell. For this to be so, the screw axis has to be placed at z ¼ 1=4 and the c glide at y ¼ 1=4. Therefore, in P21/c, the following relations hold: i
x; y; z ! x; y; z 21
x; y; z ! x; 1=2 þ y; 1=2 z c
x; y; z ! x; 1=2 y; 1=2 þ z
ð30aÞ ð30bÞ ð30cÞ
and these four equivalent positions are termed the general positions in the space group. In many space groups it is also possible for atoms to reside in ‘‘special positions,’’ locations in the cell where a point group operation leaves the position invariant. Obviously these can only occur for symmetry elements that contain no translational components. In P21/c, the only special position possible is associated with the inversion. The number of equivalent positions in this case is two, namely, 0, 0, 0 and 0, 1/2, 1/2 or any of the translationally related pairs. By replacing any point group operation by any allowable isomorphous space operation, the 32 point groups yield 230 space groups. A complete listing of all the space groups with associated cell diagrams showing the placement of the symmetry elements and a listing of the general and special position coordinates can be found in Volume A, International Tables for Crystallography, 1983.
ð31Þ
N=2 X
fj exp½2piðhxj þ lzj Þ½1 þ ð 1Þl
ð32Þ
j¼1
Thus, for h0l reflections, all intensities with l odd would be absent. Different symmetry elements give different patterns of missing reflections as long as the symmetry element has a translational component. Point group elements show no systematic missing reflections. Table 2 lists the extinction conditions caused by various symmetry elements. In many cases an examination of the extinctions will uniquely determine the space group. In others, two possibilities may remain that differ only by the presence or absence of a center of symmetry. Statistical tests (Wilson, 1944) have been devised to detect the presence of a center of symmetry; one of the most reliable is that developed by Howells et al. (1950). It should be cautioned, however, that while such tests are usually reliable, one should not interpret their results as being 100% certain. Further discussion of space group symmetry can be found in SYMMETRY IN CRYSTALLOGRAPHY as well as in the literature (International Tables for Crystallography, 1983; Ladd and Palmer, 1985; Stout and Jensen, 1989; Lipscomb and Jacobson, 1990). Crystal Structure Refinement After data have been collected and the space group determined, a first approximation to the crystal structure must then be found. (Methods to do so are described in the next section.) Such a model must then be refined to obtain a final model that best agrees with the experimental data. Although a Fourier series approach could be used (using electron density map or difference electron density map calculations), this approach has the disadvantage that the structural model could be affected by systematic errors due to series termination effects, related to truncation of data at the observation limit. Consequently, a leastsquares refinement method is employed instead. Changes in the positional and thermal parameters are calculated calc that minimize the difference between jFobs hkl j and jFhkl j, or obs calc alternatively Ihkl and Ihkl .
SINGLE-CRYSTAL X-RAY STRUCTURE DETERMINATION
857
Table 2. Extinctions Caused by Symmetry Elementsa Class of Reflection hkl
Okl
h0l
hhl
h00b 00l
hh0 a b
Condition for Nonextinction (n, an integer)
Interpretation of Extinction
Symbol for Symmetry Element
hþkþl ¼ 2n hþk ¼ 2n hþl ¼ 2n kþl ¼ 2n hþk,hþl hþl ¼ 2n
hþkþl ¼ 3n hþkþl ¼ 3n k ¼ 2n l ¼ 2n kþl ¼ 2n kþl ¼ 4n h ¼ 2n l ¼ 2n hþl ¼ 2n hþk ¼ 4n l ¼ 2n 2hþl ¼ 2n 2hþl ¼ 2n h ¼ 2n h ¼ 4n l ¼ 2n l ¼ 4n i ¼ 3n
Body-centered lattice C-centered lattice B-centered lattice A-centered lattice Face-centered lattice
I C B A F
Rhombohedral lattice indexed on hexagonal lattice Hexagonal lattice indexed on rhombohedral lattice Glide plane ? a, translation b/2 Glide plane ? a, translation c/2 Glide plane ? a, translation (b þ c)/2 Glide plane ? a, translation (b þ c)/4 Glide plane ? b, translation a/2 Glide plane ? b, translation c/2 Glide plane ? b, translation (a þ c)/2 Glide plane ? a, translation (a þ b)/4 Glide plane ? (a b), translation c/2 Glide plane ? (a b), translation (a þ b þ c)/2 Glide plane ? (a b), translation (a þ b þ c)/4 Screw axis k a, translation a/2 Screw axis k a, translation a/4 Screw axis k c, translation c/2 Screw axis k c, translation c/4 Screw axis k c, translation c/3
l ¼ 6n h ¼ 2n
Screw axis k c, translation c/6 Screw axis k (a þ b), translation (a þ b)/2
R H b c n d a c n d c n d 21, 42 41, 43 21, 42, 63 41, 43 31, 32, 62, 64 61, 65 21
Adapted from Buerger, 1942. Similarly for k and b translations.
It is common practice to refer to one portion of the data set as observed data and the other as unobserved. The terminology is a carry-over from the early days of crystallography when the data were measured from photographic film by visual matching to a standard scale. Intensities could be measured only to some lower limit, hence the ‘‘observed’’ and ‘‘unobserved’’ categories. Although measurements are now made with digital counters (scintillation detectors or area detectors), the distinction between these categories is still made; often I > 3sðIÞ or a similar condition involving jFj is used for designating observed reflections. Such a distinction is still valuable when refining on jFj (see below) and when discussing R factors. Crystallographers typically refer to a residual index (R factor) as a gauge for assessing the validity of a structure. It is defined as P k F obs j jF obs k R ¼ h Ph obs h ð33Þ h jFh j Atoms placed in incorrect positions usually produce R > 0:40. If positional parameters are correct, R’s in the 0.20 to 0.30 range are common. Structures with refined isotropic thermal parameters typically give R 0:15, and least-squares refinement of the anisotropic thermal parameters will reduce this to 0.05 or less for a well-behaved structure. Positional and thermal parameters occur in trigonometric and exponential functions, respectively. To refine
such nonlinear functions, a Taylor series expansion is used. If f is a nonlinear function of parameters p1, p2, . . . , pn, then f can be written as a Taylor series f ¼ f0 þ
n X qf i¼1
qpi
pi þ higher order terms
ð34Þ
pj
where f 0 represents the value of the function evaluated with the initial parameters and the pi ’s are the shifts in these parameters. In practice, the higher order terms are neglected; if j for observation j is defined as j ¼ fj fj0
n X qf i¼1
qpi
pi
ð35Þ
pj
P then j wj 2j can be minimized to obtain the set of best shifts of the parameters in a least squares sense. These in turn can be used to obtain a new set of f 0s, and the process repeated until all the shifts are smaller than some prescribed minimum. The weights wj are chosen such as to be reciprocally related to the square of the standard deviation associated with that observation. In the X-ray case, as noted above, f can be Ihkl or jFhkl j. If the latter, then c c jFhkl j ¼ Fhkl expð 2piahkl Þ c ¼ Ahkl cosahkl þ Bchkl sinahkl
ð36Þ
858
X-RAY TECHNIQUES
In this case, only observed reflections should be used to calculate parameter shifts since the least-squares method assumes a Gaussian distribution of errors in the observations. (For net intensities that are negative, i.e., where the total intensity is somewhat less than the background mea0 surement, jFhkl j is not defined. After refinement, structure factors can still be calculated for these unobserved reflections.) The set of least-squares equations can be obtained from these j equations as follows for m observations. Express 0
qf1 qp1
B B .. B . @
qfm qp1
qf1 qp2
qf1 qp3
...
.. .
.. .
...
qfm qp2
qfm qp3
1 0 f1 f10 p1 CB p2 C B f2 f 0 2 .. CB C B B . C¼B .. . C A@ .. A @ .
qf1 qpn
10
qfm qpn
1 C C C A
ð37Þ
fm fm0
pn
It is convenient to represent these matrices by MP ¼ Y. Then multiplying both sides by the transpose of M, MT, gives MT MP ¼ MT Y. Let A ¼ MT M and B ¼ MT Y. Since A is now a square matrix, P ¼ A 1 B. Least-squares methods not only give a final set of parameters but also provide 2
s ðpj Þ ¼ a
jj
P
wi j2 m n i
ð38Þ
where s is the standard deviation associated with the parameter pj, aij is the jth diagonal matrix element of the inverse matrix, m is the number of observations, and n is the number of parameters. The square root of the quantity in parentheses is sometimes referred to as the standard deviation of an observation of unit weight, or the goodness-of-fit parameter. If the weights are correct, i.e., if the errors in the data are strictly random and correctly estimated, and if the crystal structure is properly being modeled, then the value of this quantity should approximately equal unity.
PRACTICAL ASPECTS OF THE METHOD In practice, the application of single-crystal X-ray diffraction methods for the determination of crystal and molecular structure can be subdivided into a number of tasks. The investigator must: (i) select and mount a suitable crystal; (ii) collect the diffracted intensities produced by the crystal; (iii) correct the data for various experimental effects; (iv) obtain an initial approximate model of the structure; (v) refine this model to obtain the best fit between the observed intensities and their calculated counterparts; and (vi) calculate derived results (e.g., bond distances and angles and stereographic drawings including thermal ellipsoid plots) from the atomic parameters associated with the final model. For further discussion of the selection and mounting of a crystal see Sample Preparation, below; for further details on methods of data collection, see Data Collection Techniques. The fiber or capillary holding the crystal is placed in a pin on a goniometer head that has x, y, z translations. (Preliminary investigation using film techniques—precession
or Weissenberg—could be carried out at this stage, if desired, but we will not discuss such investigations here.) The goniometer head is then placed on the diffractometer and adjusted using the instrument’s microscope so that the center of the crystal remains fixed during axial rotations. The next task is to determine the cell dimensions, the orientation of the cell relative to the laboratory coordinate system, and the likely Bravais lattice and crystal system. This process is usually termed indexing. To accomplish this, an initial survey of the low to mid angle (y) regions of reciprocal space is carried out. This can be done via a rotation photograph using a Polaroid cassette—the user then inputs the x, y positions of the diffraction maxima and the diffractometer searches in the rotation angle—or by initiating a search procedure that varies a variety of the angles on a four-circle diffractometer. Any peaks found are centered. Peak profiles give an indication of crystal quality—one would expect Gaussian-like peaks from a crystal of good quality. Sometimes successful structure determinations can be carried out on crystals that give split peaks or have a significant side peak, but it is best to work with crystals giving wellshaped peaks if possible. With an area detector, reflections can be collected from a few frames at different rotation angles. An indexing program is then employed that uses this information to determine the most probable cell and crystal system. (Although these indexing programs are usually quite reliable, the user must remember that the information is being obtained from a reasonably small number of reflections and some caution is advisable. The user should verify that all or almost all of the reflections were satisfactorily assigned indices by the program.) Computer programs (Jacobson, 1976, 1997) typically produce a reduced cell as a result of indexing. The reduced cell is one that minimizes the lengths of a1, a2, and a3 that describe the repeating unit (and hence give angles as close to 90 as possible) (Niggli, 1928; Lawton and Jacobson, 1965; Santoro and Mighell, 1970). It also orders the axes according to a convention such as |a1| |a2| |a3|. This is followed by a Bravais lattice determination in which characteristic length and angle relationships are used to predict the presence of centered cells and the axes are rearranged to be consistent with the likely crystal system. The next step would then be to measure intensities of some sets of reflections that would be predicted to be equal by the Laue group symmetry (to obtain additional experimental verification as to the actual Laue group of the crystal or a possible subgroup). Once the likely Laue symmetry has been determined, the data collection can begin. This author usually prefers to collect more than just a unique data set; the subsequent data averaging gives a good estimate of data quality. Data are typically collected to a 2y maximum from 50 to 60 with Mo Ka radiation, the maximum chosen depending on the crystal quality and the amount of thermal motion associated with the atoms. (Greater thermal motion means weaker average intensities as 2y increases.) Once data collection has finished, the data will need to be corrected for geometric effects such as the Lorentz effect (the velocity with which the reciprocal lattice point moves
SINGLE-CRYSTAL X-RAY STRUCTURE DETERMINATION
through the sphere of reflection) and the polarization effect; the X-ray beam becomes partially polarized in the course of the diffraction process. Usually data will also have to be corrected for absorption since as the beam traverses the crystal, absorption occurs, as given by the relation e mt , where m is the absorption coefficient and t is the path length. ‘‘Analytical’’ absorption correction can be made if the orientations of crystal faces and their distances from the center of the crystal are carefully measured. Often empirical absorption corrections are employed instead. For data collected on a four-circle diffractometer, a series of psi scans can be used for this purpose. These are scans at different f angle positions for reflections close to w ¼ 90 . (For a reflection at w ¼ 90 , diffraction is f independent and any variation in observed diffraction can be attributable to effects due to absorption.) For data collected on an area detector, computer programs determine the parameters associated with an analytic representation of the absorption surface using the observed differences found between the intensities of symmetry-related reflections. Corrections can also be made for a decrease in scattering caused by crystal decomposition effects, if necessary. This composite set of corrections is often referred to as data reduction. Following data reduction, the data can be averaged and then the probable space group or groups found by an examination of systematic absences in the data. Statistical tests, such as the Howells, Philips, and Rogers tests for the possible presence of a center of symmetry, can be carried out if more than one space group is indicated. Next the investigator typically seeks an initial trial model using a direct method program, or if a heavy-atom approach is appropriate, the analysis of peaks between symmetry-related atoms found in the Patterson calculation might be used as discussed below. An initial electron density map would then be calculated. The bond distances and bond angles that are associated with these atom position possibilities and the residual index often indicate whether a valid solution has been found. If so, the inclusion of additional atoms found on electron density or difference electron density maps and subsequent refinement usually leads to a final structural model. Final bond distances, angles, and their standard deviations can be produced along with a plot of the atom positions (e.g., an ORTEP drawing); see Data Analysis and Initial Interpretations, Derived Results. If a valid initial model is not obtained, the number of phase set trials might be changed if using a direct method program. Different Patterson peaks could be tried. Other space group or crystal system possibilities could be investigated. Other structure solution methods beyond the scope of this chapter might have to be employed to deduce an appropriate trial model. Fortunately, if the data are of good quality, most structures can be solved using the standard computer software available with commercial systems.
relatively insensitive, especially for the shorter X-ray wavelengths, and does not have a large dynamic range. Gas-filled proportional counters can be used to detect X rays. These detectors can attain high efficiencies for longer wavelength radiations; with the use of a center wire of high resistance, they can be used to obtain positional information as well. The curved one-dimensional versions are used for powder diffraction studies, and the twodimensional counterparts form the basis for some of the area detectors used in biochemical applications. They operate at reasonably high voltages (800 to 1700 V), and the passage of an X-ray photon causes ionization and an avalanche of electrons to occur that migrate to the anode. The number of electrons is proportional to the number of ion pairs, which is approximately proportional to the energy of the incident X-ray photon. The scintillation counter is composed of a NaI crystal that has been activated by the addition of some Tl. An absorbed X-ray produces a photoelectron and one or more Auger electrons; the latter energize the Tl sites, which in turn emit visible photons. The crystal is placed against a photomultiplier tube that converts the light into a pulse of electrons. The number of color centers energized are approximately dependent on the initial X-ray energy. Scintillation detectors are typically used on four-circle diffractometers. The four-circle diffractometer is one of the most popular of the counter diffractometers. As the name suggests, this diffractometer has four shafts that can be moved usually in an independent fashion. Such a diffractometer is shown schematically in Figure 6. The 2y axis is used to move the detector parallel to the basal plane, while the remaining three axes position the crystal (Hamilton, 1974). The o axis rotates around an axis that is perpendicular to the basal plane, the w axis rotates around an axis that is perpendicular to the face of the circle, and the f axis rotates around the goniometer head mount. To collect data with such an instrument, these angles must be positioned such that the Laue equation, or its scalar equivalent, Bragg’s law, is satisfied (see DYNAMICAL DIFFRACTION). Electronic area detectors have become more routinely available in the last few years and represent a major new advance in detector technology. Two such new detectors are the CCD detector and the imaging plate detector. In a typical CCD system, the X-ray photons impinge on a phosphor such as gadolinium oxysulfide. The light
Data Collection Techniques Film has long been used to record X-ray diffraction intensities. However, film has a number of disadvantages; it is
859
Figure 6. Four-circle diffractometer.
860
X-RAY TECHNIQUES
collected by offsetting the crystal and performing a second limited rotation. Many commercial instruments use a fourcircle diffractometer to provide maximum rotational capability. Due to the high sensitivity of the detector, a full data collection can be readily carried out with one of these systems in a fraction of a day for moderate-size unit cells.
METHOD AUTOMATION
Figure 7. A CCD area detector. An X-ray photon arriving from the left is converted to light photons at the phosphor. Fiber optics conduct these to the CCD chip on the right.
produced is conducted to a CCD chip via fiber optics, as shown in Figure 7, yielding about eight electrons per Xray photon. The typical system is maintained at 50 C and has a dynamic range of 1 to > 105 photons per pixel. Readout times are typically 2 to 9 s depending on the noise suppression desired. In a typical imaging plate system, the X-ray photons impinge on a phosphor (e.g., europium-doped barium halide) and produce color centers in the imaging plate. Following the exposure, these color centers are read by a laser and the light emitted is converted to an electronic signal using a photomultiplier. The laser action also erases the image so that the plate can be reused. In practice, usually one plate is being exposed while another is being read. The readout time is somewhat longer than with a CCD, typically 4 min or so per frame. However, the imaging plate has excellent dynamic range, 1 to >106 photons per pixel, and a considerably larger active area. It operates at room temperature. As noted above, imaging plate systems usually provide a larger detector surface and slightly greater dynamic range than does the CCD detector, but frames from the latter system can be read much more rapidly than with imaging plate systems. Crystallographers doing smallmolecule studies often tend to prefer CCD detectors, while imaging plate systems tend to be found in those laboratories doing structural studies of large biological molecules. Both types of area detectors permit the crystallographer to collect data in a manner almost independent of cell size. With the older scintillation counter system, a doubling of the cell size translates into a doubling of the time needed to collect the data set (since the number of reflections would be doubled). On the other hand, with an area detector, as long as the detector does not have to be moved further away from the source to increase the separation between reflections, the time for a data collection is essentially independent of cell size. Furthermore, if the crystal is rotated around one axis (e.g., the vertical axis) via a series of small oscillations and the associated frame data are stored to memory, most of the accessible data within a theta sphere would be collected. The remaining ‘‘blind region’’ could be
Single-crystal X-ray methods involve the collection and processing of thousands to tens of thousands of X-ray intensities. Therefore, modern instrumentation comes with appropriate automation and related computer software to carry out this process in an efficient manner. Often, user input is confined to the selection of one of a few options at various stages in the data-collection process. The same is true of subsequent data processing, structure solution, and refinement. DATA ANALYSIS AND INITIAL INTERPRETATION Obtaining the Initial Model There are a number of approaches that can be utilized to obtain an initial model of the structure appropriate for input into the refinement process. Many are specialized and tend to be used to ‘‘solve’’ particular kinds of structures, e.g., the use of isomorphous replacement methods for protein structure determination. Here we will primarily confine ourselves to two approaches: direct methods and the heavy-atom method. Direct methods employ probability relations relying on the magnitudes of the structure factors and a few known signs or phases to determine enough other phases to reveal the structure from an electron density map calculation. The possibility that phase information could be extracted from the structure factor magnitudes has intrigued investigators throughout the history of crystallography. Karle and Hauptman (1956), building on the earlier work of Harker and Kasper (1948) and Sayre (1952), developed equations that have made these methods practical. A much more complete discussion of direct methods can be found in the Appendix. One of the most important of the equations developed for the direct methods approach is the sigma-two equation: Eh hEh Eh k ik
ð39Þ
where Eh is derived from Fh and hik represents the average over the k contributors. In practice, only those E’s whose sign or phase has been determined are used in the average on the right. Since jEh j is known, the equation is used to predict the most probable sign or place of the reflection. The probability of jEh j being positive is given by Pþ ðhÞ ¼
X 1 1
3=2 Ek Eh k þ tanh s3 s2 jEh j 2 2 k
ð40Þ
SINGLE-CRYSTAL X-RAY STRUCTURE DETERMINATION
For the non-centrosymmetric case P jEh k Eh h jsinðak þ ah k Þ tanah ¼ Pk k jEk k Eh k jcosðak þ ah k Þ
ð41Þ
Direct method computer programs use these equations and other related ones as described in the Appendix to attempt to obtain an approximate set of signs (or phases) directly from the magnitudes of the structure factors. These are then used to calculate an electron density map, which could reveal the approximate positions of most of the atoms in the structure. Further details on two of the more commonly used direct methods programs can be found in the Appendix. Direct methods can be expected to be the most reliable when applied to structures in which most of the atoms are approximately equal in atomic number. In structures where one or a few atoms are present that have appreciably larger atomic numbers than the rest, the heavyatom method is typically the approach of choice. The basic assumption of the heavy-atom method is that these atoms alone provide good enough initial phasing of an electron density map that other atoms can be found and included, and the process repeated, until all atoms are located. See the Appendix for a more complete theoretical treatment of the heavy-atom method. The initial positions for the heavy atoms can come from direct methods or from an analysis of the Patterson function (see Appendix). Derived Results As a result of a crystal structure determination, one usually wishes to obtain, e.g., bond distances, bond angles, and least-squares planes, along with associated standard deviations. These can be determined from the leastsquares results in a straightforward fashion. Consider the bond distance between two atoms A and B. In vector notation, using a1, a2, and a3 to represent the unit cell vectors. dAB ¼ jrB rA j ¼ ½ðrB rA Þ ðrB rA Þ1=2 ¼ ½ðxB xA Þ2 a21 þ ðyB yA Þ2 a22 þ ðzB zA Þ2 a23 þ 2ðxB xA ÞðyB yA Þa1 a2
ð42Þ
þ 2ðyB yA ÞðzB zA Þa2 a3 þ 2ðxB xA ÞðzB zA Þa1 a3 1=2 For atoms A, B, and C, the bond angle A–B–C can be obtained from ðrA rB Þ ðrC rB Þ Angle ¼ cos 1 ð43Þ dAB dBC Furthermore, the standard deviation in the bond distances can be obtained from X qd s2 ðpj Þ ð44Þ s2 ðdÞ ¼ qpj j if the parameters are uncorrelated. The parameters in this case are the fractional coordinates (xA, yA, zA, xB, yB, zB)
861
and their standard deviations (s(xA), s(yA), s(zA), s(xB), s(yB), s(zB)) and the cell dimensions and their standard deviations. [Usually the major contributors to s(d) are from the errors in the fractional coordinates.] A similar expression can be written for the standard deviation in the bond angle. One can also extend these arguments to e.g., torsion angles and least-squares planes. As noted above, anisotropic thermal parameters are usually obtained as part of a crystal structure refinement. One should keep in mind that R factors will tend to be lower on introduction of anisotropic temperature factors, but this could merely be an artifact due to the introduction of more parameters (five for each atom). One of the most popular ways of viewing the structure, including the thermal ellipsoids, is via plots using ORTEP [a Fortran thermal ellipsoid plot program for crystal structure illustration (Johnson, 1965)]. ORTEP plots provide a visual way of helping the investigator decide if these parameters are physically meaningful. Atoms would be expected to show less thermal motion along bonds and more motion perpendicular to them. Atoms, especially those in similar environments, would be expected to exhibit similar thermal ellipsoids. Very long or very short principal moments in the thermal ellipsoids could be due to disordering effects, poor corrections for effects such as absorption, or an incorrect choice of space groups, to mention just some of the more common causes. Once data collection has finished, the data will need to be corrected for geometric effects such as the Lorentz effect (the velocity with which the reciprocal lattice point moves through the sphere of reflection) and the polarization effect; the X-ray beam becomes partially polarized in the course of the diffraction process. Usually data will also have to be corrected for absorption since as the beam traverses the crystal, absorption occurs, as given by the relation e mt , where m is the absorption coefficient and t is the path length. ‘‘Analytical’’ absorption correction can be made if the orientations of crystal faces and their distances from the center of the crystal are carefully measured. Often empirical absorption corrections are employed instead. For data collected on a four-circle diffractometer, a series of psi scans can be used for this purpose. These are scans at different f angle positions for reflections close to w ¼ 90 . (For a reflection at w ¼ 90 , diffraction is f independent and any variation in observed diffraction can be attributable to effects due to absorption.) For data collected on an area detector, computer programs determine the parameters associated with an analytic representation of the absorption surface using the observed differences found between symmetry-related reflections. Corrections can also be made for a decrease in scattering caused by crystal decomposition effects, if necessary. This composite set of corrections is often referred to as data reduction. Following data reduction, the data can be averaged, if desired, and then the probable space group or groups found by an examination of systematic absences in the data. Statistical tests such as the Howells, Philips, and Rogers tests can be carried out to check for the possible presence of a center of symmetry; if more than one space group is indicated.
862
X-RAY TECHNIQUES
Next the investigator typically seeks an initial trial model using a direct method program, or if a heavy-atom approach is appropriate, the analysis of peaks between symmetry-related atoms found in the Patterson approach might be used. An initial electron density map would then be calculated. The bond distances and bond angles that are associated with these atom position possibilities and sometimes the residual index often indicate whether a valid solution has been found. If so, the inclusion of additional atoms found on electron density or difference electron density maps and subsequent refinement usually leads to a final structural model. Final bond distances, angles, and their standard deviations can be produced along with a plot of the atom positions (e.g., an ORTEP drawing). If a valid initial model is not obtained, the number of phase set trials might be changed if using a direct method program. Different Patterson peaks could be tried. Other space group or crystal system possibilities could be investigated. Other structure solution methods beyond the scope of this chapter might have to be employed to deduce an appropriate trial model. Fortunately, if the data are of good quality, most structures can be solved using the standard computer software available with commercial systems. Example The compound (C5H5)(CO)2MnDBT, where DBT is dibenzothiophene, was synthesized in R. J. Angelici’s group as part of their study of prototype hydrodesulfurization catalysts (Reynolds et al., 1999). It readily formed large, brown, parallelepiped-shaped crystals. A crystal of approximate dimensions 0:6 0:6 0:5 mm was selected and mounted on a glass fiber. (This crystal was larger than what would normally be used, but it did not cleave readily and the absorption coefficient for this material is quite small, m ¼ 9:34 cm 1 . A larger than normal collimator was used.) It was then placed on a Bruker P4 diffractometer, a four-circle diffractometer equipped with a scintillation counter. Molybdenum Ka radiation was used as a source from a sealed tube target and a graphite mono˚ ). chromator (l ¼ 0:71069 A A set of 42 reflections was found by a random-search procedure in the range 10 2y 25 . These were carefully centered, indexed, and found to fit a monoclinic cell with dimensions a ¼ 13.076(2), b ¼ 10.309(1), c ¼ ˚ , and b ¼ 92:93ð1Þ . Based on atomic volume 23.484(3) A estimates and a formula weight of 360.31 g/mol, a calculated density of 1.514 g/cm3 was obtained with eight molecules per cell. Systematic absences of h0l: h þ l 6¼ 2n and 0k0: k 6¼ 2n indicated space group P21/n, with two molecules per asymmetric unit, which was later confirmed by successful refinement. Data were collected at room temperature (23 C) using an o-scan technique for reflections with 2y 50 . A hemisphere of data was collected (þh; k; l), yielding 12,361 reflections. Symmetry-related reflections were averaged (5556 reflections) and 4141 had F > 3sðFÞ and were considered observed. The data were corrected for Lorentz and polarization effects and were corrected for absorption using an empirical method based on psi scans. The transmission factors ranged from 0.44 to 0.57.
Table 3. Fractional Coordinates (Nonhydrogen Atoms) for One Molecule of Mn(SC12H9)(CO)2(C5H5) X
Atom Mn1 S1 C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 C16 C17 C18 C19 O1 O2 a
.0356(1) .0970(2) .4723(8) .4098(8) .4538(10) .5588(11) .6168(10) .5742(9) .7902(8) .3040(8) .2202(9) .1274(10) .6169(10) .6989(8) .0334(10) .9012(10) .9462(10) .0649(10) .9348(10) .9439(10) .1292(10) .8838(7) .1901(7)
Beq ¼ ð8p2 =3Þ
P3
i¼1
P3
j¼1
Y
Z
Beqa
.0387(1)
.1032(2) .2495(9) .1431(9) .0169(10) .0055(12) .1102(11) .2367(11) .1841(10) .1813(9) .1044(11) .1599(13) .2017(14) .1259(11) .1004(11) .0103(13) .1140(11) .0685(11) .0674(12) .1038(11) .1583(11) .1452(9) .2394(8)
.14887(6) .2155(1) .2839(4) .2962(4) .2953(4) .2818(5) .2692(5) .2701(4) .8059(4) .3090(4) .3220(5) .3303(5) .8285(5) .8158(5) .0620(4) .9320(5) .9141(4) .9097(4) .0753(4) .1963(5) .1670(4) .2255(4) .1754(4)
3.18(4) 3.38(6) 3.2(3) 3.5(3) 4.7(3) 5.0(4) 4.7(3) 4.1(3) 3.4(3) 3.4(3) 4.5(3) 5.0(3) 5.5(4) 4.4(3) 4.6(3) 5.1(4l) 4.5(3) 4.3(3) 4.6(3) 4.4(3) 4.3(3) 6.6(3) 6.0(3)
uij ai aj ai aj .
The SHELX direct method procedure was used and yielded positions for the heavier atoms and possible positions for many of the carbon atoms. An electron density map was calculated using the phases from the heavier atoms; this map yielded further information on the lighter atom positions, which were then added. Hydrogen atoms were placed in calculated positions and all nonhydrogen atoms were refined anisotropically. The final cycle of fullmatrix least-squares refinement converged with an P P unweighted residual R ¼ k F0 j jFP jF0 j of 3.3% c k= and a weighted residual Rw ¼ ½ w ðjF0 j jFc jÞ2 = P wF02 1=2 , where w ¼ 1=s2 ðFÞ, of 3.9%. The maximum peak in the final electron density difference map was ˚ 3. Atomic scattering factors were taken from Cro0.26 e/A mer and Weber (1974), including corrections for anomalous dispersion. Fractional atomic coordinates for one of the molecules in the asymmetric unit are given in Table 3, and their anisotropic temperature factors in Table 4. The two independent molecules in the asymmetric unit have essentially identical geometries within standard deviations. A few selected bond distances and angles are given in Table 5 and Table 6, respectively. A thermal ellipsoid plot of one of the molecules is shown in Figure 8.
SAMPLE PREPARATION The typical investigation starts obviously with the selection of a crystal. This is done by examining the material under a low- or moderate-power microscope. Samples that display sharp, well-defined faces are more likely to be single crystals. Crystals with average dimensions of a
SINGLE-CRYSTAL X-RAY STRUCTURE DETERMINATION
863
Table 4. Anisotropic Thermal Parameters (Non-Hydrogen Atoms) for One Molecule of Mn(SC12H9)(CO)2(C5H5) Atom Mn1 S1 C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 C16 C17 C18 C19 O1 O2
U11 5.11(10) 5.5(2) 5.6(7) 6.1(7) 8.8(9) 7.8(9) 6.6(8) 5.2(7) 4.6(6) 5.6(6) 6.5(8) 5.6(8) 5.9(8) 4.6(7) 7.9(9) 7.3(8) 7.5(8) 7.3(8) 7.3(8) 6.4(8) 6.7(8) 8.0(7) 8.1(7)
U22
U33
U23
U13
U12
3.41(8) 3.5(1) 4.1(5) 3.5(5) 3.8(6) 5.8(7) 5.7(7) 6.1(7) 4.1(5) 3.6(5) 5.4(7) 6.5(8) 7.9(9) 5.9(7) 6.1(7) 8.0(9) 6.2(7) 5.3(6) 6.1(7) 4.4(6) 4.5(6) 9.1(7) 5.1(5)
3.56(7) 3.8(1) 2.6(4) 3.6(5) 5.1(6) 5.4(6) 5.7(6) 4.2(5) 4.1(5) 3.7(5) 5.2(6) 6.9(7) 6.8(7) 6.2(7) 3.6(5) 4.3(6) 3.4(50) 3.6(5) 3.9(5) 5.7(6) 5.2(6) 8.4(6) 9.6(7)
.07(6)
.1(1) .0(4)
.6(4)
.6(5)
1.5(6)
1.8(6)
.7(5) .1(4)
.1(4)
.1(5)
.1(6)
.6(7) .4(6) .7(5)
.8(6)
1.1(5)
.5(5) .3(5)
.2(5) .4(5)
2.1(6)
.6(5)
.38(6)
.1(1)
.3(4)
.6(5)
1.3(6) .0(6) .0(6) .4(5)
.6(5)
.5(5)
.5(6) .4(6)
.4(6)
.1(5) .4(5) .9(6)
.3(5) .0(5)
.4(5)
.9(6) .8(6) 3.1(5) .2(5)
.33(7) .4(1)
.2(5)
.1(5) .6(6) 1.5(7) .7(6)
.3(6) .6(5)
.6(5)
1.8(6)
.7(7) .4(7)
.6(6)
.4(7) .0(7) .5(7)
.2(6) .2(6) .2(6) 1.1(6) 3.0(6)
1.7(5)
U values are scaled by 100.
few tenths of a millimeter are often appropriate—somewhat larger for an organic material. (An approximately spherical crystal with diameter 2/m, where m is the absorption coefficient, would be optimum.) The longest dimension should not exceed the diameter of the X-ray beam, i.e., it is important that the crystal be completely bathed by the beam if observed intensities are to be compared to their calculated counterparts. The crystal can be mounted on the end of a thin glass fiber or glass capillary if stable to the atmosphere using any of a variety of glues, epoxy, or even fingernail polish. If unstable, it can be sealed inside a thin-walled glass capillary. If the investigation is to be carried out at low temperature and the sample is unstable in air, it can be immersed in an oil drop and then quickly frozen; the oil should contain light elements and be selected such as to form a glassy solid when cooled. ˚ ) for One Molecule Table 5. Selected Bond Distances (A of Mn(SC12H9)(CO)2(C5H5) Mn1–S1 Mn1–C13 Mn1–C14 Mn1–C15 Mn1–C16 Mn1–C17 Mn1–C18 Mn1–C19 S1–C1 S1–C7 C1–C2 C1–C6 C2–C3 C2–C8 C3–C4
2.255(1) 2.116(3) 2.139(3) 2.151(3) 2.148(3) 2.123(3) 1.764(3) 1.766(3) 1.770(3) 1.769(3) 1.398(4) 1.380(4) 1.405(4) 1.459(4) 1.365(5)
C4–C5 C5–C6 C7–C8 C7–C12 C8–C9 C9–C10 C10–C11 C11–C12 C13–C14 C13–C17 C14–C15 C15–C16 C16–C17 C18–O1 C19–O2
1.381(5) 1.387(4) 1.397(4) 1.376(4) 1.391(4) 1.376(5) 1.405(5) 1.377(5) 1.424(5) 1.381(5) 1.412(5) 1.393(4) 1.422(4) 1.166(4) 1.164(4)
SPECIMEN MODIFICATION In most instances, little if any specimen modification occurs, especially when area detectors are employed for data collection, thereby reducing exposure times. In some cases, the X-ray beam can cause some decomposition of the crystal. Decomposition can often be reduced by collecting data at low temperatures.
PROBLEMS Single-crystal X-ray diffraction has an advantage over many other methods of structure determination in that there are many more observations (the X-ray intensities) than parameters (atomic coordinates and thermal parameters). The residual index (R-value) therefore usually serves as a good guide to the general reliability of the solution. Thermal ellipsoids from ORTEP should also be examined, as well as the agreement between distances involving similar atoms, taking into account their standard deviations. Ideally, such distance differences should be less than 3 sigma, but may be somewhat greater if packing or
Table 6. Selected Bond Angles (deg) for One Molecule of Mn(SC12H9)(CO)2(C5H5) S1– Mn1 –C18 S1– Mn1 –C19 C18– Mn1 –C19 Mn1– S1 –C1 Mn1– S1 –C7 S1–C1 –C2
92.5(1) 93.3(1) 92.8(1) 113.5(1) 112.7(1) 112.1(2)
C1–C2 –C8 S1–C7 –C8 C2 –C8 –C7 Mn1 –C18 –O1 Mn1 –C19 –O2
112.3(2) 112.2(2) 112.3(2) 177.6(3) 175.0(3)
864
X-RAY TECHNIQUES
Figure 8. ORTEP of Mn(SC12Ha)(CO)2(C5H5).
systematic error effects are present. If thermal ellipsoids are abnormal, this can be an indication of a problem with some aspect of the structure determination. This might be due to a poor correction for absorption of X rays as they pass through the crystal, a disordering of atoms in the structure, an extinction coefficient, or an incorrect choice of space group. Approximations are inherent in any absorption correction, and therefore it is best to limit such effects by choosing a smaller crystal or using a different wavelength where the linear absorption coefficient would be less. The optimum thickness for a crystal is discussed above (see Sample Preparation). If an atom is disordered over a couple of sites in the unit cell, an appropriate average of the electron density is observed and an elongation of the thermal ellipsoid may be observed if these sites are only slightly displaced. Sometimes it is possible to include multiple occupancies in the refinement to reasonably model such behavior with a constraint on the sum of the occupancies. If a crystal is very perfect, extinction effects may occur, i.e., a scattered X-ray photon is scattered a second time. If this occurs in the same ‘‘mosaic block,’’ it is termed primary extinction; if it occurs in another ‘‘mosaic block,’’ it is termed secondary extinction. The effect is most noticeable in the largest intensities and usually manifests itself in a situation in which ten or so of the largest intensities are found to have calculated values that exceed the observed ones. Mathematical techniques exist to approximately correct for such extinction effects, and most crystallographic programs include such options. Problems can be encountered if some dimension of the crystal exceeds the beam diameter. The crystal must be completely bathed in a uniform beam of X rays for all
orientations of the crystal, if accurate intensities are to be predicted for any model. The crystal should not decompose significantly (less than about 10%) in the X-ray beam. Some decomposition effects can be accounted for as long as they are limited.
ACKNOWLEDGEMENTS This work was supported in part by the U.S. Department of Energy, Office of Basic Energy Sciences, under contract W-7405-Eng-82.
LITERATURE CITED Berry, F. J. 1990. Mo¨ ssbauer Spectroscopy. In Physical Methods of Chemistry. Determination of Structural Features of Crystalline and Amorphous Solids, Vol. V, 2nd ed. (B. W. Rossiter and J. F. Hamilton, eds.). pp. 273–343. John Wiley & Sons, New York. Buerger, M. J. 1942. X-Ray Crystallography. John Wiley & Sons, New York. Buerger, M. J. 1959. Vector Space. John Wiley & Sons, New York. Cochran, W. and Woolfson, M. M. 1955. The theory of sign relations between structure factors. Acta Crystallogr. 8:1– 12. Cromer, D. T. and Weber, J. T. 1974. International Tables for XRay Crystallography, Vol. IV. Kynoch Press, Birmingham, England, Tables 2.2 A, 2.3.1. Germain, G., Main, P., and Woolfson, M. M. 1970. On the application of phase relationships to complex structures. II. Getting a good start. Acta Crystallogr. B26:274–285.
SINGLE-CRYSTAL X-RAY STRUCTURE DETERMINATION
865
Hamilton, W. C. 1974. Angle settings for four-circle diffractometers. In International Tables for X-ray Crystallography, Vol. IV (J. A. Ibers and W. C. Hamilton, eds.). pp. 273–284. Kynoch Press, Birmingham, England.
Patterson, A. L. 1935. A direct method for the determination of components of interatomic distances in crystals. Z. Kristallogr. A90:517–542; Tabulated data for the seventeen phase groups. Z. Kristallogr. A90:543–554.
Harker, D. 1936. The application of the three-dimensional Patterson method and the crystal structures of provstite, Ag3A5S3 and pyrargyrite, Ag3SbS3. J. Chem. Phys. 4:381–390.
Reynolds, M. A., Logsdon, B. C., Thomas, L. M., Jacobson, R. A., and Angelici, R. J. 1999. Transition metal complexes of Cr, Mo, W and Mn containing h1(S)-2,5-dimethylthiophene, benzothiophene, and dibenzothiophene ligands. Organometallics 18: 4075–4081.
Harker, D. and Kasper, J. S. 1948. Phase of Fourier coefficients directly from crystal diffraction data. Acta Crystallogr. 1:70– 75. Hauptman, H. and Karle, J. 1953. Solution of the Phase Problem I. Centrosymmetric Crystal. American Crystallographic Association, Monograph No. 3. Polycrystal Book Service, Pittsburgh, Pa. Heald, S. M. and Tranquada, J. M. 1990. X-Ray Absorption Spectroscopy: EXAFS and XANES. In Physical Methods of Chemistry. Determination of Structural Features of Crystalline and Amorphous Solids, Vol. V, 2nd ed. (B. W. Rossiter and J. F. Hamilton, eds.). pp. 189–272. John Wiley & Sons, New York. Hendrichs, P. M. and Hewitt, J. M. 1980. Solid-State Nuclear Magnetic Resonance. In Physical Methods of Chemistry. Determination of Structural Features of Crystalline and Amorphous Solids, Vol. V, 2nd ed. (B. W. Rossiter and J. F. Hamilton, eds.). pp. 345–432. John Wiley & Sons, New York. Howells, E. R., Phillips, D. C., and Rogers, D. 1950. The probability distribution of x-ray intensities. II. Experimental investigation and the x-ray detection of centres of symmetry. Acta Crystallogr. 3:210–214.
Richardson, Jr., J. W. and Jacobson, R. A. 1987. Computer-aided analysis of multi-solution Patterson superpositions. In Patterson and Pattersons (J. P. Glusker, B. K. Patterson, and M. Rossi, eds.). p. 311. Oxford University Press, Oxford. Santoro, A. and Mighell, A. D. 1970. Determination of reduced cells. Acta Crystallogr. A26:124–127. Sayre, D. 1952. The squaring method: A new method for phase determination. Acta Crystallogr. 5:60–65. Sheldrick, G. M. 1990. Phase annealing in SHELX-90: Direct methods for larger structures. Acta Crystallogr. A46:467–473. Stout, G. H. and Jensen, L. H. 1989. X-Ray Crystal Structure Determination. John Wiley & Sons, New York. Wilson, A. J. C. 1944. The probability distribution of x-ray intensities. Acta Crystallogr. 2:318–321. Woolfson, M. M. and Germain, G. 1970. On the application of phase relationships to complex structures. Acta Crystallogr. B24:91–96.
International Tables for Crystallography. 1983. Vol. A: SpaceGroup Symmetry. D. Reidel Publishing Company, Dordrecht, Holland.
KEY REFERENCES
Jacobson, R. A. 1976. A single-crystal automatic indexing procedure. J. Appl. Crystallogr. 9:115–118.
Provides a much more detailed description of the kinematic theory of X-ray scattering and its application to single-crystal structure determination.
Jacobson, R. A. 1997. A cosine transform approach to indexing. Z. Kristallogr. 212:99–102. Johnson, C. K. 1965. ORTEP: A Fortran Thermal-Ellipsoid Plot Program for the Crystal Structure Illustrations. Report ORNL-3794. Oak Ridge National Laboratory, Oak Ridge, TN. Karle, J. and Hauptman, H. 1956. A theory of phase determination for the four types of non-centrosymmetric space groups 1P222, 2P22, 3P12,3P22. Acta Crystallogr. 9:635–651. Karle, J. and Karle, I. L. 1966. The symbolic addition procedure for phase determination for centrosymmetric and noncentrosymmetric crystals. Acta Crystallogr. 21:849–859.
Lipscomb and Jacobson, 1990. See above.
Ladd and Palmer, 1985. See above. Another good text that introduces the methods of crystallography, yet does so without being too mathematical; also provides a number of problems following each chapter to facilitate subject comprehension. Stout and Jensen, 1989. See above. A good general text on X-ray diffraction and one that emphasizes many of the practical aspects of the technique.
Ladd, M. F. C. and Palmer, R. A. 1980. Theory and Practice of Direct Methods in Crystallography. Plenum, New York.
APPENDIX
Ladd, M. F. C. and Palmer, R. A. 1985. Structure Determination by X-Ray Crystallography. Plenum Press, New York.
Direct Methods
Lawton, S. L. and Jacobson, R. A. 1965. The Reduced Cell and Its Crystallographic Applications. USAEC Report IS-1141. Ames Laboratory, Ames, IA. Lipscomb, W. N. and Jacobson, R. A. 1990. X-Ray Crystal Structure Analysis. In Physical Methods of Chemistry. Determination of Structural Features of Crystalline and Amorphous Solids, Vol. V, 2nd ed. (B. W. Rossiter and J. F. Hamilton, eds.). pp. 3–121. John Wiley & Sons, New York. Majkrzak, C. F., Lehmann, M. S., and Cox, D. E. 1990. The Application of Neutron Diffraction Techniques to Structural Studies, Physical Methods of Chemistry. Determination of Structural Features of Crystalline and Amorphous Solids, Vol. V, 2nd ed. (B. W. Rossiter and J.F. Hamilton, eds.). pp. 123–187. Niggli, P. 1928. Handbuch der Experimentalphysike 7 Part 1. Akademische Verlagsgesellschaft, Leipzig.
There were a number of early attempts to use statistical approaches to develop relations that could yield the phases (or signs for the centrosymmetric case) of some of the reflections. Harker and Kasper (1948), for example, developed a set of inequalities based on the use of Cauchy’s inequality. A different approach was suggested by Sayre (1952). Although later superceded by the work of Karle and Hauptman (1956), this work is still of some interest since it affords some physical insight into these methods. Sayre (1952) noted that r2 would be expected to look much like r, especially if the structure contains atoms that are all comparable in size. (Ideally, r should be nonnegative; it should contain peaks where atoms are and should equal zero elsewhere, and hence its square should
866
X-RAY TECHNIQUES
have the same characteristics with modified peak shape.) One can express r2 as a Fourier series, r2 ðxyzÞ ¼
1 XXX Ghkl exp½ 2piðhx þ ky þ lzÞ ð45Þ V h k l
where the G’s are the Fourier transform coefficients. Since r2 would be made up of ‘‘squared’’ atoms, Ghkl can be expressed as Ghkl ¼
n X
gj exp½2piðhxj þ kyj þ lzj Þ
where fj is the atomic scattering factor corrected for thermal effects and e is a factor to account for degeneracy and is usually unity except for special projections. For the equal atom case, a simple derivation can be given (Karle and Karle, 1966). Then X 2 2 2 jEh j ¼ f expð2pih rj Þ =Nf 2 ð52Þ j
or
ð46Þ
Eh ¼
j¼1
or
N 1 X expð2pih rj Þ N 1=2 j ¼ 1
ð53Þ
Now consider g Ghkl Fhkl f
ð47Þ
if all atoms are assumed to be approximately equal. Therefore 1 XXXg Fhkl exp½2piðhx þ ky þ lzÞ ð48Þ r ðxyzÞ ¼ V h k l f
Ek Eh k
" # 1 X ¼ expð2pik rj Þ N j " # X expð2piðh kÞ rj Þ j
2
but also
¼
1 N þ
1 r ðxyzÞ ¼ 2 V 2
XXX k
h
!
K
XX
exp½2pik ðrj ri Þ½expð2pih ri Þ
j
ð54Þ
Fhkl exp½2piðhx þ ky þ lzÞ
l
!
XXXg H
expð2pih rj Þ
j
i
FHKL exp½2piðHx þ Ky þ LzÞ f 1 XXXXXX ¼ 2 Fhkl expf2pi½ðh þ HÞx V h k l H K L
X
If an average is now taken over a reasonably large number of such pairs, keeping h constant, then
L
þ ðk þ KÞy þ ðl þ LÞzg
ð49Þ
Comparing the two expressions for r2 and letting h ¼ kþ H yield g 1X Fh ¼ Fh k Fk f V k
X
fj2
ð51Þ
ð55Þ
(The second term in Equation 54 will tend to average to zero.) Since the magnitudes are measured and our concern is with the phase, we write Eh / hEk Eh k ik
ð50Þ
Sayre’s relation therefore implies that the structure factors are interrelated. The above relation would suggest that in a centrosymmetric structure, if Fk and Fh k are both large in magnitude and of known sign, the sign of Fh would likely be given by the product Fk Fh k since this term will tend to predominate in the sum. Sayre’s relation was primarily of academic interest since in practice few F’s with known signs have large magnitudes, and the possibilities for obtaining new phases or signs are limited. Karle and Hauptman (1956), using a different statistical approach, developed a similar equation along with a number of other useful statistical-based relations. They did so using Ehkl’s that are related to the structure factor by defining jEhkl j2 ¼ jFhkl j2 =e
1X expð2pih ri Þ N 1 1=2 Eh N
hEk Eh k i
ð56Þ
Karle and Hauptman showed, using a more sophisticated mathematical approach, that for the unequal atom ease, a similar relation will Phold. Equation 56 is often referred to as the sigma-two ( 2 ) relation. For the non-centrosymmetric case, this is more conveniently written (Karle and Hauptman, 1956) as P jEh k Eh h jsinðak þ ah k Þ tanah ¼ Pk k jEk k Eh k jcosðak þ ah k Þ
ð57Þ
Karle and Hauptman argued that the relation should hold for an average using a limited set of data if the |E|’s are large, and Cochran and Woolfson (1955) obtained an approximate expression for |Eh| being positive from the P relation, 2 Pþ ðhÞ ¼
X 1 1
3=2 þ tanh s3 s2 jEh j Ek Eh k 2 2 k
ð58Þ
SINGLE-CRYSTAL X-RAY STRUCTURE DETERMINATION
where, Zj being the atomic number of atom j X Znj sn ¼
and, in terms of phases j ð59Þ
j3 ¼ jh1 þ jh2 þ jh3
j
The strategy then is to start with a few reflections P whose phases are known and extend these using the 2 relation accepting only those indications that have high probability (typically >95%) of being correct. The process is then repeated until no significant further changes in phases occur. However, to employ the above strategy, phases of a few of the largest |E|’s must be known. How does one obtain an appropriate starting set? If working in a crystal system of orthorhombic symmetry or lower, three reflections can be used to specify the origin. Consider a centrosymmetric space group. The unit cell has centers of symmetry not only at (0, 0, 0) but also halfway along the cell in any direction. Therefore the cell can be displaced by 1/2 in any direction and an equally valid structural solution obtained (same set of intensities) using the tabulated equivalent positions of the space group. If in the structure factor expression all xj were replaced by 1/2þxj, then h
old Enew hkl ¼ ð 1Þ Ehkl
ð61aÞ ð61bÞ ð61cÞ
where e indicates an even value of the index. Another useful approach to the derivation of direct method equations is through the use of what are termed structure semivariants and structure invariants. Structure semivariants are those reflections or linear combination of reflections that are independent of the choice of permissible origins, such as has just been described above. Structure invariants are certain linear combinations of the phases whose values are determined by the structure alone and are independent of the choice of origin. We will consider here two subgroups, the three-phase (triplet) structure invariant and the four-phase (quartet) structure invariant. A three-phase structure invariant is a set of three reciprocal vectors h1, h2, and h3 that satisfy the relation h1 þ h2 þ h3 ¼ 0
ð62Þ
Let A¼
s3 3=2
s2
ðjEh1 k Eh2 k Eh3 jÞ
ð64Þ
then j3 is found to be distributed about zero with a variance that is dependent on A—the larger its value, the narrower the P distribution. It is obviously then another version of the 2 relation. A four-phase structure invariant is similarly four reciprocal vectors h1, h2, h3, and h4 that satisfy the relation h1 þ h2 þ h3 þ h4 ¼ 0
ð65Þ
Let B¼
s3 3=2
s2
ðjEh1 k Eh2 k Eh3 k Eh4 jÞ
ð66Þ
and j4 ¼ jh1 þ jh2 þ jh3 þ jh4
ð67Þ
ð60Þ
Therefore the signs of the structure factors for h odd would change while the signs for the even would not. Thus specifying a sign for an h odd reflection amounts to specifying an origin in the x direction. A similar argument can be made for the y and z directions as long as the reflection choices are independent. Such choices will be independent if they obey h1 þ h2 þ h3 ¼ 6 ðe; e; eÞ h1 þ h2 ¼ 6 ðe; e; eÞ; h1 ¼ 6 ðe; e; eÞ;
867
ð63Þ
If furthermore three additional magnitudes are known, jEh1 þh2 ; jEh2 þh3 j, and jEh3 þh1 j, then, in favorable cases, a more reliable estimate of j4 may be obtained, and furthermore, the estimate may lie anywhere in the interval 0 to p. If all seven of the above |E|’s are large, then it is likely that j4 ¼ 0. However, it can also be shown that for the case where jEh1 j; jEh2 j; jEh3 j, and jEh4 j are large but jEh1 þh2 j; jEh2 þh3 j and jEh3 þh1 j are small, the most probable value of j4 is p. The latter is sometimes referred to as the negative-quartet relation. (For more details see Ladd and Palmer, 1980.) Normally additional reflections will be needed to obtain a sufficient starting set. Various computer programs adopt their own specialized procedures to accomplish this end. Some of the more common approaches include random choices for signs or phases for a few hundred reflections followed by refinement and extension; alternately, phases for a smaller set of reflections, chosen to maximize their interaction with other reflections, are systematically varied followed by refinement and extension (for an example of the latter, see Woolfson and Germain, 1970; Germain et al., 1970). Various figures of merit have also been devised by the programmers to test the validity of the phase sets so obtained. Direct method approaches readily lend themselves to the development of automatic techniques. Over the years, they have been extended and refined to make them generally applicable, and this has led to their widespread use for deducing an initial model. Computer programs such as the SHELX P direct method programs (Sheldrick, 1990) use both the 2 (three-phase structure invariant) and the negative four-phase quartet to determine phases. Starting phases are produced by random-number generation techniques and then refined using New jh ¼ phase of ½a Z
ð68Þ
868
X-RAY TECHNIQUES
where a is defined by a ¼ 2jEh jEk Eh k =N 1=2
ð69Þ
and Z is defined by Z ¼ gjEh jEk EI Eh k I =N
ð70Þ
N is the number of atoms (for an equal atom case), and g is a constant to compensate for the smaller absolute value of Z compared to a. In the calculation of Z, only those quartets are used for which all three cross terms have been measured and are weak. An E map is then calculated for the refined phase set giving the best figure of merit.
heavy-atom approach could be successful? One possibility is to use direct methods as discussed earlier. Although direct methods may not provide the full structure, the largest peaks on the calculated map may well correspond to the positions of the heavier atoms. An alternate approach is to deduce the position of the heavy atom(s) through the use of a Patterson function (Patterson, 1935). The Patterson function is an autocorrelation function of the electron density function: ð PðuÞ ¼ rðrÞ rðr þ uÞdt
ð73Þ
If the Fourier series expressions for r are substituted into this equation, it can be readily shown that
Heavy-Atom Method Direct methods can be expected to be the most reliable when applied to structures in which most of the atoms are approximately equal in atomic number. In structures where one or a few atoms are present that have appreciably larger atomic numbers than the rest, the heavyatom method is typically the approach of choice. Assume a structure contains one atom in the asymmetric unit that is ‘‘heavier’’ than the rest. Also assume for the moment that the position of this atom (rH) is known. The structure factor can be written as Fhkl ¼ fH expð2pih rH Þ þ
N X
fj expð2ih rj Þ
ð71Þ
j¼2
Alternately the structure factor can be written as heavy other þ Fhkl Fhkl ¼ Fhld
ð72Þ
where ‘‘heavy’’ denotes the contributors to the structure factor from the heavy atom(s) and ‘‘other’’ represents the contribution to the structure factor from the remaining atoms in the structure. The latter contains many small contributions, which can be expected to partially cancel one another in most reflections. Therefore, if the ‘‘other’’ contribution does not contain too large a number of these smaller contributors, one would expect the sign (or the phase) of the observed structure factor to be approximately heavy that calculated from Fhkl , although the agreement in terms of magnitudes would likely be poor. Thus the heavy approach is to calculate Fhkl and transfer its sign or heavy obs phase to the jFhkl j, unless jFhkl j is quite small. Use these obs phased jFhkl j to calculate an electron density map and examine this map to attempt to find additional positions. (As a general rule of thumb, atoms often appear with 1/ 3 of their true height if not included in the structure factor calculation.) Those atoms that appear, especially if they are in chemically reasonable positions, are then added to the structure factor calculation, giving improved phases, and the process is repeated until all the atoms have been found. For this P approach P to be successful, it is usually necessary for Z2H Z2L . How does one go about finding the position of the heavy atom or atoms, assuming the statistics indicate that the
PðuÞ ¼
1 XXX jFhkl j2 cos2ph u V h k l
ð74Þ
The Patterson function therefore can be directly calculated from the intensities. From a physical point of view, a peak would be expected in the Patterson function anytime two atoms are separated by a vector displacement u. Since a vector could be drawn from A to B or B to A, a centrosymmetric function would be expected, consistent with a cosine function. Moreover, the heights of the peaks in the Patterson function should be proportional to the products of the atomic electron densities, i.e., ZiZj. If a structure contained nickel and oxygen atoms, the heights of the Ni-Ni, Ni-O, and O-O vectors would be expected to be in the ratio 784:224:64 for peaks representing single interactions. Most materials crystallize in unit cells that have symmetry higher than triclinic. This symmetry can provide an additional very useful tool in the analysis of the Patterson function. All Patterson functions have the same symmetry as the Laue group associated with the related crystal system. Often peaks corresponding to vectors between symmetry-related atoms occur in special planes or lines. This is probably easiest seen with an example. Consider the monoclinic space group P21/c. The Patterson function would have 2/m symmetry—the symmetry of the monoclinic Laue group. Moreover, because of the general equivalent positions in this space group, namely, (x; y; z)(x; 1=2 y; 1=2 þ zÞð x; 1=2 þ y; 1=2 zÞð x; y; z), peaks between symmetry-related atoms would be found at ð0; 1=2 2y; 1=2Þð0; 1=2 þ 2y; 1=2Þ ð 2x; 1=2; 1=2 2zÞ ð2x; 1=2; 1=2 þ 2zÞ ð 2x; 2y; 2zÞ ð2x; 2y; 2zÞ ð 2x; 2y; 2zÞ and ð2x; 2y; 2z) as deduced by obtaining all the differences between these equivalent positions. Such vectors are often termed Harker vectors (Harker, 1936; Buerger, 1959). In the case of the first four peaks, two differences give the same values and yield a double peak on the Patterson function. This set of Patterson peaks can be used to determine the position of a heavy atom. First we should again note that the peaks corresponding to vectors between heavy atoms should be larger than the other types. For this space group, one can select those peaks with u ¼ 0 and w ¼ 1=2 and, from their v coordinate, determine y possibilities. In a similar fashion, by selecting those large peaks with n ¼ 1=2 (remember peaks in both of these categories
XAFS SPECTROSCOPY
must be double), an x and z pair can be determined. These results can be combined together to predict the 2x, 2y, 2z type peak position, which can then be tested to see if the combination is a valid one. The same process can be carried out if more than one heavy atom is present per asymmetric unit—the analysis just becomes somewhat more complicated. [There is another category of methods termed Patterson superposition methods, which are designed to systematically break down the multiple images of the structure present in the Patterson function to obtain essentially a single image of the structure. Because of space limitations, they will not be discussed here. An interested reader should consult Richardson and Jacobson (1987) and Lipscomb and Jacobson (1990) and references therein for further details.] It may also be appropriate here to remind the reader that the intensities are invariant if the structure is acted upon by any symmetry element in the Laue group or is displaced by half-cell displacements. Two solutions differing only in this fashion are equivalent. ROBERT A. JACOBSON Iowa State University Ames, Iowa
869
and the technique in general as XAS (x-ray absorption spectroscopy). Historically the principal quantum number of the initial state atomic level is labeled by the letters K, L, M, N, . . . for n ¼ 1; 2; 3; 4; . . . , and the angular momentum state of the level is denoted as subscripts 1; 2; 3; 4; . . . for the s, p1/2, p3/2, d3/2, . . . levels. The most common edges used for XAFS are the K edge or 1s initial state (the subscript 1 is omitted since it is the only possibility) and the L3 edge or 2p3/2 initial state. The utility of XAS is demonstrated in Figure 1. The different elements have different fine structure because they have different local atomic environments. In this unit it will be shown how these spectra can be analyzed to obtain detailed information about each atom’s environment. This includes the types of neighboring atoms, their distances, the disorder in these distances, and the type of bonding. The near-edge region is more sensitive to chemical effects and can often be used to determine the formal valence of the absorbing atom as well as its site symmetry. In the simplest picture, the spectra are a result of quantum interference of the photoelectron generated by the absorption process as it is scattered from the neighboring atoms. This interference pattern is, of course, related to the local arrangement of atoms causing the scattering. As the incoming x-ray energy is changed, the energy of the photoelectron also varies along with its corresponding
XAFS SPECTROSCOPY INTRODUCTION X rays, like other forms of electromagnetic radiation, are both absorbed and scattered when they encounter matter. X-ray scattering and diffraction are widely utilized structural methods employed in thousands of laboratories around the world. X-ray diffraction techniques are undeniably among the most important analysis tools in nearly every physical and biological science. As this unit will show, x-ray absorption measurements are achieving a similar range of application and utility. Absorption methods are often complementary to diffraction methods in terms of the area of application and the information obtained. The basic utility of x-ray absorption arises from the fact that each element has characteristic absorption energies, usually referred to as absorption edges. These occur when the x rays exceed the energy necessary to ionize a particular atomic level. Since this is a new channel for absorption, the absorption coefficient shows a sharp rise. Some examples are shown in Figure 1A for elements in a high-temperature superconductor. Note that the spectra for each element can be separately obtained and have distinctly different structure. Often the spectra are divided into two regions. The region right at the edge is often called the XANES (x-ray absorption near-edge structure), and the region starting 20 or 30 eV past its edge is referred to as EXAFS (extended x-ray absorption fine structure). The isolated EXAFS structure is shown in Figure 1B. Recent theories have begun to treat these in a more unified manner, and the trend is to refer to the entire spectrum as the XAFS (x-ray absorption fine structure)
Figure 1. Examples of x-ray absorption data from the high-temperature superconductor material YBa2Cu3O7. (A) X-ray absorption. (B) Normalized extended fine structure vs. wave vector extracted from the spectra in (A). The top spectra in both plots are for the Y K edge at 17,038 eV, and the bottom spectra are for the Cu K edge at 8,980 eV. Both edges were obtained at liquid nitrogen temperature.
870
X-RAY TECHNIQUES
wavelength. Therefore, the interference from the different surrounding atoms goes in and out of phase, giving the oscillatory behavior seen in Figure 1B. Each type of atom also has a characteristic backscattering amplitude and phase shift variation with energy. This allows different atom types to be distinguished by the energy dependence of the phase and amplitude of the different oscillatory components of the spectrum. This simple picture will be expanded below (see Principles of the Method) and will form the basis for the detailed analysis procedures used to extract the structural parameters. It is important to emphasize that the oscillations originate from a local process that does not depend on long-range order. XAFS will be observed any time an atom has at least one well-defined neighbor and, in addition to well-ordered crystalline or polycrystalline materials, has been observed in molecular gases, liquids, and amorphous materials. The widespread use of x-ray absorption methods is intimately connected to the development of synchrotron radiation sources. The measurements require a degree of tunability and intensity that is difficult to obtain with conventional laboratory sources. Because of this need and of advances in the theoretical understanding that came at about the same time as the major synchrotron sources were first developed, the modern application of absorption methods for materials analysis is only 25 years old. However, it has a long history as a physical phenomenon to be understood, with extensive work beginning in the 1930s. For a review of the early work see Azaroff and Pease (1974) and Stumm von Bordwehr (1989). This unit will concentrate on the modern application of XAS at synchrotron sources. Conducting experiments at remote facilities is a process with its own style that may not be familiar to new practitioners. Some issues related to working at synchrotrons will be discussed (see Practical Aspects of the Method). Comparison to Other Techniques There are a wide variety of structural techniques; they can be broadly classified as direct and indirect methods. Direct methods give signals that directly reflect the structure of the material. Diffraction measurements can, in principle, be directly inverted to give the atomic positions. Of course, the difficulty in measuring both the phase and amplitude of the diffraction signals usually precludes such an inversion. However, the diffraction pattern still directly reflects the underlying symmetry of the crystal lattice, and unit cell symmetry and size can be simply determined. More detailed modeling or determination of phase information is required to place the atoms accurately within the unit cell. Because there is such a direct relation between structure and the diffraction pattern, diffraction techniques can often provide unsurpassed precision in the atomic distances. Indirect methods are sensitive to structure but typically require modeling or comparison to standards to extract the structural parameters. This does not mean they are not extremely useful structural methods, only that the signal does not have a direct and obvious relation to the structure. Examples of such methods include Mo¨ ssbauer spec-
troscopy (MOSSBAUER SPECTROMETRY), nuclear magnetic resonance (NMR; NUCLEAR MAGNETIC RESONANCE IMAGING), electron paramagnetic resonance (EPR; ELECTRON PARAMAGNETIC RESONANCE SPECTROSCOPY), and Raman spectroscopy (RAMAN SPECTROSCOPY OF SOLIDS). The XANES part of the xray absorption signal can be fit into this category. As will be shown, strong multiple-scattering and chemical effects make direct extraction of the structural parameters difficult, and analysis usually proceeds by comparison of the spectra with calculated or measured models. On the other hand, as is the case with many of the other indirect methods, the complicating factors can often be used to advantage to learn more about the chemical bonding. The indirect methods usually share the characteristic that they are atomic probes. That is, they are foremost measures of the state of an individual atom. The structure affects the signal since it affects the atomic environment, but long range order is usually not required. Also, atomic concentration is not important in determining the form of the signal, only its detectability. Therefore, exquisite sensitivity to low concentrations is often possible. This is also true for x-ray absorption. Methods that allow detection of structural information to concentrations for which diffraction techniques would be useless will be described under Detection Methods, below. In some respects, the EXAFS part of the absorption signal shares the characteristics of both direct and indirect structural methods. The absorbing atom acts as both the source and detector of the propagating photoelectrons. The measured signal contains both phase and amplitude information, and, as in direct methods, can be inverted to obtain structural information. However, the photoelectrons interact strongly with the surrounding atoms, which modifies both their amplitude and phase. Therefore, a direct inversion (Fourier transform) of the spectrum gives only a qualitative view of the local structure. This qualitative view can be informative, but modeling is usually required to extract detailed structural parameters. PRINCIPLES OF THE METHOD Single-Scattering Picture When an x ray is absorbed, most of the time an electron is knocked out of the atom, which results in a photoelectron with energy E ¼ Ex Eb , where Ex is the x-ray energy and Eb is the binding energy of the electron. The x-ray edge occurs when Ex ¼ Eb . The photoelectron propagates as a spherical wave with a wave vector given by pffiffiffiffiffiffiffiffiffiffiffiffiffi 2me E k¼ ð1Þ h As shown in Figure 2, this wave can be scattered back to the central atom and interfere with the original absorption. In a classical picture this can seem like a strange concept, but this process must be treated quantum mechanically, and the absorption cross-section is governed by Fermi’s golden rule: sK ¼ 4p2 a ho
X j
jhf je rjK ij2 dðEf þ EK hoÞ
ð2Þ
XAFS SPECTROSCOPY
871
polarization of the incoming x-rays. The remaining factors, Aj(k) and j ðkÞ, are overall amplitude and phase factors. It is these two factors that must be calibrated by theory or experiment. The amplitude factor A(k) can be further broken down to give Aj ðkÞ ¼ S20 Fj ðkÞQj ðkÞe 2Rj =l
Figure 2. Two-dimensional schematic of the EXAFS scattering process. The photoelectron originates from the central (black) atom and is scattered by the four neighbors. The rings represent the maxima of the photoelectron direct and scattered waves.
where e is the electric field vector, r is the radial distance vector, a is the fine structure constant, o is the angular frequency of the photon, and K and f are the initial state (e.g., the 1s state in the K shell) and the final state, respectively. The sum is over all possible final states of the photoelectron. It is necessary to consider the overlap of the initial core state with a final state consisting of the excited atom and the outgoing and backscattered photoelectron. Since the photoelectron has a well-defined wavelength, the backscattered wave will have a well-defined phase relative to the origin (deep core electrons characteristic of x-ray edges are highly localized at the center of the absorbing atom). This phase will vary with energy, resulting in a characteristic oscillation of the absorption probability as the two waves go in and out of phase. Using this simple idea and considering only single scattering events gives the following simple expression (Sayers et al., 1971; Ashley and Doniach, 1975; Lee and Pendry, 1975): mðkÞ m0 ðkÞ m0 ðkÞ X Nj ¼ Aj ðkÞpj ðeÞsin½2kRj þ j ðkÞ kR2j j
ð4Þ
The amplitude reduction factor S20 results from multielectron excitations in the central atom (the simple picture assumes only a single photoelectron is excited). This factor depends on the central atom type and typically has nearly k independent values from 0.7 to 0.9. The magnitude of the complex backscattering amplitude factor fj(k, p), Fj(k) depends on the type of scattering atom. Often its k -dependence can be used to determine the type of backscattering atom. The factor Qj(k) accounts for any thermal or structural disorder in the jth shell of atoms. It will be discussed in more detail later. The final exponential factor is a mean free path factor that accounts for inelastic scattering and core hole lifetime effects. The phase term can be broken down to give j ðkÞ ¼ 2fc þ yj ðkÞ þ jj ðkÞ
ð5Þ
where yj ðkÞ is the phase of the backscattering factor fj ðk; pÞ, fc is the p-wave phase shift caused by the potential of the central atom, and jj ðkÞ is a phase factor related to the disorder of the jth shell. In the past, the phase and amplitude terms were generally calibrated using model compounds that had known structures and chemical environments similar to those of the substance being investigated. The problem with this approach is that suitable models were often difficult to come by, since a major justification for applying XAFS is the lack of other structural information. In recent years the theory has advanced to the state where it is often more accurate to make a theoretical determination of the phase and amplitude factors. Using theory also allows for proper consideration of multiple scattering complications to the simple single scattering picture presented so far. This will be discussed below (see Data Analysis and Initial Interpretation). Figures 3, 4, and 5 show some calculated
wðkÞ ¼
ð3Þ
where wðkÞ is the normalized oscillatory part of the absorption determined by subtracting the smooth part of the absorption, m0 , from the total absorption, m, and normalizing by m0 . The sum is over the neighboring shells of atoms each of which consists of Nj atoms at a similar distance Rj. The R2j factor accounts for the fall-off of the spherical photoelectron wave with distance. The sine term accounts for the interference. The total path length of the photoelectron as it travels to a shell and back is 2kRj. The factor pj(e) accounts for the
Figure 3. Theoretical backscattering amplitude versus photoelectron wave vector k for some representative backscattering atoms. Calculated using FEFF 5.0 (Zabinsky et al., 1995).
872
X-RAY TECHNIQUES
Figure 4. Theoretical backscattering phase shift versus photoelectron wave vector k for some representative backscattering atoms. Calculated using FEFF 5.0 (Zabinsky et al., 1995).
amplitudes and phases, illustrating two main points. First, there are significant differences in the backscattering from different elements. Therefore, it is possible to distinguish between different types of atoms in a shell. For neighboring atoms such as Fe and Co, the differences are generally too small to distinguish them reliably, but if the atomic numbers of atoms in a shell differ by more than 10%, then their contributions can be separated. The figures nicely illustrate this. Platinum is quite different from the other elements and is easily distinguished. Silicon and carbon have similar backscattering amplitudes that could be difficult to distinguish if the additional amplitude variations from disorder terms are not known. However, the backscattering phase is about a factor of p different, which means their oscillations will be out of phase and easily distinguished. Similarly, copper and carbon have almost the same phase but a distinctly different amplitude. The second point is that the phase shifts are fairly linear with k. This means that the sinusoidal oscillations in Equation 3 will maintain their periodic character. The additional phase term will primarily result in a constant frequency shift. Each shell of atoms will still have a characteristic frequency, and a Fourier transform of the spec-
Figure 6. Magnitude of the k-weighted Fourier transform of the ˚ 1 was Cu spectrum in Figure 1B. The k range from 2.5 to 16 A used.
trum can be used to separate them. This is shown in Figure 6, which is a Fourier transform of the Cu data in Figure 1B. The magnitude of the Fourier transform is related to the partial radial distribution function of the absorbing atoms. The peaks are shifted to lower R by the phase shift terms, and the amplitude of the peaks depends on the atomic type, number of atoms in a shell, and disorder of the shell. This is a complex structure and there are no well-defined peaks that are well separated from the others. Methods that can be used to extract the structural information are described below (see Data Analysis and Initial Intepretation). This is high-quality data and all the structure shown is reproducible and real. However, ˚ the structure is a composite of such a large above 4 A number of single- and multiple-scattering contributions that it is not possible to analyze. This is typical of most structures, which become too complex to analyze at dis˚ . It is also important to note that tances above 4 to 5 A because of the additional phase and amplitude terms, the transform does not give the true radial distribution. It should only be used as an intermediate step in the analysis and as a method for making a simple qualitative examination of the data. It is very useful for assessing the noise in the data and for pointing out artifacts that can result in nonphysical peaks. EXTENSIONS TO THE SIMPLE PICTURE Disorder
Figure 5. Theoretical central atom phase shift versus photoelectron wave vector k for some representative absorbing atoms. Calculated using FEFF 5.0 (Zabinsky et al., 1995).
The x-ray absorption process is energetic and takes place in a typical time scale of 10 15 sec. This is much faster than the typical vibrational periods of atoms and means that the x-ray absorption takes a snapshot of the atomic configuration. Even for an ideal shell of atoms all at a single distance, thermal vibrations will result in a distribution about the average Rj. In addition, for complex or disordered structures there can be a structural contribution to the distribution in Rj. This results in the disorder factors Qj(k) and jj ðkÞ given above. To a good approximation these factors are given by the Fourier transform of the real-space distribution Pj ðrÞ: ð ð6Þ Qj ðkÞ exp½ijj ðkÞ ¼ dr Pj ðrÞ exp½i2kðr Rj Þ
XAFS SPECTROSCOPY
The simplest case is for a well-ordered shell at low temperature where the harmonic approximation for the vibrations applies. Then the distribution is Gaussian, and the transform is also Gaussian. This results in jj ðkÞ ¼ 0 and a Debye-Waller-like term for Q : Qj ðkÞ ¼ expð 2k2 s2j Þ, where s2j is the mean-square width of the distribution. This simple approximation is often valid and provides a good starting point for analysis. However, it is a fairly extreme approximation and can break down. If s2 > 0:012 , more exact methods should be used that include the contribution to the phase (Boyce et al., 1981; Crozier et al., 1988; Stern et al., 1992). One commonly used method included in many analysis packages is the cumulant expansion (Bunker, 1983). The changes in the Debye-Waller term with temperature can often be determined relatively accurately. It can be quite sensitive to temperature, and low-temperature measurements will often produce larger signals and higher data quality. Although the EXAFS disorder term is often called a Debye-Waller factor, this is not strictly correct. The EXAFS Debye-Waller factor is different from that determined by x-ray diffraction or Mo¨ ssbauer spectroscopy (Beni and Plantzman, 1976). Those techniques measure the mean-square deviation from the ideal lattice position. The EXAFS term is sensitive to the mean-square deviation in the bond length. Long-wavelength vibrational modes in which neighboring atoms move together make relatively little contribution to the EXAFS Debye-Waller term but can dominate the conventional Debye-Waller factor. Thus, the temperature dependence of the EXAFS DebyeWaller term is a sensitive meaure of the bond strength since it directly measures the relative vibrational motion of the bonded atoms.
L Edges The preceding discussion is limited to the K edge since that has a simple 1s initial state. For an L or higher edge the situation can be more complicated. For the L2 or L3 edge, the initial state has 2p symmetry and the final state can have either s or d symmetry. It is also possible to have mixed final states where an outgoing d wave is scattered back as an s state or vice versa. The result is three separate contributions to the function wðkÞ that can be denoted w00 ; w22 , and w20 , where the subscripts refer to the angular momentum state of the outgoing and backscattered photoelectron wave. Each of these can have different phase and amplitude functions. The total EXAFS can then be expressed as follows (Lee, 1976; Heald and Stern, 1977): w¼
2 2 M21 w22 þ M01 w00 þ 2M01 M21 w20 2 2 =2 M21 þ M01
ð7Þ
The matrix element terms (e.g., M01) refer to radial dipole matrix elements between the core wave function with l ¼ 1 and the final states with l ¼ 0 and 2. For the K edge there is only one possibility for M that cancels out. The L1 edge, which has a 2s initial state, has a similar cancellation and can be treated the same as the K edge.
873
This would seriously complicate the analysis of L2,3 edges, but for two fortuitous simplifications. The most important is that M01 is only 0.2 of M21. Therefore, the 2 M01 terms can be ignored. The cross-term can still be 40% of the main term. Fortunately, for an unoriented polycrystalline sample or a material with at least cubic symmetry, the angular average of this term is zero. The cross-term must be accounted for when the sample has an orientation. This most commonly occurs for surface XAFS experiments, which often employ L edges and which have an intrinsic asymmetry. Polarization At synchrotron sources the radiation is highly polarized. The most common form of polarization is linear polarization, which will be considered here (it is also possible to have circular polarization). For linear polarization, the photoelectron is preferentially emitted along the electric field direction. At synchrotrons the electric field is normally oriented horizontally in a direction perpendicular to the beam direction. For K and L1 shells the polarization factor in Equation 3 is pj ðeÞ ¼ 3hcos2 yi, where hcos2 yi ¼
1 X cos2 yi Nj i
ð8Þ
The sum is over the i individual atoms in shell j. If the material has cubic or higher symmety or is randomly oriented, then hcos2 yi ¼ 13, and pj ðeÞ ¼ 1. A common situation is a sample with uniaxial symmetry such as a layered material or a surface experiment. If the symmetry in the plane is 3-fold, then the signal does not depend on the orientation of e within the plane. The signal only depends on the orientation of the electric field vector with respect to the plane, , and wðÞ ¼ 2w0 cos2 =3 þ w90 sin2 =3, where w0 and w90 are the signals with the field parallel and perpendicular to the plane, respectively. For L2,3 edges the situation is more complicated (Heald and Stern, 1977). There are separate polarization factors for the three contributions: 1 ð1 þ 3hcos2 yij Þ 2 1 ¼ 2 1 ¼ ð1 3hcos2 yij Þ 2
pj22 ¼ pj00 pj02
ð9Þ
As mentioned previously, the 00 case can be ignored. For the unoriented case hcos2 yij ¼ 1=3, giving p22 ¼ 1 and p02 ¼ 0. This is the reason the cross-term can often be ignored. Multiple Scattering The most important extension to the simple single-scattering picture discussed so far is the inclusion of multiplescattering paths. For example, considering the first shell of atoms, the photoelectron can scatter from one first-shell atom to a second first-shell atom before being scattered back to the origin. Since in an ordered structure there
874
X-RAY TECHNIQUES
changes the phase of the signal relative to a single-scattering calculation. Figure 7 compares the wðkÞ and the Fourier transforms of calculated spectra for copper in the single-scattering and multiple-scattering cases. Sometimes this focusing effect can be used to advantage in the study of lattice distortions. Because the focusing effect is strongly peaked in the forward direction, small deviations from perfect colinearity are amplified in the signal, allowing such distortions to be resolved even when the change in atomic position of the atoms is too small to resolve directly (Rechav et al., 1994). The focusing effect has even allowed the detection of hydrogen atoms, which normally have negligible backscattering (Lengeler, 1986).
XANES
Figure 7. Comparison of the theoretical multiple- and singlescattering spectra for Cu metal at 80 K. The calculation only included the first four shells of atoms, which show up as four distinct peaks in the transform.
are typically 4 to 12 first-shell atoms, there are a large number of equivalent paths of this type, and the contribution can potentially be comparable to the single-scattering signal from the first shell. Two facts rescue the single-scattering picture. First, it is obvious that the multiple-scattering path is longer than the single-scattering path. Therefore, the multiple-scattering contribution would not contaminate the first-shell signal. It can, however, complicate the analysis of the higher shells. The second important point is that the scattering of the photoelectron is maximum when the scattering angle is either 0 (forward scattering) or 180 (backscattering). For the example given, there are two scattering events at intermediate angles and the contribution to the signal is reduced. The single-scattering picture is still valid for the first shell and provides a reasonably accurate representation of the next one or two shells. For best results, however, modern analysis practice dictates that multiple scattering should be taken into account whenever shells past the first are analyzed, and this will be discussed further in the analysis section. The enhancement of the photoelectron wave in the forward direction is often referred to as the focusing affect. It results in strong enhancement of shells of atoms that are directly in line with inner shells. An example is a face-centered-cubic (fcc) material such as copper. The fourth-shell atoms are directly in line with the first. This gives a strong enhancement of the fourth-shell signal and significantly
The primary difference in the near-edge region is the dominance of multiple-scattering contributions (Bianconi, 1988). At low energies the scattering is more isotropic, and many more multiple-scattering paths must be accounted for, including paths that have negligible amplitude in the extended region. This complicates both the quantitative analysis and calculation of the near-edge region. However, the same multiple scattering makes the near-edge region much more sensitive to the symmetry of the neighboring atoms. Often near-edge features can be used as indicators of such things as tetrahedral versus octahedral bonding. Since the bond symmetry can be intimately related to formal valence, the near edge can be a sensitive indicator of valence in certain systems. A classic example is the near edge of Cr, where, as shown in Figure 8, the chromate ion has a very strong pre-edge feature that makes it easy to distinguish from other types. This near-edge feature is much larger than any EXAFS signal and can be used to estimate the chromate-to-totalCr ratio in many cases for which the EXAFS would be too noisy for reliable analysis. While great strides have been made in the calculation of near-edge spectra, the level of accuracy often is much less than for the extended fine structure. In addition to the multiple scattering approach discussed, attempts have been made to relate the near-edge structure to the projected density of final states obtained from a band-structure-type calculation. In this approach, for the K edge,
Figure 8. Near-edge region for the Cr K edge in a 0.5 M solution of KCrO4.
XAFS SPECTROSCOPY
the p density of states would be calculated. Often there is quite good qualitative agreement with near-edge features that allows them to be correlated to certain electronic states. This approach has some merit since one approach to band structure calculation is to consider all possible scattering paths for an electron in the solid, a quite similar calculation to that for the multiple scattering of the photoelectron. However, the band structure calculations do not account for the ionized core hole created when the photoelectron is excited. If local screening is not strong, the core hole can significantly affect the local potential and the resulting local density of states. While the band structure approach can often be informative, it cannot be taken as a definitive identification of near-edge features without supporting evidence. It is important to point out that the near edge has the same polarization dependence as the rest of the XAFS. Thus, the polarization dependence of the XANES features can be used to associate them with different bonds. For example, in the K edge of a planar system a feature associated with the in-plane bonding will have a cos2 dependence as the polarization vector is rotated out of the plane by . For surface studies this can be a powerful method of determining the orientation of molecules on a surface (Stohr, 1988). PRACTICAL ASPECTS OF THE METHOD Detection Methods The simplest method for determining the absorption coefficient m is to measure the attenuation of the beam as it passes through the sample: I t ¼ I 0 e
or
mx ¼ lnðI0 =It Þ
ð10Þ
Here, I0 is measured with a partially transmitting detector, typically a gas ionization chamber with the gas chosen to absorb a fraction of the beam. It can be shown statistically that the optimum fraction of absorption for the I0 detector is f ¼ 1=ð1 þ emx=2 Þ, or 10% to 30% for typical samples (Rose and Shapiro, 1948). The choice of the optimum sample thickness depends on several factors and will be discussed in more detail later. There are two methods for measuring the transmission: the standard scanning method and the dispersive method. The scanning method uses a monochromatic beam that is scanned in energy as the transmission is monitored. The energy scanning can be either in a step-by-step mode, where the monochromator is halted at each point for a fixed time, or the so-called quick XAFS method, where the monochromator is continuously scanned and data are collected ‘‘on the fly’’ (Frahm, 1988). The step-scanning mode has some overhead in starting and stopping the monochromator but does not require as high a stability of the monochromator during scanning. The dispersive method uses a bent crystal monochromator to focus a range of energies onto the sample (Matsushita and Phizackerley, 1981; Flank et al., 1982). After the sample, the focused beam is allowed to diverge. The constituent energies have different angles of divergence
875
and eventually separate enough to allow their individual detection on a linear detector. The method has no moving parts, and spectra can be collected in a millisecond time frame. The lack of moving apparatus also means that a very stable energy calibration can be achieved, and the method is very good for looking at the time response of small near-edge changes. With existing dispersive setups, the total energy range is limited, often to a smaller range than would be desired for a full EXAFS measurement. Also, it should be noted that, in principle, the dispersive and quick XAFS methods have the same statistical noise for a given acquisition time, assuming they both use the same beam divergence from the source. For N data points, the dispersive method uses 1/N of the flux in each data bin for the full acquisition time, while the quick XAFS method uses the full flux for 1/N of the time. The transmission method is the preferred choice for concentrated samples that can be prepared in a uniform layer of the appropriate thickness. As the element of interest becomes more dilute, the fluorescence technique begins to be preferred. In transmission the signal-to-noise ratio applies to the total signal, and for the XAFS of a dilute component the signal-to-noise ratio will be degraded by a factor related to the diluteness. With the large number of x rays (1010 to 1013 photons/s) available at synchrotron sources, if the signal-to-noise ratio was dominated by statistical errors, then the transmission method would be possible for very dilute samples. However, systematic errors, beam fluctuation, nonuniform samples, and amplifier noise generally limit the total signal-to-noise ratio to 104. Therefore, if the contribution of an element to the total absorption is only a few percent, the XAFS signal, which is only a few percent of that element’s absorption, will be difficult to extract from the transmission signal. In the fluorescence method, the fluorescence photons emitted by the element of interest are detected (Jacklevic et al., 1977). The probability of fluorescence is essentially constant with energy, and the fluorescence intensity is, therefore, directly proportional to the absorption of a specific atomic species. The advantage is that each element has a characteristic fluorescence energy, and in principle the fluorescence from the element under study can be measured separately from the total absorption. For an ideal detector the signal-to-noise ratio would then be independent of the concentration, and would depend only on the number of photons collected. Of course, the number of fluorescence photons that can be collected does depend on concentration, and there would be practical limits. In actuality the real limits are determined by the efficiency of the detectors. It is difficult to achieve a high degree of background discrimination along with a high efficiency of collection. Figure 9 shows a schematic of the energy spectrum from a fluorescence sample. The main fluorescence line is 10% lower than the edge energy. The main background peak is from the elastic scattered incident photons. There are also Compton-scattered photons, which are energy-shifted slightly below the elastic peak. The width and shape of the Compton scattering peak is determined by the acceptance angle of the detector. In multicomponent samples there can also be fluorescence lines from other sample components.
876
X-RAY TECHNIQUES
Figure 9. Schematic of the fluorescence spectrum from an Fecontaining sample with the incident energy set at the Fe K edge. The details of the Compton peak depend on the solid angle and location of the detector. This does not include complications such as fluorescence lines from other components in the sample. Also shown is the Mn K-edge absorption as an example of how filters can be used to reduce the Compton and elastic background.
To obtain a reasonable fluorescence spectrum, it is typically necessary to collect at least 106 photons in each energy channel. This would give a fractional statistical error of 10 3 for the total signal and 1% to 10% in the EXAFS. If the fluorescence signal Nf is accompanied by a background Nb, which cannot be separated by the detector, then the signal-to-noise ratio is equivalent to a signal Nf =ð1 þ AÞ, where A ¼ Nb =Nf . Therefore, a detector that is inefficient at separating the background can substantially increase the counts required. There are three basic approaches to fluorescence detection: solid-state detectors, filters and slits (Stern and Heald, 1979), and crystal spectrometers. High-resolution Ge or Si detectors can provide energy resolution sufficient to separate the fluorescence peak. The problem is that their total counting rate (signal plus all sources of background) is limited to somewhere in the range of 2 104 to 5 105 counts per second depending on the type of detector and electronics used. Since the fluorescence from a dilute sample can be a small fraction of the total, collecting enough photons for good statistics can be time consuming. One solution has been the development of multichannel detectors with up to 100 individual detector elements (Pullia et al., 1997). The other two approaches try to get around the count rate limitations of solid-state detectors by using other means of energy resolution. Then a detector such as an ion chamber can be used that has essentially no count rate limitations (saturation occurs at signal levels much greater than from a typical sample). The principle behind filters is shown in Figure 9. It is often possible to find a material whose absorption edge is between the fluorescence line and the elastic background peak. This will selectively attenuate the background. There is one problem
with this approach: the filter will reemit the absorbed photons as its own fluorescence. Many of these can be prevented from reaching the detector by using appropriate slits, but no filter slit system can be perfect. The selectivity of a crystal spectrometer is nearly ideal since Bragg reflection is used to select only the fluorescence line. However, the extreme angular selectivity of Bragg reflection means that it is difficult to collect a large solid angle, and many fluorescent photons are wasted. Thus, the filter with a slit detector trades off energy selectivity to collect large solid angles, while the crystal spectrometer achieves excellent resolution at the expense of solid angles. Which approach is best will depend on the experimental conditions. For samples with moderate diluteness (cases where the background ratio A 2 keV. These crystals are nearly always perfect and, if thermal or mounting strains are avoided, have a very well defined energy resolution. For Si(111), the resolution is E=E ¼ 1:3 10 4 . Higher-order reflections such as Si(220) and Si(311) can have even better resolution (5:6 10 5 and 2:7 10 5 , respectively). This is the resolution for an ideally collimated beam. All beamlines have some beam divergence, y. Taking the derivative of the Bragg condition, this will add an additional contribution E ¼ y cotðyb Þ where yb is the monochromator crystal Bragg angle. Typically, these two contributions can be added in quadrature to obtain the total energy spread in the beam. The beam divergence is determined by the source and slit sizes and can also be affected by any focusing optics that precede the monochromator. At lower energies, other types of monochromators will likely be employed. These are commonly based on gratings but can also use more exotic crystals such as quartz, InSb, beryl, or YB66. In these cases, it is generally necessary to obtain the resolution information from the beamline operator. Harmonic Content As the later discussion on thickness effects will make clear, it is extremely important to understand the effects of harmonics on the experiment planned. These are most serious for transmission experiments but can be a problem for the other methods as well. Both crystals and gratings can diffract energies that are multiples of the fundamental desired energy. These generally need to be kept to a low level for accurate measurements. Often the beamline mirrors can be used to reduce harmonics if their high-energy cutoff is between the fundamental and harmonic energies. When this is not possible (or for unfocused beamlines), detuning of crystal monochromators is another common method. Nearly all scanning monochromators use two or more crystal reflections to allow energy selection while avoiding wide swings in direction of the output beam.
Energy Calibration All mechanical systems can have some error, and it is important to verify the energy calibration of the beamline. For a single point this can be done using a standard foil containing the element planned for study. It is also important that the energy scale throughout the scan is correct. Most monochromators rely on gear-driven rotation stages, which can have some nonlinearity. This can result in nonlinearity in the energy scale. To deal with this, many modern monochromators employ accurate angle encoders. If an energy scale error is suspected, a good first check is to measure the energy difference between the edges of two adjacent materials and compare with published values. Unfortunately it is sometimes difficult to determine exactly which feature on the edge corresponds to the published value. Another approach is the use of Bragg reflection calibrators, available at some facilities. Monochromator Glitches An annoying feature of perfect crystal monochromators is the presence of sharp features in the output intensities, often referred to as ‘‘glitches.’’ These are due to the excitation of multiple-diffraction conditions at certain energies and angles. Generally, these are sharp dips in the output intensity due to a secondary reflection removing energy from the primary. For ideal detectors and samples these should cancel out, but for XAFS measurements cancellation the 10 4 level is needed to make them undetectable. This is difficult to achieve. It is especially difficult for unfocused beamlines since the intensity reduction may only affect a part of the output beam, and then any sample nonuniformity will result in noncancellation. A sharp glitch that affects only one or two points in the spectrum can generally be handled in the data analysis. Problematic glitches extend over a broader energy range. These can be minimized only by operating the detectors at optimum linearity and making samples as uniform as possible. Usually a particular set of crystals will have only a few problematic glitches, which should be known to the beamline operator. The presence of glitches makes it essential
878
X-RAY TECHNIQUES
that the I0 spectra be recorded separately. Then it can be assured that weak features in the spectra are not spurious by checking for correlation with glitches.
METHOD AUTOMATION Most XAFS facilities are highly automated. At synchrotron facilities the experiments are typically located inside radiation enclosures to which access is forbidden when the beam is on. Therefore, it is important to automate sample alignment. Usually, motorized sample stages are provided by the facility, but for large or unusual sample cells these may not be satisfactory. It is important to work with the facility operators to make sure that sample alignment needs are provided for. Otherwise tedious cycles of entering the enclosure, changing the sample alignment, interlocking the radiation enclosure, and measuring the change in the signal may be necessary. Since beamlines are complex and expensive, it is usually necessary that the installed control system be used. This may or may not be flexible enough to accommodate special needs. While standard experiments will be supported, it is difficult to anticipate all the possible experimental permutations. Again, it is incumbent upon the users to communicate early with the facility operators regarding special needs. This point has already been made several times but, to avoid disappointment, cannot be emphasized enough. DATA ANALYSIS AND INITIAL INTERPRETATION XAFS data analysis is a complex topic that cannot be completely covered here. There are several useful reviews in the literature (Teo, 1986; Sayers and Bunker, 1988; Heald and Tranquada, 1990). There are also a number of software packages available (see Internet Resources listing below), which include their own documentation. The discussion here will concentrate on general principles and on the types of preliminary analysis that can be conducted at the beamline facility to assess the quality of the data. There are four main steps in the analysis: normalization, background removal, Fourier transformation and filtering, and data fitting. The proper normalization is given in Equation 3. In practice, it is simpler to normalize the measured absorption to the edge step. For sample thickness x, w exp ðkÞ ¼
mðkÞx m0 ðkÞx mð0Þx
ð11Þ
where mð0Þx is the measured edge step. This can be determined by fitting smooth functions (often linear) to regions above and below the edge. This step normalization is convenient and accurate when it is consistently applied to experimental data being compared. To compare with theory, the energy dependence of m0 ðkÞ in the denominator must be included. This can be determined from tabulated absorption coefficients (McMaster et al., 1969; see Internet Resources). For data taken in fluorescence or electron
yield, it is also necessary to compensate for the energy dependence of the detectors. For a gas-filled ionization chamber used to monitor I0, the absorption coefficient of the gas decreases with energy. This means the I0 signal would decrease with energy even if the flux were constant. Again, tabulated coefficients can be used. The fluorescence detector energy dependence is not important since it is detecting the constant-energy fluorescence signal. This correction is not needed for transmission since the log of the ratio is analyzed. For this case, the detector energy dependence becomes an additive factor, which is removed in the normalization and background removal process. Background subtraction is probably the most important step. The usual method is to fit a smooth function such as a cubic spline through the data. The fitting parameters must be chosen to only fit the background while leaving the oscillations untouched. This procedure has been automated to a degree in some software packages (Cook and Sayers, 1981; Newville et al., 1993), but care is needed. Low-Z backscatterers are often problematic since their oscillation amplitude is very large at low k and damps out rapidly. It is then difficult to define a smooth background function that works over the entire k region in all cases. Often careful attention is needed in choosing the low-Z termination point. The background subtraction stage is also useful in pointing out artifacts in the data. The background curve should be reasonably smooth. Any unusual oscillation or structure should be investigated. One natural cause of background structure is multielectron excitations. These can be ignored in many cases but may be important for heavier atoms. Once the background-subtracted wðkÞ is obtained, its Fourier transform can be taken. As shown in Figure 6, this makes visible the different frequency components in the spectra. It also allows a judgment of the success of the background subtraction. Poor background removal will result in strong structure at low r. However, not all structure at low r is spurious. For example, the rapid decay of low-Z backscatterers can result in a low-r tail on the transform peak. To properly judge the source of the low-r structure, it is useful to transform theoretical calculations of similar cases for comparison. Before Fourier transforming, the data is usually multiplied by a factor kw , W = 1, 2, or 3. This is done to sharpen the transform peaks. Generally the transform peaks will be sharpest if the oscillation amplitude is uniform over the data window being transformed. Often, W = 1 is used to compensate for the 1/k factor seen in Equation 3. For low-Z atoms and systems with large disorder, higher values of W can be used to compensate for the additional k-dependent fall-off. High values of W also emphasize the higher-k part of the spectrum where the heavier backscatterers have the largest contribution. Thus, comparing transforms with different k weighting is a simple qualitative method of determining which peaks are dominated by low- or high-Z atoms. In addition to k weighting, other weighting can be applied prior to transforming. The data should only be transformed over the range for which there is significant signal. This can be accomplished by truncating wðkÞ with a rectangular window prior to transforming. A sharp rectangular window can induce truncation ripples
XAFS SPECTROSCOPY
˚ for Figure 10. Inverse Fourier transform of the region 0.8 to 2 A the Cu data in Figure 6. This region contains contributions from three different Cu-O distances in the first coordination sphere.
in the transform. To deal with this, most analysis packages offer the option of various window functions that give a more gradual truncation. There are many types of windows, all of which work more or less equivalently. The final stage of analysis is to fit the data either in k or r space using theoretical or experimentally derived functions. If k-space fitting is done, the data are usually filtered to extract only the shells being fit. For example, if the first and second shells are being fit in k space, an r-space window is applied to the r-space data, which includes only the first- and second-shell contributions, and then the data are transformed back to k space. This filtered spectra only includes the first and second shell contributions, and it can be fit without including all the other shells. An example of filtered data is shown in Figure 10. An important concept in fitting filtered data is the number of independent data points (Stern, 1993). This is determined by information theory to be N1 ¼ 2kr=p þ 2, where k and r are the k- and r-space ranges of the data used for analysis. If fitting is done in r space, r would be the range over which the data are fit, and k is the k range used for the transform. The number of independent parameters NI is the number that are needed to completely describe the filtered data. Therefore, any fitting procedure should use fewer than NI parameters. The relative merits of different types of fitting are beyond the scope of this unit, and fitting is generally not carried out at the beamline while data is being taken. It is important, however, to carry the analysis through the Fourier transform stage to assess the data quality. As mentioned, systematic errors often show up as strange backgrounds or unphysical transform peaks, and random statistical errors will show up as an overall noise level in the transform. The most widely used theoretical program is FEFF (Zabinsky et al., 1995). It is designed to be relatively portable and user friendly and includes multiple scattering. It is highly desirable for new users to obtain FEFF or an equivalent package (Westre et al., 1995) and to calculate some example spectra for the expected structures.
879
The most common causes of incorrect data can all be classified under the general heading of thickness effects or thickness-dependent distortions. A good experimental check is to measure two different thicknesses of the same sample: if thickness effects are present, the normalized XAFS amplitude for the thicker sample will be reduced. Thickness effects are caused by leakage radiation, such as pinholes in the sample, radiation that leaks around the sample, or harmonic content in the beam. All of these are essentially unaffected by the sample absorption. As the x-ray energy passes through the edge, the transmission of the primary radiation is greatly reduced while the leakage is unchanged. Thus, the percentage of leakage radiation in the transmitted radiation increases in passing through the edge, and the edge step is apparently reduced. The greater the absorption, the greater is this effect. This means the peaks in the absorption are reduced in amplitude even more than the edge if leakage is present. In the normalized spectrum such peaks will be reduced. The distortion is larger for thick samples, which have the greatest primary beam attenuation. It can be shown statistically that the optimum thickness for transmission samples in the absence of leakage is about mx ¼ 2:5. In practice, a mx 1:5 is generally a good compromise between obtaining a good signal and avoiding thickness effects. An analogous effect occurs in fluorescence detection. It is generally referred to as self-absorption. As the absorption increases at the edge, the penetration of the x rays into the sample decreases, and fewer atoms are available for fluorescing. Again, peaks in the absorption are reduced. Obviously this is a problem only if the absorption change is significant. For very dilute samples the absorption change is too small for a noticeable effect. Self-absorption is the reason that fluorescence detection should not be applied to concentrated samples. In this context ‘‘concentrated’’ refers to samples for which the absorption step is >10% of the total absorption. When sample conditions preclude transmission measurements on a concentrated sample, electron yield detection is a better choice than fluorescence detection. For electron yield detection, the signal originates within the near-surface region where the x-ray penetration is essentially unaffected by the absorption changes in the sample. Thus, in electron yield detection, selfabsorption can almost always be ignored.
SAMPLE PREPARATION XAFS samples should be as uniform as possible. This can be a significant challenge. Typical absorption lengths for concentrated samples are from 3 to 30 mm. Foils or liquid solutions are a simple approach but cannot be universally applied. It can be difficult to grind and disperse solid particles into uniform layers of appropriate thickness. For example, to achieve a 10-mm layer of a powdered sample, the particle size should be of order 2 to 3 mm. This is difficult to achieve for many materials. Some common sample preparation methods include rubbing the powder into lowabsorption adhesive tape or combining the powder with a low-absorption material such as graphite or BN prior
880
X-RAY TECHNIQUES
to pressing into pellets. The tape method has been especially successful, probably because the rubbing process removes some of the larger particles leaving behind a smaller average particle size adhered to the tape. Nevertheless, if less than about four layers of tape are required to produce a reasonable sample thickness, then the samples are likely to be unacceptably nonuniform. XAFS signals can be significantly enhanced by cooling samples to liquid nitrogen temperatures. This can also be a constraint on sample preparation since tape-prepared samples can crack, exposing leakage paths, upon cooling. This can be avoided by using Kapton-based tape, although the adhesive on such tapes can have significant absorption. Materials with layered or nonsymmetric structures can result in anisotropic particles. These will tend to be oriented when using the tape or pellet method and result in an aligned sample. Since all synchrotron radiation sources produce a highly polarized beam, the resulting data will show an orientation dependence. This can be accounted for in the analysis if the degree of orientation is known, but this can be difficult to determine. For layered materials, one solution is to orient the sample at the ‘‘magic’’ angle (54.7 ) relative to the polarization vector. This will result in a signal equivalent to an unoriented sample for partially oriented samples. Sample thickness is less of a concern for fluorescence or electron yield samples, but it is still important that the samples are uniform. For electron yield, the sample surface should be clean and free of thick oxide layers or other impurities. Sample charging can also be a factor. This can be a major problem for insulating samples in a vacuum. It is less important when a He-filled detector is used. Also, the x-ray beam is less charging than typical electron beam techniques, and fairly low conductivity samples can still be successfully measured.
SPECIMEN MODIFICATION
been discussed. Other problems can occur that are exacerbated by the need to run measurements at remote facilities under the time constraints of a fixed schedule. The critical role played by beam harmonics has already been discussed. Eliminating harmonics requires the correct setting of beamline crystals or mirrors. It is easy to do this incorrectly at a complex beamline. Another area of concern is the correct operation of detectors. The high fluxes at synchrotrons mean that saturation or deadtime effects are often important. For ion-chamber detectors it is important to run the instrument at the proper voltage and to use the appropriate gases for linear response. Similarly, saturation must be avoided in the subsequent amplifier chain. Single-photon detectors, if used, are often run at count rates for which deadtime corrections are necessary. For fluorescence experiments this means that the incoming or total count rate should be monitored as well as the fluorescence line(s) of interest. Guidance on many of these issues can be obtained from the beamline operators for typical running conditions. However, few experiments are truly typical in all respects, and separate tests may need to be made. In contrast to laboratory experiments, the time constraints of synchrotron experiments make it tempting to skip some simple tests needed to verify that undistorted data is being collected. This should be avoided if at all possible.
LITERATURE CITED Ashley, C. A. and Doniach, S. 1975. Theory of extended x-ray absorption edge fine structure (EXAFS) in crystalline solids. Phys. Rev. B 11:1279–1288. Azaroff, L. V. and Pease, D. M. 1974. X-ray absorption spectra. In X-ray Spectroscopy (L. V. Azaroff, ed.). McGraw-Hill, New York. Beni, G. and Platzman, P. M. 1976. Temperature and polarization dependence of extended x-ray absorption fine-structure spectra. Phys. Rev. B 14:1514–1518.
In general, x rays are less damaging than particle probes such as the electrons used in electron microscopy. However, samples can be modified by radiation damage. This can be physical damage resulting in actual structural changes (usually disordering) or chemical changes. This is especially true for biological materials such as metalloproteins. Metals and semiconductors are generally radiation resistant. Many insulating materials will eventually suffer some damage, but often on timescales that are long compared to the measurement. Since it is difficult to predict which materials will be damaged, for potentially sensitive materials, it is wise to use other characterization techniques to verify sample integrity after exposure or to look for an exposure dependence in the x-ray measurements.
Bianconi, A. 1988. XANES spectroscopy. In X-ray Absorption (D. C. Koningsberger and R. Prins, eds.). pp. 573–662. John Wiley & Sons, New York.
PROBLEMS
Frahm, R. 1988. Quick scanning EXAFS: First experiments. Nucl. Instrum. Methods A270:578–581.
The two most common causes of incorrect data are sample nonuniformity and thickness effects, which have already
Gauthier, Ch., Goulon, J., Moguiline, E., Rogalev, A., Lechner, P., Struder, L., Fiorini, C., Longoni, A., Sampietro, M., Besch, H.,
Boyce, J. B., Hayes, T. M., and Mikkelsen, J. C. 1981. Extended-xray-absorption-fine-structure of mobile-ion density in superionic AgI, CuI, CuBr, and CuCl. Phys. Rev. B 23:2876–2896. Bunker, G. 1983. Application of the ratio method of EXAFS analysis to disordered systems. Nucl. Instrum. Methods 207:437– 444. Cook, J. W. and Sayers, D. E. 1981. Criteria for automatic x-ray absorption fine structure background removal. J. Appl. Physiol. 52:5024–5031. Crozier, E. D., Rehr, J. J., and Ingalls, R. 1988. Amorphous and liquid systems. In X-ray Absorption (D. C. Koningsberger and R. Prins, eds.). pp. 373–442. John Wiley & Sons, New York. Flank, A. M., Fontaine, A., Jucha, A., Lemonnier, M., and Williams, C. 1982. Extended x-ray absorption fine structure in dispersive mode. J. Physique 43:L-315-L-319.
XAFS SPECTROSCOPY Pfitzner, R., Schenk, H., Tafelmeier, U., Walenta, A., Misiakos, K., Kavadias, S., and Loukas, D. 1996. A high resolution, 6 channels drift detector array with integrated JFET’s designed for XAFS spectroscopy: First x-ray fluorescence excitation recorded at the ESRF. Nucl. Instrum. Methods A382:524– 532. Heald, S. M. and Stern, E. A. 1977. Anisotropic x-ray absorption in layered compounds. Phys. Rev. B 16:5549–5557. Heald, S. M. and Tranquada, J. M. 1990. X-ray absorption spectroscopy: EXAFS and XANES. In Physical Methods of Chemistry, Vol. V (Determination of Structural Features of Crystalline and Amorphous Solids) (B.W. Rossiter and J.F. Hamilton, eds.). pp. 189–272. Wiley-Interscience, New York. Jacklevic, J., Kirby, J. A., Klein, M. P., Robinson, A. S., Brown, G., and Eisenberger, P. 1977. Fluorescence detection of EXAFS: Sensitivity enhancement for dilute species and thin films. Sol. St. Commun. 23:679–682. Kordesch, M. E. and Hoffman, R. W. 1984. Electron yield extended x-ray absorption fine structure with the use of a gas-flow detector. Phys. Rev. B 29:491–492. Lee, P. A. 1976. Possibility of adsorbate position determination using final-state interference effects. Phys. Rev. B 13:5261– 5270. Lee, P. A. and Pendry, J. B. 1975. Theory of the extended x-ray absorption fine structure. Phys. Rev. B 11:2795–2811. Lengeler, B. 1986. Interaction of hydrogen with impurities in dilute palladium alloys. J. Physique C8:1015–1018. Matsushita, T. and Phizackerley, R. P. 1981. A fast x-ray spectrometer for use with synchrotron radiation. Jpn. J. Appl. Phys. 20:2223–2228. McMaster, W. H., Kerr Del Grande, N., Mallett, J. H., and Hubbell, J. H. 1969. Compilation of x-ray cross sections, LLNL report, UCRL-50174 Section II Rev. 1. National Technical Information Services L-3, U.S. Department of Commerce. Newville, M., Livins, P., Yacoby, Y., Stern, E. A., and Rehr, J. J. 1993. Near-edge x-ray-absorption fine structure of Pb: A comparison of theory and experiment. Phys. Rev. B 47:14126– 14131. Pullia, A., Kraner, H. W., and Furenlid, L. 1997. New results with silicon pad detectors and low-noise electronics for absorption spectrometry. Nucl. Instrum. Methods 395:452–456.
881
Stohr, J. 1988. SEXAFS: Everything you always wanted to know. In X-ray Absorption (D. C. Koningsberger and R. Prins, eds.). pp. 443–571. John Wiley & Sons, New York. Stumm von Bordwehr, R. 1989. A history of x-ray absorption fine structure. Ann. Phys. Fr. 14:377–466. Teo, B. K. 1986. EXAFS: Basic Principles and Data Analysis. Springer-Verlag, Berlin. Westre, T. E., Dicicco, A., Filipponi, A., Natoli, C. R., Hedman, B., Soloman, E. I., and Hodgson, K. O. 1995. GNXAS, a multiplescattering approach to EXAFS analysis—methodology and applications to iron complexes. J. Am. Chem. Soc. 117:1566– 1583. Zabinsky, S. I., Rehr, J. J., Ankudinov, A., Albers, R. C., and Eller, M. J. 1995. Multiple scattering calculations of x-ray absorption spectra. Phys. Rev. B 52:2995–3009.
KEY REFERENCES Goulon, J., Goulon-Ginet, C., and Brookes, N. B. 1997. Proceedings of the Ninth International Conference on X-ray Absorption Fine Structure, J. Physique 7 Colloque 2. A good snapshot of the current status of the applications and theory of XAFS. See also the earlier proceedings of the same conference. Koningsberger, D. C. and Prins, P. 1988. X-ray Absorption: Principles, Applications, Techniques of EXAFS, SEXAFS and XANES. John Wiley & Sons, New York. A comprehensive survey of all aspects of x-ray absorption spectroscopy. Slightly dated on some aspects of the calculation and analysis of multiple-scattering contributions, but a very useful reference for serious XAFS practitioners. Stohr, J. 1992. NEXAFS Spectroscopy. Springer-Verlag, New York. More details on the use and acquisition of near-edge spectra, especially as they apply to surface experiments.
INTERNET RESOURCES
Rechav, B., Yacoby Y., Stern, E. A., Rehr, J. J., and Newville, M. 1994. Local structural distortions below and above the antiferrodistortive phase transition. Phys. Rev. Lett. 69:3397– 3400.
http://ixs.csrri.iit.edu/index.html International XAFS Society homepage, containing much useful information including upcoming meeting information, an XAFS database, and links to many other XAFS-related resources.
Rose, M. E. and Shapiro, M. M. 1948. Statistical error in absorption experiments. Physiol. Rev. 74:1853–1864.
http://www.aps.anl.gov/offsite.html
Sayers, D. E. and Bunker, B. A. 1988. Data analysis. In X-ray Absorption (D. C. Koningsberger and R. Prins, eds.). pp. 211– 253. John Wiley & Sons, New York.
A list of synchrotron facility homepages maintained by the Advanced Photon Source at Argonne National Laboratory, one of several similar Web sites.
Sayers, D. E., Stern, E. A., and Lytle, F. W. 1971. New technique for investigating non-crystalline structures: Fourier analysis of the extended x-ray absorption fine structure. Phys. Rev. Lett. 27:1204–1207.
http://www-cxro.lbl.gov/optical_constants/
Stern, E. A. 1993. Number of relevant independent points in xray-absorption fine-structure spectra. Phys. Rev. B 48:9825– 9827.
http://www.esrf.fr/computing/expg/subgroups/theory/xafs/xafs_ software.html
Stern, E. A. and Heald, S. M. 1979. X-ray filter assembly for fluorescence measurements of x-ray absorption fine structure. Rev. Sci. Instrum. 50:1579–1582. Stern, E. A., Ma, Y., Hanske-Pettipierre, O., and Bouldin, C. E. 1992. Radial distribution function in x-ray-absorption fine structure. Phys. Rev. B 46:687–694.
Tabulation of absorption coefficients and other x-ray optical constants maintained by the Center for X-Ray Optics at the Lawrence Berkeley National Laboratory.
A compilation of available XAFS analysis software maintained by the European Synchrotron Radiation Facility (ESRF).
STEVE HEALD Pacific Northwest National Laboratory Richland, Washington
882
X-RAY TECHNIQUES
X-RAY AND NEUTRON DIFFUSE SCATTERING MEASUREMENTS INTRODUCTION Diffuse scattering from crystalline solid solutions is used to measure local compositional order among the atoms, dynamic displacements (phonons), and mean speciesdependent static displacements. In locally ordered alloys, fluctuations of composition and interatomic distances break the long-range symmetry of the crystal within local regions and contribute to the total energy of the alloy (Zunger, 1994). Local ordering can be a precursor to a lower temperature equilibrium structure that may be unattainable because of slow atomic diffusion. Discussions of the usefulness of local chemical and displacive correlations within alloy theory are given in Chapter 2 (see PREDICTION OF PHASE DIAGRAMS and COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS). In addition to local atomic correlations, neutron diffuse scattering methods can be used to study the local short-range correlations of the magnetic moments. Interstitial defects, as opposed to the substitutional disorder defects described above, also disrupt the long-range periodicity of a crystalline material and give rise to diffusely scattered x rays, neutrons, and electrons (electron scattering is not covered in this unit; Schweika, 1998). Use of tunable synchrotron radiation to change the xray scattering contrast between elements has greatly improved the measurement of bond distances between the three types of atom pairs found in crystalline binary alloys (Ice et al., 1992). The estimated standard deviation of the first-order (first moment) mean static displacements ˚ (0.0001 nm), from this technique approaches 0.001 A which is an order of magnitude more precise than results obtained with extended x-ray absorption fine structure (EXAFS; XAFS SPECTROSCOPY) measurements. In addition, both the radial and tangential displacements can be reliably determined to five or more near-neighbor shells (Jiang et al., 1996). In a binary A-B alloy, the number of A or B near-neighbor atoms to, for example, an A atom can be determined to even less than 1 atom in 100. The second moment of the static displacements, which gives rise to Huang scattering, is also measurable (Schweika, 1998). Measurements of diffuse scattering can also reveal the tensorial displacements associated with substitutional and interstitial defects. This information provides models of the average arrangements of the atoms on a local scale. An example of chemical local ordering is given in Figure 1, where the probability PAB lmn of finding a B atom out to the sixth lmn shell around an A atom goes from a preference for A atoms (clustering) to a preference for B atoms (short-range order) for a body-centered cubic (bcc) A50B50 alloy. The real-space representation of the atom positions is derived from a Monte Carlo simulation of the PAB lmn values (Gehlen and Cohen, 1965) from the measurement of the intensity distribution in reciprocal space (Robertson et al., 1998). In the upper panel, the probability of finding a B atom as the first neighbor to an A atom is 40% (10% clustering; PAB 111 ¼ 0:4). Atoms are located on a (110) plane so that first-neighbor pairs are shown. The middle panel depicts the random alloy where PAB lmn ¼ 0:5 for the
Figure 1. Direct and reciprocal space representations for a clustering, a random, and an ordering A50B50 bcc alloy. Courtesy of Robertson et al. (1998).
first six shells (lmn). The lower panel shows the case where PAB lmn ¼ 0:6 (a preference for unlike atom pairs). The intensity distribution in the (100) plane of reciprocal space (with the fundamental Bragg maxima removed) is shown in the right column of Figure 1. Note that a preference for like nearest neighbors causes the scattering to be centered near the fundamental Bragg maxima, such as at the origin, 0,0. A preference for unlike first-neighbor pairs causes the diffuse scattering to peak at the superlattice reflections for an ordered structure. Models, such as those shown in Figure 1, are used to understand materials properties and their response to heat treatment, mechanical deformation, and magnetic fields. These local configurations are useful to test advances in theoretical models of crystalline alloys as discussed in COMPUTATION OF DIFFUSE INTENSITIES IN ALLOYS. The diffraction pattern from a crystalline material with perfect periodicity, such as nearly perfect single-crystal Si, consists of sharp Bragg maxima associated with longrange periodicity. With Bragg’s law, we can determine the size of the unit cell. Because of thermal motion, atom positions are smeared and Bragg maxima are reduced. In alloys with different atomic sizes, static displacements will also contribute to this reduction. This intensity, which is lost from the Bragg reflections, is diffusely distributed.
X-RAY AND NEUTRON DIFFUSE
883
recover pair correlation probabilities for the three kinds of pairs in a binary alloy. The interpretation of diffuse scattering associated with dynamic displacements of atoms from their average crystal sites will be discussed only briefly in this unit. Competitive and Related Techniques
Figure 2. Displacements about the average lattice preserve the regular spacing between atomic planes such that d ¼ d1 ¼ d2 ¼ d3 ¼ . . . . The average lattice is obtained from the positions of the sharp Bragg reflections (B). Information about short-range correlations among the atoms is contained in the diffusely distributed intensity between the Bragg peaks. Courtesy of Ice et al. (1998).
Shown schematically in Figure 2A is a solid solution of two kinds of atoms displaced from the sites of the average lattice in such a way that the average plane of atoms is regularly spaced with a constant ‘‘d’’ spacing existing over hundreds of planes. As shown schematically in Figure 2B, there is weak diffuse scattering but no broadening of the fundamental Bragg reflections, as would be the case for more extended defects such as stacking faults, high dislocation densities, displacive transformations, and incoherent precipitates, among others (Warren, 1969). In cases where the fundamental Bragg reflections are broadened, our uncertainty in the size of the average lattice increases and the precision of the measured pair separation is reduced. This unit will concentrate on the use of diffuse x-ray and neutron scattering from single crystals to measure local chemical correlations and chemically specific static displacements. Particular emphasis will be placed on the use of resonant (anomalous) x-ray techniques to extract information on atomic size from binary solid solutions with short-range order. Here the alloys have a well-defined average lattice but have local fluctuations in composition and displacements from the average lattice. In stoichiometric crystals with long-range periodicity, sharp superlattice Bragg reflections appear. If the compositional order is correlated only over short distances, the superlattice reflections are so broadened that measurements throughout a symmetry related volume in reciprocal space are required to determine its distribution. In addition, the displacement of the atom pairs (e.g., the A-A, A-B, and B-B pairs in a binary alloy) from the sites of the average lattice because of different atom sizes also contributes to the distribution of this diffuse scattering. By separating this diffuse intensity into its component parts—that associated with the chemical preference for A-A, A-B, and B-B pairs for the various near-neighbor shells and that associated with the static and dynamic displacements of the atoms from the sites of the average lattice—we are able to
Other techniques that measure local chemical order and bond distances exist. In EXAFS, outgoing photoejected electrons are scattered by the surrounding near neighbors (most from first and to a lesser extent from second nearest neighbors). This creates modulations of the x-ray absorption cross-section, typically extending for 1000 eV above the edge, and gives information about both local chemical order and bond distances (see XAFS SPECTROSCOPY for details). Usually, the phase and amplitudes for the interference of the photo-ejected electrons must be extracted from model systems, which necessitates measurements on intermetallic (ordered) compounds of known bond distances and neighboring atoms. The choice of an incident x-ray energy specific to an elemental absorption edge makes EXAFS information specific to that elemental constituent. For an alloy of A and B atoms, the EXAFS for an absorption edge of an A atom would be sensitive to the A and B atoms neighboring the A atoms. Separation of the signal into A-A and B-B pairs is typically done by using dilute alloys containing 2 at.% or less of the constituent of interest (e.g., the A atoms). The EXAFS signal is then interpreted as arising from the predominately B neighborhood of an A atom, and analyzed in terms of the number of B first and second neighbors and their bond distances from the A atoms. Claims for the accuracy of the EXAFS method ˚ for bond distance and 10% vary between 0.01 and 0.02 A for the first shell coordination number (number of atoms in the first shell; Scheuer and Lengeler, 1991). For crystalline solid-solution alloys, the crystal structure precisely determines the number of neighbors in each shell but not the kinds for nondilute alloys. For most alloys, the precision achieved with EXAFS is inadequate to determine the deviations of the interatomic spacings from the average lattice. Whenever EXAFS measurements are applicable and of sufficient precision to determine the information of interest, the ease and simplicity of this experiment compared with three-dimensional diffuse scattering measurements makes it an attractive tool. An EXAFS study of concentrated Au-Ni alloys revealed the kind of information available (Renaud et al., 1988). Mo¨ ssbauer spectroscopy is another method for obtaining near-neighbor information (MOSSBAUER SPECTROMETRY). Measurements of hyperfine field splitting caused by changes in the electric field gradient or magnetic field because of the different charge or magnetic states of the nuclear environments give information about the nearneighbor environments. Different chemical and magnetic environments of the nucleus produce different hyperfine structure, which is interpreted as a measure of the different chemical environments typical for first and second neighbors. The quantitative interpretation of Mo¨ ssbauer spectra in terms of local order and bond distances is often ambiguous. The use of Mo¨ ssbauer spectroscopy is limited
884
X-RAY TECHNIQUES
to those few alloys where at least one of the constituents is a Mo¨ ssbauer-active isotope. This spectroscopy is complimentary but does not compete with diffuse scattering measurements as a direct method for obtaining detailed information about near-neighbor chemical environments and bond distances (Drijver et al., 1977; Pierron-Bohnes et al., 1983). Imaging techniques with electrons, such as Z contrast microscopy and other high-resolution electron microscopy techniques (see Chapter 11), do not have the resolution for measuring the small displacements associated with crystalline solid solutions. Imaging for high-resolution microscopy requires a thin sample about a dozen or more unit cells thick with identical atom occupations, and precludes obtaining information about short-range order and bond distances. Electron diffuse scattering measurements are difficult to record in absolute units and to separate from contributions to the diffuse background caused by straggling energy loss processes. Electron techniques provide extremely useful information on more extended defects as discussed in Chapter 11. Field ion microscopy uses He or Ne gas atoms to image the small radius tip of the sample. Atom probes provide direct imaging of atom positions. Atoms are pulled from a small radius tip of the sample by an applied voltage and mass analyzed through a small opening. The position ˚ . Information on the of the atom can be localized to 5 A species of an atom and its neighbors can be recovered. Reports of successful analysis of concentration waves and clusters in phase separating alloys have occurred where strongly enriched clusters of like atoms are as small ˚ in diameter (Miller et al., 1996). However, informaas 5 A tion on small displacements cannot be obtained with atom probes. Scanning tunneling (see SCANNING TUNNELING MICROSCOPY) and atomic force microscopy can distinguish between the kinds of atoms on a surface and reveal their relative positions.
PRINCIPLES OF THE METHOD In this section, we formulate the diffraction theory of diffuse scattering in a way that minimizes assumptions and maximizes the information obtained from a diffraction pattern without recourse to models. This approach can be extended with various theories and models for interpretation of the recovered information. Measurements of diffusely scattered radiation can reveal the kinds and number of defects. Since different defects give different signatures in diffuse scattering, separation of these signatures can simplify recovery of the phenomenological parameters describing the defect. Availability of intense and tunable synchrotron x-ray sources, which allow the selection of x-ray energies near absorption edges, permits the use of resonant scattering techniques to separate the contribution to the diffuse scattering from different kinds of pairs (e.g., the A-A, A-B, and B-B pairs of a binary alloy). Near an x-ray K absorption edge, the x-ray scattering factor of an atom can change by 8 electron units (eu) and allows for scattering contrast control between atoms in an alloy. Adjustable contrast,
Figure 3. The atom positions for a face-centered cubic (fcc) structure are used to illustrate the notation for the real-space lattice. The unit cell has dimensions a ¼ b ¼ c. The corresponding reciprocal space lattice is a*, b*, c*. A position in reciprocal space at which the scattered intensity, I(H), is measured for an incoming x ray in the direction of S0 of wavelength l and detected in the outgoing direction of S would be H ¼ (S S0)/l ¼ h1a*þh2b*þh3c*. At Bragg reflections, h1h2h3 are integers and are usually designated hkl, the Miller indices of the reflection. This notation follows that used in the International Tables for Crystallography. Courtesy of Sparks and Robertson (1994).
either through resonant (anomalous) x-ray scattering or through isotopic substitution for neutrons, allows for precision measurement of chemically specific local-interatomic distances within the alloy. Figure 3 gives the real-space notation used in describing the atom positions and its reciprocal space notations used in describing the intensity distribution. X Rays Versus Neutrons The choice of x rays or neutrons for a given experiment depends on instrumentation and source availability, the constituent elements of the sample, the information sought, the temperature of the experiment, the size of the sample, and isotopic availability and cost, among other considerations. Neutron scattering is particularly useful for measurements of low-Z materials, for high-temperature measurements, and for measurements of magnetic ordering. X-ray scattering is preferred for measurements of small samples, for measurement of static displacements, and for high H resolution. Chemical Order Recovery of the local chemical preference for atom neighbors has been predominately an x-ray diffuse scattering measurement, although x-ray and neutron measurements are complimentary. More than 50 systems have been studied with x rays and around 10 systems with neutrons.
X-RAY AND NEUTRON DIFFUSE
The choice between x-ray and neutron methods often depends upon which one gives the best contrast between the constituent elements and which one allows the greatest control over contrast-isotopic substitution for contrast change with neutrons (Cenedese et al., 1984) or resonant (synonymous with dispersion and anomalous) x-ray techniques with x-ray energies near to absorption edge energies (Ice et al., 1994). In general, x-ray scattering is favored by virtue of its better intensity and collimation, which allow for smaller samples and better momentumtransfer resolution. Neutron diffuse scattering has the advantage of discriminating against thermal diffuse scattering (TDS); there is a significant change in energy (wavelength) when neutrons are scattered by phonons (see PHONON STUDIES). For example, phonons in the few tens of millielectron volt energy range make an insignificant change in the energy of kiloelectron volt x rays but make a significant change in the 35-meV energy of thermal neutrons, except near Bragg reflections. Thus, TDS of neutrons is easily rejected with crystal diffraction or time-offlight techniques even at temperatures approaching and exceeding the Debye temperature. With x rays, however, TDS can be a major contribution and can obscure the Laue scattering. Magnetic short-range order can also be studied with neutrons in much the same way as the chemical short-range order. However, when the alloy is magnetic, extra effort is needed to separate the magnetic scattering from the nuclear scattering that gives the information on chemical pair correlations. Neutrons can be used in combination with x rays to obtain additional scattering contrast. The x-ray scattering factors increase with the atomic number of the elements (since it increases with the number of electrons), but neutron scattering cross-sections are not correlated with the atomic number as they are scattered from the nucleus. When an absorption edge of one of the elements is either too high for available x-ray energies or too low in energy for the reciprocal space of interest, or if enough different isotopes of the atomic species making up the alloy are not available or are too expensive, then a combination of both x-ray and neutron diffuse scattering measurements may be a way to obtain the needed contrast. A more indepth discussion of the properties of neutrons is given in Chapter 13. Information on chemical short-range order obtained with both x rays and neutrons and a general discussion of their merits is given by Kostorz (1996). Local chemical order among the atoms including vacancies has been measured for 70 metallic binary alloys and a few oxides. Only two ternary metallic systems have been measured in which the three independent pair probabilities between the three kinds of atoms have been recovered: an alloy of Cr21Fe56Ni23 with three different isotopic contents studied with neutron diffuse scattering measurements (Cenedese et al., 1984) and an alloy Cu47Ni29Zn34 studied with three x-ray energies (Hashimoto et al., 1985). Bond Distances In binary alloys, recovery of the bond distances between A-A, A-B, and B-B pairs requires measurement of the
885
diffuse intensity at two different scattering contrasts to separate the A-A and B-B bond distances (as shown, later the A-B bond distance can be expressed as a linear combination of the other two distances to conserve volume). Such measurements require two matched samples of differing isotopic content to develop neutron contrast, whereas for x rays, the x-ray energy need only be changed a fraction of a kilovolt near an absorption edge to produce a significant contrast change (Ice et al., 1994). Of course, one of the elemental constituents of the sample needs to have an absorption edge energy above 5 keV to obtain sufficient momentum transfer for a full separation of the thermal contribution. However, more limited momentumtransfer data can be valuable for certain applications. Thus binary solid solutions where both elements have atomic numbers lower than that of Cr (Z¼24) may require both an x-ray and a neutron measurement of the diffuse intensity to obtain data of sufficient contrast and momentum transfer. To date, there have been about 10 publications on the recovery of bond distances. Most have employed diffuse x-ray scattering measurements, but some employed neutrons and isotopically substituted samples (Mu¨ ller et al., 1989). Diffuse X-ray (Neutron) Scattering Theory for Crystalline Solid Solutions In the kinematic approximation, the elastically scattered x-ray (neutron) intensity in electron units per atom from an ensemble of atoms is given by
IðHÞTotal
2 X XX 2p iHrp ¼ fp e fp fq e2p iHðrp rq Þ ¼ p p q
ð1Þ
Here fp and fq denote the complex and complex-conjugate x-ray atomic scattering factors (or neutron scattering lengths), p and q designate the lattice sites from 0 to N 1, rp and rq are the atomic position vectors for those sites, and H is the momentum transfer or reciprocal lattice vector |H| ¼ (2 sin y)/l (Fig. 3). For crystalline solid solutions with a well-defined average lattice (sharp Bragg reflections), the atom positions can be represented by r ¼ Rþd, where R is determined from the lattice constants and d is both the thermal and static displacement of the atom from that average lattice. Equation 1 can be separated into terms of the average lattice R and the local fluctuations d, IðHÞTotal ¼
XX p
fp fq e2p iHðdp dq Þ e2piHðRp Rq Þ
ð2Þ
q
We limit our discussion of the diffraction theory to crystalline binary alloys of A and B atoms with atomic concentration CA and CB, respectively, and with complex x-ray atomic scattering factors of fA and fB (neutron scattering lengths bA and bB). Since an x-ray or neutron beam of even a millimeter diameter intercepts >1020 atoms, the double sum in Equation 2 involves >1021 first-neighbor atom pairs (one at p, the other at q); the sum over all the atoms is a statistical description, which includes all possible atom pairs that can be formed (i.e., A-A, A-B, B-A, B-B;
886
X-RAY TECHNIQUES
Warren, 1969). A preference for like or unlike neighboring pairs is introduced by the conditional probability term PAB pq . This term is defined as the probability for finding a B atom at site q after having found an A atom at site p (Cowley, 1950). The probability for A-B pairs is CA PAB pq , which must equal CB PBA pq , the number of B-A pairs. Also, BA AA AB PBB pq ¼ 1 Ppq ; Ppq ¼ 1 Ppq ; CAþCB¼1. With the Warren-Cowley definition of the short-range order (SRO) parameter (Cowley, 1950), apq 1 PAB pq =CB . Spatial and time averages taken over the chemically distinct A-A, A-B, or BB pairs with relative atom positions p q, produce the total elastically and quasielastic (thermal) scattered intensity in electron units for a crystalline solid solution of two components as IðHÞTotal ¼
X Xh p
AA
ðC2A þ CA CB apq Þj fA j2 he2piHðdp dq Þ i
q
BA þ CA CB ð1 apq Þ fA fB e2piHðdp dq Þ AB þ ðC2B þ CA CB apq Þj fB j2 þ e2piHðdp dq Þ i he2piHðdp dq Þ iBB e2piHðRp Rq Þ
ð3Þ
where |fA| and | fB| denote the absolute value or moduli of the complex amplitudes. From the theoretical development given in Appendix A, a complete description of the diffusely distributed intensity through the second moment of the displacements is given as IðHÞDiffuse IðHÞSRO IðHÞj ¼ 1 IðHÞj ¼ 2 ¼ þ þ N N N N
ð4Þ
where IðHÞSRO X ¼ CA CB j fA fB j2 almn cos pðh1 l þ h2 m þ h3 nÞ N lmn ð5Þ IðHÞj¼1 ¼ Re fA fA fB N þRe fB fA fB
AA AA h1 QAA x þ h2 Qy þ h3 Qz BB BB h1 QBB x þ h2 Qy þ h3 Qz
ð6Þ and IðHÞj ¼ 2 2 AA 2 AA ¼ j fA j2 h21 RAA X þ h2 RY þ h3 RZ N 2 AB 2 AB þ fA fB h21 RAB X þ h2 RY þ h3 RZ 2 BB 2 BB þ j fB j2 h21 RBB X þ h2 RY þ h3 RZ AA AA þ j fA j2 h1 h2 SAA XY þ h1 h3 SXZ þ h2 h3 SYZ AB AB þ fA fB h1 h2 SAB XY þ h1 h3 SXZ þ h2 h3 SYZ BB BB þ j fB j2 h1 h2 SBB XY þ h1 h3 SXZ þ h2 h3 SYZ
ð7Þ
Here the individual terms are defined in Appendix A. As illustrated in Equations 4, 5, 6, and 7, local chemical order (Warren-Cowley a’s) can be recovered from a crystalline binary alloy with a single contrast measurement of the diffuse scattering distribution, provided the displacement
Figure 4. Variation in the ratio of the x-ray atomic scattering factor terms as a function H. The divisor hf i2 ¼ jCCu fCu þ CAu fAu j2 was chosen to reduce the H dependence of all the terms for an incident energy of Mo Ka ¼ 1:748 keV. The relatively larger x-ray atomic scattering factor of Au, fAu ¼ 79 versus Cu, fCu ¼ 29 at H ¼ 0, would require a divisor more heavily weighted with fAu, such as jfAu j2 to reduce the H dependence of those terms.
contributions are negligible. This was the early practice until a series of papers used symmetry relationships among the various terms to remove the IðHÞj¼1 term in two dimensions (Borie and Sparks, 1964), in three dimensions (Sparks and Borie, 1965), and to second moment in all three dimensions: IðHÞSRO , IðHÞj¼1 , IðHÞj¼2 (Borie and Sparks, 1971), henceforth referred to as BS. The major assumption of the BS method is that the x-ray atomic scattering factor terms j fA fB j2 , Re½ fA ð fA fB Þ, Re½ fB ð fA fB Þ, j fA j2 , j fB j2 , and fA fB of Equations 4, 5, 6, and 7 have a similar H dependence so that a single divisor renders them independent of H. With this assumption, the diffuse intensity can be written as a sum of periodic functions given by Equation 34. For neutron nuclear scattering, this assumption is excellent; neutron nuclear scattering cross-sections are independent of H, and in addition, the TDS terms C and D can be filtered out. Even with x rays, the BS assumption is generally a good approximation. For example, as shown in Figure 4 for Mo Ka x rays and even with widely separated elements such as Au-Cu, a judicious choice of the divisor allows the BS method to be applied as a first approximation over a large range in momentum transfer. In this case, division by fAu ð fAu fCu Þ ¼ fAu f would be a better choice since the Au atom is the major scatterer. Iterative techniques to further extend the BS method have not been fully explored. This variation in the scattering factor terms with H has been proposed as a means to recover the individual pair displacements (Georgopoulos and Cohen, 1977). Equations 5, 6, and 7 are derived from the terms first given by BS, but with notation similar to that used by Georgopoulos and Cohen (1977). There are 25 Fourier series in Equations 5, 6, and 7. For a cubic system with cenAA trosymmetric sites, if we know QAA X , then we know QY and
X-RAY AND NEUTRON DIFFUSE BB AA BB AB AA BB QAA Z . Similarly, if we know QX , RX , RX , RX , SXY , SXY , and SAB , then we know all the Q, R, and S parameters. XY Thus with the addition of the a series, there are nine separate Fourier series for cubic scattering to second order. As described in Appendix A (Derivation of the Diffuse Intensity), the nine distinct correlation terms from the a, Q, R, and S series can be grouped into four unique Hdependent functions, A, B, C, D within the BS approximation. By following the operations given by BS, we are able to recover these unique H dependent functions and from these the nine distinct correlation terms. For a binary cubic alloy, one x-ray map is sufficient to recover A, B, C, and D and from A(h1 h2 h3 ), the Warren-Cowley a’s. Measurements at two x-ray energies with sufficient contrast are required to separate the A-A and B-B pair contributions to the B(h1 h2 h3 ) terms, and three x-ray energies for the A-A, A-B, and B-B displacements given in Equation 29 and contained in the terms C and D of Equation 34. In an effort to overcome the assumption of H independence for the x-ray atomic scattering factor terms and to use that information to separate the different pair contributions, Georgopoulos and Cohen (1977), henceforth GC, included the variation of the x-ray scattering factors in a large least-squares program. Based on a suggestion by Tibballs (1975), GC used the H dependence of the three different x-ray scattering factor terms to separate the first moment of the displacements for the A-A, A-B, and B-B pairs. Results from GC’s error analysis (which included statistical, roundoff, x-ray scattering factors, sample roughness, and extraneous backgrounds) showed that errors in the x-ray atomic scattering factors had the largest effect, particularly on the Q terms. They concluded, based on an error analysis of the BS method by Gragg et al. (1973), that the errors in the GC method were no worse than for the BS method and provided correct directions for the first moment displacements. Improvements in the GC method, with the use of Mo Ka x rays to obtain more data and the use of a Householder transformation to avoid matrix inversion and stabilization with ridge-regression techniques, still resulted in unacceptably large errors on the values of the R and S parameters (Wu et al., 1983). To date, there have been no reported values for the terms R and S that are deemed reliable. However, the Warren-Cowley a’s are found to have typical errors of 10% or less for binary alloys with a preference for unlike first-neighbor pairs with either the BS or GC analysis. For clustering systems, the BS method was reported to give large errors of 20% to 50% of the recovered a’s (Gragg et al., 1973). Smaller errors were reported on the a’s for clustering systems with the GC method (Wu et al., 1983). With increasing experience and better data from intense synchrotron sources, errors will be reduced for both the BS and GC methods. Another methodology to recover the pair correlation parameters uses selectable x-ray energies (Ice et al., 1992). Of most practical interest are the a’s and the first moment of the static displacements as given in Equations 27 and 28. When alloys contain elements that are near one another in the periodic table, the scattering factor term fA fB can be made to be nearly zero by proper choice of x-ray energy nearby to an x-ray absorption. In this way,
887
Figure 5. For elements nearby in the periodic table, x-ray energies can be chosen to obtain near null Laue scattering to separate intensity arising from quadratic and higher moments in atomic displacements. Courtesy of Reinhard et al. (1992).
the intensities expressed in Equations 27 and 28 are made nearly zero and only that intensity associated with Equation 29 remains. Then, the term IðHÞj¼2 can be measured and scaled to diffuse scattering measurements made at other x-ray energies (which emphasize the contrast between the A and B atoms) and subtracted off. This leaves only the I(H)SRO term, Equation 5, and the first moment of the static displacements IðHÞj¼1 , Equation 6. Shown in Figure 5 are the values of j fFe fCr j2 for x-ray energies selected for maximum contrast at 20 eV below the Fe K and Cr K edges. The near null Laue energy, or energy of minimum contrast, was 7.6 keV. The major assumption in this null Laue or 3 l method is that the IðHÞj¼2 and higher moment terms scale with x-ray energy as jCA fA þ CB fB j2 , which implies that the A and B atoms have the same second and higher moment displacements or that the different elements have the same x-ray atomic scattering factors. This assumption is most valid for alloys of elements with similar atomic numbers, which have similar masses (similar thermal motion), atom sizes (small static displacement), and numbers of electrons (similar x-ray scattering factors). This 3 l method has been used to analyze four different alloys, Fe22.5Ni77.5 (Ice et al., 1992), Cr47Fe53 (Reinhard et al., 1992), Cr20Ni80 (Scho¨ nfeld et al., 1994), and Fe46.5Ni53.5 and recalculated Fe22.5Ni77.5 (Jiang et al., 1996). An improvement in the null Laue method by Jiang et al. (1996) removed an iteration procedure to account for the residuals left by the fact that fA fB was not strictly zero over the measured volume. The same techniques used for x-ray diffuse scattering analysis can also be applied to neutron scattering measurements. Neutrons have the advantage (and complication) of being sensitive to magnetic order as described in Appendix B. This sensitivity to magnetic order allows neutron measurements to detect and quantify local magnetic ordering but complicates analysis of chemical ordering. Error analysis of the null Laue method has been given by Jiang et al. (1995) and by Ice et al. (1998). The statistical uncertainties of the recovered parameters can pffiffiffibe estimated by propagating the standard deviation n of the total number of counts n for each data point through the
888
X-RAY TECHNIQUES
Table 1. Contributions to the Uncertainties in the Short-Range-Order Parameter, of Fe46.5Ni53.5a (1s) pffiffiffi sð nÞ s (f0 ) 0.2 eu s(P0) 1% s(RRS) 1 eu sCompton lmn almn (sTotal) 000 110 200 211 220 310 222 321 400 330 411 420 332
1.0000(100)
0.0766 (54) 0.0646 (28)
0.0022 (15) 0.0037 (14)
0.0100 (11) 0.0037 (12)
0.0032 (19) 0.0071 (12)
0.0021 (9) 0.0007 (7) 0.0012 (8)
0.0007 (7)
0.0024 0.0018 0.0017 0.0014 0.0013 0.0011 0.0011 0.0009 0.0011 0.0008 0.0007 0.0007 0.0007
0 0.0010 0.0003 0 0.0002 0.0001 0 0 0.0002 0.0001 0 0.0002 0
0 0.0048 0.0016 0.0004 0.0003 0.0002 0.0002 0.0001 0.0001 0 0 0 0
0 0 0.0008 0.0001 0.0003 0.0001 0.0002 0.0001 0.0003 0.0003 0 0.0004 0
s(CA) 0.3 at.%
0 0.0006 0.0013 0.0002 0.0003 0.0001 0.0003 0.0001 0.0004 0.0001 0.0002 0.0001 0.0001
0 0.0011 0.0003 0.0001 0.0001 0.0001 0 0.0001 0 0 0 0 0
a For statistical and possible systematic errors associated with counting statistics n, the real part of the resonant x-ray scattering factor f 0 the scaling parameter P0 to absolute intensities, inelastic resonant Raman scattering (RRS) and Compton contributions, and concentration CA. Total error is shown in ˚. parentheses and 0 indicates uncertainties < 0.00005 A
nonlinear least-squares processing of the data. Systematic errors can be determined by changing the values of input variables such as the x-ray atomic scattering factors, backgrounds, and composition; then the data is reprocessed and the recovered parameters are compared. Because the measured pair correlation coefficients are very sensitive to the relative and to a lesser degree the absolute intensity calibration of data sets collected with varying scattering contrast, the addition of constraints greatly increases reliability and reduces uncertainties. For example, the uncertainty in recovered parameters due to scaling of the measured scattering intensities is determined as input parameters are varied. Each time, the intensities are rescaled so that the ISRO values are everywhere positive and match values at the origin of reciprocal space measured by small-angle scattering. The inte-
grated Laue scattering over a repeat volume in reciprocal space is also constrained to have an average value of CA CB j fA fB j2 (i.e., a000 ¼ 1). These two constraints eliminate most of the systematic errors associated with converting the raw intensities into absolute units (Sparks et al., 1994). The intensities measured at three different energies are adjusted to within 1% on a relative scale and the intensity at the origin is matched to measured values. For these reasons, the systematic errors for a000 are estimated at 1%. For the null Laue method, errors on the recovered a’s and X’s arising from statistical and various possible systematic errors in the measurement and analysis of diffuse scattering data are given in Tables 1 and 2 for the Fe46.5Ni53.5 alloy (Jiang et al., 1995; Ice et al., 1998). Details of the conversion to absolute intensity units are given
Table 2. Standard Deviation of 1s of x, y, and z Components of the Pair Fe-Fe Displacements d Fe Fea pffiffiffi ˚) lmn X(sTotal) (A s n s(f0 ) 0.2 eu s(P0) 1% s(RRS) 1 eu sCompton s(CA) 0.3 at.% 110 200 211 121 220 310 130 222 321 231 123 400 330 411 141
0.0211 (25)
0.0228 (14) 0.0005 (2) 0.0014 (4) 0.0030 (7) 0.0022 (3) 0.0009 (2) 0.0003 (3) 0.0011 (2) 0.0001 (1) 0.0008 (4)
0.0019 (6) 0.0011 (4)
0.0008 (3)
0.0001 (2)
0.0002 0.0004 0.0002 0.0001 0.0002 0.0002 0.0002 0.0002 0.0001 0.0001 0.0001 0.0004 0.0002 0.0002 0.0001
0.0023 0.0010 0 0.0003 0.0006 0.0001 0.0001 0.0002 0.0001 0 0.0001 0.0002 0.0001 0.0002 0.0001
0.0007 0.0007 0.0001 0.0001 0.0001 0.0001 0 0 0 0 0 0.0001 0 0 0
0.0002 0.0002 0.0001 0.0002 0.0003 0.0002 0.0001 0.0001 0.0002 0.0001 0 0.0003 0.0003 0.0002 0.0001
0.0004 0.0004 0 0 0.0001 0.0001 0 0 0 0 0 0.0001 0 0 0
0.0004 0.0002 0 0 0 0 0 0 0 0 0 0 0 0 0
a For the various atom pairs of Fe46.5Ni53.5 for statistical and possible systematic errors described in the text. Total error is shown in ˚. parentheses and 0 indicates uncertainties ’s represents the average increase in the moment of the atom at the origin (000) due to the atomic species of the atom located at site lmn. While the terms IM(H) and INM(H) allow one to study the magnetic short-range order in the alloy, they also complicate the data analysis by making it difficult to separate these two terms from the chemical SRO. One experimental method for resolving the magnetic components is to use polarization analysis where the moment of the incident neutron beam is polarized to be either parallel (e ¼ 1) or antiparallel (e ¼ 1) to the magnetization. The total scattering for each case can now be written as Ie ðHÞ ¼ IN ðHÞ þ eINM ðHÞ þ IM ðHÞ
ð40Þ
The intensity difference between the two polarization states gives ITotal ðHÞ ¼ 2INM ðHÞ
ð41Þ
and the sum gives X e
Ie ðHÞ þ 2IN ðHÞ þ 2IM ðHÞ
ð42Þ
RESONANT SCATTERING TECHNIQUES
If IN ðHÞ is known from a separate measurement with x rays, all three components of the scattering can be separated from one another. One of the greatest difficulties in studying magnetic short-range order comes when the moments of the atoms cannot be aligned in the same direction with, for example, an external magnetic field. In the above analysis, it was assumed that the moments are perpendicular to the scattering vector, H. The magnetic scattering cross-section is reduced by the sine of the angle between the magnetic moment and the scattering vector. Thus if the magnetization is not perpendicular to the scattering vector, the moments on the atoms must be reduced by the appropriate amount. When the spins are not aligned, the sine of the angle between the moment and the scattering vector for each individual atom must be considered. In this case, it becomes necessary to construct computer models of the spin structure to extract M(H) and T(H). More indepth discussion is given in MAGNETIC NEUTRON SCATTERING. GENE E. ICE JAMES L. ROBERTSON CULLIE J. SPARKS Oak Ridge National Laboratory Oak Ridge, Tennessee
RESONANT SCATTERING TECHNIQUES INTRODUCTION This unit will describe the principles and methods of resonant (anomalous) x-ray diffraction as it is used to obtain information about the roles of symmetry and bonding on the electronic structure of selected atoms in a crystal. These effects manifest themselves as crystal orientationdependent changes in the diffracted signal when the xray energy is tuned through the absorption edges for those atoms. Applications of the method have demonstrated it is useful in: (1) providing site-specific electronic structure information in a solid containing the same atom in several different environments; (2) determining the positions of specific atoms in a crystal structure; (3) distinguishing electronic from magnetic contributions to diffraction; (4) isolating and characterizing multipole contributions to the x-ray scattering that are sensitive to the local environment of atoms in the solid; and (5) studying the ordering of 3d electron orbitals in the transition metal oxides. These effects share a common origin with x ray resonant magnetic scattering. Both are examples of anomalous dispersion, both are strongest near the absorption edge of the atoms involved, and both display interesting dependence on x-ray polarization. Resonant electronic scattering is a microscopic analog of the optical activity familiar at longer wavelengths. Both manifest as a polarization-dependent response of xray or visible light propagation through anisotropic media. The former depends on the local environment (molecular point symmetry) of individual atoms in the crystalline unit cell, while the latter is determined by the point
905
symmetry of the crystal as a whole (Belyakov and Dmitrienko, 1989, Section 2). This analogy led to a description of ‘‘polarized anomalous scattering’’ with optical tensors assigned to individual atoms (Templeton and Templeton, 1982). These tensors represent the anisotropic response when scattering involves electronic excitations between atomic core and valence states that are influenced by the point symmetry at the atomic site. In the simplest cases (dipole-dipole), these tensors may be visualized as ellipsoidal distortions of the otherwise isotropic x-ray form factor. Symmetry operations of the crystal space group that affect ellipse orientation can modify the structure factors, leading to new reflections and angle dependencies. Materials Properties that Can Be Measured Resonant x-ray diffraction is used to measure the electronic structure of selected atoms in a crystal and for the direct determination of the phases of structure factors. Results to date have been obtained in crystalline materials possessing atoms with x-ray absorption edges in an energy range compatible with Bragg diffraction. As shown by examples below (see Principles of the Method), resonant diffraction is a uniquely selective spectroscopy, combining element specificity (by selecting the absorption edge) with site selectivity (through the choice of reflection). The method is sensitive to the angular distribution of empty states near the Fermi energy in a solid. Beyond distinguishing common species at sites where oxidation and/or coordination may differ, it is sensitive to the orientation of molecules, to deviations from high symmetry, to the influence of bonding, and to the orientation (spatial distribution) of d electron orbital moments when these valence electrons are ordered in the lattice (Elfimov et al., 1999; Zimmermann et al., 1999). The second category of measurement takes advantage of differences in the scattering from crystallographically equivalent atoms. Because of its sensitivity to molecular orientation, resonant scattering can lead to diffraction at reflections ‘‘forbidden’’ by screw-axis and glide-plane crystal symmetries (SYMMETRY IN CRYSTALLOGRAPHY). Such reflections can display a varying intensity as the crystal is rotated in azimuth about the reflection vector. This results from a phase variation in the scattering amplitude that adds to the position-dependent phase associated with each atom. This effect can lead to the determination of the position-dependent phase in a manner analogous to the multiple-wavelength anomalous diffraction (MAD) method of phase determination. Secondary applications involve the formulation of corrections used to analyze anomalous scattering data. This includes accounting for orientation dependence in MAD measurements (Fanchon and Hendrickson, 1990) and for the effects of birefringence in measurements of radial distribution functions of amorphous materials (Templeton and Templeton, 1989). Comparison with Other Methods The method is complementary to standard x-ray diffraction because it is site selective and can provide direct information on the phase of structure factors (Templeton and
906
X-RAY TECHNIQUES
Templeton, 1992). A synchrotron radiation source is required to provide the intense, tunable high-energy xrays required. Highly polarized beams are useful but not required. Perhaps the technique most closely related to resonant diffraction is angle-dependent x-ray absorption. However, as pointed out by Brouder (1990), except in special cases, the absorption cross-section reflects the full crystal (point) symmetry instead of the local symmetry at individual sites in the crystal. In diffraction, the position-dependent phase associated with each atom permits a higher level of selectivity. This added sensitivity is dramatically illustrated by (but not limited to) reflections that violate the usual extinction rules. Resonant diffraction thus offers a distinct advantage over spectroscopy methods that provide an average picture of the material. It offers site specificity, even for atoms of a common species, through selection of the diffraction and/or polarization conditions. When compared to electron probe spectroscopies, the method has sensitivity extending many microns into the material, and does not require special sample preparation or vacuum chambers. Drawbacks to the method include the fact that signal sizes are small when compared to standard x-ray diffraction, and that the experimental setup requires a good deal of care, including the use of a tunable monochromator and in some cases a polarization analyzer. General Scope This unit describes resonant x-ray scattering by presenting its principles, illustrating them with several examples, providing background on experimental methods, and discussing the analysis of data. The Principles of the Method section presents the classical picture of x-ray scattering, including anomalous dispersion, followed by a quantummechanical description for the scattering amplitudes, and ends with a discussion on techniques of symmetry analysis useful in designing experiments. Two examples from the literature are used to illustrate the technique. This is followed by a discussion of experimental aspects of the method, including the principles of x-ray polarization analysis. A list of criteria is given for selecting systems amenable to the technique, along with an outline for experimental design. A general background on data collection and analysis includes references to sources of computational tools and expertise in the field.
described by a symmetric tensor, and the structure factor is constructed by summing these tensors, along with the usual position-dependent phase factor, for each site in the unit cell. The tensor structure factor is sensitive to the nonisotropic nature of the scattering at each site and this leads to several important phenomena: (1) the usual extinction rules that limit reflections (based on an isotropic response) can be violated; (2) the x-ray polarization may be modified by the scattering process; (3) the intensity of reflections may vary in a characteristic way as the crystal is rotated in azimuth (maintaining the Bragg condition) about the reflection vector; and (4) these behaviors may have a strong dependence on x-ray energy near the absorption edge. Theory The classical theory of x-ray scattering, carefully developed in James’ book on diffraction (James, 1965), makes clear connections between the classical and quantum mechanical descriptions of anomalous scattering. The full description of resonant (anomalous) scattering requires a quantum-mechanical derivation beginning with the interaction between electrons and the quantized electromagnetic field, and yielding the Thomson and resonant electronic scattering cross-sections. An excellent modern exposition is given by Blume (1985). Several good discussions on the influence of symmetry in anomalous scattering, including an optical model of the anistropy of x-ray susceptibility, are given by Templeton (1991), Belyakov and Dmitrienko (1989), and Kirfel (1994). The quantum mechanical description of symmetry-related effects in anisotropic scattering is included in the work by Blume (1994) on magnetic effects in anomalous dispersion. This has been extended in a sophisticated, general group-theoretical approach by Carra and Thole (1994). We describe x-ray diffraction by first considering the atomic scattering factor in classical terms and the corrections associated with anomalous scattering. This is followed by the quantum description, and finally the structure factor for resonant diffraction is calculated. Classical Picture The scattering of x rays from a single free electron is given by the Thomson scattering cross-section; the geometry is illustrated in Figure 1. The oscillating electric field (E) of the incident x-ray, with wave vector k and frequency o, causes a sinusoidal
PRINCIPLES OF THE METHOD Resonant x-ray diffraction is used to refine information on the location of atoms and the influence of symmetry and bonding for specific atoms in a known crystal structure. Measurements are performed in a narrow range of energy around the absorption edges for the atoms of interest. The energy and crystal structure parameters determine the range of reflections (regions in reciprocal space) accessible to measurement. The point symmetry at the atom location selects the nature (i.e., the multipole character) of the resonance-scattering amplitude for that site. This amplitude is
Figure 1. Geometry of x-ray scattering from a single electron. The notation is explained in the text.
RESONANT SCATTERING TECHNIQUES
Figure 2. The geometry used in the calculation of the atomic scattering form factor for x rays. The diffraction vector is also defined and illustrated.
the total intensity scattered from an atom and the coherent scattering. He defines the ‘‘modified’’ (i.e., incoherent or Compton) intensity as the difference between the atomic number and the sum of squares of form factors, one for each electron in the atom. A second effect of binding to the nucleus must be considered in describing x-ray scattering from atomic electrons. This force influences the response to the incident electric field. James gives the equation of motion for a bound electron as d2 x=d2 t þ kd x=dt þ o2s x ¼ ðe=mÞEeiot
acceleration of the electron (with charge e) along the field direction. This produces a time-dependent dipole moment of magnitude e2 =ðmo2 Þ, which radiates an electromagnetic field polarized so that no radiation is emitted along the direction E. This polarization effect (succinctly given in terms if the incident and scattered polarization directions as e e0 ) yields a sin f dependence where f is the angle between the incident polarization and the scattered wave vector k0 . The Thomson scattered radiation is 180 out of phase with the incident electric field, and has an intensity IThomson ¼ I0 ½ðe2 =mc2 Þsin f=R2
ð1Þ
where the incident intensity I0 ¼ jEj2 , e, and m are the electron charge and mass, c the speed of light, and R is the distance between the electron and field point where scattered radiation is measured. When electrons are bound to an atom, the x-ray scattering is changed in several important ways. First, consider an electron cloud around the nucleus as shown in Figure 2. James treats the cloud as a continuous, spherically symmetric distribution with charge density r(r) that depends only on distance from the center. The amplitude for x rays scattered from an extended charge distribution into the k0 direction is calculated by integrating over the distribution, weighting each point with a phase factor that accounts for the path length differences between points in the cloud. This is given by ð ð2Þ f0 ¼ rðrÞeiQr dV with Q the reflection vector of magnitude jQj ¼ 4p sin y=l, and 2y and l the scattering angle and wavelength, respectively. The atomic scattering form factor, f0, defined by this integral, is equal to the total charge in the cloud when the phase factor is unity (in the forward scattering direction), and approaches zero for large 2y. Values of f0 versus sin y=l for most elements are provided in tables and by algebraic expressions (MacGillavry and Rieck, 1968; Su and Coppens, 1997). The form factor is given in electron units, where one electron is the scattering from a single electron. An important assumption in this chapter is that the xray energy is unchanged by the scattering (i.e., elastic or coherent). Warren (1990) describes the relation between
907
ð3Þ
where x, k, os are, respectively, the electron position, a damping factor, and natural oscillation frequency for the bound electron. Using the oscillating dipole model, he gives the scattered electric field at unit distance in the scattering plane (i.e., normal to the oscillator) as A ¼ e2 =ðmc2 Þ½o2 E=ðo2s o2 þ ikoÞ
ð4Þ
The x-ray scattering factor, f, is obtained by dividing A by
e2 =ðmc2 ÞE, the scattering from a single electron. This factor is usually expressed in terms of real and imaginary components as f ¼ o2 =ðo2 o2s ikoÞ f 0 þ if 00
ð5Þ
This expression, and its extension to the many-electron atom discussed by James, are the basis for a classical description of resonant x-ray scattering near absorption edges. The dispersion of x rays is the behavior of f as a function of o (where the energy is ho), and ‘‘anomalous dispersion’’ refers to the narrow region near resonance (o ¼ os Þ where the slope df =do is positive. Templeton’s review on anomalous scattering describes how f 0 and f 00 are related to refraction and absorption, and gives insight into the analytic connection between them through the Kramers-Kronig relation. Returning to the description of scattering from all the electrons in an atom, it is important to recognize that the x-ray energies required in most diffraction measurements correspond to o " os for all but the inner-shell electrons of most elements. For all but these few electrons, f is unity and their contribution to the scattering is well described by the atomic form factor. The scattering from a single atom is then given by f ¼ f0 þ f 0 þ if 00
ð6Þ
To this approximation, the dependence of scattering amplitude on scattering angle comes from two factors. The first is the polarization dependence of Thomson scattering, and second is the result of interference because of the finite size of the atomic charge density which effects f0. This is the classical description assuming the x-ray wavelength is larger than the spatial extent of the resonating electrons. We will find a very different situation with the quantum-mechanical description.
908
X-RAY TECHNIQUES
Quantum Picture The quantum mechanical description of x-ray scattering given by Blume may be supplemented with the discussion by Sakurai (1967). Both authors start with the Hamiltonian describing the interaction between atomic electrons and a transversely polarized radiation field " # X X H¼ ðpj ðe=cÞAðrj ÞÞ2 =ð2mÞ þ Vðrij Þ ð7Þ j
i
where pj is the momentum operator and rj is the position of the jth electron, A(rj) is the electromagnetic vector potential at rj, and V(rij) is the Coulomb potential for the electron at rj due to all other charges at ri. We have suppressed terms related to magnetic scattering and to the energy of the radiation field. Expanding the first term gives X H0 ¼ ½e2 =ð2mc2 ÞAðrj Þ2 e=ðmcÞAðrj Þ pj ð8Þ j
where the delta function insures that energy is conserved between the initial state of energy EI ¼ Ea þ hok , and the final one with EF ¼ Eb þ hok0 , jni and En are the wave functions and energies of intermediate (or virtual) states of the system during the process. To get the scattered intensity (or cross-section), W is multiplied by the density of allowed final states and divided by the incident intensity. Blume uses a standard expansion for the vector potential, and modifies the energy resonant denominator for processes that limit the scattering amplitude. Sakurai introduces the energy width parameter, , to account for ‘‘resonant fluorescence’’ (Sakurai, 1967). In the x-ray case, the processes that limit scattering are both radiative (such as ordinary fluorescence), and nonradiative (like Auger emission). They are discussed by many authors (Almbladh and Hedin, 1983). When the solid is returned to the initial state ðjaiÞ, the differential scattering cross-section, near resonance, is X d s=d dE ¼ e =ð2mc Þhaj eiQrj jaie0 e h=m j " P # P 0 X haj j ðe0 pj eik r jÞjnihnj l ðepj eikrl Þjai 2 j Ea En þ hok ið=2Þ n 2
The interaction responsible for the scattering is illustrated in Figure 3, and described as a transition from an initial state jIi ¼ j a; k; ei with the atom in state a (with energy Ea) and a photon of wave vector k, polarization e, and energy hok to a final state jFi with the atom in state b (of energy Eb) and a photon of wave vector k0 , polarization e0 , and energy hok0 . The transition probability per unit time is calculated using time-dependent perturbation theory (Fermi’s golden rule) to second order in A (Sakurai, 1967, Sections 2-4,2-5) X W ¼ ð2p=hÞe2 =ð2mc2 Þ hF=Aðrj Þ Aðrj ÞjIi þ ðe=mcÞ2 j
2 P P XhFj j Aðrj Þpj jnihnj l Aðrl Þpjrml jIi dðEI EF Þ EI En n ð9Þ
2
2
ð10Þ where Q ¼ k k0 . It is useful to express the momentum operator in terms of a position operator using a commutation relation (Baym, 1973, Equations 13-96). The final expression, including an expansion of the complex exponential is given by Blume (1994) as d2 s=ðd dEÞ P X m ðEa En Þ3 2 2 ¼ e =ð2mc Þhaj eiQrj jaie0 e 2 n ho h j
haj
P
0 j ðe
2 rl Þð1 þ ik rl =2Þjai ok ið=2Þ Ea En þ h
rj Þð1 ik0 rj =2Þjnihnj
P
l ðe
ð11Þ
The amplitude, inside the squared brackets, should be compared to the classical scattering amplitude given in Equation 6. The first term is equivalent to the integral which gave the atomic form factor; the polarization dependence is that of Thomson scattering. The second term, the resonant amplitude, is the same form as Equation 5 o2 =o2k o2ca ið= hoÞk
Figure 3. Schematic diagram illustrating the electronic transitions responsible for resonant scattering. An x-ray photon of wave vector k excites a core electron to an empty state near the Fermi level. The deexcitation releases a photon of wave vector k0 .
ð12Þ
Now, however, the ground-state expectation value, haj transition operator jai, can have a complicated polarization and angular dependence. The transition operator is a tensor that increases in complexity as more terms in the expansion of the exponentials (Equation 10) are kept. Blume (1994) and Templeton (1998) identify the lowest-order term (keeping only the 1 yields a second-rank tensor) as a dipole excitation and dipole deexcitation. Next in complexity is the dipolequadrupole terms (third-rank tensors); this effect has
RESONANT SCATTERING TECHNIQUES
been observed by Templeton and Templeton (1994). The next order (fourth-rank tensor) in scattering can be dipole-octapole or quadrupole-quadrupole; the last term has been measured by Finkelstein (Finkelstein et al., 1992). An important point is that different multipole contributions make the scattering experiment sensitive to different features of the resonating atoms involved. The atom site symmetry, the electronic structure of the outer orbitals, and the orientational relationship between symmetry equivalent atoms combine to influence the scattering.
A METHOD FOR CALCULATION Two central facts pointed out by Carra and Thole (1994) guide our evaluation of the resonant scattering. The first is a selection rule; the scattering may exist only if the transition operator is totally symmetric (i.e., if it is invariant under all operations which define the symmetry of the atomic site). This point is explained in Landau and Lifshitz (1977) and many books on group theory (Cotton, 1990). The second point is that the Wigner-Eckart Theorem permits separation of the scattering amplitude into products of angle- and energy-dependent terms. The method outlined below involves an evaluation of this angular dependence. We cannot predict the strength of the scattering (the energy-dependent coefficients); this requires detailed knowledge of electronic states that contribute to the scattering operator. The first task is to determine the multipole contributions to resonant scattering that may be observable at sites of interest in the crystal; the selection rule provides this information. The rule may be cast as an orthogonality relation between two representations of the point symmetry. Formula 4.3–11 in Cotton (1990) gives the number of times the totally symmetric representation (A1) exists in a representation of this point symmetry. It is convenient to choose spherical harmonics as basis functions for the representation because harmonics with total angular momentum L represent tensors of rank L. First calculate the trace wðRiÞ of each symmetry operation, Ri, of the point group. Because the trace of any symmetry operation of the totally symmetric representation equals 1, Cotton’s formula reduces to NðA1 Þ ¼ ð1=hÞ
X ½wðRiÞ
ð13Þ
i
where N(A1) is the number of times the totally symmetric representation is found in the candidate representation, h is the number of symmetry operations in the point group, and the sum is over all group operations. If the result is nonzero, this number equals the number of independent tensors (of rank L) needed to describe the scattering. The second task is to construct the angle-dependent tensors that describe the resonant scattering. There are a number of approaches (Belyakov and Dmitrienko, 1989; Kirfel, 1994; Blume, 1994; Carra and Thole, 1994), but in essence this problem is familiar to physicists, being analogous to finding the normal modes of a mechanical system, and to chemists as finding symmetry-adapted
909
linear combinations of atomic orbitals that describe states of a molecule. To calculate where, in reciprocal space, this scattering may appear, and to predict the polarization and angular dependence of each reflection, we use spherical harmonics to construct the angle-dependent tensors. This choice is essential for the separation discussed above, and simplifies the method when higher-order (than dipoledipole) scattering is considered. Our method uses the projection operator familiar in group theory (see Chapter 5 in Cotton, 1990; also see Section 94 in Landau and Lifshitz, 1977). We seek a tensor of rank L, which is totally symmetric for the point group of interest. The projection operator applied to find a totally symmetric tensor is straightforward; we average the result of applying each symmetry operation of the group to a linear combination of the 2L þ 1 spherical harmonics with quantum number L, i.e.,
f ðLÞ ¼ ð1=hÞ
" X
( RðnÞ
n
X
)# am YðL; mÞ
ð14Þ
m
where am is a constant which may be set equal to 1 for all m, and Y(L;m) is the spherical harmonic with quantum numbers L and m. If our ‘‘first task’’ indicated more than one totally symmetric contribution to the scattering amplitude, the other tensors must be orthogonal to f(L) and again totally symmetric. The third task is to calculate the contribution this atom makes to the structure factor. In calculating f(L), we assumed an oriented coordinate system to evaluate the symmetry operations. If this orientation is consistent with the oriented site symmetry for this atom and space group in the International Tables (Hahn, 1989), then the listed symmetry operations can be used to transform the angle-dependent tensor from site to site in the crystal. The structure factor of the atoms of site i, for reflection Q is written Fði; QÞ ¼
X
f0j þ
X
fpj ðLÞ eiQrj
ð15Þ
p
j
where f0j is the nonresonant form factor for this atom, and fpj(L) is the pth resonant scattering tensor for the atom at rj; both terms are multiplied by the usual position-dependent phase factor. The tensor transformations are familiar in Cartesian coordinates as a multiplication of matrices (Kaspar and Lonsdale, 1985, Section 2.4). Here, having used spherical harmonics, it is easiest to use standard formulas (Sakurai, 1985) for rotations and simple rules for reflection and inversion given in APPENDIX A of this unit. When more then one atom contributes to the scattering, these structure factors are added together to give a total F(Q). The final task is to calculate the scattering as a function of polarization and crystal orientation. For a given L, our resonant FL(Q) is a compact expression FL ðQÞ ¼
X ½bm YðL; mÞ m
ð16Þ
910
X-RAY TECHNIQUES
with bm the sum of contributions to the scattering with angle dependence Y(L;m). To evaluate the scattering, we express each Y(L;m) as a sum of products of the vectors in the problem. The products are obtained by standard methods using Clebsch-Gordan coefficients, as described in Sakurai (1985, Section 3.10). For L ¼ 2 (dipole-dipole) scattering, the two vectors are the incident and scattered polarization, for L ¼ 3 a third vector (the incident or outgoing wave vector) is included, and for L ¼ 4 both wave vectors are used to characterize the scattering. The crystal orientation (relative to these defining vectors) is included by applying the rotation formula (see APPENDIX C in this article) to FL(Q).
Wilkinson et al. (1995) in their Equation 1. Subtracting one-third of the trace of this matrix from each component gives fxx ¼ fyy ¼ 2 fzz , which is equivalent to Yð2; 0Þ / 2z2 ðx2 þ y2 Þ. Further discussion on the angular dependence for dipole scattering is found in APPENDIX B of this article. The angular function at each gold site is represented by the lobe-shaped symbols in Figure 4. This angular dependence is then used to calculate the structure factor, which is a sum of three terms: one for each of the two gold atomic sites and a third for the remaining atoms in the unit cell. The 16 symmetry operations of this space group (#139 in Hahn, 1989) leave the angular function unchanged, so the structure factor for the two gold sites may be written
Illustrations of Resonant Scattering The following two examples illustrate the type of information available from these measurements; the first describes an experiment sensitive to L ¼ 2 scattering, while the second presents results from L ¼ 4 work. L ¼ 2 measurements. Wilkinson et al. (1995) illustrate how resonant scattering can be used to separately measure parameters connected to the electronic state of gold at distinct sites. The diffraction measurements were performed at the gold LIII edge in Wells’ salt (Cs2[AuCl2] [AuCl4]). Both sites, I and III in Figure 4, have D4h symmetry, but the first has a linear coordination of neighbors (along the tetragonal c axis), while the coordination is square planar (parallel to the crystalline a-b plane) at the second site. Application of Equation 13 shows that one L ¼ 2 tensor is required (for each of the two sites) to describe the anisotropic scattering. The construction formula, Equation 14, gives the angular dependence, which is proportional to Y(2,0). This agrees with the anisotropic (the traceless symmetric) part of the matrix given (in Cartesian form) by
Fgold ¼ ½Qðh; k; lÞ / cos½ðp=2Þðh þ k þ lÞ f2 f0 I cos½pl=2 þ FI e iðpl=2Þ þ FIII eþiðpl=2Þ g
ð17Þ
where Fgold, FI, FIII, and I are second-rank tensors and the reciprocal lattice vector Qðh; k; lÞ ¼ ha þ kb þ lc . I, the diagonal unit tensor, represents the isotropic nature of the gold atom form factor. Wilkinson et al. (1995) point out that the resonant scattering is proportional to the sum (difference) of FI, FIII for reflections with l ¼ even (odd), so that separation of the two quantities requires data for both types of reflections. Figure 4 shows that FI and FIII share a common orientation in the unit cell. The tensor structure factor is thus easily described in a coordinate system that coincides with the reciprocal lattice. Wilkinson et al. (1995) use s for the direction along c* and p for directions in the basal plane. To avoid confusion with notation for polarization, we will use s and p instead of s and p. With the crystal oriented for diffraction, the resonant scattering amplitude depends on the orientation of the tensors with respect to the polarization vectors. Fgold(Q) should be re-expressed in the coordinate system where these vectors are defined (see APPENDIX C in this article). The amplitude is then written A ¼ e0 Flab ðQÞ e
ð18Þ
where e0 and e are the scattered and incident polarization vectors and matrix multiplication is indicated. The amplitude expressed in terms of the four (2-incoming, 2-outgoing) polarization states is shown in Table 1, where Fij are cartesian tensor components in the new (laboratory) frame, and 2y is the scattering angle. This scheme for representing the scattering amplitude makes it easy to predict basic behaviors. Consider a vertical scattering plane with Q pointing up, and the incident
Table 1. Amplitude Expressed in Terms of the Four (2-Incoming, 2-Outgoing) Polarization Statesa Figure 4. The crystal structure of Wells’ salt contains two sites for gold; the labeling indicates the formal valence of the two atoms. At each of these two sites, the lobe-shaped symbol indicates the angular dependence for resonant scattering of rank L ¼ 2.
e0 ¼ r0 e0 ¼ p00 a
e¼r F11 F31 cosð2yÞF33 cosð2yÞ
rðpÞ points normal (parallel) to the scattering plane.
e¼p F13 F21 sinð2yÞF23 sinð2yÞ
RESONANT SCATTERING TECHNIQUES
911
Their results, plotted versus energy in Figure 5, show a clear distinction between the two sites. The gold (III) site exhibits a strongly anisotropic near-edge behavior (the difference jfp fs j is >15% of the total resonant scattering) which the authors suggest may result from an electronic transition between the 2p3/2 and empty 3dx2 y2 orbitals. The gold(I) site on the other hand would (formally) have a filled d shell, and in fact shows a much smaller anisotropic difference ( 2000 eV) have focused on the TM K edge and the RE L2,3 edges, which probe much more diffuse bands with small magnetic moments leading to smaller dichroic effects. Although the edges probed and thus the final states are different for these regimes, the physics and
954
X-RAY TECHNIQUES Table 1. Representative Sample of the Elements and Edges Measured Using XMCD
Elements
Edges
3d (Mn-Cu) 3d (V-Cu) 4d (Rd, Pd) 4f (Ce-Ho) 4f (Ce-Lu) 5d (Hf-Au) 5f (U)
K L2,3 L2,3 M4,5 L2,3 L2,3 M4,5
Orbital Character of Transitions s!p p!d p!d d!f p!d p!d d!f
analysis of the spectra is essentially the same, and they differ only in the techniques involved in spectrum collection. This unit will concentrate on the techniques employed in the hard x-ray regime, but considerable mention will be made of soft x-ray results and techniques because of the fundamental importance in magnetism of the final states probed by edges in this energy regime. Emphasis will be given to the methods used for the collection of XMCD spectra rather than to detailed analysis of specific features in the XMCD spectra of a particular edge. More detailed information on the features of XMCD spectra at specific edges can be found in other review articles (Krill et al., 1993; Stahler et al., 1993; Schu¨ tz et al., 1994; Sto¨ hr and Wu, 1994; Pizzini et al., 1995). Competitive and Related Techniques Many techniques provide information similar to XMCD, but typically they tend to give information on the bulk macroscopic magnetic properties of a material rather than microscopic element specific information. It is apparent that the basic criteria for choosing XMCD measurements over other techniques are the need for this element-specific information or for a separate determination of the orbital and spin moments. The need to obtain this information must be balanced against the requirements that XMCD measurements must be performed at a synchrotron radiation facility, and that collection of spectra can take several hours. Therefore, the amount of parameter space that can be explored via XMCD experiments is restricted by the amount of time available at the beamline. The methods most obviously comparable to XMCD are the directly analogous laser-based methods such as measurement of Faraday rotation and magneto-optical Kerr effect. These techniques, while qualitatively similar to XMCD, are considerably more difficult to interpret. The transitions involved in these effects occur between occupied and unoccupied states near the Fermi level, and therefore the initial states are extended bands that are much more difficult to model than the atomic-like core states in XMCD measurements. In fact, recent efforts have focused on using XMCD spectra as a basis for understanding the features observed in laser-based magnetic
Energy Range (keV) 6.5–9.0 0.5–1.0 3.0–3.5 0.9–1.5 5.7–10.3 9.5–13.7 3.5–3.7
Reference(s) Showing Typical Spectra Stahler et al. (1993) Rudolf et al. (1992); Wu et al. (1992) Kobayashi et al. (1996) Rudolf et al. (1992); Schille´ et al. (1993) Fischer et al. (1992); Lang et al. (1992) Schu¨ tz et al. (1989a) Finazzi et al. (1997)
measurements. Another disadvantage of these measurements is that they lack the element specificity provided by XMCD spectra, because they involve transitions between all the conduction electron states. A final potential drawback is the surface sensitivity of the measurements. Because the size of the dichroic signal scales with sample magnetization, XMCD spectra can also be used to measure relative changes in the magnetization as a function of temperature or applied field. This information is similar to that provided by vibrating sample or SQUID magnetometer measurements. Unlike a magnetometer, which measures the bulk magnetization for all constituents, dichroism measures the magnetization of a particular orbital of a specific element. Therefore, XMCD measurements can be used to analyze complicated hysteresis behavior, such as that encountered in magnetic multilayers, or to measure the different temperature dependencies of each component in a complicated magnetic compound. That these measurements are generally complementary to those obtained with a magnetometer should nevertheless be stressed, since it is frequently necessary to use magnetization measurements to correctly normalize the XMCD spectra taken at fixed field. The size of the moments and their temperature behavior can also be ascertained from magnetic neutron or nonresonant magnetic x-ray diffraction. These two techniques are the only other methods that can deconvolve the orbital and spin contributions to the magnetic moment. For neutron diffraction, separation of the moments can be accomplished by measuring the magnetic form factor for several different momentum transfer values and fitting with an appropriate model, while for non-resonant magnetic x-ray scattering the polarization dependence of several magnetic reflections must be measured (Blume and Gibbs, 1988; Gibbs et al., 1988). The sensitivity of both these techniques, however, is very limited and thus measurements have been restricted to demonstration experiments involving the heavy rare earth metals with large orbital moments (morb 6 mB). XMCD on the other hand provides orbital moment sensitivity down to 0.005 mB (Samant et al., 1994). Furthermore, for compound materials these measurements will not be element specific, unless a reflection is used for which only one atomic species produces constructive interference. This is usually only possible
X-RAY MAGNETIC CIRCULAR DICHROISM
for simple compounds and generally not possible for multilayered materials. As diffractive measurements, these techniques have an advantage over XMCD in that they are able to examine antiferromagnetic as well as ferromagnetic materials. But, the weakness of magnetic scattering relative to charge scattering for nonresonant x-ray scattering makes the observation of difference signals in ferromagnetic compounds difficult. A diffractive technique that does provide the element specificity of XMCD is x-ray resonant exchange scattering (XRES; Gibbs et al., 1988). XRES measures the intensity enhancement of a magnetic x-ray diffraction peak as the energy is scanned through a core hole resonance. This technique is the diffraction analog of dichroism phenomena and involves the same matrix elements as XMCD. In fact, theoretical XRES calculations reduce to XMCD at the limit of zero momentum transfer. At first glance, this would seem to put XMCD measurements at a disadvantage to XRES measurements, because the former is just a simplification of the later. It is this very simplification, however, that makes possible the derivation of the sum rules relating the XMCD spectrum to the orbital and spin moments. Most XRES measurements in the hard xray regime have been limited to incommensurate magnetic Bragg reflections, which do not require circularly polarized photons. In these cases, XRES measurements are more analogous to linear dichroism, for which a simple correlation between the size of the moments and the dichroic signal has not been demonstrated. Lastly, even though the XRES signal is enhanced over that obtained by nonresonant magnetic scattering, it is still typically lower by a factor of 104 or more than the charge scattering contributions. Separating signals this small from the incoherent background is difficult for powder diffraction methods, and therefore XRES, and magnetic x-ray scattering in general, have been restricted to single-crystal samples. Spin-polarized photoemission (SPPE) measures the difference between the emitted spin-up and spin-down electrons upon the absorption of light (McIlroy et al., 1996). SPPE spectra can be related to the spin polarization of the occupied electron states, thereby providing complimentary information to XMCD measurements, which probe the unoccupied states. The particular occupied states measured depend upon the incident beam energy, UV or soft x-ray, and the energy of the photoelectron. Soft x-ray measurements probe the core levels, thereby retaining the element specificity of XMCD but only indirectly measuring the magnetic properties, through the effect of the exchange interaction between the valence band and the core-level electrons. UV measurements, on the other hand, directly probe the partially occupied conduction electron bands responsible for the magnetism, but lose element specificity, which makes a clear interpretation of the spectra much more difficult. The efficiency of spin-resolving electron detectors for these measurements, however, loses 3 to 4 orders of magnitude in the analysis of the electron spin. This inefficiency in resolving the spin of the electron has prevented routine implementation of this technique for magnetic measurements. A further disadvantage (or advantage) of photoemission techniques is their surface sensitivity. The emitted photoelectrons typi-
955
cally come only from the topmost 1 to 10 monolayers of the sample. XMCD, on the other hand, provides a true bulk characterization because of the penetrating power of the x-rays. Magnetic sensitivity in photoemission spectra has also been demonstrated using circularly polarized light (Baumgarten et al., 1990). This effect, generally referred to as photoemission MCD, should not be confused with XMCD, because the states probed and the physics involved are quite different. While this technique eliminates the need for a spin-resolving electron detector, it does not provide a clear relationship between measured spectra and size of magnetic moments. PRINCIPLES OF THE METHOD XMCD measures the difference in the absorption coefficient over an absorption edge upon reversing the sample magnetization or helicity of the incident circularly polarized photons. This provides magnetic information that is specific to a particular element and orbital in the sample. Using a simple one-electron theory it is easy to demonstrate that XMCD measurements are proportional to the net moment along the incident beam direction and thus can be used to measure the variations in the magnetization of a particular orbital upon application of external fields or change in the sample temperature. Of particular interest has been the recent development of sum rules which can be used to separate the orbital and spin contributions to the magnetic moment. XMCD measurements are performed at synchrotron radiation facilities, with the specific beamline optics determined by the energy regime of the edge of interest. Basic Theory of X-Ray Circular Dichroism Although a number of efforts had been made over the years to characterize an interplay between x-ray absorption and magnetism, it was only in 1986 that the first unambiguous evidence of magnetization-sensitive absorption was observed in a linear dichroism measurement at the M4,5 edges of Tb metal (Laan et al., 1986). This measurement was quickly followed by the first observation of XMCD at the K edge of Fe and, a year later, of an order-of-magnitude-larger effect at the L edges of Gd and Tb (Schu¨ tz et al., 1987, 1988). Large XMCD signals directly probing the magnetic 3d electrons were then discovered in the soft x-ray regime at the L2,3 edge of Ni (Chen et al., 1990). The primary reason for the lack of previous success in searching for magnetic contributions to the absorption signal is that the count rates, energy tunability, and polarization properties required by these experiments make the use of synchrotron radiation sources absolutely essential. Facilities for the production of synchrotron radiation, however, were not routinely available until the late 1970s. The mere availability of these sources was clearly not sufficient, because attempts to observe XMCD had been made prior to 1988 without success (Keller, 1985). It was only as the stability of these sources increased that the systematic errors which can easily creep into difference measurements were reduced sufficiently to allow the XMCD effect to be observed.
956
X-RAY TECHNIQUES
The basic cause of the enhancement of the dichroic signal that occurs near an absorption edge can be easily understood using a simple one-electron model for x-ray absorption (see Appendix A). While this model fails to explain all of the features in most spectra, it does describe the fundamental interaction that leads to the dichroic signal, and more complicated analyses of XMCD spectra are typically just perturbations of this basic model. In this crude treatment of x-ray absorption, the transition probabilities between spin-orbit-split core states and the spinpolarized final states are calculated starting from Fermi’s golden rule. The most significant concept in this analysis is that the fundamental cause of the XMCD signal is the spin-orbit splitting of the core state. This splitting causes the difference between the matrix elements for the left or right circularly polarized photons. In fact, the presence of a spin-orbit term is a fundamental requirement for the observation of any magneto-optical phenomena regardless of the energy range. Without it no dichroic effects would be seen. Specifically, the example illustrated in Appendix A demonstrates that, for an initial 2p1/2 (L2) core state, left (right) circularly polarized photons will make preferential transitions to spin-down (-up) states. The magnitude of this preference is predicted to be proportional to the difference between the spin-up and spin-down occupation of the probed shell and as such the net moment on that orbital. Therefore the measured signal scales with the local moment and can be used to determine relative changes in the magnetization upon applying a magnetic field or increasing the sample temperature. Furthermore, the XMCD signal for an L3 edge is also predicted to be equal and opposite that observed at an L2 edge. This 1: 1 dichroic ratio is a general rule predicted for any pair of spin-orbit-split core states [i.e., 2p1/2, 3/2 (L2,3), 3d3/2, 5/2 (M4,5), etc.]. Although this simple model describes the basic underlying cause of the XMCD effect, it proves far too simplistic to completely explain most dichroic spectra. In particular, this model predicts no dichroic signal from K-edge measurements, because the initial state at this edge is an s level with no spin-orbit coupling. Yet it is at this edge that the first (admittedly weak) XMCD effects were observed. Also, while experimental spectra for spin-orbitsplit pairs have been found to be roughly in the predicted 1: 1 ratio, for most materials they deviate considerably. To account for these discrepancies, more sophisticated models need to incorporate the following factors, which are neglected in the simple one-electron picture presented. 1. The effects of the spin-orbit interaction in the final states. 2. Changes in the band configuration due to the presence of the final state core hole. 3. Spin-dependent factors in the radial part of the matrix elements. 4. Contributions from higher-order electric quadrupole terms, which can contribute significantly to the XMCD signal for certain spectra.
The first two of these factors are significant for all dichroic spectra, while the later two are particularly relevant to RE L-edge spectra. First consider the effects of the spin-orbit interaction in the final states. The inclusion of this factor can quickly explain the observed dichroic signal at TM K-edge spectra (i.e., repeat the example in Appendix A with initial states and final states inverted). This observation of a K-edge dichroic signal illustrates the extreme sensitivity of the XMCD technique, because not only do 4p states probed by this edge posses a relatively small spin moment (0.2 mB), but the substantial overlap with neighboring states results in a nearly complete quenching of the orbital moment. This orbital contribution (0.01 mB), however, is nonzero, and thus a nonzero dichroic signal is observed. The influence of the final state spin-orbit interaction also explains the deviations from the predicted 1: 1 ratio observed at L and M edges. The spin-orbit term tends to enhance the L3 (M5 ) edge dichroic signal at the expense of the L2 (M4) edge. In terms of the simple one-electron model presented, the spin-orbit interaction effectively breaks the degeneracy of the ml states, resulting in an enhancement of the XMCD signal at the L3 edge. An example of this enhancement is shown in Figure 2, which plots XMCD spectra obtained at the Co L edges for metallic Co and for a Co/Pd multilayer (Wu et al., 1992). The large enhancement of the L3-edge dichroic signal indicates that the multilayer sample possesses an enhanced Co 3d orbital moment compared to that of bulk Co. A quantitative relationship between the degree of this enhancement and the strength of the spin-orbit coupling is expressed in terms of sum rules that relate the integrated dichroic (mc) and total (mo) absorptions over a
Figure 2. Normal and dichroic absorption at the Co L2,3 edges for Co metal and Co/Pd multilayers. Courtesy of Wu et al. (1992).
X-RAY MAGNETIC CIRCULAR DICHROISM
spin-orbit-split pair of core states to the orbital and spin moments of that particular final state orbital (Thole et al., 1992; Carra et al., 1993). The integrated signal rather than the value at a specific energy is used in order to include all the states of a particular band. This has the added benefit of eliminating the final state core-hole effects, thereby yielding information on the ground state of the system. Although the general expressions for the sum rules can appear complicated, at any particular edge they reduce to extremely simple expressions. For instance, the sum rules for the L2,3 edges simplify to the following expressions: Ð m dE hLz i=2 Ð L3 þL2 c ; ¼ ð10
nocc Þ 3m dE o L3 þL2 Ð Ð hSz i þ 7hTz i L3 mc dE 2 L2 mc dE Ð ¼ ð10 nocc Þ 3m dE o L3 þL2
ð1Þ ð2Þ
Here hLZi, hSZi, and hTZi are the expectation values for the orbital, spin, and spin magnetic dipole operators in Bohr magnetons [T ¼ 12 ðS 3r^ ðr^ SÞ] and nocc is the occupancy of the final-state band (i.e., 10 nocc corresponds to the number of final states available for the transition). The value of nocc must usually be obtained via other experimental methods or from theoretical models of the band occupancy, which can lead to uncertainties in the values of the moments obtained via the sum rules. To circumvent this, it is sometimes useful to express the two equations above as a ratio: Ð Ð
L3 þL2
mc dE hLz i Ð ¼ i 2hS m dE z þ 14hTz i L2 c
L3 mc dE 2
ð3Þ
This yields an expression that is independent of the shell occupancy and also eliminates the need to integrate over the total absorption, which can also be a source of systematic error in the measurement. Comparison with magnetization measurements can then be used to obtain the absolute value of the orbital moment. Applying the sum rules to the spectra shown in Figure 2 yields values of 0.17 and 0.24 mB for the orbital moments of the bulk Co and Co/Pd samples, respectively. This example again illustrates the sensitivity of the XMCD technique, because the clearly resolvable differences in the spectra correspond to relatively small changes in the size of the orbital component. Several assumptions go into the derivation of the sum rules, which can restrict their applicability to certain spectra; the most important of these is that the two spin-orbit split states must be well resolved in energy. This criterion means that sum rule analysis is restricted to measurements involving deep core states and is generally not applicable to spectra taken at energies below 250 eV. Moreover, the radial part of the matrix elements is assumed to be independent of the electron spin and identical at both edges of a spin-orbit-split pair. This is typically not the case for the RE L edges, where the presence of the open 4f shell can introduce some spin dependence (Konig et al., 1994).
957
Also, the presence of the magnetic dipole term hTZi in the these expressions poses problems for determining the exact size of the spin moment or the orbital-to-spin ratio. The importance of the size of this term has been a matter of considerable debate (Wu et al., 1993; Wu and Freeman, 1994). Specifically, it has been found that, for TM 3d states in a cubic or near-cubic symmetry, the hTZi term is small and can be safely neglected. For highly anisotropic conditions, however, such as those encountered at interfaces or in thin films, hTZi can become appreciable and therefore would distort measurements of the spin moment obtained from the XMCD sum rules. For RE materials, hTZi can be calculated analytically for the 4f states, but an exact measure of the relative size of this term for the 5d states is difficult to obtain.
PRACTICAL ASPECTS OF THE METHOD The instrumentation required to collect dichroic spectra naturally divides XMCD measurements into two general categories based on the energy ranges involved: those in the soft x-ray regime, which require a windowless UHVcompatible beamline, and those in the hard x-ray regime, which can be performed in nonevacuated environments. The basic elements required for an XMCD experiment, however, are similar for both regimes: i.e., a source of circularly polarized x-rays (CPX), an optical arrangement for monochromatizing the x-ray beam, a magnetized sample, and a method for detecting the absorption signal. The main differences between the two regimes, other than the sample environment, lie in the methods used to detect the XMCD effect. Circularly Polarized X-Ray Sources In evaluating the possible sources of CPX, the following quantities are desirable; high circular polarization rate (Pc), high flux (I), and the ability to reverse the helicity. For a given source, though, the experimenter must sometimes sacrifice flux in order to obtain a high rate of polarization, or vice versa. Under these circumstances, it should be kept in mind that the figure of merit for circularly polarized sources is P2c I, because it determines the amount of time required to obtain measurements of comparable statistical accuracy on different sources (see Appendix B). Laboratory x-ray sources have proved to be impractical for XMCD measurements because they offer limited flux and emit unpolarized x-rays. The use of a synchrotron radiation source is therefore essential for performing XMCD experiments. Three different approaches can be used to obtain circularly polarized x-rays from a synchrotron source: using (1) off-axis bending magnet radiation, (2) a specialized insertion device, such as an elliptical multipole wiggler, or (3) phase retarders based on perfect crystal optics. In a bending magnet source, a uniform circular motion by the relativistic electrons or positrons in the storage ring is used to generate synchrotron radiation. This radiation, when observed on the axis of the particle orbit, is purely linearly polarized in the orbital plane (s polarization).
958
X-RAY TECHNIQUES
Figure 3. Horizontally (short dashes) and vertically (long dashes) polarized flux along with circular polarization rate (solid) from an APS bending magnet at 8.0 keV.
Appreciable amounts of photons polarized out of the orbital plane (p polarization) are only observed off-axis (Jackson, 1975). These combine to produce a CPX source because the s- and p-polarized photons are exactly d ¼ p/2 radians out of phase. The sign of this phase difference depends on whether the radiation is viewed from above or below the synchrotron orbital plane. Therefore, the off-axis synchrotron radiation will be elliptically polarized, with a helicity dependent on the viewing angle and a degree of circular polarization given by Pc ¼
2Es Ep jEs j2 þ jEp j2
sin d;
ð4Þ
where Es and Ep are the electric field amplitudes of the radiation. An example of this is shown in Figure 3, which plots the s and p intensities along with the circular polarization rate as a function of viewing angle for 8.0 keV xrays at an Advanced Photon Source bending magnet (Shenoy et al., 1988). Although simple in concept, obtaining CPX in this fashion does have some drawbacks. The primary one is that the off-axis position required to get appreciable circular polarization reduces the incident flux by a factor of 5 to 10. Furthermore, measurements taken at the sides of the synchrotron beam where the intensity changes more rapidly are particularly sensitive to any motions of the particle beam. Moreover, the photon helicity cannot be changed easily because this requires moving the whole experimental setup vertically. This movement can result in slightly different Bragg angles incident on the monochromator, thereby causing energy shifts that would have to be compensated for. Attempts have been made to overcome this by using slits, which define beams both above and below the orbital plane simultaneously in order to make XMCD measurements. These efforts have had limited success, however, since they are particularly sensitive to sample inhomogeneity (Schu¨ tz et al., 1989b). Standard planar insertion devices are magnetic arrays placed in the straight sections of synchrotron storage rings to make the particle beam oscillate in the orbital plane. Because each oscillation produces synchrotron radiation, a series of oscillations greatly enhances the emitted flux
over that produced by a single bending magnet source. These devices produce linearly polarized light on axis, but the off-axis radiation of a planar device, unlike that of a bending magnet, is not circularly polarized, because equal numbers of right- and left-handed bends in the particle orbit produce equal amounts of left- and right-handed circular polarization, yielding a net helicity of zero. Specialized insertion devices for producing circularly polarized xrays give the orbit of the particle beam an additional oscillation component out of the orbital plane (Yamamoto et al., 1988). In this manner, the particle orbit is made to traverse a helical or pseudo-helical path to yield a net circular polarization on axis of these devices. When coupled with a low-emittance synchrotron storage ring, these devices can provide a high flux with a high degree of polarization. The main disadvantage of these devices is that their high cost has made the beamlines dedicated to them rather rare. Also, to preserve the circular polarization through monochromatization, specialized optics are required, particularly for lower energies (Malgrange et al., 1991). Another alternative for producing CPX is using x-ray phase retarders. Phase retarders employ perfect crystal optics to transform linear to circular polarization by inducing a p/2 radian phase shift between equal amounts of incoming s- and p-polarized radiation. (Here s and p refer to the intensities of the radiation polarized in and out of the scattering plane of the phase retarder, and should not be confused with the radiation emitted from a synchrotron source. Unfortunately, papers on both subjects tend to use the same notation; it is kept the same for this paper in order to conform with standard practice.) Equal s and p intensities incident on the phase retarder are obtained by orienting its plane of diffraction at a 45 angle with respect to the synchrotron orbital plane. As the final optical element before the experiment, it offers the greatest degree of circular polarization incident on the sample. Thus far phase-retarding optics have only been developed for harder x-rays, with soft x-ray measurements restricted to bending-magnet and specialized-insertion-device CPX sources. Typically materials in the optical regime exhibit birefringence over a wide angular range; in the x-ray regime, however, materials are typically birefringent only at or near a Bragg reflection. For the hard x-ray energies of interest in XMCD measurements, a transmission-phase retarder, shown in Figure 4, has proved to be the most suitable type of phase retarder (Hirano et al., 1993). In this
Figure 4. Schematic of a transmission phase retarder.
X-RAY MAGNETIC CIRCULAR DICHROISM
Figure 5. Calculated degree of circular polarization of the transmitted beam for a 375-mm-thick diamond (111) Bragg reflection at 8.0 keV.
phase retarder, a thin crystal, preferably of a low-Z material to minimize absorption, is deviated (y 10 to 100 arcsec) from the exact Bragg condition, and the transmitted beam is used as the circularly polarized x-ray source. Figure 5 plots the predicted degree of circular polarization in the transmitted beam as a function of the off-Bragg position for a 375-mm-thick diamond (111) crystal and 8.0 keV incoming x-ray beam. Note that, for a particular crystal thickness and photon energy, either helicity can be obtained by simply reversing y. Further, when used with a low-emittance source, a high degree of polarization (Pc > 0.95) can be achieved with a transmitted beam intensity of 10% to 20% of the incident flux. This flux loss is comparable to that encountered using the offaxis bending magnet radiation. The phase retarder, however, provides easy helicity reversal and can be used with a standard insertion device to obtain fluxes comparable to or greater than those obtained from a specialized device (Lang et al., 1996). The main drawback of using a phase retarder as a CPX source is that the increased complexity of the optics can introduce a source of systematic errors. For instance, if the phase retarder is misaligned and does not precisely track with the energy of the monochromator, the y movements of the phase retarder will not be symmetric about the Bragg reflection. This leads to measurements with different Pc values and can introduce small energy shifts in the transmitted beam. Detection of XMCD The most common method used to obtain XMCD spectra is to measure the absorption by monitoring the incoming (Io) and transmitted (I) fluxes. The absorption in the sample is then given by: Io mo t ¼ ln I
ð5Þ
The dichroism measurement is obtained by reversing the helicity or magnetization of the sample and taking the difference of the two measurements (mc ¼ mþ m ).
959
Typically the thickness of the sample, t, is left in as a proportionality factor, because its removal requires an absolute determination of the transmitted flux at a particular energy. Some experimenters choose to remove this thickness factor by expressing the dichroism as a ratio between the dichroic and normal absorption, mc/mo. Absorption not due to resonant excitation of the measured edge, however, is usually removed by normalizing the pre-edge of the mo spectra to zero; therefore, expressing the XMCD signal in terms of this ratio tends to accentuate features closer to the absorption onset. Rather than taking the difference of Eq. 5 for two magnetizations, the dichroic signal is also frequently expressed in terms of the asymmetry ratio of the transmitted fluxes only. This asymmetry ratio is related to the dichroic signal by Eq. 6. þ
I þ I e m t e m t e0:5mc t e 0:5mc t ¼ ¼ I þ þ I e mþ t þ e m t e0:5mc t þ e 0:5mc t ¼ tanhð0:5mc tÞ ffi f 0:5mc t
ð6Þ
The last approximation can be made, because reasonable attenuations require mot 1 to 2 and mc < 0.05 mot. The factor f is introduced in an ad hoc fashion to account for the incomplete circular polarization in the incident beam and the finite angle (y) between the magnetization and beam directions, f ¼ Pc cos y. In some cases, the factor f also includes the incomplete sample magnetization (M0 ) relative to the saturated moment at T ¼ 0 K. If the experimenter is interested in obtaining element specific magnetizations as a function of field or temperature, however, the inclusion of this factor would defeat the purpose of the measurement. This type of measurement is illustrated in Figure 6, which shows a plot of a hysteresis measurement taken using the dichroic signal at the Co and Fe L3 edges of a Co/Cu/Fe multilayer (Chen et al., 1993; Lin et al., 1993). In this compound the magnetic anisotropy of the Co layer is much greater than that of the Fe layer. Thus, in an applied magnetic field, the directions of Fe moments reverse before those of Co moments. By monitoring the strength of the dichroic signal as a function of applied field for the Fe and Co L3 edges, the hysteresis curves of each constituent were traced out. Similarly, XMCD can also be used to measure the temperature variation of the magnetization of a specific constituent in a sample. This is demonstrated in Figure 7 (Rueff et al., 1997), which shows the variation of the dichroic signal at the Gd L edges as a function of temperature for a series of amorphous GdCo alloys. These XMCD measurements provide orbital- and element-specific information complementary to that obtained from magnetometer measurements.
Measurement Optics Two basic optical setups, shown in Figure 8, are used for the measurement of x-ray attenuations. The scanning monochromator technique (Fig. 8a) selects a small energy range (E 1 eV) to impinge on the sample, and the full
960
X-RAY TECHNIQUES
Figure 6. Hysteresis measurements of the individual magnetizations of Fe and Co in a Fe/Cu/Co multilayer obtained by measuring the L3 edge XMCD signal (top), along with data obtained from a vibrating sample magnetometer (VSM) and a least-squares fit to the measured Fe and Co hysteresis curves (bottom). Courtesy of Chen et al. (1993).
spectrum is acquired by sequentially stepping the monochromator through the required energies. In the dispersive monochromator technique (Fig. 8b), a broad range of energies (E 200 to 1000 eV) is incident on the sample simultaneously and the full spectrum is collected by a position-sensitive detector. Both of these setups are common at synchrotron radiation facilities where they have been widely used for EXAFS measurements. Their conversion to perform XMCD experiments usually requires minimal effort. The added aspects are simply use of CPX, using the methods discussed above, and a magnetized sample. Although the scanning monochromator technique can require substantial time to step through the required beam energies, the simplicity of data interpretation and compatibility with most synchrotron beamline optics have made it by far the most common method utilized for XMCD measurements. Typically, these measurements are taken by reversing the magnetization (helicity) at each energy step, rather than taking a whole spectrum, then changing the magnetization (helicity) and repeating the measurement. The primary reason for this is to minimize
Figure 7. Variation of the Gd L2-edge XMCD signal as a function of temperature for various GdNiCo amorphous alloys. Courtesy of Rueff et al. (1997).
systematic errors that can be introduced as a result of the decay or movements of the particle beam. In the hard x-ray regime, because of the high intensity of the synchrotron radiation, the incident (Io) and
Figure 8. Experimental setups for the measurement of the absorption. (A) Scanning monochromator (double crystal); (B) dispersive monochromator (curved crystal).
X-RAY MAGNETIC CIRCULAR DICHROISM
attenuated (I) beam intensities are usually measured using gas ionization chambers. For these detectors, a bias (500 to 2000 V) is applied between two metallic plates and the x-ray beam passes though a gas (generally nitrogen) flowing between them. Absorption of photons by the gas creates ion pairs that are collected at the plates, and the induced current can then be used to determine the xray beam intensity. Typical signal currents encountered in these measurements range from 10 9 A for bending magnet sources to 10 6 A for insertion-device sources. While these signal levels are well above the electronic noise levels, 10 14 A, produced by a state-of-the-art current amplifier, other noise mechanisms contribute significantly to the error. The most common source of noise in these detectors arises from vibrations in the collection plates and the signal cables. Vibrations change the capacitance of the detection system, inducing currents that can obscure the signal of interest—particularly for measurements made on a bending magnet source, where the signal strength is smaller. This error can be minimized by rigidly mounting the ion chamber and minimizing the length of cable between the ion chamber and the current amplifier. Even when these effects are minimized, however, they still tend to be the dominant noise mechanism in XMCD measurements. This makes it difficult to get an a priori estimate of the error based on the signal strength. Therefore, the absolute error in an XMCD measurement is frequently obtained by taking multiple spectra and computing the average and variance in the data. Using a scanning monochromator, one can also indirectly measure the effective absorption of a sample by monitoring the fluorescence signal. Fluorescence radiation results from the filling of the core hole generated by the absorption of an x-ray photon. The strength of the fluorescence will thus be proportional to the number of vacancies in the initial core state and therefore to the absorption. An advantage of this technique over transmittance measurements is that fluorescence arises only from the element at resonance, thereby effectively isolating the signal of interest. In a transmittance measurement, on the other hand, the resonance signal from the absorption edge sits atop the background absorption due to the rest of the constituents in the sample. Fluorescence measurements require the use of an energy-sensitive detector to isolate the characteristic lines from the background radiation of elastically and Compton scattered radiation. Generally Si or Ge solid-state detectors have been used for fluorescence detection, since they offer acceptable energy resolution (E 200 eV) and high efficiencies. The count rate offered by these solid-state detectors, however, is typically restricted to 1 T), the direction of the local magnetization direction will generally lie along the sample plane rather than along the field direction. Varying the angle that the beam makes with the sample magnetization can also be used as a means of identifying features in the dichroic spectra that do not arise from dipolar transitions which scale with the cosine. Figure 10 shows the XMCD signal taken for two different angles at the Dy L3 edge in a Dy0.4Tb0.6 alloy (Lang et al., 1995). These spectra are normalized to the cosine of the angle between the beam and the magnetization directions; therefore, the deviation in the lower feature indicates nondipolar-like behavior. This feature has been shown to arise from quadrupolar transitions to the 4f states and is a common feature in all RE L2,3-edge XMCD spectra (Wang et al., 1993). Although quadrupolar effects in the absorption spectrum are smaller than dipolar effects by a factor of 100, their size becomes comparable in the difference spectrum due to the strong spin-polarization of the RE 4f states. The presence of quadrupolar transitions can complicate the application of sum rule to the analysis of these spectra but also opens up the possibility of obtaining magnetic information on two orbitals simultaneously from measurements at only one edge.
METHOD AUTOMATION Acquisition of XMCD spectra requires a rather high degree of automation, because the high intensities of x-ray radiation encountered at synchrotron sources prevent any manual sample manipulation during data acquisition. The experiment is normally set up inside a shielded room and all experimental equipment operated remotely through a computer interface. These interfaces are typically already installed on existing beamlines and the experimenter simply must ensure that the equipment that they are considering for the experiment is compatible with the existing hardware. Note that the time required to enter and exit the experimental enclosure is not minimal. Therefore, computerized motion control of as many of the experimen-
The first thing to realize when taking an initial qualitative look at an XMCD spectrum is that these measurements probe the unoccupied states, whose net polarization is opposite that of the occupied states. Thus, the net difference in the XMCD signal reflects the direction of the socalled minority spin band, because it contains the greater number of available states. It is therefore easy to make a sign error for the dichroic spectra, and care should be taken when determining the actual directions of the moments in a complicated compound. To prevent confusion, it is essential to measure the direction of the magnetic field and the helicity of the photons used from the outset of the experiment. If XMCD is being used to measure changes in the magnetization of the sample, the integrated dichroic signal should be used rather than the size of the signal at one particular energy for two reasons in particular. The integrated signal is less sensitive to long-term drifts in the experimental apparatus than are measurements at one nominal energy. And, systematic errors are much more apparent in the entire spectrum than at any particular energy. For instance, the failure of the spectrum to go to zero below or far above the absorption edge would immediately indicate a problem with the data collection, but would not be detected by measurement at a single energy. When analyzing spectra taken from different compounds, one should note that a change in the magnitude of the dichroic signal at just one edge does not necessarily reflect a change in the size or direction of the total magnetic moment. Several factors complicate this, some of which have already been mentioned. For example, the development of an orbital moment for the spectra shown in Figure 2 greatly enhanced the signal at the L3 edge but diminished the signal at the L2 edge. Therefore, although the size of the total moment remains relatively constant between the two samples, the dichroic signal at just one edge shows substantial changes that do not necessarily reflect this. The total moment in this case must be obtained through the careful application of both sum rules or comparison with magnetometer measurements. The strength of the TM K-edge dichroism also tends not to
964
X-RAY TECHNIQUES
Figure 11. Steps involved in the sum rule analysis of the XMCD spectra demonstrated for a thin layer of metallic Fe. (A) Transmitted intensity of parylene substrate and substrate with deposited sample. (B) Calculated absorption with background subtracted. (C) Dichroic spectra and integrated dichroic signal. (D) Absorption spectra and edge-jump model curve along with integrated signal. Courtesy of Chen et al. (1995).
reflect the size of the moment for different materials. This is particularly true for the Fe K-edge spectra which demonstrate a complete lack of correlation between the size of the total Fe moment and the strength of the dichroic signal, although better agreement has been found for Co and Ni (Stahler et al., 1993). In addition, RE L2,3-edge spectra show a strong spin dependence in the radial matrix elements that can in some cases invert the relationship between the dichroic signal and the size of the moment (Lang et al., 1994). These complications notwithstanding, the dichroic spectra at one edge and within one compound should still scale with changes in the magnetization due to applied external fields or temperature changes. If the XMCD signal is not complicated by the above effects and both edges of a spin-orbit-split pair are measured, the sum rules may be applied. The steps involved in the implementation of the sum rules are illustrated in Figure 11 for a thin Fe metal film deposited on a parylene substrate (Chen et al., 1995). The first step in sum rules analysis is removal from the transmission signal of all
absorption not due to the particular element of interest. In this example, this consists of separately measuring and subtracting the substrate absorption (Fig. 11a), leaving only the absorption from the element of interest (Fig. 11b). In compound materials, however, there are generally more sources of absorption than just the substrate material, and another method of background removal is necessary. The common method has been subtraction from the signal of a l3 function fit to the pre-edge region of the absorption spectrum. The next step in the application of the sum rules is determining the integrated intensities of the dichroic signal. The easiest method is to plot the integral of the dichroic spectrum as a function of the scanning energy (Fig. 11c). The integrated values over each edge can then be directly read at the positions where the integrated signal becomes constant. In the figure, p corresponds to the signal from the L3 edge alone and q corresponds to the integrated value over both edges. Although there is some uncertainty in the appropriate value of the p integral, since the dichroic signal of the L3 edge extends well past the edge, choosing the position closest to the onset of the L2 edge signal minimizes possible errors. This uncertainty diminishes for edges at higher energies where the spinorbit splitting of the core levels becomes more pronounced. The final step in the sum rule analysis is obtaining the integral of the absorption spectrum (Fig. 11d). At first glance, it seems that this integral would be nonfinite because the absorption edges do not return to zero at the high-energy end of the spectrum. The sum rule derivation, however, implicitly assumes that the integral includes only absorption contributions from the final states of interest (i.e., the 3d states in this example). It is well known that the large peaks in the absorption spectrum just above the edge, sometimes referred to as the ‘‘white line,’’ are due to the strong density of states of d bands near the Fermi energy, while the residual step-like behavior is due to transitions to unbound free electron states. To deconvolve these two contributions, a two-step function is introduced to model the absorption arising from the transitions to the free electron states. The absorption-edge jumps for this function are put in a 2:1 ratio, expected because of the 2:1 ratio in the number of initial-state electrons. The steps are also broadened to account for the lifetime of the state and the experimental resolution. The integral r shown in Fig 11d is then obtained by determining the area between the absorption curve and the two-step curve. Once the integrals r, p, and q are determined, they can easily be put into Eqs. 1, 2, and 3 to obtain the values of the magnetic moments.
SAMPLE PREPARATION XMCD measurements in the hard x-ray regime require little sample preparation other than obtaining a sample thin enough for transmission measurements. Measurements in the soft x-ray regime, on the other hand, tend to require more complicated sample preparation to account for UHV conditions and small attenuation lengths. Hard xray XMCD measurements therefore tend to be favored if
X-RAY MAGNETIC CIRCULAR DICHROISM
the same type of information (i.e., relative magnetization) can be obtained using edges at the higher energies. The incident photon fluxes typically do not affect the sample in any way, so repeated measurements can be made on the same sample without concern for property changes or degradation. Attenuation lengths (i.e., the thickness at which I/ Io 1/e) for hard x-rays are generally on the order of 2 to 10 mm. While it is sometimes possible to roll pure metals to these thicknesses, the rare earth and transition metal compounds generally studied in XMCD are typically very brittle, so that rolling is impractical. The sample is therefore often ground to a fine powder in which the grain size is significantly less than the x-ray attenuation length (1 mm). This powder is then spread on an adhesive tape, and several layers of tape combined to make a sample of the proper thickness. Most of the absorbance in these tapes is usually due to the adhesive rather than the tape itself; thus, a commercial tape brand is as effective as a specialized tape such as Kapton that minimizes x-ray absorbance. An alternative to tape is to mix the powder with powder of a light-Z material such as graphite and make a disk of appropriate thickness using a press. A note of warning: the fine metallic powders used for these measurements, especially those containing RE elements, can spontaneously ignite. Contact with air should therefore be kept to a minimum by using an oxygen-free atmosphere when applying the powder and by capping the sample with a tape overlayer during the measurements. The techniques used to detect the XMCD signal in the soft x-ray energy regime are generally much more surface sensitive. Therefore, the surfaces must be kept as clean as possible and arrangements made to clean them in situ. Alternatively, a capping layer of a nonmagnetic material can be applied to the sample to prevent degradation. Surface cleanliness is also a requirement for the total electronyield measurements; that is, the contacts used to measure the current should make the best possible contact to the material without affecting its properties. For transmittance measurements in this regime, the thinness of the ˚ ) makes supporting sample attenuation lengths (100 A the sample impractical. Thus, samples are typically deposited on a semitransparent substrate such as parylene.
PROBLEMS There are many factors that can lead to erroneous XMCD measurements. These generally fall into two categories, those encountered in the data collection and those that arise in the data analysis. Problems during data collection tend to introduce distortions in the shape or magnitude of the spectra that can be difficult to correct in the data analysis; thus, great care must be taken to minimize their causes and be aware of their effects. Many amplitude-distorting effects encountered in XMCD measurements, and in x-ray absorption spectroscopy in general, come under the common heading of ‘‘thickness effects.’’ Basically these are systematic errors that diminish the relative size of the observed signal due to leakage of noninteracting x-rays into the measurement.
965
Leakage x-rays can come from pinholes in the sample, higher-energy harmonics in the incoming beam, or the monochromator resolution function containing significant intensity below the absorption-edge energy. While these effects are significant for all x-ray spectroscopies, they are particularly important in the near-edge region probed by XMCD. First, to minimize pinhole effects, the sample should be made as uniform as possible. If the sample is a powder, it should be finely ground and spread uniformly on tape, with several layers combined to obtain a more uniform thickness. All samples should be viewed against a bright light and checked for any noticeable leakages. The sample uniformity should also be tested by taking absorption spectra at several positions on the sample before starting the XMCD measurements. Besides changes in the transmitted intensity, the strength and definition of sharp features just above the absorption edge in each spectrum provide a good qualitative indication of the degree of sample nonuniformity. Another source of leakage x-rays is the harmonic content of the incident beam. The monochromator used for the measurement will pass not only the fundamental wavelength but multiples of it as well. These higherenergy photons are unaffected by the absorption edges and will cause distorting effects similar to those caused by pinholes. These harmonics are an especially bothersome problem with third-generation synchrotron sources because the higher particle-beam energies found at these sources result in substantially larger high-energy photon fluxes. Harmonics in the beam are generally reduced by use of an x-ray mirror. By adjusting the reflection angle of the mirror, near-unit reflectivity can be achieved at the fundamental energy, while the harmonic will be reduced by several orders of magnitude. Another common way of eliminating harmonics has been to slightly deviate the angle of the second crystal of a scanning monochromator (Fig. 8a) so that it is not exactly parallel to the first. The monochromator (said to be ‘‘detuned’’) then reflects only a portion of the beam reflected by the first crystal, typically 50%. The narrower reflection width of the higher-energy photons makes harmonics much more sensitive to this detuning. Thus while the fundamental flux is reduced by a factor of 2, the harmonic content of the beam will be down by several orders of magnitude. Eliminating harmonics in this manner has a distinct disadvantage for XMCD measurements, however, because it also reduces the degree of circular polarization of the monochromatized beam. Therefore, monochromator detuning should not be used with off-axis bending-magnet or specialized-insertion-device CPX sources. A third important consideration in minimizing thickness effects is the resolution of the monochromator. After passing through the monochromator, the beam tends to have a long-tailed Lorentzian energy distribution, meaning that a beam nominally centered around Eo with a resolution of E will contain a substantial number of photons well outside this range. When Eo is above the edge, the bulk of the photons are strongly attenuated, and thus those on the lower end of the energy distribution will constitute a greater percentage of the transmitted intensity.
966
X-RAY TECHNIQUES
Typically a monochromator with a resolution well below the natural width of the edge is used to minimize this effect. This natural width is determined by the lifetime of the corresponding core state under study. For instance, a typical RE L-edge spectrum with a core-hole broadening of 4 eV is normally scanned by a monochromator with 1 eV resolution. Increasing the resolution, however, typically increases the Bragg angle of the monochromator, which reduces the circular polarization of the transmitted beam. While all the methods for reducing the thickness effects mentioned above should be used, the primary and most efficient way to minimize these effects is to adjust the sample thickness so as to minimize the percentage of leakage photons. This is accomplished by optimizing the change in the transmitted intensity across the edge. It should be small enough to reduce the above thickness effect errors yet large enough to permit observation of the signal of interest. It has been found in practice that sample thicknesses (t) of 1 attenuation length (mt 1) below the absorption edge and absorption increases across the edge of kc
ð19Þ
and the conservation of photons is fulfilled in the scattering process Tðkz;s Þ þ Rðkz;s Þ ¼ 1
ð20Þ
In Figure 2B the transmission amplitude |t(kz,s)| for external (solid line) and for internal (dashed line) reflections are shown. This amplitude modulates non-specular scattering processes at the interface as will be discussed later in this unit.
X-RAY DIFFRACTION TECHNIQUES FOR LIQUID SURFACES AND MONOMOLECULAR LAYERS
1031
Figure 4. An illustration of a continuous scattering length density, sliced into a histogram. Figure 3. Calculated reflectivities from H2O and liquid mercury (Hg) showing the effects of absorption and surface roughness. The absorption modifies the reflectivity near the critical momentum transfer for mercury with insignificant effect on the reflectivity from H2O. The dashed line shows the calculated reflectivity from the same interfaces with root mean square surface roughness, ˚. s¼3A
The effect of absorption on the reflectivity can be incorporated by introducing b into the generalized potential in Equation 11, so that qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi kz;s ¼ k2z;0 k2c þ 2ib ð21Þ is used in the Fresnel equations (Equation 15). Calculated reflectivities from water and liquid mercury, demonstrating that the effect of absorption is practically insignificant for the former, yet has the strongest influence near the critical angle for the latter, are shown in Figure 3. Multiple Stepwise and Continuous Interfaces. On average, the electron density across a liquid interface is a continuously varying function, and is a constant far away on both sides of the interface, as shown in Figure 4. The reflectivity for a general function r(z) can be then calculated by one of several methods, classified into two major categories: dynamical and kinematical solutions. The dynamical solutions (see DYNAMICAL DIFFRACTION) are in general more exact, and include all the features of the scattering, in particular the low-angle regime, close to the critical angle where multiple scattering processes occur. For a finite number of discrete interfaces, exact solutions can be obtained by use of standard recursive (Parratt, 1954) or matrix (Born and Wolf, 1959) methods. These methods can be extended to compute, with very high accuracy, the scattering from any continuous potential by slicing it into a set of finite layers but with a sufficient number of interfaces. On the other hand, the kinematical approach (see KINEMATIC DIFFRACTION OF XRAYS) neglects multiple scattering effects and fails in describing the scattering at small angles. 1. The matrix method. In this approach the scattering length density with variation over a characteristic length dt is sliced into a histogram with N interfaces.
The matrix method is practically equivalent to the Parratt formalism (Born and Wolf, 1959; Lekner, 1987). For each interface, the procedure described previously for the one interface is applied. Consider an arbitrary interface, n, separating two regions of a sliced SLD (as in Fig. 4), with rn 1 , and rn at position z ¼ zn , with the following wavefunctions rn 1 Rn 1;n e ikn 1 z Tn 1;n e ikn 1 z
! z ¼ zn
rn Rn;nþ1 e ikn z ! Tn;nþ1 eikn z
ð22Þ
where kn
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi k2z;0 4prn
ð23Þ
The effect of absorption can be taken into account as described earlier. For simplicity, the subscript z is omitted from the component of the wavevector so that kz;n ¼ kn . The solution at each interface in terms of a transfer matrix, Mn, is given by Tn 1;n Tn;nþ1 e iðkn 1 kn Þzn rn e iðkn 1 þkn Þzn ¼ Rn 1;n Rn;nþ1 rn eiðkn 1 þkn Þzn eiðkn 1 kn Þzn ð24Þ where rn ¼
kn 1 kn kn 1 þ kn
ð25Þ
is the Fresnel reflection function through the zn interface separating the rn 1 and rn SLDs. The solution to the scattering problem is given by noting that beyond the last interface, (i.e., in the bulk), there is a transmitted wave for which only an arbitrary amplitude of the form ð 1 0 Þ can be assumed (i.e., the reflectivity is normalized to the incident beam anyway). The effect of all interfaces is calculated as follows
T0;1 R0;1
¼ ðM1 ÞðM2 Þ . . . ðMn Þ . . . ðMNþ1 Þ
1 0
ð26Þ
1032
X-RAY TECHNIQUES
with rNþ1 ¼
kN ks kN þ ks
ð27Þ
in the MNþ1 matrix given in terms of the substrate ks. The reflectivity is then given by the ratio R0;1 2 RðQz Þ ¼ T0;1
ð28Þ
Applying this procedure to the one-box model of thickness d with two interfaces yields r1 þ r2 ei2ks d 2 RðQz 2ks Þ ¼ 1 þ r1 r2 ei2ks d
ð29Þ
Figure 5 shows the calculated reflectivities from a flat liquid interface with two kinds of films (one box) of the same thickness d but with different scattering length densities, r1 and r2 . The reflectivities are almost indistinguishable when the normalized
SLDs (ri =rs ) of the films are complementary to one (r1 =rs þ r2 =rs ¼ 1), except for a very minute difference near the first minimum. In the kinematical method described below, the two potentials shown in Figure 5 yield identical reflectivities. The matrix method can be used to calculate the exact solution from a finite number of interfaces, and it is most powerful when used with computers by slicing any continuous scattering length density into a histogram. The criteria for determining the optimum number of slices to use is based on the convergence of the calculated reflectivity at a point where slicing the SLD into more boxes does not change the calculated reflectivity significantly. 2. The kinematical approach. The kinematical approach for calculating the reflectivity is only applicable under certain conditions where multiple scattering is not important. It usually fails in calculating the reflectivity at very small angles (or small momentum transfers) near the critical angle. The kinematical approach, also known as the Born approximation, gives physical insight in the formulation of R(Qz) by relating the Fresnel normalized reflectivity, R/RF, to the Fourier transform of spatial changes in r(z) across the interface (Als-Nielsen and Kjaer, 1989) as discussed below. As in the dynamical approach, r(z) is sliced so that kðzÞ ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi k2z;0 4prðzÞ
dk 2p dr ¼
dz kðzÞ dz
ð30Þ
and the reflectance across an arbitrary point z is given by rðzÞ ¼
kðz þ zÞ kðzÞ 4p dr
dz kðz þ zÞ þ kðzÞ 4kðzÞ2 dz ðQc =2Qz Þ2
1 dr dz rs dz
ð31Þ
In the last step of the derivation, r(z) was multiplied and divided by rs , the SLD of the subphase, and the identity Q2c 16prs was used. Assuming no multiple scattering, the reflectivity is calculated by integrating over all reflectances at each point, z, with a phase factor eiQz z as follows ð 1 dr iQz z 2 e dz ¼ RF ðQz ÞjðQz Þj2 ð32Þ RðQz Þ ¼ RF ðQz Þ rs dz Figure 5. Calculated reflectivities for two films with identical thicknesses but with two distinct normalized electron densities, r1 (solid line) and 1 r1 (dashed line), and corresponding calculated reflectivities using the dynamical approach (s ¼ 0). The two reflectivities are almost identical except for a minute difference near the first minimum (see arrow in figure). The Born approximation (dotted line) for the two models yields identical reflectivities. The inset shows the normalized reflectivities near the first minimum. As Qz is increased, the three curves converge. This is the simplest demonstration of the phase problem, i.e., the nonuniqueness of models where two different potentials give the same reflectivities.
where (Qz) can be regarded as the generalized structure factor of the interface, analogous to the structure factor of a unit cell in 3-D crystals. This formula also can be derived by using the Born approximation, as is shown in the following section. As an example of the use of Equation 32 we assume that the SLD at a liquid interface can be approximated by a sum of error functions as follows " !# N X ðrj rj 1 Þ z zj rðzÞ ¼ r0 þ 1 þ erf pffiffiffiffiffiffiffiffi ð33Þ 2 2sj j¼1
X-RAY DIFFRACTION TECHNIQUES FOR LIQUID SURFACES AND MONOMOLECULAR LAYERS
where r0 is the SLD of the vapor phase and rN ¼ rs . Using Equation 32 the reflectivity is given by 2 X r r ðQz sj Þ2 j j 1
2 iQz zj e e RðQz Þ ¼ RF ðQz Þ j rs
ð34Þ
Assuming one interface at z1 ¼ 0 with surface roughness s1 ¼ s, the Fresnel reflectivity, RF(Qz), is simply modified by a Debye-Waller-like factor RðQz Þ ¼ RF ðQz Þe ðQz sÞ
2
ð35Þ
The effect of surface roughness on the reflectivities from water and from liquid mercury surfaces, assuming Gaussian smearing of the interfaces, is shown by the dashed lines in Figure 3. Braslau et al. (1988) have demonstrated that the Gaussian smearing of the interface due to capillary waves in simple liquids is sufficient in modeling the data, and that more complicated models cannot be supported by the x-ray data. Applying Equation 34 to the one-box model discussed above (see Fig. 5), and assuming conformal roughness, sj ¼ s, the calculated reflectivity in terms of SLD normalized to rs is 2
RðQz Þ ¼ RF ðQz Þ½ð1 r1 Þ2 þr21 þ2r1 ð1 r1 ÞcosðQz dÞe ðQz sÞ
ð36Þ In this approximation, the roles of the normalized SLD of the one box, r1 , and of the complementary model, r2 ¼ 1 r1 , are equivalent. This demonstrates that the reflectivities for both models are mathematically identical. This is the simplest of many examples where two or more distinct SLD models yield identical reflectivities in the Born approximation. When using the kinematical approximation to invert the reflectivity to SLD, there is always a problem of facing a nonunique result. For discussion of ways to distinguish between such models see Data Analysis and Initial Interpretation. In some instances, the scattering length density can be generated by several step functions that are smeared with one Gaussian (conformal roughness sj ¼ s), representing different moieties of the molecules on the surface. The reflectivity can be calculated by using a combination of the dynamical and the kinematical approaches (Als-Nielsen and Kjaer, 1989). First, the exact reflectivity from the step-like functions (s ¼ 0) is calculated using the matrix method, Rdyn(Qz), and the effect of surface roughness is incorporated by multiplying the calculated reflectivity with a Debye-Waller-like factor as follows (Als-Nielsen and Kjaer, 1989) RðQz Þ ¼ Rdyn ðQz Þe ðQz sÞ
2
ð37Þ
1033
weak, and enhancements due to multiple scattering processes at the interface are taken advantage of. As is shown in Figure 1B the momentum transfer Q has a finite component parallel to the liquid surface ðQ? ki? kf? Þ, enabling determination of lateral correlations in the 2-D plane. Exact calculation of scattering from surfaces is practically impossible except for special cases, and the Born approximation (BA; Schiff, 1968) is usually applied. When the incident beam or the scattered beam are at grazing angles (i.e., near the critical angle), multiple scattering effects modify the scattering, and these can be accounted for by a higher-order approximation known as the distorted wave Born approximation (DWBA). The features due to multiple scattering at grazing angles provide evidence that the scattering processes indeed occur at the interface. The Born Approximation. In the BA for a general potential V(r) the scattering length amplitude is calculated as follows (Schiff, 1968) FðQÞ ¼
ð 1 VðrÞeiQr d3 r 4p
ð38Þ
where, in the present case, VðrÞ ¼ 4prðrÞ. From the scattering length amplitude, the differential cross-section is calculated as follows (Schiff, 1968) ð ds ¼ jFðQÞj2 ¼ ½rð0ÞrðrÞeiQr d3 r d
ð39Þ
Ð where ½rð0Þ rðrÞ ½rðr0 rÞrðr0 Þd3 r0 is the densitydensity correlation function. The measured reflectivity is a convolution of the differential cross-section with the instrumental resolution, as discussed below and in the literature (Schiff, 1968; Braslau et al., 1988; Sinha et al., 1988). The scattering length density, r, for a liquid-gas interface can be described as a function of the actual height of the surface, z(x,y), as follows rðl; zÞ ¼
rs 0
for z < zðlÞ for z > zðlÞ
ð40Þ
where m ¼ ðx; y; 0Þ is a 2-D in-plane vector. The height of the interface z is also time and temperature dependent due to capillary waves, and therefore thermal averages of z are used (Buff et al., 1965; Evans, 1979). Inserting the SLD (Equation 40) in Equation 39 and performing the integration over the z coordinate yields FðQ? ; Qz Þ ¼
ð rs ei½Q? lþQz zðlÞ d2 l iQz
ð41Þ
where Q? ¼ ðQx ; Qy ; 0Þ is an in-plane scattering vector. This formula properly predicts the reflectivity from an ideally flat surface, zðx; yÞ ¼ 0 within the kinematical approximation
Non-Specular Scattering The geometry for non-specular reflection is shown in Figure 1B. The scattering from a 2-D system is very
FðQ? ; Qz Þ ¼
4p2 rs ð2Þ d ðQ? Þ iQz
ð42Þ
1034
X-RAY TECHNIQUES
with a 2-D delta-function (dð2Þ ) that guarantees specular reflectivity only. The differential cross-section is then given by
2 2
ds Qc ¼ p2 d 4Qz
dð2Þ ðQ? Þ
ð43Þ
where Q2c 16prs . This is the general form for the Fresnel reflectivity in terms of the differential cross-section ds=d , which is defined in terms of the flux of the incident beam on the surface. In reflectivity measurements, however, the scattered intensity is normalized to the intensity of the incident beam, and therefore the flux on the sample is angle-dependent and is proportional to sinai . In addition, the scattered intensity is integrated over the polar angles af and 2y with k20 sin af daf dð2yÞ ¼ dQx dQy . Correcting for the flux and integrating RF ðQz Þ stot ðQz Þ ¼
ðð
dQx dQy ds Qc 4 ¼ 2Qz d 4p2 k20 sin ai sin af ð44Þ
as approximated from the exact solution, given in Equation 17. Taking advantage of the geometrical considerations above, the differential cross-section to the reflectivity measurement can be readily derived in the more general case of scattering length density that varies along z only, (i.e., r(z)). In this case, Equation 38 can be written as ð 2 ds ¼ 4p2 dð2Þ ðQ? Þ rðzÞeiQz z dz d
ð45Þ
If we normalize r(z) to the scattering length density of the substrate, rs , and use a standard identity between the Fourier transform of a function and its derivative, we obtain 2 2 ð 1 drðzÞ iQ z 2 ð2Þ ds Qc z d ðQ? Þ ¼ p2 e dz d 4Qz rs dz
ð46Þ
which, with the geometrical corrections, yields Equation 32. Thermal averages of the scattering length density under the influence of capillary waves and the assumption that the SLD of the gas phase is zero can be approximated as follows (Buff et al., 1965; Evans, 1979) r rðrÞ s 2
"
z 1 þ erf pffiffiffi 2sðlÞ
#! ð47Þ
where s(m) is the height-height correlation function. Inserting Equation 47 into Equation 39, and integrating over z, results in the differential cross-section ds d
Q4c Q2z
ð
2
eiQ? l Qz s
2
ðlÞ 2
d l
ð48Þ
and assuming isotropic correlation function in the plane yields (Sinha et al., 1988) ð ds Q4c 2 2 2 m J0 ðQ? mÞe Qz s ðmÞ dm d Qz
ð49Þ
where J0 is a Bessel function of the first kind. This expression was used by Sinha et al. (1988) to calculate the diffuse scattering from rough liquid surfaces with a height-height density correlation function that diverges logarithmically due to capillary waves (Sinha et al., 1988; Sanyal et al., 1991). Distorted Wave Born-Approximation (DWBA). Due to the weak interaction of the electromagnetic field (x-rays) with matter (electrons), the BA is a sufficient approach to the scattering from most surfaces. However, as we have already encountered with the reflectivity, the BA fails (or is invalid) when either the incident beam or the scattered beam is near the critical angle where multiple scattering processes take place. The effect of the bulk on the scattering from the surface can be accounted for by defining the scattering length density as a superposition of two parts, as follows rðrÞ ¼ r1 ðzÞ þ r2 ðl; zÞ
ð50Þ
Here r1 ðzÞ is a step function that defines an ideally sharp interface separating the liquid and gas phases at z ¼ 0, whereas the second term, r2 ðm; zÞ, is a quasi–two-dimensional function in the sense that it has a characteristic average thickness, dc, such that limz!dc =2 r2 ðm; zÞ ¼ 0. It can be thought of as film-like and is a detailed function with features that relate to molecular or atomic distributions at the interface. Although the definition of r2 may depend on the location of the interface, (z ¼ 0) in r1 , the resulting calculated scattering must be invariant for equivalent descriptions of r(r). In some cases r2 can be defined as either a totally external or totally internal function with respect to the liquid bulk, (i.e., r1 ). In other cases, especially when dealing with liquid surfaces, it is more convenient to locate the interface at some intermediate point coinciding with the center of mass of r2 with respect to z. The effect of the substrate term r1 (z) on the scattering from r2 can be treated within the distorted wave Born approximation (DWBA) by using the exact solution from the ideally flat interface (see Principles of the Method) to generate the Green function for a higher-order Born approximation (Rodberg and Thaler, 1967; Schiff, 1968; Vineyard, 1982; Sinha et al., 1988). The Green function in the presence of an ideally flat interface, r1 , replaces the free particle Green function that is commonly used in the Born approximation. The scattering amplitude in this case is given by (Rodberg and Thaler, 1967; Schiff, 1968) FDWBA ðQÞ ¼ FF ðQz Þ þ F2 ðQÞ ¼ ipQz rF ðQz Þ ð ~ k0 ðrÞr2 ðrÞwk ðrÞdr ð51Þ þ w
X-RAY DIFFRACTION TECHNIQUES FOR LIQUID SURFACES AND MONOMOLECULAR LAYERS
1035
where the exact Fresnel amplitude FF ðQz Þ is written in the form of a scattering amplitude so that Equation 51 reproduces the Fresnel reflectivity in the absence of r2 . The exact solution of the step function r1 , wk ðrÞ, is given by ( i i ikz;0 z þ ri ðkiz;s Þe ikz;0 z for z > 0 iki? m e ð52Þ wk ðrÞ ¼ e i for z < 0 ti ðkiz;s Þeikz;s z ~k ðrÞ is the time-reversed and complex conjugate and the w solution of an incident beam with ki , ~k ðrÞ ¼ e w
ikf? m
8 < e ikfz;0 z þ rf ðkf Þeikfz;0 z z;0 : tf ðkf Þe ikfz;0 z z;0
for z > 0 for z < 0
ð53Þ
In Equation 53, it is assumed that the scattered beam is detected only for z > 0, i.e., above the liquid surface. The notation for transmission and reflection functions indicate scattering of the wave from the air onto the subphase and vice-versa according to kiz;s
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ ðkiz;0 Þ2 þ k2c
ð54Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðkfz;s Þ2 þ k2c
ð55Þ
and kfz;0 ¼
respectively. In the latter case, total reflectivity does not occur except for the trivial case kfz;s ¼ 0, and no enhancement due to the evanescent wave is expected. In this approximation, the final momentum transfer in the Qz direction is a superposition of momentum transfers from r2 (the film) and from the liquid interface, r1 . For instance, there could be a wave scattered with Qz ¼ 0 with respect to r2 but reflected from the surface with a finite Qz. This is due to multiple scattering processes between r2 and r1 and, therefore, in principle the amplitude of the detected beam contains a superposition of different components of the Fourier transform of r2 ðqz Þ and interference terms between them. Detailed analysis of Equation 51 can be very cumbersome for the most general case, and is usually dealt with for specific cases of r2 . Assuming that the scattering from the film is as strong for Q z as for Qz (as is the case for an ideal 2-D system with equal scattering along the rod, i.e., r2 is symmetrical under the inversion of z), ~ 2 ðQ? ; Qz Þ r ~ 2 ðQ? ; Qz Þ. In this simple we can write r case, the cross-section can be approximated as follows (Vineyard, 1982; Sinha et al., 1988; Feidenhas’l, 1989) ds ¼ jti ðkiz;s Þ~ r2 ðQ? ; Q0z Þtf ðkfz;s Þj2 þ jti ðkiz;s Þ~ r2 ðQ? ; Q00z Þtf ðkfz;0 Þj2 d ð56Þ where Q? ki? kf? and Q0z ¼ kiz;0 kfz;s and Q00z ¼ kiz;s kfz;0 . Notice that the transmission functions modulate the scattering from the film (r2 ), and in particular they give rise to enhancements as kiz;s and kfz;s are scanned around the critical angle as depicted in Figure 2B. Also,
Figure 6. Illustration of wave paths for exterior (A) and interior (B) scatterer near a step-like interface. In both cases the scattering is enhanced by the transmission function when the angle of the incidence is varied around the critical angle. However, due to the asymmetry between external and internal reflectivity, the rod scan of the final beam modulates the scattering differently, as is shown on the right-hand side in each case.
it is only by virtue of the z symmetry of the scatterer that such enhancements occur for an exterior film. From this analysis we notice that there will be no enhancement due to the transmission function of the final wave for an interior film. To examine the results from the DWBA method, we consider scattering from a single scatterer near the surface. The discussion is restricted to the case where the detection of the scattered beam is performed in the vapor phase only. The scatterer can be placed either in the vacuum (z > 0) or in the liquid (see Fig. 6). When the particle is placed in the vacuum there are two major relevant incident waves: a direct one from the source, labeled 1i in Figure 6A, and a second one reflected from the surface before scattering from r2 , labeled 2i . Assuming inversion symmetry along z, both waves scatter into a finite Q? with similar strengths at Qz and Qz , giving rise to an enhancement near the critical angle if the incident beam is near the critical angle. Another multiple scattering process that gives rise to enhancement at the critical angle is one in which beam 1i does not undergo a change in the momentum transfer along z (Qfilm 0) before scattering from the z liquid interface. The effect of these processes gives rise to enhancements if either the incident beam or the reflected beam are scanned along the z direction. Slight modifications of the momentum transfer along the z direction, such as
Q0z ¼ kiz þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðkfz Þ2 k2c
ð57Þ
1036
X-RAY TECHNIQUES
are neglected in the discussion above. The effective amplitude from the scatterer outside the medium is given by the following terms 0
jeiQz z þ eiQz z rðkfz;s Þj jtðkiz;s Þj
ð58Þ
where the approximation is valid since the phase factors can be neglected, and Qz Q0z . At large angles the reflectivity is negligible and the transmission function approaches tðkz;s Þ 1. Similar arguments hold for the outgoing wave. Neglecting the small changes in the momentum transfer due to dynamical effects, the transmission function modulates the scattering as is shown in solid line in Figure 2B. The scattered wave from a particle that is embedded in the medium is different due to the asymmetry between external and internal reflection from the liquid subphase. The wave scattered from the particle rescatters from the liquid-gas interface. Upon traversing the liquid interface, the index of refraction increases from n ¼ 1 ð2p=k20 Þr to 1, and no total internal reflection occurs, as discussed earlier; thus there is no evanescent wave in the medium. The transmission function for this wave is given by tð kiz;s Þ, like that of a wave emanating from the liquid interface into the vapor phase. In this case, the transmission function is a real function for all kiz;s and does not have the enhancements around the critical angle (as shown in Fig. 6B), with zero intensity at the horizon. Grazing Incidence Diffraction (GID), and Rod Scans. In some instances, ordering of molecules at liquid interfaces occurs. Langmuir monolayers spread at the gas-water interface usually order homogeneously at high enough lateral pressures (Dutta et al., 1987; Kjaer et al., 1987, 1994). Surface crystallization of n-alkane molecules on molten alkane has been observed recently (Wu et al., 1993b; Ocko et al., 1997). In these cases, r2 is a periodic function in x and y, and can be expanded as a Fourier series in terms of the 2-D reciprocal lattice vectors t? as follows r2 ðl; zÞ ¼
X
Fðt? ; zÞeis? l
ð59Þ
s?
Inserting Equation 59 in Equation 56 and integrating yields the cross-section for quasi–2-D Bragg reflection at Q? ¼ s? ds f PðQÞjtðkiz;s Þj2 hjFðs? ; Qz Þj2 iDWðQ? ; Qz Þjt f ðkz;s Þj2 dðQ s? Þ d ð60Þ
where P(Q) is a polarization correction and the 2-D unit cell structure factor is given as a sum over the atomic form factors fj(Q) with appropriate phase Fðt? ; Qz Þ ¼
X
fj ðQÞeitrj þQz zj
ð61Þ
j
The structure factor squared is averaged for multiplicity due to domains and weighted for orientation relative to the surface normal. The ordering of monolayers at the
air-water interface is usually in the form of 2-D powder consisting of crystals with random orientation in the plane. From Equation 60 we notice that the conservation of momentum expressed with the delta-function allows for observation of the Bragg reflection at any Qz. A rod scan can be performed by varying either the incident or reflected beam, or both. The variation of each will produce some modulation due to both the transmission functions and to the average molecular structure factor along the z-axis. The Debye-Waller factor (see KINEMATIC DIFFRACTION OF X-RAYS), DW(Q? , Qz), which is due to the vibration of molecules about their own equilibrium position with time-dependent molecular displacement u(t) is given by 2
2
2
2
DWðQ? ; Qz Þ e ðC? Q? hu? iþQz s
Þ
ð62Þ
The term due to capillary waves on the liquid surface is much more dominant than the contribution from the inplane intrinsic fluctuations. The Debye-Waller factor in this case is an average over a crystalline size and might not reflect surface roughness extracted from reflectivity measurements, where it is averaged over the whole sample.
PRACTICAL ASPECTS OF THE METHOD The minute sizes of interfacial samples on the sub-microgram level, combined with the weak interaction of x-rays with matter, result in very weak GID and reflectivity (at large Qz) signals that require highly intense incident beams, which are available at x-ray synchrotron sources (see SURFACE X-RAY DIFFRACTION). A well prepared incident beam for reflectivity experiments at a synchrotron (for example, the X22B beam-line at the National Synchrotron Light Source at Brookhaven National Laboratory; Schwartz, 1992) has an intensity of 108 to 109 photons/ sec, whereas, for a similar resolution, an 18-kW rotating anode generator produces 104 to 105 photons/sec. Although reflectivity measurements can be carried out with standard x-ray generators, the measurements are limited to almost half the angular range accessible at synchrotron sources, and they take hours to complete compared to minutes at the synchrotron. GID experiments are practically impossible with x-ray generators, since the expected signals (2-D Bragg reflections, for example) normalized to the incident beam are on the order of 10 8 to 10 10 . Reflectivity X-ray reflectivity and GID measurements of liquid surfaces are carried out on special reflectometers that enable manipulation of the incident as well as the outgoing beam. A prototype liquid surface reflectometer was introduced for the first time by Als-Nielsen and Pershan (1983). In order to bring the beam to an angle of incidence ai with respect to the liquid surface, the monochromator is tilted by an angle w either about the axis of the incident beam (indicated by w1 in Fig. 7) or about the axis normal to the reciprocal lattice wave vector of the monochromator, t0 ðw2 Þ. Figure 7 shows the geometry that is used to deflect
X-RAY DIFFRACTION TECHNIQUES FOR LIQUID SURFACES AND MONOMOLECULAR LAYERS
1037
Using Equations 63 and 64, the following relations for the monochromator axes are obtained t0 2k0 k0 sin w1 ¼ cos c sin ai t0 1 t2 cos ai cos f ¼ 1 02 2k0 sin c ¼
ð65Þ
and we notice that the monochromator angle c is independent of ai . However, the scattering angle f has to be modified as ai is varied. This means that the whole reflectometer arm has to be rotated. Similarly, for the configuration where the monochromator is tilted over the axis normal to s0 we get t0 2k0 cos w2 k0 sin w2 ¼ sin ai t0 1 t2 cos ai cos f ¼ 1 02 2k0 sin c ¼
Figure 7. (A) Monochromator geometry to tilt a Bragg reflected beam from the horizon on a liquid surface. Two possible tilting configurations about the primary beam axis and about an axis along the surface of the reflecting planes are shown. (B) A side view diagram of the Ames Laboratory Liquid Surfaces Diffractometer at the 6-ID beam line at the Advanced Photon Source at Argonne National Laboratory.
the beam from the horizontal onto the liquid surface at an angle, ai , by tilting the monochromator. At the Bragg condition, the surface of the monochromator crystal is at an angle c with respect to the incoming beam. Tilting over the incident beam axis is like tracing the Bragg reflection on the Debye-Scherer cone so that the c axis remains fixed, with a constant wavelength at different tilting angles. The rotated reciprocal lattice vector and the final wave vector in this frame are given by s0 ¼ t0 ð sin c; cos c cos w1 ; cos c sin w1 Þ kf ¼ k0 ðcos ai cos f; cos ai sin f; sin ai Þ
ð63Þ
where f is the horizontal scattering angle. The Bragg conditions for scattering are given by ki þ t0 ¼ kf ;
jkf j ¼ k0
ð64Þ
ð66Þ
From these relations, the conditions for a constant wavelength operation for any angle of incidence, ai , can be calculated and applied to the reflectometer. Here, unlike the previous mode, deflection of the beam to different angles of incidence requires both the adjustment of c and of f in order to maintain a constant wavelength. If c is not corrected in this mode of operation, the wavelength varies as w is varied. This mode is sometimes desirable, especially when the incident beam hitting the monochromator consists of a continuous distribution of wavelengths around the wavelength at horizontal scattering, w2 ¼ 0. Such continuous wavelength distribution exists when operating with x-ray tubes or when the tilting monochromator is facing the white beam of a synchrotron. Although the variation in the wavelength is negligible as w2 is varied, without the correction of c, the exact wavelength and the momentum transfer can be computed using the relations in Equation 65 and Equation 66. In both modes of monochromator tilting, the surface height as well as the height of the slits are adjusted with vertical translations. Figure 7, panel B, shows a side-view diagram of the Ames Laboratory Liquid Surfaces Diffractometer at the 6-ID beam line at the Advanced Photon Source at Argonne National Laboratory. A downstream Si double-crystal monochromator selects a very narrow-bandwidth energy (1 eV) beam (in the range 4 to 40 keV) from an undulator-generated beam. The highly monochromatic beam is deflected onto the liquid surface to any desired angle of incidence, ai , by the beam-tilting monochromator; typically Ge(111) or Ge(220) crystals are used. To simplify its description, the diffractometer can be divided into two main stages, with components that may vary slightly depending on the details of the experiment. In the first stage, the incident beam on the liquid surface is optimized. This part consists of the axes that adjust the beam tilting
1038
X-RAY TECHNIQUES
monochromator (c; f; w), incident beam slits (Si), beam monitor, and attenuator. The o axis, below the monochromator crystal, is adjusted in the initial alignment process to ensure that the tilting axis w is well defined. For each angle of incidence ai , the c, f, w and the height of the incident beam arm (IH) carrying the S1, S2 slits, are adjusted. In addition, the liquid surface height (SH) is brought to the intersecting point between the incident beam and the detector arm axis of rotation 2ys . In the second stage of the diffractometer, the intensity of the scattered beam from the surface is mapped out. In this section, the angles af and 2ys and the detector height (DH) are adjusted to select and detect the outgoing beam. The two stages are coupled through the f-arm of the diffractometer. In general, ys is kept idle because of the polycrystalline nature of monolayers at liquid surfaces. Surface crystallization of alkanes proceeds with the formation of a few large single crystals, and the use of ys is essential to orienting one of these crystals with respect to the incoming beam. The scattering from liquid metals is complicated by the fact that their surfaces are not uniformly flat. For details on how to scatter from liquids with curved surfaces see Regan et al. (1997). For aqueous surfaces, the trough (approximate dimensions 120 270 5 mm3) is placed in an airtight aluminum enclosure that allows for the exchange of the gaseous environment around the liquid surface. To get total reflectivity from water below the critical angle (e.g., ac ¼ 0:1538 and 0.07688 at 8 keV and 16 keV respectively), the footprint of the beam has to be smaller than the specimen surface. A typical cross-section of the incident beam is 0:5 0:1 mm2 with approximately 1010 to 1011 photons per second (6-ID beamline). At about 0.8ac , the footprint of the beam with a vertical size (slit S2) of about 0.1 mm is 47 and 94 mm at 8 keV and 16 keV, respectively, compared to 120 mm liquid-surface dimension in the direction of the primary beam. Whereas a 0.1 mm beam size is adequate for getting total reflectivity at 8 keV (exploiting about half of the liquid surface in the beam direction), a 0.05-mm beam size is more appropriate at 16 keV. This vertical beam size (slit S2) can be increased at larger incident angles to maintain a relatively constant beam footprint on the surface. The alignment of the diffractometer encompasses two interconnected iterated processes. First, the angles of the first stage (ai , c, f, w, and o) are optimized and set so that the x-ray flux at the monitor position is preserved upon deflection of the beam (tracking procedure). Second, the beam is steered so that it is parallel to the liquid surface. It should be emphasized that the beam, after the tracking procedure, is not necessarily parallel to the liquid surface. In this process, reflectivities from the liquid surface at various incident angles are employed to define the parallel beam, by adjustment of the angles and heights of the diffractometer. Then, repeatedly, the first and second processes are iterated until convergence is achieved (i.e., until corrections to the initial positions of motors are smaller than the mechanical accuracies). The divergence of the monochromatic incident beam on the surface is determined by at least two horizontal slits located between the sample and the source. One of these
Figure 8. Superposition of the reflected beam (circles) below the critical angle and direct beam (triangles), demonstrating total reflectivity of x-rays from the surface of water. Severe surface roughness reduces the intensity and widens the reflected signal. A reduction from total reflectivity can also occur if the slits of the incident beam are too wide, so that the beam-footprint is larger than the surface sample.
slits is usually located as close as possible to the sample, and the other as close as possible to the source. These two slits determine the resolution of the incident beam. By deflecting the beam from the horizontal, the shape of the beam changes, and that may change incident beam intensity going through the slits; therefore the use of a monitor right after the slit in front of the sample is essential for the absolute determination of the reflectivity. The size of the two slits defining the incident beam is chosen in such a way that the footprint of the beam is much smaller than the width of the reflecting surface, so that total reflectivity occurs. Figure 8 shows the reflected beam and the direct beam from a flat surface of water, demonstrating total reflectivity at Qz ¼ 0:85Qc . In this experiment the detector slit is wide open at 10 times the opening of the sample slit. As is demonstrated, the effect of absorption is negligible for water, and roughness is significantly reduced by damping surface waves. The damping can be achieved by reducing the height of the water film to 0.3 mm and placing a passive as well as an active antivibration unit underneath the liquid sample holder, suppressing mechanical vibrations (Kjaer et al., 1994). Non-Specular Scattering: GID, Diffuse Scattering, and Rod Scans X-ray GID measurements are performed at angles of incidence below the critical angle 0:9ac . Operating with the incident beam below the critical angle enhances the signal from the surface with respect to that of the bulk by creating an evanescent wave in the medium that is exponentially decaying according to EðzÞ ¼ tðkz;s Þe z=
ð67Þ
X-RAY DIFFRACTION TECHNIQUES FOR LIQUID SURFACES AND MONOMOLECULAR LAYERS
where 1 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ k2c k2z;0
ð68Þ
˚. For water at kz 0:9kc, 100 A As illustrated in Figure 1, the components of the momentum transfer for GID are given by Qz ¼ k0 ðsin ai þ sin af Þ Qx ¼ k0 ðcos ai cos af Þcos 2y Qy ¼ k0 cos af sin 2y
ð69Þ
In most cases, the 2-D order on liquid surfaces is powderlike, and the lateral scans are displayed in terms of Q? which is given by Q? ¼ k0
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi cos2 ai þ cos2 af 2cos ai cos af cos 2y
ð70Þ
To determine the in-plane correlations, the horizontal resolution of the diffractometer can be adjusted with a Soller collimator consisting of vertical absorbing foils stacked together between the surface and the detector. The area that is probed at each scattering angle 2y is proportional to S0 =sin 2y, where S0 is the area probed at 2y ¼ p=2. The probed area must be taken into account in the analysis of a GID scan that is performed over a wide range of angles. Position-sensitive detectors (PSD) are commonly used to measure the intensity along the 2-D rods. It should be kept in mind that the intensity along the PSD is not a true rod scan of a Bragg reflection at a nominal Q? because of the variation in Q? as af is varied, as is seen in Equation 69. DATA ANALYSIS AND INITIAL INTERPRETATION The task of finding the SLD from a reflectivity curve is similar to that of finding an effective potential for Equation 7 from the modulus of the wave-function. Direct inversion of the scattering amplitude to SLD is not possible except for special cases when the BA is valid (Sacks, 1993, and references therein). If the modulus and the phase are known, they can be converted by the method of Gelfand-Levitan-Marchenko (Sacks, 1993, and references therein) to SLD (GLM method). However, in reflectivity experiments, the intensity of the scattered beam alone is measured, and phase information is lost. Step-like potentials have been directly reconstructed recently by retrieving the phase from the modulus—i.e., reflectivity—and then using the GLM method (Clinton, 1993; also Sacks, 1993, and references therein). Modelindependent methods which are based on optimization of a model to reflectivity, without requiring any knowledge of the chemical composition of the SLD at the interface, were also developed recently (Pedersen, 1992; Zhou and Chen, 1995). Such models incorporate a certain degree of objectivity. These methods are based on the kinematical and the dynamical approaches for calculating the reflectiv-
1039
ity. One method (Pedersen, 1992) uses indirect Fourier transformation to calculate the correlation function of dr=dz, which is subsequently used in a square-root deconvolution model to construct the SLD model. Zhou and Chen (1995), on the other hand, developed a groove tracking method that is based on an optimization algorithm to reconstruct the SLD using the dynamical approach to calculate the reflectivity at each step. The most common procedure to extract structural information from reflectivity is the use of standard nonlinear least squares refinement of an initial SLD model. The initial model is defined in terms of a P-dimensional set of independent parameters, p, using all the information available in guessing r(z, p). The parameters are then refined by calculating the reflectivity ðR½Qiz pÞ with the tools described earlier, and by minimizing the w2 ðpÞ quantity w2 ðpÞ ¼
2 1 X Rexp ðQiz Þ RðQiz ; pÞ N P i¼1 eðQiz Þ
ð71Þ
where eðQiz Þ is the uncertainty of the measured reflectivity, Rexp ðQiz Þ, and N is the number of measured points. The criteria for a good fit can be found in Bevington (1968). Uncertainties of a certain parameter can be obtained by fixing it at various values and for each value refining the rest of the parameters until w2 is increased by a factor of at least 1=ðN P). The direct methods and model-independent procedures of reconstruction SLD do not guarantee uniqueness of the potential—i.e., there can be multiple SLD profiles that essentially yield the same reflectivity curve, as discussed with regard to Figure 5, for example. The uniqueness can be achieved by introducing physical constraints that are incorporated into the parameters of the model. Volume, in-plane density of electrons, etc., are among such constraints that can be used. Applying such constraints is discussed briefly below; see Examples (also see Vaknin et al., 1991a,b; Gregory et al., 1997). These constraints reduce the uncertainties and make the relationship of the SLD to the actual molecular arrangement apparent. In the dynamical approach, no two potentials yield exactly the same reflectivity, although the differences between two models might be too small to be detected in an experiment. An experimental method for solving such a problem was suggested by Sanyal et al. (1992) using anomalous x-ray reflectivity methods. Two reflectivity curves from the same sample are measured with two different x-ray energies, one below and one above an absorption edge of the substrate atoms, thereby varying the scattering length density of the substrate. Subsequently the two reflectivity curves can be used to perform a direct Fourier reconstruction (Sanyal et al., 1992), or refinement methods can be used to remove ambiguities. This method is not efficient when dealing with liquids that consist of light atoms, because of the very low energy of the absorption edge with respect to standard x-ray energies. Another way to overcome the problem of uniqueness is by performing reflectivity experiments on similar samples with x-rays and with neutrons. In addition, the SLD, r(z) across the
1040
X-RAY TECHNIQUES
interface can be changed significantly, in neutron scattering experiments, by chemical exchange of isotopes that change r(z), but maintain the same structure (Vaknin et al., 1991b; Penfold and Thomas, 1990). The reflectivities (x-ray as well as neutrons) can be fitted to one structural model that is defined in terms of geometrical parameters only, calculating the SLDs from scattering lengths of the constituents and the geometrical parameters (Vaknin et al., 1991b,c). Examples Since the pioneering work of Als-Nielsen and Pershan (1983), x-ray reflectivity and GID became standard tools for the characterization of liquid surfaces on the atomic length scales. The techniques have been exploited in studies of the physical properties of simple liquids (Braslau et al., 1988; Sanyal et al., 1991; Ocko et al., 1994), Langmuir monolayers (Dutta et al., 1987; Kjaer et al., 1987, 1989, 1994; Als-Nielsen and Kjaer, 1989; Vaknin et al., 1991b), liquid metals (Rice et al., 1986; Magnussen et al., 1995; Regan et al., 1995), surface crystallization (Wu et al., 1993a,b, 1995; Ocko et al., 1997), liquid crystals (Pershan et al., 1987), surface properties of quantum liquids (Lurio et al., 1992), protein recognition processes at liquid surfaces (Vaknin et al., 1991a, 1993; Lo¨ sche et al., 1993), and many other applications. Here, only a few examples are briefly described in order to demonstrate the strengths and the limitations of the techniques. In presenting the examples, there is no intention of giving a full theoretical background of the systems. Simple Liquids. The term simple liquid is usually used for a monoatomic system governed by van der Waals– type interactions, such as liquid argon. Here, the term is extended to include all classical dielectric (nonmetallic) liquids such as water, organic solvents (methanol, ethanol, chloroform, etc.), and others. One of the main issues regarding dielectric liquids is the determination of the average density profile across the interface, N(z). This density is the result of folding the intrinsic density NI(z) of the interface due to molecular size, viscosity, and compressibility of the fluid with density fluctuations due to capillary waves, NCW ðzÞ. The continuous nature of the density across the interface due to capillary waves was worked out by Buff, Lovett, and Stillinger (BLS; Buff et al., 1965) assuming that NI(z) is an ideal step-like function. The probability for the displacement is taken to be proportional to the Boltzmann factor, e bUðzÞ , where U is the free energy necessary to disturb the surface from equilibrium state—i.e., zðx; yÞ ¼ 0—and b ¼ 1=kB T where kB is Boltzmann’s constant. The free energy of an incompressible and nonviscous liquid surface consists of two terms; a surface tension (g) term, which is proportional to the changes in area from the ideally flat surface and a gravitational term as follows ð qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 U ¼ ðg½ 1 þ jrzj2 1 þ ms gz2 Þd2 l 2 ð 1 2 2 2 ðgjrzj þ ms gz Þd l 2
ð72Þ
where ms is the mass density of the liquid substrate. By using standard Gaussian approximation methods, Buff et al. (1965) find that UðzÞ
z2 2s20
ð73Þ
Convolution of the probability with a step-like function, representing the intrinsic density of the liquid surface yields the following density function z NðzÞ ¼ Ns erfc pffiffiffi 2s
ð74Þ
with a form similar to the one given in Equation 47. The average surface roughness at temperature T, is then given by s2CW ¼
kB T L ln 2pg a0
ð75Þ
where a0 is a molecular diameter and L is the size of the surface. Notice the logarithmic divergence of the fluctuations as the size of the surface increases, as expected of a 2-D system (Landau and Lifshitz, 1980). This model was further refined by assuming that the intrinsic profile has a finite width (Evans, 1979). In particular if the width due to the intrinsic profile is also expressed by a Gaussian then, the effective surface roughness is given by s2eff ¼ s2I þ s2CW
ð76Þ
and the calculated reflectivity is similar to Equation 35 for an interface that is smeared like the error function 2
2
RCW ¼ RF ðQz Þe aeff Qz
ð77Þ
Figure 9 shows the reflectivity from pure water measured at the synchrotron (D. Vaknin, unpub. observ.) where it is shown that, using Equation 77 for fitting the reflectivity data is satisfactory. This implies that the error function type of density profile (BLS model) for the liquid interface is sufficient. Only the surface roughness parameter, s, is varied to refine the fit to the data (s ¼ ˚ ). This small roughness value depends on the 2:54 A attenuation of capillary waves by minimizing the depth of the water to 0.3 mm by placing a flat glass under the water (Kjaer et al., 1994). The validity of a Gaussian approximation of N(z) (BLS model) was examined by various groups and for a variety of systems (Braslau et al., 1988; Sanyal et al., 1991; Ocko et al., 1994). Ocko et al. (1994) have measured the reflectivity of liquid alkanes over a wide range of temperatures, verifying that the surface roughness is of the form given in Equations 75 and 76. Experimentally, the reflectivity signal at each Qz from a rough interface is convoluted with the resolution of the spectrometer in different directions. The effect of the resolution in Qz can be calculated analytically or convoluted numerically. For simplicity, we consider that the resolution functions can be approximated as a Gaussian with a
X-RAY DIFFRACTION TECHNIQUES FOR LIQUID SURFACES AND MONOMOLECULAR LAYERS
1041
tion of the scattering from disordered interfaces of various characteristics are treated in Sinha et al. (1988). In the Born approximation, true specular scattering from liquid surfaces exists only by virtue of the finite cutoff length of the mean-square height fluctuations (Equation 75). In other words, the fluctuations due to capillary waves diverge logarithmically, and true specular reflectivity is observed only by virtue of the finite instrumental resolution. The theory for the diffuse scattering from fractal surfaces and other rough surfaces was developed in Sinha et al. (1988).
Figure 9. Experimental reflectivity from the surface of water. The dashed line is the calculated Fresnel reflectivity from an ideally flat water-interface, RF. The normalized reflectivity versus Q2z 2 is fitted to a the form R=RF ¼ e ðQz sÞ , demonstrating the validity of the capillary-wave model (Buff et al., 1965). 2
2
width of Qz along Qz can be taken as e Qz =Qz with appropriate normalization factor (Bouwman and Pederesen, 1996). The resolution, Qz , is Qz-dependent as the angles of incidence and scattering are varied (Ocko et al., 1994). However, if we assume that around a certain Qz value the resolution is a constant and we measure sexp , the convolution of the true reflectivity with the resolution function yields the following relation 1 1 þ Q2z s2exp s2eff
ð78Þ
from which the effective roughness can be extracted as follows sexp seff qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 s2exp Q2z
ð79Þ
Thus, if the resolution is infinitely good, i.e., with a Qz ¼ 0, the measured and effective roughness are the same. However, as the resolution is relaxed, the measured roughness gets smaller than the effective roughness. The effect of the resolution on the determination of true surface roughness was discussed rigorously by Braslau et al. (1988). Diffuse scattering from liquid surfaces is practically inevitable, due to the presence of capillary waves. Calcula-
Langmuir Monolayers. A Langmuir monolayer (LM) is a monomolecular amphiphilic film spread at the air-water interface. Each amphiphilic molecule consist of a polar head group (hydrophilic moiety) and a nonpolar tail, typically hydrocarbon (hydrophobic) chains (Gaines, 1966; Swalen et al., 1987; Mo¨ hwald, 1990). Typical examples are fatty acids, lipids, alcohols, and others. The length of the hydrocarbon chain can be chemically varied, affecting the hydrophobic character of the molecule. On the other hand, the head group can be ionic, dipolar, or it may have with a certain shape that might attract specific compounds present in the aqueous solution. One important motivation for studying LMs is their close relationship to biological systems. Membranes of all living cells and organelles within cells consist of lipid bilayers interpenetrated with specific proteins, alcohols, and other organic compounds that combine to give functional macromolecules which determine transport of matter and energy through them. It is well known that biological functions are structural, and structures can be determined by XR and GID. In addition, delicate surface chemistry can be carried out at the head-group interface with the molecules in the aqueous solution. From the physics point of view, the LM belongs to an important class of quasi–2-D system, by means of which statistical models that depend on the dimension of the system can be examined. In this example, results from a simple lipid, dihexadecyl hydrogen phosphate (DHDP), consisting of a phosphate head group ðPO
4 Þ and its two attached hydrocarbon chains, are presented. Figure 10A displays the normalized reflectivity of a DHDP monolayer at the air-water interface at a lateral pressure of 40 mN/m. The corresponding electron density profile is shown in the inset as a solid line. The profile in the absence of surface roughness (s ¼ 0) is displayed as a dashed line. The bulk water subphase corresponds to z < 0, the phosphate headgroup ˚ , and the hydrocarbon tails are region is at 0 z 3:4 A ˚ z 23:1 A ˚ region. As a first-stage analysis at the 3.4 A of the reflectivity, a model SLD with minimum number of boxes, i ¼ 1; 2; 3. . . , is constructed. Each box is characterized by a thickness di and an electron density Ne;i , and one surface roughness, s for all interfaces. Refinement of the reflectivity with Equation 37 shows that the two-box model is sufficient. In order to improve the analysis, we can take advantage of information we know of the monolayer, i.e., the constituents used and the molecular area determined from the lateral-pressure versus molecular area isotherm. If the monolayer is homogeneous and not necessarily ordered, we can assume an average area per
1042
X-RAY TECHNIQUES
with water molecules that interpenetrate the head-group region, which is not densely packed. The cross-section of the phosphate head group is smaller than the area occupied by the two hydrocarbon tails, allowing for water molecules to penetrate the head group region. We therefore introduce an extra parameter NH2 O , the number of water molecules with ten electrons each. The electron density of the head group region is given by rhead ¼ ðNe;phosphate þ 10NH2 O Þ=ðAdhead Þ
ð81Þ
This approach gives a physical insight into the chemical constituents at the interface. In modeling the reflectivity with the above assumptions, we can either apply volume constraints or, equivalently, examine the consistency of the model with the literature values of closely packed moieties. In this case the following volume constraint can be applied Vheadgroup ¼ A dhead ¼ NH2 O VH2 O þ Vphosphate
ð82Þ
˚ 2 is known from the density of water. where VH2 O 30 A The value of Vphosphate determined from the refinement should be consistent within error with known values extracted from crystal structures of salt phosphate (Gregory et al., 1997). Another parameter that can be deduced from the analysis is the average tilt angle, t, of the tails with respect to the surface. For this the following relation is used dtail =ltail ¼ cos t
Figure 10. (A) Normalized x-ray reflectivity from dihexadecyl phosphate (DHDP) monolayer at the air-water interface with best-fit electron density, Ne , shown with solid line in the inset. The calculated reflectivity from the best model is shown with a solid line. The dashed line in the inset shows the box model with no roughness s ¼ 0. (B) A diffraction from the same monolayer showing a prominent 2-D Bragg reflection corresponding to the hexagonal ordering of individual hydrocarbon chains at ˚ 1 QB ? ¼ 1:516 A . The inset shows a rod scan from the quasi-2D Bragg reflection at QB ? , with a calculated model for tilted chains denoted by solid line (see text for more details).
molecule at the interface A, and calculate the electron density of the tail region as follows rtail ¼ Ne;tail r0 =ðAdtail Þ
ð80Þ
where Ne;tail is the number of electrons in the hydrocarbon tail and dtail is the length of the tail in the monolayer. The advantage of this description is two-fold: the number of independent parameters can be reduced, and constraints on the total number of electrons can be introduced. However, in this case, the simple relation rhead ¼ Ne;phosphate = ðAd head Þ is not satisfactory, and in order to get a reasonable fit, additional electrons are necessary in the headgroup region. These additional electrons can be associated
ð83Þ
where ltail is the full length of the extended alkyl chain evaluated from the crystal data for alkanes (Gregory et al., 1997). Such a relation is valid under the condition that the electron density of the tails when tilted is about the same as that of closely packed hydrocarbon chains ˚ 3 r0 , as observed by Kjaer et al. in a crystal, rtail 0:32 e=A (1989). Such a tilt of the hydrocarbon tails would lead to an average increase in the molecular area compared to the cross-section of the hydrocarbon tails (A0 ) A0 =A ¼ cos t
ð84Þ
Gregory et al. (1997) found that at lateral pressure p ¼ 40 mN/m, the average tilt angle is very close to zero ˚ 2 compared with ( 7 7 ), and extracted an A0 40:7 A ˚ 2 for closely packed crystalline hydrocara value of 39.8 A bon chains. The small discrepancy was attributed to defects at domain boundaries. The GID for the same monolayer is shown in Figure 10B, where a lowest-order Bragg reflection at ˚ 1 is observed. This reflection corresponds to the 1.516 A hexagonal ordering of the individual hydrocarbon chains (Kjaer et al., 1987, 1989) with lattice constant d ¼ 4:1144 ˚ , and molecular area per chain Achain ¼ 19:83 A ˚ 2. Note A that in DHDP the phosphate group is anchored to a pair ˚ 2, of hydrocarbon chains with molecular area A ¼ 39:66 A and it is surprising that ordering of the head group with a larger unit cell (twice that of the hydrocarbon unit cell) is not observed, as is evident in Figure 10B. Also shown in
X-RAY DIFFRACTION TECHNIQUES FOR LIQUID SURFACES AND MONOMOLECULAR LAYERS
the inset of Figure 10B is a rod scan of the Bragg reflection. To model the rod scan in terms of tilted chains, the procedure developed in Kjaer et al. (1989) is followed. The structure factor of the chain can be expressed as Fchain ðQ0 ? ; Q0z Þ ¼ FðQ? Þ
sinðlQ0z =2Þ ðlQ0z =2Þ
ð85Þ
where FðQ0? Þ is the in-plane Fourier transform of the crosssection of the electron density of chain, weighted with the atomic form factors of the constituents. The second term accounts for the length of the chain, and is basically a Fourier transform of a one-dimensional aperture of length l. If the chains are tilted with respect to the surface normal (in the y z plane) by an angle t, the Q0 should be rotated as follows Q0x ¼ Qx cos t þ Qz sin t Q0y ¼ Qy
ð86Þ
Q0z ¼ Qx sin t þ Qz cos t Applying this transformation to the molecular structure factor, Equation 85, and averaging over all six domains (see more details in Kjaer et al., 1989) with the appropriate weights to each tilt direction, we find that at 40 mN/m the hydrocarbon chains are practically normal to the surface, consistent with the analysis of the reflectivity. In recent Brewster Angle Microscopy (BAM) and x-ray studies of C60-propylamine spread at the air-water interface (see more details on fullerene films; Vaknin, 1996), a broad in-plane GID signal was observed (Fukuto et al., 1997). The GID signal was analyzed in terms of a 2-D radial distribution function that implied short-range positional correlations extending to only few molecular distances. It was demonstrated that the local packing of molecules on water is hexagonal, forming a 2-D amorphous solid. Surface Crystallization of Liquid Alkanes. Normal alkanes are linear hydrocarbon chains (CH2)n terminating with CH3 groups similar to fatty acids and lipids. The latter compounds, in contrast to alkanes, possess a hydrophilic head group at one end. Recent extensive x-ray studies of pure and mixed liquid alkanes (Wu et al., 1993a,b, 1995; Ocko et al., 1997) reveal rich and remarkable properties near their melting temperature, Tf. In particular, a single crystal monolayer is formed at the surface of an isotropic liquid bulk up to 3 C above Tf for a range of hydrocarbon number n. The surface freezing phenomenon exists for a wide range of chain lengths, 16 n 50. The molecules in the ordered layer are hexagonally packed and show three distinct ordered phases: two rotator phases, one with the molecules oriented vertically (16 n 30), and the other tilted toward nearest neighbors (30 n 44). The third phase (44 n) orders with the molecules tilted towards next-nearest neighbors. In addition to the 2-D Bragg reflections observed in the GID studies, reflectivity curves from the same monolayers were found to be consistent with a one-box model of densely packed hydrocarbon chains, and a thickness that corresponds to slightly tilted
1043
chains. This is an excellent demonstration of a case where no other technique but the x-ray experiments carried out at a synchrotron could be applied to get the detailed structure of the monolayers. Neutron scattering from this system would have yielded similar information; however, the intensities available today from reactors and spallation sources are smaller by at least a factor of 105 counts/sec for similar resolutions, and will not allow observation of any GID signals above background levels. Liquid Metals. Liquid metals, unlike dielectric liquids, consist of the classical ionic liquid and quantum free-electron gas. Scattering of conduction electrons at a step-like potential (representing the metal-vacuum interface) gives rise to quantum interference effects and leads to oscillations of the electron density across the interface (Lang and Kohn, 1970). This effect is similar to the Friedel oscillations in the screened potential arising from the scattering of conduction electrons by an isolated charge in a metal. By virtue of their mobility, the ions in a liquid metal can in turn rearrange and conform to these oscillations to form layers at the interface, not necessarily commensurate with the conduction electron density (Rice et al., 1986). Such theoretical predictions of atomic layering at surfaces of liquid metals have been known for a long time, and have only recently been confirmed by x-ray reflectivity studies for liquid gallium and liquid mercury (Magnussen et al., 1995; Regan et al., 1995). X-ray reflectivities of these ˚ 1 , showing a single liquids were extended to Qz 3A peak that indicates layering with spacing on the order of atomic diameters. The exponential decay for layer pene˚ ) was found to be larger tration into the bulk of Ga (6.5A ˚ ). Figure 11 shows a peak in the than that of Hg (3A reflectivity of liquid Ga under in situ UHV oxygen-free surface cleaning (Regan et al., 1995, 1997). The normalized reflectivity was fitted to a model scattering length density shown in Figure 11B, of the following oscillating and exponentially decaying form (Regan et al., 1995) rðzÞ=rs ¼ erf½ðz z0 Þ=s þ yðzÞA sin ð2pz=dÞe z=x
ð87Þ
where yðzÞ is a step function, d is the inter-layer spacing, x is the exponential decay length, and A is an amplitude. Fits to this model are shown in Figure 11 with ˚ , x ¼ 5:8 A ˚ . The layering phenomena in Ga d ¼ 2:56 A showed a strong temperature dependence. Although liquid Hg exhibits layering with a different decay length, the reflectivity at small momentum transfers, Qz, is significantly different than that of liquid Ga, indicating fundamental differences in the surface structures of the two metals. The layering phenomena suggest in-plane correlations that might be different than those of the bulk, but had not been observed yet with GID studies.
SPECIMEN MODIFICATION AND PROBLEMS In conducting experiments from liquid surfaces, the experimenter faces problems that are commonly encountered in other x-ray techniques. A common nuisance in
1044
X-RAY TECHNIQUES
Figure 11. (A) Measured reflectivity for liquid gallium (Ga). Data marked with X were collected prior to sample cleaning whereas the other symbols correspond to clean surfaces (for details see Regan et al., 1995). Calculated Fresnel reflectivity from liquid Ga surface convoluted with a surface roughness due ˚ ), and the atomic form factor for to capillary waves (s ¼ 0:82 A Ga is denoted with a solid line. (B) The normalized reflectivity, with a solid line that was calculated with the best fit by an exponentially decaying sine model shown in the inset (courtesy of Regan et al., 1995).
an undulator beam line is the presence of high harmonic components that accompany the desired monochromatic beam. These high harmonics can skew the data as they affect the counting efficiency of the detector, can damage the specimen or the detector, and can increase undesirable background. The higher harmonic components can be reduced significantly (but not totally eliminated) by inserting a mirror below the critical angle for total reflection, or by detuning the second crystal of the double-crystal monochromator (XAFS SPECTROSCOPY). However, there are also problems that are more specific to reflectivity and GID in general, and to liquid surfaces in particular. Such problems can arise at any stage of the study, initially during the alignment of the diffractometer, or subsequently during data collection, and also in the final data analysis. Poor initial alignment of the diffractometer will eventually result in poor data. Accurate tracking, namely small variation (5% to 15%) in the count rate of the monitor as the incident angle ai is varied (workable range of ai is 0 to 10 ) is a key to getting reliable data. Although seemingly simple, one of the most important factors in getting a good alignment is the accurate determination of relevant distances in the diffractometer. For the specific design in
Figure 7B, these include the distances from the center of the tilting monochromator to the center of the IH elevator and to the sample SH, or the distances from the center of the liquid surface to the detector (S4) and to the elevator (DH). These distances are used to calculate the translations of the three elevators (IH, SH, and DH) for each angle. Although initial values measured directly with a ruler can be used, effective values based on the use of the x-ray beam yield the best tracking of the diffractometer. Radiation damage to the specimen is a common nuisance when dealing with liquid surfaces. Many of the studies of liquid surfaces and monolayers involve investigations of organic or biomaterials that are susceptible to chemical transformations in general and, in particular, in the presence of the intense synchrotron beam. Radiation damage is of course not unique to monolayers on liquid surfaces; other x-ray techniques that involve organic materials (protein, polymer, liquid crystals, and others) face similar problems. Radiation damage to a specimen proceeds in two steps. First, the specimen or a molecule in its surroundings is ionized (by the photoelectric, Auger, or Compton effects) or excited to higher energy levels (creating radicals). Subsequently, the ionized/ excited product can react with a nearby site of the same molecule or with a neighboring molecule to form a new species, altering the chemistry as well the structure at the surface. In principle, recovery without damage is also possible, but there will be a recovery time involved. The remedies that are proposed here are in part specific to liquid surfaces and cannot be always fulfilled in view of the specific requirements of an experiment. To minimize the primary effects (photoelectric, Auger, and Compton scattering) one can employ one or all of the following remedies. First, the exposure time to the x-ray beam can be minimized. For instance, while motors are still moving to their final positions, the beam can be blocked. Reduced exposure can be also achieved by attenuating the flux on the sample to roughly match it to the expected signal, so that the full intense beam is used only for signals with cross-sections for scattering that require it. That, of course, requires rough knowledge of signal intensity, which is usually the case in x-ray experiments. Monolayers on liquid surfaces are inevitably disposable and it takes a few of them to complete a study, so that in the advanced stage of a study many peak intensities are known. Another approach to reducing the effect of the primary stage is by operating at high x-ray energies. It is well known that the cross-section for all the primary effects is significantly reduced with the increase of x-ray energy. If the experiment does not require a specific energy, such as in resonance studies (anomalous scattering), it is advantageous to operate at high x-ray energies. However, operation at higher x-ray energies introduces technical difficulties. Higher mechanical angular resolutions, and smaller slits, are required in order to achieve reciprocal space resolutions comparable to those at lower energies. As discussed above, slit S2 at 16 keV has to be set at about 50 mm width, which cannot be reliably achieved with variable slits. Finally, reducing the high harmonic component in the x-ray beam will also reduce the primary effect of the radiation.
X-RAY DIFFRACTION TECHNIQUES FOR LIQUID SURFACES AND MONOMOLECULAR LAYERS
Reducing the effects of secondary processes also depends on the requirements of a specific experiment. Air surrounding the sample has probably the worst effect on the integrity of an organic film at the liquid interface. The x-ray radiation readily creates potent radicals in air (such as monatomic oxygen) that are highly diffusive and penetrant and can interact with almost any site of an organic molecule. Working in an He environment or under vacuum can significantly reduce this source of radiation damage. Another approach to consider for reducing the secondary effect of radiation is scattering at low temperatures. It is well documented in protein crystallography that radiation damage is significantly reduced at liquid nitrogen temperatures. However, such low temperatures are not a choice when dealing with aqueous surfaces, and variations in temperature can also lead to dramatic structural transition in the films which may alter the objectives of the study. The liquid substrate, even water, can create temporary radicals that can damage the monolayer: in particular, the head group region of lipids. Water under intense x-ray radiation can give many reactive products such as H2O2 or monatomic oxygen that can readily interact with the monolayer. Thus, some radiation damage, with an extent that may vary from sample to sample, is inevitable, and fresh samples are required to complete a study. Moving the sample underneath the footprint is a quick fix in that regard, assuming that the radiation damage is mostly localized around the illuminated area. To accomplish that, one can introduce a translation of the trough (perpendicular to the incident beam direction at ai ¼ 0) to probe different parts of the surface. It should be noted that for lipid monolayers, a common observation suggests that radiation damage is much more severe at lower in-plane densities than for closely packed monolayers. Another serious problem concerning scattering from surfaces is background radiation that can give count rates comparable to those expected from GID or rod scan signals. Background can be classified into two groups, one due to room background and the other due to the sample and its immediate surroundings. Although it is very hard to locate the sources of room background, it is important to trace and block them, as they reduce the capability of the diffractometer. The specimen can give unavoidable background signal due to diffuse or incoherent scattering from the liquid substrate, and this needs to be accounted for in the interpretation of the data. An important source of background is the very immediate environment of the liquid surface that does not include the sample but is included in the scattering volume of the beam: worst of all is air. Working under vacuum is not an option with monolayers, and therefore such samples are kept under an He environment. Air scattering in the trough can give rise to background levels that are at least two or three orders of magnitude higher than the expected signal from a typical 2-D Bragg reflection in the GID. As discussed previously (see Data Analysis and Initial Interpretation), reflectivity data can give ambiguous SLD values, although that rarely happens. More often, however, the reflectivity is overinterpreted, with details in the SLD that cannot be supported by the data. The
1045
reflectivity from aqueous surfaces can be at best measured ˚ 1, which, roughly up to a momentum transfer Qz of 1 A speaking, in an objective reconstruction of the SLD, should ˚ . It is only by give uncertainties on the order of 2p=Qz 6 A virtue of complementary knowledge on the constituents of the monolayer that these uncertainties can be lowered to ˚ . Another potential pitfall for overinterpretaabout 1 to 2 A tion lies in the fact that in the GID experiments only a few peaks are observed, and without the knowledge on the 3-D packing of the constituents it is difficult to interpret the data unequivocally.
ACKNOWLEDGMENTS The author would like to thank Prof. P. S. Pershan for providing a copy of Fig. 11 for this publication. Ames Laboratory is operated by Iowa State University for the U.S. Department of Energy under Contract No. W-7405-Eng82. The work at Ames was supported by the Director for Energy Research, Office of Basic Energy Sciences.
LITERATURE CITED Als-Nielsen, J. and Kjaer, K. 1989. X-ray reflectivity and diffraction studies of liquid surfaces and surfactant monolayers. In Phase Transitions in Soft Condensed Matter (T. Riste and D. Sherrington, eds.).. pp. 113–138. Plenum Press, New York. Als-Nielsen, J. and Pershan, P. S. 1983. Synchrotron X-ray diffraction study of liquid surfaces. Nucl. Instrum. Methods 208:545–548. Azzam, R. M. A. and Bashara, N. M. 1977. Ellipsometry and Polarized Light. North-Holland Publishing, New York. Bevington, P. R. 1968. Data Reduction and Error Analysis. McGraw-Hill, New York. Binnig, G. 1992. Force microscopy. Ultramicroscopy 42–44:7–15. Binnig, G. and Rohrer, H., 1983. Scanning tunneling microscopy. Surf. Sci. 126:236–244. Born, M. and Wolf, E., 1959. Principles of Optics. MacMillan, New York. Bouwman, W. G. and Pedersen, J. S., 1996. Resolution function for two-axis specular neutron reflectivity. J. Appl. Crystallogr. 28:152–158. Braslau, A., Pershan, P. S., Swislow, G., Ocko, B. M., and Als-Nielsen, J. 1988. Capillary waves on the surface of simple liquids measured by X-ray reflectivity. Phys. Rev. A 38:2457–2469. Buff, F. P., Lovett, R. A., and Stillinger, Jr., F. H. 1965. Interfacial density profile for fluids at the critical region. Phys. Rev. Lett. 15:621–623. Clinton, W. L. 1993. Phase determination in X-ray and neutron reflectivity using logarithmic dispersion relations. Phys. Rev. B 48:1–5. Ducharme, D., Max, J.-J., Salesse, C., and Leblanc, R. M. 1990. Ellipsometric study of the physical states of phosphotadylcholines at the air-water interface. J. Phys. Chem. 94:1925– 1932. Dutta, P., Peng, J. B., Lin, B., Ketterson, J. B., Prakash, M. P., Georgopoulos, P., and Ehrlich, S. 1987. X-ray diffraction studies of organic monolayers on the surface of water. Phys. Rev. Lett. 58:2228–2231.
1046
X-RAY TECHNIQUES
Evans, R. 1979. The nature of the liquid-vapor interface and other topics in the statistical mechanics of non-uniform, classical fluids. Adv. Phys. 28:143–200.
Ocko, B. M., Wu, X. Z., Sirota, E. B., Sinha, S. K., and Deutsch, M., 1994. X-ray reflectivity study of thermal capillary waves on liquid surfaces. Phys. Rev. Lett. 72:242–245.
Feidenhas’l, R. 1989. Surface structure determination by X-ray diffraction. Surf. Sci. Rep. 10:105–188.
Ocko, B. M., Wu, X. Z., Sirota, E. B., Sinha, S. K., Gang, O., and Deutsch, M., 1997. Surface freezing in chain molecules: I. Normal alkanes. Phys. Rev. E 55:3166–3181.
Fukuto, M., Penanen, K., Heilman, R. K., Pershan, P. S., and Vaknin, D. 1997. C60-propylamine adduct monolayers at the gas/water interface: Brewster angle microscopy and x-ray scattering study. J. Chem. Phys. 107:5531. Gaines, G. 1966. Insoluble Monolayers at Liquid Gas Interface. John Wiley & Sons, New York. Gregory, B. W., Vaknin, D., Gray, J. D., Ocko, B. M., Stroeve, P., Cotton, T. M., and Struve, W. S. 1997. Two-dimensional pigment monolayer assemblies for light harvesting applications: Structural characterization at the air/water interface with X-ray specular reflectivity and on solid substrates by optical absorption spectroscopy. J. Phys. Chem. B 101:2006– 2019. Henon, S. and Meunier, J. 1991. Microscope at the Brewster angle: Direct observation of first-order phase transitions in monolayers. Rev. Sci. Instrum. 62:936–939. Ho¨ nig, D. and Mo¨ bius, D. 1991. Direct visualization of monolayers at the air-water interface by Brewster angle microscopy. J. Phys. Chem. 95:4590–4592. Kjaer, K., Als-Nielsen, J., Helm, C. A., Laxhuber, L. A., and Mo¨ hwald, H. 1987. Ordering in lipid monolayers studied by synchrotron x-ray diffraction and fluorescence microscopy. Phys. Rev. Lett. 58:2224–2227. Kjaer, K., Als-Nielsen, J., Helm, C. A., Tippman-Krayer, P., and Mo¨ hwald, H. 1989. Synchrotron X-ray diffraction and reflection studies of arachidic acid monolayers at the air-water interface. J. Phys. Chem. 93:3202–3206. Kjaer, K., Als-Nielsen, J., Lahav, M., and Leiserowitz, L. 1994. Two-dimensional crystallography of amphiphilic molecules at the air-water interface. In Neutron and Synchrotron Radiation for Condensed Matter Studies, Vol. III (J. Baruchel, J.-L. Hodeau, M. S. Lehmann, J.-R. Regnard, and C. Schlenker, eds.). pp. 47–69. Springer-Verlag, Heidelberg, Germany. Landau, L. D. and Lifshitz, E. M. 1980. Statistical Physics. p. 435. Pergamon Press, Elmsford, New York. Lang, N. D. and Kohn, W. 1970. Theory of metal surfaces: Charge density and surface energy. Phys. Rev. B1 4555–4568. Lekner, J. 1987. Theory of Reflection of Electromagnetic Waves and Particles. Martinus Nijhoff, Zoetermeer, The Netherlands. Lo¨ sche, M. and Mo¨ hwald, H. 1984. Fluorescence microscope to observe dynamical processes in monomolecular layers at the air/water interface. Rev. Sci. Instrum. 55:1968–1972. Lo¨ sche, M., Piepenstock, M., Diederich, A., Gru¨ newald, T., Kjaer, K., and Vaknin, D. 1993. Influence of surface chemistry on the structural organization of monomolecular protein layers adsorbed to functionalized aqueous interfaces. Biophys. J. 65:2160–2177. Lurio, L. B., Rabedeau, T. A., Pershan, P. S., Silvera, I. S., Deutsch, M., Kosowsky, S. D., and Ocko, B. M., 1992. Liquidvapor density profile of helium: An X-ray study. Phys. Rev. Lett. 68:2628–2631. Magnussen, O. M., Ocko, B. M., Regan, M. J., Penanen, K., Pershan, P. S., and Deutsch, M. 1995. X-ray reflectivity measurements of surface layering in mercury. Phys. Rev. Lett. 74 4444– 4447. Mo¨ hwald, H. 1990. Phospholipid and phospholipid-protein monolayers at the air/water interface. Annu. Rev. Phys. Chem. 41:441–476.
Panofsky, W. K. H. and Phillips, M. 1962. Classical Electricity and Magnetism. Addisson-Wesley, Reading, Mass. Parratt, L. G. 1954. Surface studies of solids by total reflection of X-rays. Phys. Rev. 95:359–369. Pedersen, J. S. 1992. Model-independent determination of the surface scattering-length-density profile from specular reflectivity data. J. Appl. Crystallogr. 25:129–145. Penfold, J. and Thomas, R. K., 1990. The application of the specular reflection of neutrons to the study of surfaces and interfaces. J. Phys. Condens. Matter 2:1369–1412. Pershan, P. S., Braslau, A., Weiss, A. H., and Als-Nielsen, 1987. Free surface of liquid crystals in the nematic phase. Phys. Rev. A 35:4800–4813. Regan, M. J., Kawamoto, E. H., Lee, S., Pershan, P. S., Maskil, N., Deutsch, M., Magnussen, O. M., Ocko, B. M., and Berman, L. E. 1995. Surface layering in liquid gallium: An X-ray reflectivity study. Phys. Rev. Lett. 75:2498–2501. Regan, M. J., Pershan, P. S., Magnussen, O. M., Ocko, B. M., Deutch, M., and Berman, L. E. 1997. X-ray reflectivity studies of liquid metal and alloy surfaces. Phys. Rev. B 55:15874– 15884. Rice, S. A., Gryko, J., and Mohanty, U. 1986. Structure and properties of the liquid-vapor interface of a simple metal. In Fluid Interfacial Phenomena (C. A. Croxton, ed.) pp. 255–342. John Wiley & Sons, New York. Rodberg, L. S. and Thaler R. M. 1967. Introduction to the Quantum Theory of Scattering. Academic Press, New York. Russell, T. P. 1990. X-ray and neutron reflectivity for the investigation of polymers. Mater. Sci. Rep. 5:171–271. Sacks, P. 1993. Reconstruction of step like potentials. Wave Motion 18:21–30. Sanyal, M. K., Sinha, S. K., Huang, K. G., and Ocko, B. M. 1991. Xray scattering study of capillary-wave fluctuations at a liquid surface. Phys. Rev. Lett. 66:628–631. Sanyal, M. K., Sinha, S. K., Gibaud, A., Huang, K. G., Carvalho, B. L., Rafailovich, M., Sokolov, J., Zhao, X., and Zhao, W. 1992. Fourier reconstruction of the density profiles of thin films using anomalous X-ray reflectivity. Europhys. Lett. 21:691– 695. Schiff, L. I. 1968. Quantum Mechanics, McGraw-Hill, New York. Schwartz, D. K., Schlossman, M. L., and Pershan, P. S. 1992. Reentrant appearance of the phases in a relaxed Langmuir monolayer of the tetracosanoic acid as determined by X-ray scattering. J. Chem. Phys. 96:2356–2370. Sinha, S. K., Sirota, E. B., Garof, S., and Stanely, H. B. 1988. X-ray and neutron scattering from rough surfaces. Phys. Rev. B 38:2297–2311. Swalen, J. D., Allra, D. L., Andrade, J. D., Chandross, E. A., Garrof, S., Israelachvilli, J., Murray, R., Pease, R. F., Wynne, K. J., and Yu, H. 1987. Molecular monolayers and films. Langmuir 3:932–950. Vaknin, D., Als-Nielsen, J., Piepenstock, M., and Lo¨ sche, M. 1991a. Recognition processes at a functionalized lipid surface observed with molecular resolution. Biophys. J. 60:1545– 1552. Vaknin, D., Kjaer, K., Als-Nielsen J., and Lo¨ sche, M. 1991b. A new liquid surface neutron reflectometer and its application to the
X-RAY DIFFRACTION TECHNIQUES FOR LIQUID SURFACES AND MONOMOLECULAR LAYERS study of DPPC in a monolayer at the air/water interface. Makromol. Chem. Macromol. Symp. 46:383–388. Vaknin, D., Kjaer, K., Als-Nielsen, J., and Lo¨ sche, M. 1991c. Structural properties of phosphotidylcholine in a monolayer at the air/water interface: Neutron reflectivity study and reexamination of X-ray reflection experiments. Biophys. J. 59:1325–1332. Vaknin, D., Kjaer, K., Ringsdorf, H., Blankenburg, R., Piepenstock, M. Diederich, A., and Lo¨ sche, M. 1993. X-ray and neutron reflectivity studies of a protein monolayer adsorbed to a functionalized aqueous surface. Langmuir 59:1171– 1174. Vaknin, D. 1996. C60-amine adducts at the air-water interface: A new class of Langmuir monolayers. Phys. B 221:152–158. Vineyard, G. 1982. Grazing-incidence diffraction and the distorted-wave approximation for the study of surfaces. Phys. Rev. B 26:4146–4159. Wilson, A. J. C. (eds.). 1992. International Tables For Crystallography Volume C. Kluwer Academic Publishers, Boston. Wu, X. Z., Ocko, B. M., Sirota, E. B., Sinha, S. K., Deutsch, M., Cao, B. H., and Kim, M. W. 1993a. Surface tension measurements of surface freezing in liquid normal-alkanes. Science 261:1018–1021. Wu, X. Z., Sirota, E. B., Sinha, S. K., Ocko, B. M., and Deutsch, M. 1993b. Surface crystallization of liquid normal-alkanes. Phys. Rev. Lett. 70:958–961. Wu, X. Z., Ocko, B. M., Tang, H., Sirota, E. B., Sinha, S. K., and Deutsch, M. 1995. Surface freezing in binary mixtures of alkanes: New phases and phase transitions. Phys. Rev. Lett. 75:1332–1335.
APPENDIX: P-POLARIZED X-RAY BEAM A p-polarized x-ray beam has a magnetic field component that is parallel to the stratified medium (along the x axis, see Fig. 1), and straightforward derivation of the wave equation (Equation 7) yields d dB þ ½k2z VðzÞB ¼ 0 dz edz
The solution of Equation 89 for an ideally flat interface in terms of rp ðkz;s Þ and tp ðkz;s Þ is then simply given by rp ðkz;s Þ ¼
kz;0 kz;s =e ; kz;0 þ kz;s =e
An excellent, self-contained, and intuitive review of x-ray reflectivity and GID from liquid surfaces and Langmuir monolayers by pioneers in the field of liquid surfaces. Braslau et al., 1988. See above. The first comprehensive review examining the properties of the liquid-vapor interface of simple liquids (water, carbon tetrachloride, and methanol) employing the reflectivity technique. The paper provides many rigorous derivations such as the Born approximation, the height-height correlation function of the surface, and surface roughness due to capillary waves. Discussion of practical aspects regarding resolution function of the diffractometer and convolution of the resolution with the reflectivity signal is also provided. Sinha et al., 1988. See above. This seminal paper deals with the diffuse scattering from a variety of rough surfaces. It also provides a detailed account of the diffuse scattering in terms of the distored-wave Born approximation (DWBA). It is relevant to liquid as well as to solid surfaces.
tp ðkz ; sÞ ¼
2kz;0 kz;0 þ kz;s =e
ð90Þ
The critical momentum transfer for total external pffiffiffiffiffiffiffiffiffiffi reflectivity of the p-type x-ray beam is Qc ¼ 2kc ¼ 4prs , identical to the one derived for the s-type wave. Also, for B 2kB z " Qz " Qc , (kz is defined below), RF ðQz Þ can be approximated as RF ðQz Þ
Als-Neilsen and Kjaer, 1989. See above.
ð88Þ
By introducing a dilation variable Z such that dZ ¼ edz, Equation 88 for B can be transformed to a form similar to Equation 12 2 d2 B kz VðzÞ B¼0 ð89Þ ¼ dZ2 e
Zhou, X. L. and Chen, S. H. 1995. Theoretical foundation of X-ray and neutron reflectometry. Phys. Rep. 257:223–348.
KEY REFERENCES
1047
Qc 2Qz
4
2 2 1þe
ð91Þ
The factor on the right hand side is equal to 1 for all practical liquids, and thus the Born approximation is basically the same as for the s-polarized x-ray beam (Equation 17). The main difference between the s-type and p-type waves occurs at larger angles near a Brewster angle that is given by yB ¼ sin 1 ðkB z =k0 Þ. At this angle, total transmission of the p-type wave occurs (rp ðkz ; sÞ ¼ 0). Using Equations 14 and 90, kB z can be derived kB 1 z ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi k0 2 4pr =k2 s
ð92Þ
0
The Brewster angle for x-rays is then given by yB ¼ sin 1 ðkB z =k0 Þ p=4. This derivation is valid for solid surfaces, including crystals, where the total transmission effect of the p-polarized wave at a Bragg reflection is used to produce polarized and monochromatic x-ray beams. DAVID VAKNIN Iowa State University Ames, Iowa
This page intentionally left blank
ELECTRON TECHNIQUES INTRODUCTION
make interference patterns in the image. On the other hand, the operation of the scanning transmission electron microscope for Z-contrast imaging makes use of incoherent imaging, where the intensities of the scatterings from the individual atoms add independently to the image. Electron scattering from a material can be either elastic or inelastic. In general, elastic scattering is used for imaging measurements, whereas inelastic scattering provides for spectroscopy. (Electron beam techniques are enormously versatile, however, so there are many exceptions to this generality.) Chemical mapping, for example, makes images by use of inelastic scattering. For chemical mapping, the energy loss of the electron itself may be used to obtain the chemical information, as in electron energy loss spectrometry (EELS). Alternatively, the subsequent radiations from atoms ionized by the inelastic scattering may be as signals containing the chemical information. A component of the characteristic atomic x-ray spectrum, measured with an energy dispersive x-ray spectrometer (EDS), is a particularly useful signal for making chemical maps with a scanning electron microscope. The intensity of selected Auger electrons emitted from the excited atoms provides the chemical mapping capability of scanning Auger microscopy. Auger maps are highly surface-sensitive, since the Auger electrons lose energy as they traverse only a short distance through the solid. The chapters in this part enumerate some requirements for sample preparation, but it is remarkable that any one of these techniques permits studies on a large variety of samples. All classes of materials (metals, ceramics, polymers, and composites) have been studied by all of these electron techniques. Except for some problems with the lightest (or heaviest) elements in the periodic table, the electron beam techniques have few compositional limitations. There are sometimes problems with how the sample may contaminate the vacuum system that is integral to all electron beam methods. This is especially true for the surface techniques that employ ultrahigh vacuums. When approaching a new problem in materials characterization, electron beam methods (especially SEM), provide some of the best reward/effort ratios of any materials characterization technique. Many, if not all, of the techniques in this part should therefore be familiar to everyone interested in materials characterization. Some of the methods are available through commercial laboratories on a routine basis, such as SEM, EDS and Auger spectroscopy. TEM services can sometimes be found at companies offering asbestos analysis. Manufacturers of electron beam instruments may also provide analysis services at reasonable rates. All the electron methods of this part are available at materials research laboratories at universities and national laboratories. Preliminary assessments of the applicability of an electron method to a materials problem can often be made by contacting someone at a local institution. These institutions usually have established policies for fees and service for outside users. For value in capital investment, perhaps a scanning electron
This part describes how electrons are used to probe the microstructures of materials. These methods are arguably the most powerful and flexible set of tools available for materials characterization. For the characterization of structures internal to materials, electron beam methods provide capabilities for determining crystal structure, crystal shapes and orientations, defects within the crystals, and the distribution of atoms within these individual crystals. For characterizing surfaces, electron methods can determine structure and chemistry at the level of the atomic monolayer. The methods in this part use electrons with kinetic energies spanning the range from 1 to 1,000,000 eV. The low energy range is the domain of scanning tunneling microscopy (STM). Since STM originates with quantum mechanical tunneling of electrons through potential barriers, this method differs from the others in this part, which employ electron beams incident on the material. Progressively higher energies are used for the electron beams of low energy electron diffraction (LEED), Auger spectrometry, reflection high energy electron diffraction (RHEED), scanning electron microscopy (SEM), and transmission electron microscopy (TEM). Electron penetration through the solid increases strongly over this broad energy range. STM, Auger, and LEED techniques are used for probing the monolayer of surface atoms (although variants of STM can probe sub-surface structures). On the other hand, the penetration capability of high energy electrons makes TEM primarily a bulk technique. Nevertheless, even with TEM it is often unrealistic to study samples having thicknesses much greater than a fraction of a micron. Since electron beams can be focused with high accuracy, the incident beam can take various forms. A tightly focused beam, rastered across the sample, is one useful form. The acquisition of various signals in synchronization with this raster pattern provides a spatial map of the signal. Signals can be locations of characteristic x-ray emission, secondary-electron emission, or electron energy losses, to name but three. A plane wave is another useful form of the incident electron beam. Plane wave illumination is typically used for image formation with conventional microscopy and diffraction. It turns out that for complementary optical designs, the same diffraction effects in the specimen that produce visible features in images will occur with either a point- or plane-illumination mode. There are, however, advantages to one mode of operation versus another, and instruments of the scanning type and imaging type are typically used for different types of measurements. Atomic resolution imaging, for example, can be performed with a plane wave illumination method in high resolution electron microscopy (HREM), or with a probe mode in Z-contrast imaging. A fundamental difference between these techniques is that the diffracted waves interfere coherently in the case of HREM imaging, and the phase information of the scattering is used to 1049
1050
ELECTRON TECHNIQUES
microscope with an energy dispersive spectrometer is a best buy. The surface techniques are more specialized, and the expense and maintenance of their ultrahigh vacuum systems can be formidable. TEM is also a technique that cannot be approached casually, and some contact with an expert in the field is usually the best way to begin. BRENT FULTZ
with the TEM image showing the internal structure of the material. Due to these unique features, SEM images frequently appear not only in the scientific literature but also in the daily newspapers and popular magazines. The SEM is relatively easy to operate and affordable and allows for multiple operation modes, corresponding to the collection of different signals. The following sections review the SEM instrumentation and principles, its capabilities and applications, and recent trends and developments.
SCANNING ELECTRON MICROSCOPY
PRINCIPLES OF THE METHOD
INTRODUCTION
Signal Generation
The scanning electron microscope (SEM) is one of the most widely used instruments in materials research laboratories and is common in various forms in fabrication plants. Scanning electron microscopy is central to microstructural analysis and therefore important to any investigation relating to the processing, properties, and behavior of materials that involves their microstructure. The SEM provides information relating to topographical features, morphology, phase distribution, compositional differences, crystal structure, crystal orientation, and the presence and location of electrical defects. The SEM is also capable of determining elemental composition of micro-volumes with the addition of an x-ray or electron spectrometer (see ENERGY-DISPERSIVE SPECTROMETRY and AUGER ELECTRON SPECTROSCOPY) and phase identification through analysis of electron diffraction patterns (see LOW-ENERGY ELECTRON DIFFRACTION). The strength of the SEM lies in its inherent versatility due to the multiple signals generated, simple image formation process, wide magnification range, and excellent depth of field. Lenses in the SEM are not a part of the image formation system but are used to demagnify and focus the electron beam onto the sample surface. This gives rise to two of the major benefits of the SEM: range of magnification and depth of field in the image. Depth of field is that property of SEM images where surfaces at different distances from the lens appear in focus, giving the image threedimensional information. The SEM has more than 300 times the depth of field of the light microscope. Another important advantage of the SEM over the optical microscope is its high resolution. Resolution of 1 nm is now achievable from an SEM with a field emission (FE) electron gun. Magnification is a function of the scanning system rather than the lenses, and therefore a surface in focus can be imaged at a wide range of magnifications from 3 up to 150,000. The higher magnifications of the SEM are rivaled only by the transmission electron microscope (TEM) (see TRANSMISSION ELECTRON MICROSCOPY and SCANNING TRANSMISSION ELECTRON MICROSCOPY: Z-CONTRAST IMAGING), which requires the electrons to penetrate through the entire thickness of the sample. As a consequence, TEM sample preparation of bulk materials is tedious and time consuming, compared to the ease of SEM sample preparation, and may damage the microstructure. The information content of the SEM and TEM images is different,
The SEM electron beam is a focused probe of electrons accelerated to moderately high energy and positioned onto the sample by electromagnetic fields. The SEM optical column is utilized to ensure that the incoming electrons are of similar energy and trajectory. These beam electrons interact with atoms in the specimen by a variety of mechanisms when they impinge on a point on the surface of the specimen. For inelastic interactions, energy is transferred to the sample from the beam, while elastic interactions are defined by a change in trajectory of the beam electrons without loss of energy. Since electrons normally undergo multiple interactions, the inelastic and elastic interactions result in the beam electrons spreading out into the material (changing trajectory from the original focused probe) and losing energy. This simultaneous energy loss and change in trajectory produces an interaction volume within the bulk (Fig. 1). The size of this interaction volume can be estimated by Monte Carlo simulations, which incorporate probabilities of the multiple possible elastic and inelastic interactions into a calculation of electron trajectories within the specimen. The signals resulting from these interactions (e.g., electrons and photons) will each have different depths within the sample from which they can escape due to their unique
Figure 1. A Monte Carlo simulation of electron beam interaction with a bulk copper target at 5 keV shows the interaction volume within the specimen. The electron trajectories are shown, and the volume is symmetric about the beam due to the normal incidence angle (derived from Goldstein et al., 1992; Monte Carlo simulation using Electron Flight Simulator, Small World, D. Chernoff).
SCANNING ELECTRON MICROSCOPY
Figure 2. Distribution of relative energies (E/E0; E0 ¼ incident beam energy) of electrons ejected from a surface by an incident electron beam (not to scale). The peak (1) is the elastic peak, BSEs that have lost no energy in the specimen. Slightly lower in energy is the plasmon loss region (2). The BSEs (3) are spread over the entire energy range from 0 eV to the energy of the incident beam. The characteristic Auger electron peak (4) is usually a small peak superimposed on the backscattered curve. Secondary electrons emitted from specimen atoms are responsible for the large peak at low energy (5).
physical properties and energies. For example, a secondary electron (SE) is a low-energy (2- to 5-eV) electron ejected from the outer shell of a sample atom after an inelastic interaction. These low-energy electrons can escape the surface only if generated near the surface. Thus we have an ‘‘interaction volume’’ in which beam electrons interact with the specimen and a ‘‘sampling volume’’ from which a given signal escapes the solid, which is some fraction of the interaction volume. It is this sampling volume and the signal distribution within it that determine the spatial resolution of the technique (Joy, 1984). We can thus expect different spatial resolutions for the various types of signals generated in the SEM. Backscattered electrons (BSEs) are electrons from the incident probe that undergo elastic interactions with the sample, change trajectory, and escape the sample. These make up the majority of electrons emitted from the specimen at high beam voltage, and their average energy is much higher than that of the SEs (Fig. 2). The depth from which BSEs escape the specimen is dependent upon the beam energy and the specimen composition, but >90% generally escape from less than one-fifth the beam penetration depth. The intensity of the BSE signal is a function of the average atomic number (Z) of the specimen, with heavier elements (higher Z samples) producing more BSEs. It is thus a useful signal for generating compositional images, in which higher Z phases appear brighter than lower Z phases. The BSE intensity and trajectory are also dependent upon the angle of incidence between the beam and the specimen surface. The topography, or physical features of the surface, is then imaged by using these properties of the BSE signal to generate BSE topographic images. Due to the relatively high energy of the
1051
BSE signal, the sampling volume (sample depth) is greater than that of SEs. Secondary electrons are due to inelastic interactions, are low energy (typically 2 to 5 eV), and are influenced more by surface properties than by atomic number. The SE is emitted from an outer shell of a specimen atom upon impact of the incident electron beam. The term ‘‘secondary’’ thus refers to the fact that this signal is not a scattered portion of the probe, but a signal generated within the specimen due to the transfer of energy from the beam to the specimen. In practice, SEs are arbitrarily defined as those electrons with 200 kV can damage light metals such as aluminum, and the effect on heavier elements continually increases as the beam energy increases above this value. Knock-on damage usually manifests itself by the formation of small vacancy or interstitial clusters and dislocation loops in the specimen, which can be observed by diffraction contrast. The main way to avoid displacment damage is to use lower accelerating voltages. Knock-on damage can also occur in polymers and minerals where there is generally a trade-off
between knock-on damage and radiolysis. Since knock-on damage creates vacancies and interstitials, it can be used to study the effects of irradiation in situ. Another important aspect of specimen modification in TEM is the ability to perform in situ studies to directly observe the behavior of materials under various conditions (Butler and Hale, 1981). This potentially useful effect for electron irradiation studies was mentioned above. In addition, specimen holders designed to heat, cool, and strain (and combinations of these) a TEM sample in the microscope are available commercially from several manufacturers. It is also possible to introduce gases into the transmission electron microscope to observe chemical reactions in situ by using apertures and differential pumping in the microscope. Once they have equilibrated, most commercial TEM holders are sufficiently stable that it is possible to perform EDS and EELS on foils to obtain compositional information in situ (Howe et al., 1998). To reproduce the behavior of bulk material in TEM, it is often desirable to perform in situ experiments in high-voltage TEM (e.g., 1 MeV) so that foils on the order of 1 mm thick can be used. Using thicker foils can be particularly important for studies such as straining experiments. As with most techniques, one needs to be careful when interpreting in situ data, and it is often advisable to compare the results of in situ experiments with parallel experiments performed in bulk material. For example, if we want to examine the growth behavior of precipitates during in situ heating in TEM, we might compare the size of the precipitates as a function of time and temperature obtained in situ in the transmission electron microscope with that obtained by bulk aging experiments. The temperature calibration of most commercial holders seems to be reliable to 20 C, and calibration specimens need to be used to obtain greater temperature accuracy. One problem that can arise during heating experiments, for example, is that solute may preferentially segregate to the surface. Oxidation of some samples during heating can also be a problem. In spite of all these potential difficulties, in situ TEM is a powerful technique for observing the mechanisms and kinetics of reactions in solids and the effects of electron irradiation on these phenomena.
PROBLEMS A number of limitations should be considered in the TEM analysis of materials, including but not limited to (1) the sampling volume, (2) image interpretation, (3) radiation damage, (4) specimen preparation, and (5) microscope calibration. This section briefly discusses some of these factors. More thorough discussion of these topics can be found in Edington (1974), Williams and Carter (1996), and other textbooks on TEM (see Literature Cited). It is important to remember that in TEM only a small volume of material is observed at high magnification. If possible, it is important to examine the same material at lower levels of resolution to ensure that the microstructure observed in TEM is representative of the overall specimen. It is often useful to prepare the TEM specimen by more than one method to ensure that the microstructure was
TRANSMISSION ELECTRON MICROSCOPY
not altered during sample preparation. As discussed above (see Specimen Modification), many materials damage under a high-energy electron beam, and it is important to look for signs of radiation damage in the image and diffraction pattern. In addition, as shown above (see Data Analysis and Initial Interpretation), image contrast in TEM varies sensitively with diffracting conditions, i.e., the exact value of the deviation parameter s. Therefore, one must carefully control and record the diffracting conditions during imaging in order to quantify defect contrast. It is important to remember that permanent calibration of TEM features such as image magnification and camera length, which are typically performed during installation, are only accurate to within 5%. If one desires higher accuracy, then it is advisable to perform in situ calibrations using standards. A variety of specimens are available for magnification calibration. The most commonly used are diffraction grating replicas for low and intermediate magnifications (100 to 200; 000) and direct lattice images of crystals for high magnifications (>200,000). Similarly, knowledge of the camera constant in Equation 35 greatly simplifies indexing of diffraction patterns and is essential for the identification of unknown phases. The accuracy of calibration depends on the experiment, but an error of 1% can be achieved relatively easily, but this can be improved to 0.1% if care is used in operation of the microscope and measurement of the diffraction patterns. An example of this calibration was given above (see Data Analysis and Initial Interpretation), and it is common to evaporate a metal such as Au directly onto a sample or to use a thin film as a standard. Permanent calibrations are normally performed at standard lens currents that must be subsequently reproduced during operation of the microscope for the calibration to be valid. This is accomplished by setting the lens currents and using the height adjustment (z control) on the goniometer to focus the specimen. Other geometric factors that can introduce errors into the magnification of diffraction patterns are discussed by Edington (1974). There are also a number of other calibrations that are useful in TEM, including calibration of (1) the accelerating voltage, (2) specimen drift rate, (3) specimen contamination rate, (4) sense of specimen tilt, (5) focal increments of the objective lens, and (6) beam current, but these are not detailed here.
ACKNOWLEDGMENTS The authors are grateful for support of this work by the National Science Foundation, JMH under Grant DMR9630092 and BTF under Grant DMR-9415331.
LITERATURE CITED Basile, D. P., Boylan, R., Hayes, K., and Soza, D. 1992. FIBXTEM—Focussed ion beam milling for TEM sample preparation. In Materials Research Society Symposium Proceedings, Vol. 254 (R. Anderson, B. Tracy, and J. Bravman, eds.). pp. 23–41. Materials Research Society, Pittsburgh.
1089
Borchardt-Ott, W. 1995. Crystallography, 2nd ed. SpringerVerlag, New York. Butler, E. P. and Hale, K. F. 1981. Dynamic Experiments in the Electron Microscope, Vol. 9. In Practical Methods in Electron Microscopy (A. M. Glauert, ed.). North-Holland, New York. Chang, Y.-C. 1992. Crystal structure and nucleation behavior of (111) precipitates in an AP-3.9Cu-0.5Mg-0.5Ag alloy. Ph.D. thesis, Carnegie Mellon University, Pittsburgh. Chescoe, D. and Goodhew, P. J. 1984. The Operation of the Transmission Electron Microscope. Oxford University Press, Oxford. Cockayne, D. J. H., Ray, I. L. F., and Whelan, M. J. 1969. Investigation of dislocation strain fields using weak beams. Philos. Mag. 20:1265–1270. Edington, J. W. 1974. Practical Electron Microscopy in Materials Science, Vols. 1-4. Macmillan Philips Technical Library, Eindhoven. Egerton, R. F. 1996. Electron Energy-Loss Spectroscopy in the Electron Microscope, 2nd ed. Plenum Press, New York. Fultz, B. and Howe, J. M. 2000. Transmission Electron Microscopy and Diffractometry of Materials. Springer-Verlag, Berlin. Goodhew, P. J. 1984. Specimen Preparation for Transmission Electron Microscopy of Materials. Oxford University Press, Oxford. Head, A. K., Humble, P., Clarebrough, L. M., Morton, A. J., and Forwood, C. T. 1973. Computed Electron Micrographs and Defect Identification. North-Holland, Amsterdam, The Netherlands. Hirsch, P. B., Howie, A., Nicholson, R. B., Pashley D. W., and Whelan, M. J. 1977. Electron Microscopy of Thin Crystals, 2nd ed. Krieger, Malabar. Hobbs, L. W. 1979. Radiation effects in analysis of inorganic specimens by TEM. In Introduction to Analytical Electron Microcopy (J. J. Hren, J. I. Goldstein, and D. C. Joy, eds.) pp. 437–480. Plenum Press, New York. Howe, J. M., Murray, T. M., Csontos, A. A., Tsai, M. M., Garg, A., and Benson, W. E. 1998. Understanding interphase boundary dynamics by in situ high-resolution and energy-filtering transmision electron microscopy and real-time image simulation. Microsc. Microanal. 4:235–247. Hull, D. and Bacon, D. J. 1984. Introduction to Dislocations, 3rd ed. (see pp. 17–21). Pergamon Press, Oxford. Keyse, R. J., Garratt-Reeed, A. J., Goodhew, P. J., and Lorimer, G. W. 1998. Introduction to Scanning Transmission Electron Microscopy. Springer-Verlag, New York. Kikuchi, S. 1928. Diffraction of cathode rays by mica. Jpn. J. Phys. 5:83–96. Klepeis, S. J., Benedict, J. P., and Anderson, R. M. 1988. A grinding/ polishing tool for TEM sample preparation. In Specimen Preparation for Transmission Electron Microscopy of Materials (J. C. Bravman, R. M. Anderson, and M. L. McDonald, eds.). pp. 179–184. Materials Research Society, Pittsburgh. Krivanek, O. L., Ahn, C. C., and Keeney, R. B. 1987. Parallel detection electron spectrometer using quadrupole lenses. Ultramicroscopy 22:103–116. McKie, D. and McKie, C. 1986. Essentials of Crystallography, p 208. Blackwell Scientific Publications, Oxford. Miller, M. K. and Smith, G. D. W. 1989. Atom Probe Microanalysis: Principles and Applications to Materials Problems. Materials Research Society, Pittsburgh. Reimer, L. 1993. Transmission Electron Microscopy: Physics of Image Formation and Microanalysis, 3rd ed. Springer-Verlag, New York.
1090
ELECTRON TECHNIQUES
Reimer, L. (ed.). 1995. Energy-Filtering Transmission Electron Microscopy. Springer-Verlag, Berlin. Rioja, R. J. and Laughlin, D. E. 1977. The early stages of GP zone formation in naturally aged Al-4 wt pct Cu alloys. Metall. Trans. 8A:1257–1261. Sawyer, L. C. and Grubb, D. T. 1987. Polymer Microscopy. Chapman and Hall, London. Schwartz, L. H. and Cohen, J. B. 1987. Diffraction from Materials, 2nd ed. Springer-Verlag, New York. Smith, F. G. and Thomson, J. H. 1988. Optics, 2nd ed. John Wiley & Sons, Chichester. Susnitzky, D. W. and Johnson, K. D. 1998. Focused ion beam (FIB) milling damage formed during TEM sample preparation of silicon. In Microscopy and Microanalysis 1998 (G. W. Bailey, K. B. Alexander, W. G. Jerome, M. G. Bond, and J. J. McCarthy, eds.) pp. 656–667. Springer-Verlag, New York.
http://cimewww.epfl.ch/welcometext.html A similar site based at the Ecole Polytechnique Federale de Lausanne in Switzerland that contains software and a variety of electron microscopy information. http://www.msa.microscopy.com Provides access to up-to-date information about the Microscopy Society of America, affiliated societies, and microscopy resources that are sponsored by the society. http://rsb.info.nih.gov/nih-image Public domain software developed by National Institutes of Health (U.S.) for general image processing and manipulation. Available from the Internet by anonymous ftp from zippy.nimh.nih. gov or on floppy disk from National Technical Information Service, 5285 Port Royal Rd., Springfield, VA 22161, part number PB93-504868.
Thomas, G. and Goringe, M. J. 1979. Transmission Electron Microscopy of Metals. John Wiley & Sons, New York.
JAMES M. HOWE University of Virginia Charlottesville, Virginia
Voelkl, E., Alexander, K. B., Mabon, J. C., O’Keefe, M. A., Postek, M. J., Wright, M. C., and Zaluzec, N. J. 1998. The DOE2000 Materials MicroCharacterization Collaboratory. In Electron Microscopy 1998, Proceedings of the 14th International Congress on Electron Microscopy (H. A. Calderon Benavides and M. Jose Yacaman, eds.) pp. 289–299. Institute of Physics Publishing, Bristol, U.K.
BRENT FULTZ California Institute of Technology Pasadena, California
Weatherly G. C. and Nicholson, R. B. 1968. An electron microscope investigation of the interfacial structure of semi-coherent precipitates. Philos. Mag. 17:801–831. Williams, D. B. and Carter, C. B. 1996. Transmission Electron Microscopy: A Textbook for Materials Science. Plenum Press, New York.
SCANNING TRANSMISSION ELECTRON MICROSCOPY: Z-CONTRAST IMAGING INTRODUCTION
KEY REFERENCES Edington, 1974. See above. Reprinted edition available from Techbooks, Fairfax, Va. Filled with examples of diffraction and imaging analyses. Fultz and Howe, 2000. See above. An integrated treatment of microscopy and diffraction, with emphasis on principles. Hirsch et al., 1977. See above. For many years, the essential text on CTEM. Reimer, 1993. See above. Excellent reference with emphasis on physics of electron scattering and TEM. Shindo, D. and Hiraga, K. 1998. High Resolution Electron Microscopy for Materials Science. Springer-Verlag, Tokyo. Provides numerous high-resolution TEM images of materials. Williams and Carter, 1996. See above. A current and most comprehensive text on modern TEM techniques.
INTERNET RESOURCES http://www.amc.anl.gov An excellent source for TEM information on the Web in the United States. Provides access to the Microscopy ListServer and a Software Library as well as a connection to the Microscopy & Microanalysis FTP Site and Libraries plus connections to many other useful sites.
As its name suggests, the scanning transmission electron microscope is a combination of the scanning electron microscope and the transmission electron microscope. Thin specimens are viewed in transmission, while images are formed serially by the scanning of an electron probe. In recent years, electron probes have become available with atomic dimensions, and, as a result, atomic resolution images may now be achieved in this instrument. The nature of the images obtained in scanning transmission electron microscopy (STEM) can differ in significant ways from those formed by the more widespread conventional transmission electron microscopy (CTEM). The key difference lies in their modes of image formation; the STEM instrument can be configured for almost perfect incoherent imaging whereas CTEM provides almost perfect coherent imaging. The latter technique is generally referred to as high-resolution electron microscopy (HREM), though both methods now provide atomic resolution. The difference between coherent and incoherent imaging was first discussed over one hundred years ago in the context of light microscopy by Lord Rayleigh (1896). The difference depends on whether or not permanent phase relationships exist between rays emerging from different parts of the object. A self-luminous object results in perfect incoherent imaging, as every atom emits independently, whereas perfect coherent imaging occurs if the entire object is illuminated by a plane wave, e.g., a point source at infinity. Lord Rayleigh noted the factor of 2 improvement in resolution available with incoherent
SCANNING TRANSMISSION ELECTRON MICROSCOPY: Z-CONTRAST IMAGING
1091
Figure 2. Image intensity for two point objects, P1 and P2, illuminated coherently in phase, 1808 out of phase, and with incoherent illumination. (After Lord Rayleigh, 1896.)
Figure 1. Schematic showing Z-contrast imaging and atomic resolution electron energy loss spectroscopy with STEM. The image is of GaAs taken on the 300-kV STEM instrument at Oak Ridge National Laboratory, which directly resolves and distinguishes the sublattice, as shown in the line trace.
imaging and also the lack of artifacts caused by interference phenomena that could be mistaken for real detail in the object. Of particular importance in the present context, he appreciated the role of the condenser lens (1896, p. 175): ‘‘It seems fair to conclude that the function of the condenser in microscopic practice is to cause the object to behave, at any rate in some degree, as if it were self-luminous, and thus to obviate the sharply-marked interference bands which arise when permanent and definite phase relationships are permitted to exist between the radiations which issue from various points of the object.’’ A large condenser lens provides a close approximation to perfect incoherent imaging by ensuring a range of optical paths to neighboring points in the specimen. A hundred years later, STEM can provide these same advantages for the imaging of materials with electrons. A probe of atomic dimensions illuminates the sample (see Fig. 1), and a large annular detector is used to detect electrons scattered by the atomic nuclei. The large angular range of this detector performs the same function as Lord Rayleigh’s condenser lens in averaging over many optical paths from each point inside the sample. This renders the sample effectively self-luminous; i.e., each atom in the specimen scatters the incident probe in proportion to its atomic scattering cross-section. With a large central hole in the annular detector, only high-angle Rutherford scattering is detected, for which the cross-section depends on the square of the atomic number (Z); hence this kind of microscopy is referred to as Z-contrast imaging. The concept of the annular detector was introduced by Crewe et al. (1970), and spectacular images of single heavy
atoms were obtained (see, e.g., Isaacson et al., 1979). In the field of materials, despite annular detector images showing improved resolution (Cowley, 1986a) and theoretical predictions of the lack of contrast reversals (Engel et al., 1974), it was generally thought impossible to achieve an incoherent image at atomic resolution (Cowley, 1976; Ade, 1977). Incoherent images of thick crystalline materials were first reported by Pennycook and Boatner (1988), and the reason for the preservation of incoherent characteristics despite the strong dynamical diffraction of the crystal was soon explained (Pennycook et al., 1990), as described below. An incoherent image provides the most direct representation of a material’s structure and, at the same time, improved resolution. Figure 2 shows Lord Rayleigh’s classic result comparing the observation of two point objects with coherent and incoherent illumination. Each object gives rise to an Airy disc intensity distribution in the image plane, with a spatial extent that depends on the aperture of the imaging system. The two point objects are separated so that the first zero in the Airy disc of one coincides with the central maximum of the other, a condition that has become known as the Rayleigh resolution criterion. With incoherent illumination, there are clearly two peaks in the intensity distribution and a distinct dip in between; the two objects are just resolved, and the peaks in the image intensity correspond closely with the positions of the two objects. With coherent illumination by a plane-wave source (identical phases at the two objects), there is no dip, and the objects are unresolved. Interestingly, however, if the two objects are illuminated 1808 out of phase, then they are always resolved, with the intensity dropping to zero half-way between the two. Unfortunately, this desirable result can only be achieved in practice for one particular image spacing (e.g., by illuminating from a particular angle), and other spatial frequencies will have different phase relationships and therefore show different contrast, a characteristic that is generic to coherent imaging. Note also that the two peaks are significantly displaced from their true positions. In an incoherent image, there are no fixed phase relationships, and the intensity is given by a straightforward
1092
ELECTRON TECHNIQUES
Figure 3. Schematic showing incoherent imaging of a thin specimen with STEM: (A) monolayer raft of Sih110i; (B) the Z-contrast object function represents the high-angle scattering power localized at the atomic sites; (C) illumination of the sites for a ˚ probe located over one atom in the central dumbbell. As 1.26-A the probe scans, it maps out the object function, producing an incoherent image.
convolution of the electron probe intensity profile with a real and positive specimen object function, as shown schematically in Figure 3. With Rutherford scattering from the nuclei dominating the high-angle scattering, the object function is sharply peaked at the atomic positions and proportional to the square of the atomic number. In the figure, a monolayer raft of atoms is scanned by the probe, and each atom scatters according to the intensity in the vicinity of the nucleus and its high-angle cross-section. This gives a direct image with a resolution determined by the probe intensity profile. Later it will be shown how crystalline samples in a zone axis orientation can also be imaged incoherently. These atomic resolution images show similar characteristics to the incoherent images familiar from optical instruments such as the camera; in a Z-contrast image, atomic columns do not reverse contrast with focus or sample thickness. In Figure 1, columns of Ga can be distinguished directly from columns of As simply by inspecting the image intensity. Detailed image simulations are therefore not necessary. This unit focuses on Z-contrast imaging of materials at atomic resolution. Many reviews are available covering other aspects of STEM, e.g., microanalysis and microdiffraction (Brown, 1981; Pennycook, 1982; Colliex and Mory, 1984; Cowley, 1997). Unless otherwise stated, all images were obtained with a 300-kV scanning transmission electron microscope, a VG Microscopes HB 603U ˚ probe size, and all spectroscopy was perwith a 1.26-A formed on a 100-kV VG Microscopes HB 501UX STEM.
the various diffracted beams. These are lost when the intensity is recorded, giving rise to the well-known phase problem. Despite this, much effort has been expended in attempts to measure the phase relationships between different diffracted beams in order to reconstruct the object (e.g., Coene et al., 1992; Orchowski et al., 1995; Mo¨ bus, 1996; Mo¨ bus and Dehm, 1996). However, electrons interact with the specimen potential, which is a real quantity. There is no necessity to involve complex quantities; in an incoherent image, there are no phases and so none can be lost. Information about the object is encoded in the image intensities, and images may be directly inverted to recover the object. As seen in Figure 2, intensity maxima in an incoherent image are strongly correlated with atomic positions, so that often this inversion can be done simply by eye, exactly as we interpret what we see around us in everyday life. With the additional benefit of strong Z contrast, structures of defects and interfaces in complex materials may often be determined directly from the image. In a phase-contrast image, the contrast changes dramatically as the phases of the various diffracted beams change with specimen thickness or objective lens focus, which means that it is much more difficult to determine the object uniquely. It is often necessary to simulate many trial objects to find the best fit. The situation is especially difficult in regions such as interfaces or grain boundaries, where atomic spacings deviate from those in the perfect crystal, which can also cause atomic columns to reverse contrast. If one does not think of the correct structure to simulate, obviously the fit will be spurious. Unexpected phenomena such as the formation of new interfacial phases or unexpected atomic bonding are easily missed. Such phenomena are much more obvious in an incoherent image, which is essentially a direct image of atomic structure. As an example, Figure 4 shows the detection of van der Waals bonding between an epitaxial film and its substrate from the measurement of film-substrate atomic ˚ , this is signifiseparation (Wallis et al., 1997). At 3.2 A cantly larger than bond lengths in covalently bonded semi˚. conductors, which are typically in the range of 2.3 to 2.6 A Closely related to STEM, scanning electron microscopy (SEM) gives an image of the surface (or near-surface)
Competitive and Related Techniques The absence of phase information in an incoherent image is an important advance; it allows the direct inversion from the image to the object. In coherent imaging, the structural relationships between different parts of the object are encoded in the phase relationships between
Figure 4. Z-contrast image of an incommensurate CdTe(111)/ ˚ spacing between film and subSi(100) interface showing a 3.2-A strate indicating van der Waals bonding.
SCANNING TRANSMISSION ELECTRON MICROSCOPY: Z-CONTRAST IMAGING
region of a bulk sample, whereas STEM gives a transmission image through a thin region of the bulk, which by careful preparation we hope is representative of the original bulk sample. The information from the two microscopes is therefore very different, as is the effort required for sample preparation. Whereas in STEM we detect the bright field and dark field transmitted electrons to reveal the atomic structure, in SEM we detect secondary electrons, i.e., low-energy electrons ejected from the specimen surface. The resolution is limited by the interaction volume of the beam inside the sample, and atomic resolution has not been achieved. Probe sizes are therefore much larger, 1 to 10 nm. Secondary electron images show topographic contrast with excellent depth of field. Backscattered primary electrons can also be detected to give a Zcontrast image analogous to that obtained with the STEM annular detector, except at lower resolution and contrast. Other signals such as cathodoluminescence or electron beam-induced conductivity can be used to map optical and electronic properties, and x-ray emission is often used for microanalysis. These signals can also be detected in STEM (Pennycook et al., 1980; Pennycook and Howie, 1980; Fathy and Pennycook, 1981), except that signals tend to be weaker because the specimen is thin. Atomic resolution images of surfaces are produced by scanning tunneling microscopy (STM) and atomic force microscopy (AFM). Here, the interaction is with the surface electronic structure, although this may be influenced by defects in underlying atomic layers to give some subsurface sensitivity. In scanning probe microscopy, it is the valence electrons that give rise to an image; it is not a direct image of the surface atoms, and interpretation must be in terms of the surface electronic structure. The STEM Z-contrast image is formed by scattering from the atomic nuclei and is therefore a direct structure image. Valence electrons are studied in STEM through electron energy loss spectroscopy (EELS) and can be investigated at atomic resolution using the Z-contrast image as a structural reference, as discussed below. Scanning transmission electron microscopy can study a wider range of samples than STM or AFM as it has no problem with rough substrates and can tolerate substrates that are quite highly insulating. Data from EELS provide information similar to that from x-ray absorption spectroscopy. The position and fine structure on absorption edges give valuable information on the local electronic structure. The EELS data give such information from highly localized regions, indeed, from single atomic columns at an interface (Duscher et al., 1998a). Instead of providing details of bulk electronic structure, it provides information on how that structure is modified at defects, interfaces, and grain boundaries and insight into the changes in electronic, optical, and mechanical properties that often determine the overall bulk properties of a material or a device. The detailed atomic level characterization provided by STEM, with accurate atomic positions determined directly from the Z-contrast image and EELS data on impurities, their valence, and local band structure, represents an ideal starting point for theoretical studies. This is particularly
1093
valuable for complex materials; it would be impractical to explore all the possible configurations of a complex extended defect with first-principles calculations, even with recent advances in computational capabilities. As the complexity increases, so the number of required trial structures grows enormously, and experiment is crucial to cut the possibilities down to a manageable number. Combining experiment and theory leads to a detailed and comprehensive picture of complex atomistic mechanisms, including equilibrium structures, impurity or stress-induced structural transformations, and dynamical processes such as diffusion, segregation, and precipitation. Recent examples include the observation that As segregates at specific sites in a Si grain boundary in the form of dimers (Chisholm et al., 1998) and that Ca segregating to an MgO grain boundary induces a structural transformation (Yan et al., 1998). PRINCIPLES OF THE METHOD Comparison to Coherent Phase-Contrast Imaging The conventional means of forming atomic resolution images of materials is through coherent phase-contrast imaging using plane-wave illumination with an (approximately) parallel incident beam (Fig. 5A). The objective aperture is behind the specimen and collects diffracted beams that are brought to a focus on the microscope screen where they interfere to produce the image contrast. The electrons travel from top to bottom in the figure. Not shown are additional projector lenses to provide higher magnification. In Figure 5B, the optical path of the STEM is shown, with the electrons traveling from bottom to top. A point source is focused into a small probe by the objective lens, which is placed before the specimen. Not shown are the condenser lenses (equivalent to projector lenses in the CTEM) between the source and the objective lens to provide additional demagnification of the source. Transmitted electrons are then detected through a defined angular range. For the small axial collector aperture shown, the two microscopes have identical optics, apart from the fact that the direction of electron propagation is reversed. Since image contrast in the electron microscope is dominated by elastic scattering, no energy loss is involved and time reversal symmetry applies. With equivalent apertures, the image contrast is independent of the direction of electron propagation, and the two microscopes are optically equivalent: the STEM bright-field image will be the same image, and be described by the same imaging theory, as that of a conventional TEM with axial illumination. This is the principle of reciprocity, which historically was used to predict the formation of high-resolution lattice images in STEM (Cowley, 1969; Zeitler and Thomson, 1970). For phase-contrast imaging, the axial aperture (illumination aperture in TEM, collection aperture in STEM) must be much smaller than a typical Bragg diffraction angle. If b is the semiangle subtended by that aperture at the specimen, then the transverse coherence length in the plane of the specimen will be of order l/b, where l is the electron wavelength. Coherent illumination of
1094
ELECTRON TECHNIQUES
Figure 6. Contrast transfer functions for a 300-kV microscope with an objective lens with Cs ¼ 1 mm; (A) coherent imaging conditions; (B) incoherent imaging conditions. Curves assume the Scherzer (1949) optimum conditions shown in Table 1: (A) defocus ˚ ; (B) defocus 438 A ˚ , aperture cutoff 0.935 A ˚ 1. 505 A
Figure 5. Schematics of the electron-optical arrangement in (A) CTEM and (B) STEM with a small axial detector. Note the direction of the electron propagation is reversed in the two microscopes. (C) A large-angle bright-field detector or annular darkfield detector in STEM gives incoherent imaging.
neighboring atoms spaced a distance d apart requires d l/b, or b l/d 2yB, where yB is the Bragg angle corresponding to the spacing d. Similarly, if the collection aperture is opened much wider than a typical Bragg angle (Fig. 5C), then the transverse coherence length at the specimen becomes much less than the atomic separation d, and the image becomes incoherent in nature, in precisely the same way as first described by Lord Rayleigh. Now the annular detector can be seen as the complementary dark-field
version of the Rayleigh case; conservation of energy requires the dark-field image I ¼ 1 IBF, where IBF is the bright-field image and the incident beam intensity is normalized to unity. In practice, it is more useful to work in the dark-field mode for weakly scattering objects. If I 1, the incoherent bright-field image shows weak contrast on a large background due to the unscattered incident beam. The dark-field image avoids this background, giving greater contrast and better signal-to-noise ratio. The difference between coherent and incoherent characteristics is summarized in Figure 6, where image contrast is plotted as a function of spatial frequency in a microscope operating at an accelerating voltage of 300 kV. In both modes some objective lens defocus is used to compensate for the very high aberrations of electron lenses, and the conditions shown are the optimum conditions for each mode as defined originally by Scherzer (1949). In the coherent imaging mode (Fig. 6A), the contrast transfer function is seen to oscillate rapidly with increasing spatial frequency (smaller spacings). Spatial frequencies where the transfer function is negative are imaged in reverse contrast; those where the transfer function crosses zero will be absent. For this reason, atomic positions in distorted regions such as grain boundaries may reverse contrast or be absent from a phase-contrast image. In the incoherent mode (Fig. 6B), a smoothly
SCANNING TRANSMISSION ELECTRON MICROSCOPY: Z-CONTRAST IMAGING
Figure 7. Diffraction pattern in the detector plane from a simple cubic crystal of spacing such that the angle between diffracted beams is greater than the objective aperture radius. (A) An axial bright-field detector shows no contrast. (B) Regions of overlapping discs on the annular detector produce atomic resolution in the incoherent image.
decaying positive transfer is obtained (referred to as an object transfer function to distinguish from the coherent case), which highlights the lack of contrast reversals associated with incoherent imaging. It is also apparent from the two transfer functions that incoherent conditions give significantly improved resolution. To demonstrate the origin of the improved resolution available with incoherent imaging, Figure 7 shows the diffraction pattern formed on the detector plane in the STEM from a simple cubic crystal with spacing d. Because the illuminating beam in the STEM is a converging cone, each Bragg reflection appears as a disc with diameter determined by the objective aperture. Electrons, unlike light, are not absorbed by a thin specimen but are only diffracted or scattered. The terms are equivalent, diffraction being reserved for scattering from periodic assemblies of atoms when sharp peaks or discs are seen in the scattered intensity. The periodic object provides a good demonstration of Abbe´ theory (see, e.g., Lipson and Lipson, 1982)
1095
that all image contrast arises through interference; as the probe is scanned across the atomic planes, only the regions of overlap between the discs change intensity as the phases of the various reflections change (Spence and Cowley, 1978). For the crystal spacing corresponding to Figure 7A, coherent imaging in CTEM or STEM using a small axial aperture will not detect any overlaps and will show a uniform intensity with no contrast. The limiting resolution for this mode of imaging requires diffraction discs to overlap on axis, which means they must be separated by less than the objective aperture radius. Any detector that is much larger than the objective aperture will detect many overlapping regions, giving an incoherent image. Clearly, the limiting resolution for incoherent imaging corresponds to the discs just overlapping, i.e., to a crystal spacing such that the diffracted beams are separated by the objective aperture diameter (as opposed to the radius), double the resolution of the coherent image formed by an axial aperture. The approximately triangular shape of the contrast transfer function in Figure 6B results from the simple fact that a larger crystal lattice produces closer spaced diffraction discs with more overlapping area, thus giving more image contrast. In practice, of course, one would tend to use larger objective apertures in the phase-contrast case to improve resolution. The objective aperture performs a rather different role in the two imaging modes; in the coherent case the objective aperture is used to cut off the oscillating portion of the contrast at high spatial frequencies, and the resolution is directly determined by the radius of the objective aperture, as mentioned before. In the incoherent case, again it is clear that the available resolution will be limited to the aperture diameter, but there is another consideration. To take most advantage of the incoherent mode requires that the image bear a direct relationship to the object, so that atomic positions correlate closely with peaks in the image intensity. This is a rather more stringent condition and necessitates some loss in resolution. The optimum conditions worked out long ago by Scherzer (1949) for both coherent and incoherent imaging conditions are shown in Figure 5 and summarized in Table 1. The Scherzer resolution limit for coherent imaging is now only 50% poorer than for the incoherent case. The role of dynamical diffraction is also very different in the two modes of imaging. As will be seen later, dynamical diffraction manifests itself in a Z-contrast image as a columnar channeling effect, not by altering the incoherent nature of the Z-contrast image, but simply by scaling the columnar scattering intensities. In coherent imaging, dynamical diffraction manifests itself through contrast
Table 1. Optimum Conditions for Coherent and Incoherent Imaging Parameter Resolution limit Optimum defocus Optimum aperture
Coherent Imaging 1/4 3/4
0.66CS l 1.155 (CSl)1/2 1.515(l/CS)1/4
Incoherent Imaging 0.43CSl1/4l3/4 (CSl)1/2 1.414 (l/CS)1/4
1096
ELECTRON TECHNIQUES
reversals with specimen thickness, as the phases between various interfering Bragg reflections change. These effects are also nonlocal, so that atom columns adjacent to interfaces will show different intensity from columns of the same composition far from the interface. Along with the fact that atom columns can also change from black to white by changing the objective lens focus, these effects make intuitive interpretation of coherent images rather difficult. Of course, one usually has a known material next to an interface that can help to establish the microscope conditions and local sample thickness, but these characteristics of coherent imaging make it very difficult to invert an image of an interface without significant prior knowledge of its likely structure. Typically, one must severely limit the number of possible interface structures that are considered, e.g., by assuming an interface that is atomically abrupt, with the last monolayer on each side having the same structure as the bulk. Although some interfaces do have these characteristics, it is also true that many, and perhaps most, do not have such simple structures. In the case of the CoSi2/Si(111) interface, this procedure gives six possible interface models. At interfaces made by ion implantation, however, different interface structures were seen by Z-contrast imaging (Chisholm et al., 1994). There are now a large number of examples where interface structures have proved to be different from previous models: in semiconductors (Jesson et al., 1991; McGibbon et al., 1995; Chisholm and Pennycook, 1997), superconductors (Pennycook et al., 1991; Browning et al., 1993a), and ceramics (McGibbon et al., 1994, 1996; Browning et al., 1995). If we apply reciprocity to the STEM annular detector, it is clear that an equivalent mode exists in CTEM using high-angle hollow-cone illumination. In practice, it has not proved easy to achieve such optics. In any case, a much higher incident beam current will pass through the specimen than in the equivalent STEM geometry, because the illumination must cover an angular range much greater than a typical Bragg angle, whereas the STEM objective aperture is of comparable size to a Bragg angle. Beam damage to the sample would therefore become a serious concern. Another advantage of the STEM geometry that should be emphasized is that the detection angle can be increased without limit, up to the maximum of 1808 if so desired. Lord Rayleigh’s illumination was limited to the maximum aperture of the condenser lens, so that at the limit of resolution (with condenser and objective apertures equal), significant partial coherence exists in an optical microscope. In STEM, by increasing the inner angle of the annular detector, the residual coherence can be reduced well below the limiting resolution imposed by the objective aperture. It is therefore a more perfect form of incoherent imaging than fixed-beam light microscopy. However, increasing detector angles will lead to reduced image intensity due to the rapid fall-off in atomic scattering factor, and a backscattered electron detector is really not practical. With high-energy electrons, the full unscreened Rutherford scattering cross-section is seen above a few degrees scattering angle, so higher scattering angles bring no advantage. A useful criterion for the minimum detector
Figure 8. (A) Bright-field phase-contrast STEM image of an iodine intercalated bundle of carbon nanotubes showing lattice fringes. (B) Z-contrast image taken simultaneously showing intercalated iodine.
aperture yi to achieve incoherent imaging of two objects separated by R is yi ¼ 1:22l=R
ð1Þ
where the detected intensity varies by n: ðnÞ E3
¼
ðnþ1Þ E3
"
ðnÞ Ed
X ðnÞ ðnÞ ðnþ1Þ S ðE3 Þ cos 2 b
ðnÞ
½Sp;b
ðnÞ
ðnþ1Þ
E3 ¼ E3
P ð0Þ ð j1Þ Efoil ¼ xð0Þ NF Þ j ¼ 1 Sr ðEf Y dEd ðnÞ ¼ ½Sp;r dx n
ðnÞ
n
ðnÞ
19,20 21
X ðnÞ ðnÞ ðnþ1Þ S ðE3 Þ cos 2 r
# n X ðnÞ X ðj1Þ ð jÞ ¼ K E0 S ðE0 Þ cos 1 j¼1 p " # 2 X X ðnÞ ðnÞ ð jÞ ðnÞ ð j1Þ S ðKE0 Þ þ Sr ðE3 Þ Efoil cos 2 r j¼n
14,15
17,18
ðnÞ ðnÞ KSp ðE0 Þ Sb ðKE0 Þ ¼ þ cos 2 cos 1
X ðnÞ ðnÞ ðnÞ S ðKE0 Þ cos 2 r
"
0
ðnÞ
ðnÞ
ðnÞ Ed
16 dEd ðnÞ Sb ðEd Þ ¼ ½Sp;b ðnÞ dx n Sb ðKE Þ
ðnÞ
E3 ¼ KE0
For a scattering event occurring in slab j > n: 13
# n X ðnÞ X ð j1Þ ð jÞ ¼ K E0 S ðE0 Þ cos 1 j¼1 p " # 2 X X ðnÞ ðnÞ ðnÞ ð j1Þ ðjÞ S ðKE0 þ Sb ðE3 Þ cos 2 b j¼n
ðnÞ
QNr ðxÞsr ðE0 ; fÞdEd cos1 dEd =dxjn !2
ðnÞ dsR ðE0 ; fÞ Z1 Z2 e2 4 ðM1 þ M2 Þ2 ¼ 3 ðnÞ cos f d ðM2 Þ2 4E0
For a scattering event occurring in slab n:
ðnÞ
X ðnÞ ðnÞ S ðKE0 Þ cos 2 b
ðM1 þ M2 Þ2 ðnÞ
For a scattering event occurring in slab n: E3 ¼ KE0
4M1 M2 cos2 f
3,4
ðnÞ
QN ðnÞ ðxÞsðE0 ; yÞdEd cos1 dEd =dxjn !2
ðnÞ dsR ðE0 ; yÞ Z1 Z2 e2 4 ¼ ðnÞ d sin4 y 4E0 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð 1 ½ðM1 =M2 Þsiny2 þ cosyÞ2 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 ½ðM1 =M2 Þsiny2 Y ðnÞ ðEd Þ ¼
ERD
ðnÞ
ðnÞ
ðnÞ
ðnÞ ½Sp;r
KSp ðE0 Þ Sr ðKE0 Þ ¼ þ cos 1 cos 2
Y
Sr ðE3 Þ Sr
n
ðnÞ
¼
ðnÞ
ðnÞ
ðn1Þ
ðn1Þ
Sr ðE2 Þ Sr
ðn1Þ
ðE3
ðnÞ
ðE3 Þ
Þ
ð0Þ
Sr ðEd Þ ð0Þ
ð1Þ
Sr ðE3 Þ
Figure 23. Equations specific to either RBS or ERD.
J. C. BARBOUR Sandia National Laboratories Albuquerque, New Mexico
NUCLEAR REACTION ANALYSIS AND PROTON-INDUCED GAMMA RAY EMISSION INTRODUCTION Nuclear reaction analysis (NRA) and proton- (particle-) induced gamma ray emission (PIGE) are based on the interaction of energetic (from a few hundred kilo-electron-volt to several mega-electron-volt) ions with light nuclei. Every nuclear analytical technique that uses nuclear reactions has a unique feature—isotope sensitivity. Therefore, it is insensitive to matrix effects, and there is much less interference than in methods where signals from different elements overlap. In NRA the nuclear
reaction produces charged particles, while in PIGE the excited nucleus emits gamma rays. Sometimes the charged particle emission and the gamma ray emission occur simultaneously, such as in the 19 Fðp; agÞ16 O reaction. Both NRA and PIGE measure the concentration and depth distribution of elements in the surface layer (few micrometers) of the sample. Both techniques are limited by the available nuclear reactions. Generally, they can be used for only light elements, up to calcium. This is the basis for an important property of these methods: the nuclear reaction technique is one of the few analytical techniques that can quantitatively measure hydrogen profiles in solids. Since these techniques are sensitive to the nuclei in the sample, they are unable to provide information
NUCLEAR REACTION ANALYSIS AND PROTON-INDUCED GAMMA RAY EMISSION
about the chemical states and bonds of the elements in the sample. For the same reason they cannot provide information about the microscopic structure of the sample. Combined with channeling, NRA or PIGE can provide information about the location of the measured element in a crystalline lattice. The sensitivity and depth resolution of these methods depend on the specific nuclear reaction. The sensitivity typically varies between 10 and 100 ppm while the depth resolution can be as good as a few nanometers or as large as hundreds of nanometers. The lateral resolution of the method depends on the size of the bombarding ion beam. Good nuclear microprobes are currently in the few-micrometer range. Also, using nuclear microprobes, an elemental/isotopic image of the sample can be recorded. Both NRA and PIGE are nondestructive, although some materials might be lattice damaged after long, high-current bombardment. Although NRA and PIGE are quantitative, in most cases standards have to be used. Depending on the shape of the particular nuclear cross-section, nonresonant or resonant depth profiling can be used. The resonant profiling (in most cases PIGE uses resonances but there are a few charged particle reactions that have usable resonance) typically can give very good resolution, but the measurement takes much longer than in nonresonant profiling; therefore, the probability of inducing changes in the sample by the ion beam is higher. These techniques require a particle accelerator capable of accelerating ions up to several mega-electron-volts. This limits their availability to laboratories dedicated to these methods or that have engaged in low-energy nuclear physics in the past (i.e., they have a particle accelerator that is no longer adequate for modern nuclear physics experiments because of its low energy but is quite suitable for nuclear analytical techniques). This unit will concentrate on the specific aspects of these two nuclear analytical techniques and will not cover the details of the ion-solid interaction or the general aspects of ion beam techniques (e.g., stopping power, detection of ions). These are described in detail elsewhere in this part (see ELASTIC ION SCATTERING FOR COMPOSITION ANALYSIS and MEDIUM-ENERGY BACKSCATTERING AND FORWARDRECOIL SPECTROMETRY). Competitive and Complementary Techniques Practically every analytical technique that measures the concentration and depth profile of elements in the top few micrometers of a solid competes with NRA and PIGE. However, either most of these techniques are not isotope sensitive or their isotope resolution is not very good. The competing techniques can be divided into two groups: ion beam techniques and other techniques. The ion beam techniques that compete with NRA and PIGE are Rutherford backscattering spectrometry (RBS; see ELASTIC ION SCATTERING FOR COMPOSITION ANALYSIS), elastic recoil detection (ERD or ERDA; see ELASTIC ION SCATTERING FOR COMPOSITION ANALYSIS), proton-induced x-ray emission (PIXE; see PARTICLE-INDUCED X-RAY EMISSION), secondary ion mass spectroscopy (SIMS), medium-energy ion scattering
1201
(MEIS; see MEDIUM-ENERGY BACKSCATTERING AND FORWARDand ion scattering spectroscopy (ISS; see HEAVY-ION BACKSCATTERING SPECTROMETRY). Rutherford backscattering spectrometry and ERD can be considered special cases of NRA in which the nuclear reaction is just an elastic scattering. The main advantage of RBS and ERD is that they are able to see almost all elements in the periodic table (with the obvious exception in RBS when the element to be detected is lighter than the bombarding ions). This can be a disadvantage when a light element has to be measured in a heavy matrix (e.g., the measurement of oxygen in tantalum by RBS). This is the typical case when NRA can solve the problem but RBS cannot. In ERD, since the forward recoiled atoms have to be mass analyzed, the measurement becomes more complicated than in NRA, requiring more sophisticated (and more expensive) instruments. The resolution and sensitivity achieved when using RBS, ERD, and NRA are on the same order, although when using very heavy ions, the sensitivity and depth resolution can be an order of magnitude better in ERD than in NRA. Using heavy ions presents other problems, but these are beyond the scope of this unit (see HEAVY-ION BACKSCATTERING SPECTROMETRY). For more discussion see Green and Doyle (1986) and Davies et al. (1995). Proton-induced x-ray emission (see PARTICLE-INDUCED X-RAY EMISSION) can detect most elements, except H and He. Also, the detection of x rays from low-Z elements (below Na) requires a special, windowless detector. Another drawback of PIXE is that it generally cannot provide depth information. Secondary ion mass spectroscopy can be used for most of the analyses performed in NRA, including hydrogen profiling. Its sensitivity and depth resolution are superior to those of NRA. The main advantage of NRA over SIMS is that while NRA is a nondestructive technique, SIMS depth profiling destroys the sample. Also, the depth scale with SIMS depends on accurate knowledge of sputtering rates for a specific sample. Although MEIS and ISS can be considered competing techniques, their probing depth is much smaller (which also means much better depth resolution) than that of NRA. Among the nonion beam techniques, Auger electron spectroscopy (AES, see AUGER ELECTRON SPECTROSCOPY), x-ray photoelectron spectroscopy (XPS), x-ray fluorescence (XRF, see X-RAY MICROPROBE FOR FLUORESCENCE AND DIFFRACTION ANALYSIS), and neutron activation analysis (NAA) should be mentioned. Auger electron spectroscopy and XPS not only provide concentration information but also are sensitive to the chemical states of the elements; therefore, they give information about chemical bonds. Since both methods get the information from the first few nanometers of the sample, to measure a depth profile, layers of the sample have to be removed (usually by sputtering); therefore, the sample is destroyed. X-ray fluorescence can provide concentration information about elements heavier than Na (again, it depends on the x-ray detector) but it cannot measure the depth profile. The only technique listed that has the same isotope sensitivity as NRA is NAA. However, NAA is a bulk technique and does not provide any depth information. RECOIL SPECTROMETRY),
1202
ION-BEAM TECHNIQUES
Most of the above-mentioned techniques are complementary techniques to NRA, and in many cases are, used concurrently with NRA. [It is quite common in ion beam analysis (IBA) laboratories for an ion-scattering chamber to have an AES or XPS spectrometer mounted on it.] The most frequent combination is RBS and NRA, since they are closely related and use the same equipment. Whereas, NRA can measure light elements in a heavy matrix but cannot see the matrix itself (at least not directly), RBS is very sensitive and has a good mass resolution for heavy elements but cannot see a small amount of some light element in a heavy matrix. PRINCIPLES OF THE METHOD Although NRA and PIGE are similar to other high-energy ion beam techniques (see ELASTIC ION SCATTERING FOR COMPOSITION ANALYSIS and PARTICLE-INDUCED X-RAY EMISSION), here we will discuss only the principles specific to NRA and PIGE. Both NRA and PIGE measure the prompt reaction products from nuclear reactions. The yield (number of detected particles, g rays) provides information about the concentration of elements, and the energy of the detected charged particles provides information about the depth distribution (depth profile) of the elements. Depending on whether the reaction cross-section has a sharp resonance or not, the methods are distinguished as either resonant or nonresonant (e.g., PIGE uses only the resonant method). When a high-energy ion beam is incident on a target, the ions slow down in the material and either undergo elastic scattering (RBS, ERDA) or induce a nuclear reaction at various depths. In a nuclear reaction, a compound nucleus is formed and almost immediately a charged particle (p, d, 3He, and 4He) and/or a g photon is emitted. Many (p, a) reactions have an associated g photon. The emitted charged particle/photon then leaves the target and is detected by an appropriate detector. The energy of the charged particle depends on the angle of the incident ion and the emitted particle, the kinetic energy of the incident particle, and the Q value of the reaction, where Q is the energy difference between the initial and final nuclear states. The energy of the emitted g photon is determined by the energy structure of the compound nucleus (for the formulas, see Appendix). The detected number of emitted particles is proportional to the number of incident particles, the solid angle of the detector, the concentration of the atoms participating in the nuclear reaction, and the cross-section of the reaction. The charged reaction products will lose energy on their way out from the target; therefore, their energy will carry information about the depth where the nuclear reaction occurred. Since the energy of the g photons does not change while they travel through the target, they do not provide depth information.
Figure 1. Cross-section of the 16O(d, p1)17O reaction; y is the scattering angle (Jarjis, 1979).
can be determined in thin layers independent of the concentration profile and the other components of the target. Assuming that the incident ions lose E energy in the thin layer and the reaction cross-section sðEÞ sðE0 Þ for E0 > E > E E, the number of particles detected is Y¼
QC sðE0 ÞNt cos ain
ð1Þ
where QC is the collected incident ion charge, is the solid angle of the detector, ain is the angle between the direction of the incident beam and the surface normal of the target, and Nt is the number of nuclei per square centimeter. The spectrum of emitted particle would contain a peak with an area Y. An example of such a cross-section is shown in Figure 1. The 16O(d, p1)17O reaction has a plateau between 800 and 900 keV, as indicated by arrows in the figure. This reaction is frequently used to determine oxygen content of thin layers or thickness of surface oxide layers up to several hundreds of nanometers. Depth Profiling. When the thickness of the sample becomes larger and the cross-section cannot be considered constant, the spectrum becomes the convolution of the concentration profile, the energy resolution of the incident and the detected beam, and the cross-section. A typical scattering geometry is shown in Figure 2. Although the figure shows only backward geometry, forward geometry is used in certain cases as well. The depth scale can be calculated by determining the energy loss of the incident ions
Nonresonant Methods Overall Near-Surface Content. If the cross-section changes slowly in the vicinity of the E0 bombarding energy, the absolute value of nuclei per square centimeters
Figure 2. Typical scattering geometry used in NRA experiments.
NUCLEAR REACTION ANALYSIS AND PROTON-INDUCED GAMMA RAY EMISSION
1203
Figure 3. Principle of resonant depth profiling.
Figure 4. Cross-section of the 18O(p, a)15N reaction around the 629-keV resonance; y is the scattering angle (Amsel and Samuel, 1967).
before they induce the nuclear reaction and the energy loss of the reaction product particles: ð x=cosaout Sout ðEÞ dx ð2Þ Eout ðxÞ ¼ EðEin ðxÞ; Q; cÞ
the cross-section of the 18O(p, a)15N reaction. The resonance at 629 keV is widely used in 18O tracing experiments where oxygen movement is studied. This is an excellent example of how useful NRA and PIGE are, since these methods are sensitive to only 18O and not to the other oxygen isotopes. The measured spectrum (‘‘excitation curve’’) is the convolution of the concentration profile and the energy spread of the detection system. To extract the depth profile from the excitation function, we should deconvolve it with the depth-dependent energy spread or fit the excitation function to a simulation of it. The theory and simulation of narrow resonances are discussed in detail in Maurel (1980), Maurel et al. (1982), and Vickridge (1990).
Ein ðxÞ ¼ E0
ð x=cosain
0
Sin ðEÞ dx
ð3Þ
0
where x is the depth at which the nuclear reaction occurs, E0 is the energy of the incident ions, Q is the Q value of the reaction, c is the angle between the incident and detected beams, ain and aout are angles of the incident and detected beams to the surface normal of the sample, and Sin and Sout are the stopping powers of the incident ions and the reaction products. The term EðEin ðxÞ; Q; cÞ is the energy of the reaction product calculated from Equation 7. (Also see Fig. 7.) In most cases, to get quantitative results, a nonlinear fit of the spectrum to a simulated spectrum is necessary (Simpson and Earwaker, 1984; Vizkelethy, 1990; Johnston, 1993). For a discussion of energy broadening, see the Appendix. Resonant Depth Profiling In case of sharp resonances in the cross-section, most particles/g photons come from a very narrow region in the target that is equal to the energy-loss-related width of the resonances. As the energy of the incident beam increases, the incident ions slow to the resonance energy at deeper and deeper depth; therefore, the thin layer from which the reaction products are coming lies deeper and deeper. In this way a larger depth can be probed and a depth profile can be measured. The principle of the method is illustrated in Figure 3. The depth x and the energy E0 of the incident ions are related through the equation E0 ðxÞ ¼ ER þ
ð x=cosain
Sin ðEÞ dx
ð4Þ
0
where ER is the resonance energy. In the figure C(x) is the concentration of the element to be detected. Figure 4 shows
PRACTICAL ASPECTS OF THE METHOD Equipment The primary equipment used in NRA and PIGE is an electrostatic particle accelerator, which is the source of the high-energy ion beam. The accelerators used for IBA are either Cockroft-Walton or van de Graaff type (or some slight variation of them, such as pelletron accelerators) with terminal voltage in the range of a few million volts. When higher energies are needed or heavy ions have to be accelerated, tandem accelerators are used. In general, the requirements for the accelerators used in NRA and PIGE are similar to those of any other IBA technique. In addition, to perform resonance profiling, the accelerators must have better energy resolution and stability than are necessary for RBS or ERDA and be capable of changing the beam energy easily. Usually, accelerators with slit feedback energy stabilization systems are satisfactory. Another NRA-specific requirement is needed because of the deuteron-induced nuclear reactions. When deuterons are accelerated, neutrons are generated mainly from the D(d, n) 3He reaction. Therefore, if deuteron-induced nuclear reactions are used, additional shielding is necessary to protect personnel against neutrons. Many laboratories using ion beam techniques do not have this shielding. Most NRA and PIGE measurements take place in vacuum, with the exception of a few extracted beam experiments. The required vacuum, i.e., >103 Pa,
1204
ION-BEAM TECHNIQUES
depends on the particular application; it usually ranges from 103 to 107 Pa. The main problem caused by bad vacuum is hydrocarbon layer formation on the sample under irradiation, which can interfere with the measurement. Having a load-lock system attached to the scattering chamber makes the sample changing convenient and fast. In most cases, standard surface barrier or ionimplanted silicon particle detectors are used to detect charged particles. These detectors usually have an energy resolution around 10 to 12 keV, which is adequate for NRA. When better energy resolution is required, electrostatic or magnetic spectrometers or time-of-flight (TOF) techniques can be used. In PIGE, NaI, BGO (bismuth germanate), Ge(Li), or HPGE (high-purity germanium) detectors are used to detect g rays. The NaI and BGO detectors are used when high efficiency is needed (measuring low concentrations, or the cross-section is too small) using well-isolated resonances. The NaI detectors were used mainly before BGO detectors became available, but currently BGO detectors are preferred since their efficiency is higher for a given size (Kuhn et al., 1990) and have a better signal-to-background ratio. The Ge(Li) or HPGE detectors are used when there are interfering peaks in the spectrum that cannot be resolved using other detectors. To process the signals from the detectors, standard NIM (nuclear instrument module) or CAMAC (computerautomated measurement and control) electronics modules, such as amplifiers, single-channel analyzers (SCA), and analog-to-digital converters (ADCs), are used. In most cases, the ADC is on a card in a computer or connected to a computer through some interface (e.g., GPIB, Ethernet). Therefore, the spectrum can be examined while it is being collected and the data can be evaluated immediately after the measurement. Filtering of Unwanted Particles (NRA) From Figure 2 it is obvious that not only the reaction products but also all the backscattered particles will reach the detector. Since the Q value of the most frequently used reactions is positive, the backscattered particle will have less energy than the reaction products; therefore, the signal from the backscattered particles will not overlap in the spectrum. [The exceptions are the (p, a) reactions with low Q value. Due to the higher stopping power of the a particles than that of the protons, the energy of an a particle coming from a deeper layer can be the same as the energy of the proton coming from another layer.] However, the cross-section of the Rutherford scattering is usually much larger than the cross-section of the nuclear reaction we want to use. Therefore, much less reaction product would be detected in a unit time than backscattered particles. Since every detection system has finite dead time, the number of backscattered particles would limit the maximum incident ion current, so the NRA measurement would take a very long time. To make the measurement time reasonable, the backscattered particles should be filtered out. The simplest method to get rid of the backscattered particles is to use an absorber foil, since the energy of
the reaction products is higher than the energy of the backscattered particles. The energy of the reaction products after the absorber foil is Eabs ðxÞ ¼ Eout ðxÞ
ð xabs
Sabs ðEÞ dx
ð5Þ
0
where xabs and Sabs are the thickness and stopping power of the absorber foil and Eout(x) is from Equation 2. Choosing an absorber foil thickness such that Eabs is larger than the energy range of the backscattered particles will eliminate the unwanted particles. Usually Mylar or aluminum foils are used as absorbers. The main disadvantage of the method is the poor depth resolution due the large energy spread (straggling) in the absorber foil. There are more sophisticated methods to filter the backscattered particles. Here we briefly list them without detailed discussion: 1. Electrostatic or magnetic deflection (or the use of electrostatic or magnetic spectrometers) gives much better resolution than the absorber method but is complicated, time consuming, and expensive (spectrometers). For applications see Mo¨ ller (1978), Mo¨ ller et al. (1977), and Chaturvedi et al. (1990). 2. The TOF technique, a standard technique used in nuclear physics to distinguish particles, can be used to select the reaction products only. This technique requires sophisticated electronics and a twodimensional multichannel analyzer. 3. The thin-detector technique is used when protons and a peaks overlap in a spectrum. Since the stopping power of protons is much smaller than the stopping power of a particles, the protons would lose only the fraction of their energies in a thin detector while the a particles stop completely in it. Thus the a particles will be separated from the mix. This technique uses either ‘‘dE/dx detectors,’’ which are quite expensive, or low-resistivity detectors, with low bias voltage. Recently the use of low-resistivity detectors was studied in detail by Amsel et al. (1992). The TOF and thin-detector techniques have the disadvantage that although they select the reaction product, the large flux of backscattered particles still reach and can damage the detector. Energy Scanning (for Resonance Depth Profiling) To do resonance depth profiling, the energy of the incident ion beam has to be changed in small steps. After acceleration, the ion beam goes through an analyzing magnet that selects the appropriate energy. Generally, the energy is changed by adjusting the terminal voltage and the magnetic field. Automatic energy-scanning systems have been developed using electrostatic deflection plates after the analyzing magnet by Amsel et al. (1983) and Meier and Richter (1990). Varying the voltage on these plates, the magnetic field is kept constant and the terminal voltage is changed by the slit feedback system. This
NUCLEAR REACTION ANALYSIS AND PROTON-INDUCED GAMMA RAY EMISSION
system allows energy scanning up to several hundred kilo-electron-volts.
1205
available from the Sigmabase together with many (p, p) and (a,a) cross-sections. Usage of Standards
Background, Interference, and Sensitivity Since NRA and PIGE are both sensitive to specific isotopes, the background, interference from the sample matrix, and sensitivity, all depend on which nuclear reaction is used. When the nonresonant technique is used, the spectrum is generally background free. Interference from elements in the matrix is possible, especially in the deuteron-induced reaction case. An element-by-element discussion of the possible interferences can be found in Vizkelethy (1995) for NRA and in Hirvonen (1995) for PIGE. In resonant depth profiling, there are two sources of background. One is natural background radiation (present only for PIGE). The effect of this background can be taken into account by careful background measurements or can be minimized by using appropriate active and passive shielding of the detector (Damjantschitsch et al., 1983; Kuhn et al., 1990; Horn and Lanford, 1990). The second background source is also present if the resonance is not an isolated resonance that sits on top of a non-resonant cross-section. In this case, the background is not constant and depends on the concentration profile of the element we want to measure. To extract a reliable depth profile from the excitation curve, a non-linear fit to a simulation is necessary. As mentioned above, the sensitivity depends on the cross-section used and the composition of the sample. Generally the sensitivities of NRA and PIGE are on the order of 10 to 100 ppm. Lists of sensitivities for special applications can be found in Bird et al. (1974) and Hirvonen (1995).
Cross-Sections and Q values The most important data in NRA and PIGE are the crosssections, the Q values of the reactions, and the energy level diagrams of nuclei. The most recent compilation of Q values and energy level diagrams can be found for A ¼ 3 in Tilley et al. (1987); A ¼ 5; . . . ; 10 in Ajzenberg-Selove (1988); A ¼ 11; 12 in Ajzenberg-Selove (1990); A ¼ 13; . . . ; 15 in Ajzenberg-Selove (1991); A ¼ 16; 17 in Tilley et al. (1993); fA ¼ 18; 19 in Tilley et al. (1995); A ¼ 18; . . . ; 20 in Ajzenberg-Selove (1987); and A ¼ 21; . . . ; 44 in Endt and van der Leun (1978). Data of frequently used cross-sections can be found in Foster et al. (1995), and an extensive list of (p, g) resonances can be found in Hirvonen and Lappalainen (1995). The data are also available on the Internet (see Internet Resources). The energy levels are available from Lund Nuclear Data Service (LNDS) and from Triangle Universities Nuclear Laboratories (TUNL). An extensive database of nuclear cross-sections is available from the National Nuclear Data Center, NNDC) and from the T-2 Nuclear Information Services, although these serve more the needs of nuclear physicists than people working in the field of IBA. Nuclear reaction cross-sections used in IBA are
There are two basic reasons to use standards with NRA and PIGE. First, the energy of the bombarding beam is determined by reading the generating voltmeter (GVM) of the accelerator or by reading the magnetic field of the analyzing magnet. Usually these readings are not absolute and might shift with time. Therefore, in depth profiling it is necessary to use standards to determine the energy of the resonance (with respect to the accelerator readings) and the energy spread of the beam. Second, since the yield is proportional to the collected charge, the cross-section, and the solid angle of the detector, to make an absolute measurement, the absolute values of these three quantities must be known precisely. In most cases, these values are not available. Ion bombardment always causes secondary electron emission. Imperfect suppression of the secondary electrons falsifies the current (and therefore the collected charge) measurement. It is not easy to design a measurement setup that can measure current with high absolute precision, but high relative precision and reproducibility are easy to achieve. Also, the solid angle of the detector and the absolute cross-section are usually not known precisely. Using reference targets that compare the reference yield to the yield from the unknown sample can eliminate most of these uncertainties. A good reference target must satisfy the following requirements: (1) have high lateral uniformity; (2) be thick enough to provide sufficient yield in reasonable time but thin enough not to cause significant change in the cross-section; (3) have a standard that is amorphous (to avoid accidental channeling); (4) be stable in air, in vacuum, and under ion bombardment; and (5) be highly reproducible. A detailed discussion of reference targets used in nuclear microanalysis can be found in Amsel and Davies (1983).
METHOD AUTOMATION Automation of acquisition of NRA spectra is the same as in RBS. The only task that can be automated is changing the sample. Since most of the measurements are done in high vacuum, there is a need for some mechanism to change samples without breaking the vacuum. If the scattering chamber is equipped with a load-lock mechanism, the sample change is easy and convenient and does not require automation. Without the load lock, usually large sample holders that can hold several samples are used. In this case stepping motors connected to a computer can be used to move from one sample to another. In PIGE or resonant depth profiling, an automatic energy scanning system synchronized with data acquisition electronics is desired. Sophisticated energy scanning systems have been developed by Amsel et al. (1983) and Meier and Richter (1990). An alternative to these methods is the complete computer control of the accelerator and the analyzing magnet.
1206
ION-BEAM TECHNIQUES
profile with the energy broadening of the incident ions and the detected reaction particles. The interpretation is not that simple now. The energy-depth conversion can be calculated using Equations 2 and 3. If the cross-section changes only slowly, the spectrum shows the main features of the concentration profile, but this cannot be considered as quantitative analysis. To extract the concentration profile from the spectrum, the spectrum has to be simulated and the assumed concentration profile should be changed until an acceptable agreement between the simulation and the measurement is reached. Several computer programs are available that can do the simulations and fits, e.g., ANALRA (Johnston, 1993) and SENRAS (Vizkelethy, 1990). These programs are available from the Sigmabase (see Internet Resources). Resonant Depth Profiling Figure 5. Measured spectrum from an 834-keV deuteron beam on 100 nm SiO2 on Si.
DATA ANALYSIS AND INITIAL INTERPRETATION Overall Surface Content (Film Thickness) As mentioned above, when the cross-section can be considered flat across the thickness of the film, the overall surface content can be determined easily. Figure 5 shows a spectrum from an 834-keV deuteron beam incident on 100 nm SiO2 on bulk Si. The 16O(d, p)17O reaction has two proton groups that show up as two separate peaks in the spectrum. Since the (d, p1) cross-section has a plateau around 850 keV, we need to consider only the p1 peak. Using Equation 1, we can easily calculate the number of oxygen atoms per square centimeter. In most cases reference standards are used (thin Ta2O5 layers on Ta backing or thin SiO2 layer on Si); therefore, the exact knowledge of the detector solid angle, cross-section, and absolute value of the collected charge is not necessary. If the reference sample contains Ntref oxygen atoms per square centimeter, the number of oxygen atoms per square centimeter in the unknown sample is Nt ¼ Ntref
Y Qref C Yref QC
In resonant depth profiling, the spectrum is usually not recorded, but the particles or g photons are counted in an energy window. The data that carry the depth profile information is the excitation function, i.e., the number of counts vs. the incident energy. The excitation function is the convolution of the resonance cross-section, the energy profile of the incident beam as a function of depth, and the concentration profile. Qualitative features of the concentration profile can be deduced from the excitation function. Quantitative concentration profiles can be obtained by simulating the excitation function and using least-squares fitting to match it to the measured excitation functions. [More detailed discussion can be found in Hirvonen (1995).] Also, several computer programs have been developed for profiling purposes, most recently in Smulders (1986); Lappalainen (1986), and Rickards (1991). The SPACES program (Vickridge and Amsel, 1990) was developed especially for high-resolution surface studies using very narrow resonance. This program is also available from the Sigmabase. A simple example is shown in Figure 6. In this experiment an 18O profile was measured in YBaCuO using the already mentioned 18O(p, a)15N reaction. From the spectrum
ð6Þ
where Qref C and QC are collected charges and Yref and Y are the peak areas of the p1 protons for the reference standard and for the unknown sample, respectively. There are two peaks in the spectrum that need further explanation. The peak at higher energies is the result of the 12C(d, p)13C reaction from the hydrocarbon layer deposited during ion bombardment. The broad peak between the p0 and p1 peaks is due to the D(d, p)T reaction. Since this particular sample was used as a reference standard, during the measurements a considerable amount of deuterium had been implanted into it and is now a detectable component of the target. Nonresonant Depth Profiling When the cross-section cannot be considered constant, the spectrum becomes the convolution of the concentration
Figure 6. Excitation function of the 18O(p, a)15N reaction measured on an 18O-enriched YBaCuO sample (Cheang-Wong, 1992).
NUCLEAR REACTION ANALYSIS AND PROTON-INDUCED GAMMA RAY EMISSION
we can deduce qualitatively that there is an 18O enrichment on the surface. Using Equation 6, we can estimate the thickness of the enriched region (width of the peak divided by the stopping power of proton around the resonance energy), which is about 20 nm. Using simulation of the excitation curve and a nonlinear fit to the measured excitation function gives a 15-nm-thick layer enriched to 30% 18 O with 3.5% constant volume enrichment. A recent review of the computer simulation methods in IBA can be found in Vizkelethy (1994). SPECIMEN MODIFICATION Although NRA and PIGE are nondestructive techniques, certain modifications to the samples can occur due to the so-called beam effect. Since a high-energy ion beam is bombarding the sample surface during the measurement, considerable heat transfer occurs. A 1-MeV ion beam with 1-mA/cm2 current (not unusual in NRA and PIGE) means 1-W/cm2 heat is transferred to the sample. This can be a serious problem in biological samples, and the measurement can lead to the destruction of the samples. In good heat-conducting samples, the heat does not cause a big problem, but diffusion and slight structural changes can occur. Poor heat-conducting samples will suffer local overheating that can induce considerable changes in the composition of the sample. Apart from the heat delivered to the sample, the large measuring doses can cause considerable damage to singlecrystal samples. It is especially significant in heavy-ion beams, such as 15N, which is used to profile hydrogen using the 1H(15N, ag)12C reaction. In this case significant hydrogen loss can occur during the analysis. Heavy ions also can cause ion beam mixing and sputtering. To avoid these problems, the beam should be spread out over a large area on the sample if high current must be used or the current should be kept at the minimum acceptable level.
PROBLEMS Several factors can lead to false results. Most of them can be minimized with careful design of the measurement. A frequently encountered problem is the inability to precisely measure the current. The secondary electrons emitted from the sample can falsify the current measurement and thus measurement of the collected charge. To minimize the effect of secondary electron emission, several techniques can be used: 1. A suppression cage around the sample holder will minimize the number of escaping electrons, but it is not possible to build a perfect suppression cage. A much simpler technique is to apply positive bias to the sample, but this alternative cannot be used for every sample. 2. Isolating the scattering chamber from the rest of the beamline and allowing its use as a Faraday cup is also a good solution, but in many cases the chamber
1207
is connected electrically to other equipment that cannot be isolated. 3. There are different transmission Faraday cup designs where the current is measured not on the sample but in a separate Faraday cup that intercepts the beam in front of the sample several times per second. 4. As described above (see Practical Aspects of the Method), standards can be used to accurately measure current. Another problem is hydrocarbon layer deposition during measurement. This can cause serious problems when narrow resonances are used for depth profiling. The incident ions will lose energy in the hydrocarbon layer that has an unknown thickness. This can lead to a false depth scale if the energy loss is significant in the hydrocarbon layer. Since the hydrocarbon layer formation is slower in higher vacuum, the best way to minimize its effect is to maintain good vacuum (see GENERAL VACCUM TECHNIQUES). Cold traps in the beamline and in the scattering chamber can significantly reduce hydrocarbon layer formation. Another way to minimize the effect of the carbon layer in case of resonance profiling (when many measurements have to be made on the same sample) is to frequently move the beam spot to a fresh area. Interference from the elements of the matrix can be another problem. This is a potential danger in deuteroninduced reactions. Many light elements have (d, p) or (d, a) reactions, and the protons and a particles can overlap in the spectrum. Careful study of the kinematics and the choice of appropriate absorber foil thickness can solve this problem, although in certain cases the interference cannot be avoided. In PIGE, when there are interfering reactions, the use of a Ge(Li) or HPGE detector can help. The better resolution of germanium detectors allows the separation of the g peaks from the different reactions, but at the cost of lower efficiency. A special problem arises in single-crystal samples— accidental channeling. When the direction of the incident ions coincides with one of the main axes of the crystal, the ions are steered to these channels and the reaction yield becomes lower than it would otherwise be. (Also there is a difference between the energy loss of ions in a channel and in a random direction, which can affect the energyto-depth conversion.) The solution is to tilt the samples to 58 to 78 with respect to the direction of the beam. (There are cases when channeling is preferred. Using NRA or PIGE combined with channeling, the lattice location of atoms can be determined.) Electrically nonconducting samples will become charged under ion bombardment up to voltages of several kilovolts, and this can modify the beam energy. This can be especially detrimental when narrow (few hundred electron volts wide) resonances are used for depth profiling. There are several solutions to avoid surface charging: (1) A thin conducting coating (usually carbon) on the surface of the sample can reduce the charging significantly. (2) Supplying low-energy electrons from a hot filament will neutralize the accumulated positive charge on the sample surface.
1208
ION-BEAM TECHNIQUES
A problem can arise from the careless use of the fitting programs. As in any deconvolution problem, there is usually no unique solution determined by the measured yield curve only. (One example is the fact that these deconvolutions tend to give an oscillating depth profile, which is in most cases obviously incorrect.) Extra boundary conditions and assumptions for the profile are needed to make the solution acceptable.
Hirvonen, J.-P. 1995. Nuclear reaction analysis: Particle-gamma reactions. In Handbook of Modern Ion Beam Materials Analysis (J. R. Tesmer and M. Nastasi, eds.) pp. 167–192. Materials Research Society, Pittsburgh, Pa. Hirvonen, J.-P. and Lappalainen, R. 1995. Particle-gamma data. In Handbook of Modern Ion Beam Materials Analysis (J. R. Tesmer and M. Nastasi, eds.). pp. 573–613. Materials Research Society, Pittsburgh, Pa.
LITERATURE CITED
Horn, K. M. and Lanford, W. A. 1990. Suppression of background radiation in BGO and NaI detectors used in nuclear reaction analysis. Nucl. Instrum. Methods B45:256–259.
Ajzenberg-Selove, F. 1987. Energy levels of light nuclei A ¼1820. Nucl. Phys. A475:1–198.
Jarjis, R. A. 1979. Internal Report, University of Manchaster, U.K.
Ajzenberg-Selove, F. 1988. Energy levels of light nuclei A¼510. Nucl. Phys. A490:1.
Johnston, P. N. 1993. ANALRA—charged particle nuclear analysis software for the IBM PC. Nucl. Instrum. Methods B79:506– 508.
Ajzenberg-Selove, F. 1990. Energy levels of light nuclei A ¼1112. Nucl. Phys. A506:1–158. Ajzenberg-Selove, F. 1991. Energy levels of light nuclei A ¼1315. Nucl. Phys. A523:1–196. Amsel, G., d’Artemare, E., and Girard., E. 1983. A simple, digitally controlled, automatic, hysteresis free, high precision energy scanning system for van de Graaff type accelerators. Nucl. Instrum. Methods. 205:5–26. Amsel, G. and Davies, J. A. 1983. Precision standard reference targets for microanalysis with nuclear reactions. Nucl. Instrum. Methods 218:177–182. Amsel, G., Pa´ szti, F., Szila´ gyi, E., and Gyulai, J. 1992. p, d, and a particle discrimination in NRA: Thin, adjustable sensitive zone semiconductor detectors revisited. Nucl. Instrum. Methods B63:421–433. Amsel, G. and Samuel, D. 1967. Microanalysis of the stable isotopes of oxygen by means of nuclear reactions. Anal. Chem. 39:1689–1698. Bird, J. R., Campbell, B. L., and Price, P. B. 1974. Prompt nuclear analysis. Atomic Energy Rev. 12:275–342. Blatt, J. M. and Weisskopf, V. 1994. Theoretical Nuclear Physics. Dover Publications, New York. Chaturvedi, U. K., Steiner, U., Zak, O., Krausch, G., Shatz, G., and Klein, L. 1990. Structure at polymer interfaces determined by high-resolution nuclear reaction analysis. Appl. Phys. Lett. 56:1228–1230. Cheang-Wong, J. C., Ortega, C., Siejka, J., Trimaille, I., Sacuto, A., Balkanski, M., and Vizkelethy, G. 1992. RBS analysis of thin amorphous YBaCuO films: Comparison with direct determination of oxygen contents by NRA. Nucl. Instrum. Methods B64:169–173. Damjantschitsch, H., Weiser, G., Heusser, G., Kalbitzer, S., and Mannsperger, H. 1983. An in-beam-line low-level system for nuclear reaction g-rays. Nucl. Instrum. Methods 218:129–140. Davies, J. A., Lennard, W. N., and Mitchell, I. V. 1995. Pitfalls in ion beam analysis. In Handbook of Modern Ion Beam Analysis (J. R. Tesmer and M. Nastasi, eds.). pp. 343–363. Materials Research Society, Pittsburgh, Pa. Endt, P. M. and van der Leun, C. 1978. Energy levels of A ¼2144 nuclei. Nucl. Phys. A310:1–752. Foster, L., Vizkelethy, G., Lee, M., Tesmer, J. R., and Nastasi, M. 1995. Particle-particle nuclear reaction cross sections. In Handbook of Modern Ion Beam Materials Analysis (J.R. Tesmer and M. Nastasi, eds.). pp. 549–572. Materials Research Society, Pittsburgh, Pa. Green, P. F. and Doyle, B. L. 1986. Silicon elastic recoil detection studies of polymer diffusion: Advantages and disadvantages. Nucl. Instrum. Methods B18:64–70.
Kuhn, D., Rauch, F., and Baumann, H. 1990. A low-background detection system using a BGO detector for sensitive hydrogen analysis with the 1H(15N,ag)12C reaction. Nucl. Instrum. Methods B45:252–255. Lappalainen, R. 1986. Application of the NRB method in range, diffusion, and lifetime measurements, Ph.D. Thesis, University of Helsinki, Finland. Maurel, B. 1980. Stochastic Theory of Fast Charged Particle Energy Loss. Application to Resonance Yield Curves and Depth Profiling. Ph.D. Thesis, University of Paris, France. Maurel, B., Amsel, G., and Nadai, J. P. 1982. Depth profiling with narrow resonances of nuclear reactions: Theory and experimental use. Nucl. Instrum. Methods 197:1–14. Meier, J. H. and Richter, F. W. 1990. A useful device for scanning the beam energy of a van de Graaff accelerator. Nucl. Instrum. Methods B47:303–306. Mo¨ ller, W. 1978. Background reduction in D(3He, a)H depth profiling experiments using a simple electrostatic deflector. Nucl. Instrum. Methods 157:223–227. Mo¨ ller, W., Hufschmidt, M., and Kamke, D. 1977. Large depth profile measurements of D, 3He, and 6Li by deuterium induced nuclear reactions. Nucl. Instrum. Methods 140:157– 165. Rickards, J. 1991. Fluorine studies with a small accelerator. Nucl. Instrum. Methods. B56/57:812–815. Simpson, J. C. B. and Earwaker, L. G. 1984. A computer simulation of nuclear reaction spectra with applications in analysis and depth profiling of light elements. Vacuum 34:899–902. Smulders, P. J. M. 1986. A deconvolution technique with smooth, non-negative results. Nucl. Instrum. Methods B14:234–239. Tilley, D. R., Weller, H. R., and Hassian, H. H. 1987. Energy levels of light nuclei A ¼ 3. Nucl. Phys. A474:1–60. Tilley, D. R., Weller, H. R., and Cheves, C. M. 1993. Energy levels of light nuclei A ¼ 1617. Nucl. Phys. A564:1–183. Tilley, D. R., Weller, H. R., Cheves, C. M., and Chesteler, R. M. 1995. Energy levels of light nuclei A ¼ 1819. Nucl. Phys. A595:1–170. Vickridge, I. C. 1990. Stochastic Theory of Fast Ion Energy Loss and Its Application to Depth Profiling Using Narrow Nuclear Resonances. Applications in Stable Isotope Tracing Experiment for Materials Science. PhD Thesis, University of Paris, France. Vickridge, I. C. and Amsel, G. 1990. SPACES: A PC implementation of the stochastic theory of energy loss for narrow resonance profiling. Nucl. Instrum. Methods B45:6–11. Vizkelethy, G. 1990. Simulation and evaluation of nuclear reaction spectra. Nucl. Instrum. Methods B45:1–5.
NUCLEAR REACTION ANALYSIS AND PROTON-INDUCED GAMMA RAY EMISSION
1209
Vizkelethy, G. 1994. Computer simulation of ion beam methods in analysis of thin films. Nucl. Instrum. Methods B89:122–130. Vizkelethy, G. 1995. Nuclear reaction analysis: Particle-particle reactions. In Handbook of Modern Ion Beam Materials Analysis (J. R. Tesmer and M. Nastasi, eds.). pp. 139–165. Materials Research Society, Pittsburgh, Pa.
KEY REFERENCES Amsel, G. and Lanford, W. A. 1984. Nuclear reaction technique in materials analysis. Ann. Rev. Nucl. Part. Sci. 34:435–460. Figure 7. Reaction kinematics.
Excellent review paper of RNA. Deconnick, G. 1978 Introduction to Radioanalytical Chemistry. Elsevier, Amsterdam. Excellent reviews of the methods. Feldman, L. C. and Mayer, J. W. 1986. Fundamentals of Surface and Thin Film Analysis. North-Holland Publishing, New York. See chapter on ‘‘Nuclear techniques: Activation analysis and prompt radiation analysis,’’ pp. 283–310. Good discussion of the fundamental physics involved in NRA. Hirvonen, 1995. See above. Detailed discussion of PIGE with several worked-out examples. Lanford, W. A. 1995. Nuclear reactions for hydrogen analysis. In Handbook of Modern Ion Beam Materials Analysis (J. R. Tesmer and M. Nastasi, eds.). pp. 193–204. Materials Research Society, Pittsburgh, Pa.
where E0 is the energy of the incident ions; Q is the Q value of the reaction; and A, B, C, and D are given as M1 M4 E0 ðM1 þ M2 ÞðM3 þ M4 Þ E0 þ Q M1 M3 E0 B¼ ðM1 þ M2 ÞðM3 þ M4 Þ E0 þ Q
M2 M3 M1 Q 1þ C¼ M2 ðE0 þ QÞ ðM1 þ M2 ÞðM3 þ M4 Þ
M2 M4 M1 Q 1þ D¼ M2 ðE0 þ QÞ ðM1 þ M2 ÞðM3 þ M4 Þ A¼
ð9Þ
Detailed discussion of NRA with worked-out examples and discussion of useful reactions.
where M1, M2, M3, and M4 are the masses of the incident ion, the target nucleus, the lighter reaction product, and the heavier product, respectively. In Equation 7 only pluspsigns ffiffiffiffiffiffiffiffiffiffi are used unless B > D, in which case cmax ¼ sin1 D=B, and in Equation 8 only plus signs p are unless A > C, in which case ffiffiffiffiffiffiffiffiffiused ffi jmax ¼ sin1 C=A. The emitted particle then loses energy along its path until it leaves the target. The energy loss of ions inward and outward is described by Equations 8 and 9.
INTERNET RESOURCES
Energy Spread (Depth Resolution)
Detailed discussion of hydrogen detection with NRA. Peaisach, M. 1992. Nuclear reaction analysis. In Elemental Analysis by Particle Accelerators (Z. B. Alfassi and M. Peisach, eds.). pp. 351–383. CRC Press, Boca Raton, Fla. General discussion of NRA with valuable references. Vizkelethy, 1995. See above.
Lund Nuclear Data Service, http://nucleardata.nuclear.lu.se/ nucleardata National Nuclear Data Center, http://www.nndc.bnl.gov/ Sigmabase, http://ibaserver.physics.isu.edu/sigmabase T-2 Nuclear Information Service, http://t2.lanl.gov/ Triangle Universities Nuclear Laboratory, http://www.tunl. duke.edu/NuclData
APPENDIX Energy Relations in Nuclear Reactions In a charged particle reaction, the energies of the emitted particles in the laboratory coordinate system are (see Fig. 7):
Elight
Eheavy
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi!2 D ¼ B cosc sin2 c ðE0 þ QÞ B rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi!2 C ¼ A cosj sin2 j ðE0 þ QÞ A
ð7Þ
ð8Þ
Equations 2 and 3 are true only on the average. Since the energy loss is a stochastic process, the ions, after passing a certain thickness, will have an energy distribution rather than a sharp energy. This phenomenon is called straggling. There are several models to describe the energy straggling; in most cases, the simplest one, Bohr straggling, is satisfactory. [One exception is the case of very narrow resonances, which is described in details in Maurel (1980) and Vickridge (1990).] There are other factors that cause broadening of the energy in the detected spectrum. The following contribute to the energy spread of detected particles in NRA: (1) initial energy spread of the incident beam, (2) straggling of the incident ions, (3) multiple scattering, (4) geometric spread due to the finite acceptance angle of the detector and to the finite beam size, (5) straggling of the outgoing particles, (6) energy resolution of the detector, and (6) energy straggling in the absorber foil. A detailed treatment of most of these factors can be found in Vizkelethy ( 1994). In PIGE and resonant depth profiling with NRA, only a few contribute to the depth resolution. Since the energy of the detected particles or g
1210
ION-BEAM TECHNIQUES
rays is not used to extract concentration profile information, only the initial energy spread, the straggling of the incident beam, and the resonance width determine the depth resolution. (The contribution from the multiple scattering is negligible since then normal incidence is used.) Calculation of the Excitation Function for Resonance Depth Profiling The excitation function at an E0 incident energy is ð1 cðxÞgðE0 ; xÞ dx YðE0 Þ / sðE0 Þ hðE0 Þ
ð10Þ
0
where sðE0 Þ is the reaction cross-section, hðE0 Þ is the initial energy distribution of the ion beam, cðxÞ is the concentration profile of the atoms to be detected, g(E; x) is the probability that an ion loses E energy when penetrating a thickness x, and the asterisk denotes convolution. [This convolution is a very complicated calculation; for details see Maurel (1980) and Vickridge (1990).] When the resonance is not extremely narrow, the initial energy spread of the ions is not too small, and the profile is not measured very close to the surface (the energy loss is large enough that the straggling can be considered Gaussian); the function g(E; x) can then be approximated with a Gaussian: 2 1 2 gðE; xÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi e½EEðxÞ =2s ðxÞ 2ps2 ðxÞ
ð11Þ
where E(x) is the same as Ein(x) in Equation 3 and s(x) is the root-mean-square energy straggling at x. If the resonance cross-section is given by a Breit-Wigner resonance (Blatt and Weiskopf, 1994), sðE0 Þ ¼ s0
2 =4 2 =4 þ ðE ER Þ2
ð12Þ
where s0 is the strength, is the width, and ER is the energy of the resonance, then the straggling, the Doppler-broadening, and the initial energy spread can be combined into a single Gaussian with depth-dependent width. Evaluating Equation 10, the excitation function is YðE0 Þ /
ð1 0
! E0 ER cðxÞs0 pffiffiffi i pffiffiffi Re w pffiffiffi dx 8SðxÞ 8SðxÞ 2SðxÞ ð13Þ
where Re w(z) is the real part of the complex error function and S(x) the width of the combined Gaussian. GYO¨ RGY VIZKELETHY Idaho State University Pocatello, Idaho
PARTICLE-INDUCED X-RAY EMISSION INTRODUCTION Particle-induced x-ray emission (PIXE) is an elemental analysis technique that employs a mega-electron-volt energy beam of charged particles from a small electrostatic
accelerator to induce characteristic x-ray emission from the inner shells of atoms in the specimen. The accelerator can be single ended or tandem. Most PIXE work is done with 2- to 4-MeV proton beams containing currents of a few nanoamperes within a beam whose diameter is typically a few millimeters; a small amount of work with deuteron and helium beams has been reported, and use of heavier ion beams has been explored. The emitted x rays are nearly always detected in the energy-dispersive mode using a Si(Li) spectrometer. Although wavelength-dispersive x-ray spectrometry would provide superior energy resolution, its low geometric efficiency is a significant disadvantage for PIXE, in which potential specimen damage by heating limits the beam current and therefore the emitted x-ray intensity. In the proton microprobe, magnetic or electrostatic quadrupole lenses are employed to focus the beam to micrometer spot size; this makes it possible to perform micro-PIXE analysis of very small features and, also, by sweeping this microbeam along a preselected line or over an area of the specimen, to determine element distributions in one or two dimensions in a fully quantitative manner. Most proton microprobes realize a beam spot size smaller than a few micrometers at a beam current of at least 0.1 nA. In trace element analysis or imaging by micro-PIXE, the high efficiency of a Si(Li) detector, placed as close to the specimen as possible, is again mandatory for x-ray collection. To date, PIXE and micro-PIXE have most often been used to conduct trace element analysis in specimens whose major element composition is already known or has been measured by other means, such as electron microprobe analysis. However, both variants are capable of providing major element analysis for elements of atomic number down to Z ¼ 11. Competing Methods In the absence of beam focusing, conventional PIXE is an alternative to x-ray fluorescence analysis (XRFA; see X-RAY MICROPROBE FOR FLUORESCENCE AND DIFFRACTION ANALYSIS) but requires more complex equipment. Micro-PIXE is a powerful and seamless complement to electron microprobe analysis (EPMA), a longer established microbeam x-ray emission method based on excitation by kilo-electron-volt electron beams; detection limits of a few hundred parts per million in EPMA compare with limits of a few parts per million in micro-PIXE. While the electron microprobe is ubiquitous, there are only about 50 proton microprobes around the world, but many of these can provide beam time on short notice. Conducted with the highly polarized x radiation from a synchroton storage ring equipped with insertion devices such as wigglers and undulators, XRFA now provides a strong competitor to micro-PIXE in terms of both spatial resolution (a few micrometers) and detection limits (subparts per million). The x-ray spectrum from an undulator is highly collimated, extremely intense, and nearly monochromatic; the exciting x-ray energy can be tuned to lie just above the absorption edge energy of the element of greatest interest, where the photoabsorption cross-section is highest, thereby optimizing the detection limit. However, there are only a handful of synchrotron facilities, and beam time is at a premium. A
PARTICLE-INDUCED X-RAY EMISSION
much cheaper method of obtaining an x-ray microbeam is through the use of a conventional fine-focus x-ray tube and a nonimaging optical system based upon focusing capillaries (Rindby, 1993). Total-reflection XRFA (Klockenkamper et al., 1992) offers similar detection limits, but this method is restricted to very thin films deposited on a totally reflecting substrate and lacks the versatility of PIXE and XRFA as regards different specimen types. One of PIXEs merits relative to XRFA is the possibility of deploying other ion beam analysis techniques simultaneously or sequentially; these include Rutherford backscattering spectroscopy (see MEDIUM-ENERGY BACKSCATTERING AND FORWARD-RECOIL SPECTROMETRY and HEAVY-ION BACKSCATTERING SPECTROMETRY), nuclear reaction analysis (NUCLEAR REACTION ANALYSIS AND PROTON-INDUCED GAMMA RAY EMISSION), ionoluminescence, and scanning transmission ion microscopy (STIM). Optical spectroscopy methods based on emission (AES—atomic emission spectroscopy) and absorption (AAS—atomic absorption spectroscopy) are the workhorses of trace element analysis in materials that can be reduced to liquid form or dissolved in a liquid and then atomized. The inductively coupled plasma (ICP) has become the method of choice for inducing light emission, and ICP-AES attains detection limits of 1 to 30 ng/mL of liquid; the limits referred to the original material depend on the dilution factor and typically reach down to 0.1 ppm. Interference effects among optical lines are more complex than is the case in PIXE with x-ray lines. Use of mass spectrometry (ICP-MS) reduces interferences, provides a more complete elemental coverage than ICP-AES, and has much less element-to-element variation in sensitivity than does ICP-AES; its matrix effects are less straightforward to handle than those of PIXE. Overall, PIXE is more versatile as regards specimen type and preparation than ICP-AES and ICP-MS and has the advantages of straightforward matrix correction, smoothly varying sensitivity with atomic number, and high accuracy. But for conventional bulk analysis, the highly developed optical and mass spectrometry methods will frequently be more accessible and entirely satisfactory and provide excellent detection limit without the need for an accelerator. For particular specimen types such as aerosol particulates and awkwardly shaped specimens (e.g., in archaeology) requiring nondestructive analysis, PIXE has unique advantages. The ability to handle awkwardly shaped specimens by extracting the proton beam through a thin window into the laboratory milieu is particularly valuable in archaeometric applications. The nondestructive nature, high spatial resolution, parts per million detection limits, and high accuracy of micro-PIXE together with the simultaneous deployment of STIM and backscattering make it very competitive with other microprobe methods. It has been applied intensively in the analysis of individual mineral grains and in zoning phenomena within these, individual fly-ash particles, single biological cells, and thin tissue slices containing features such as Alzheimer’s plaques (Johansson et al., 1995). Secondary ion mass spectrometry (SIMS) often provides better detection limits but is destructive. The combination of laser ablation microprobe with inductively
1211
coupled plasma (ICP) excitation and mass spectrometry offers resolution of some tens of micrometers and detection limits as low as 0.5 ppm; however, the ablation basis of this method renders matrix effects and standardization more complex than in micro-PIXE. Detailed accounts of the fundamentals and of many applications can be found in two recent books on PIXE (Johansson and Campbell, 1988; Johansson et al., 1995), the first of which provides considerable historical perspective. The present offering is an overview, including some typical applications. The proceedings of the triennial international conferences on PIXE provide both an excellent historical picture and an account of recent developments and new applications.
PRINCIPLES OF THE METHOD The most general relationship between element concentrations in a specimen and the intensity of detected x rays of each element is derived for the simple case of K x rays (see Appendix A). The L and M x rays may be dealt with in similar manner, with the added complexity that in these cases ionization occurs in three and five subshells, respectively, and there is the probability of vacancies being transferred by the Coster-Kronig effect to higher subshells prior to the x-ray emission occurring. In principle, Equation 6 could be used to perform standardless analysis; this would involve a complete reliance on the underlying theory and the database and on the assumed properties of the x-ray detector. At the other extreme, by employing standards that very closely mimic the major element (matrix) composition of a specimen, the PIXE analyst working on trace elements could completely avoid dependence upon theory and rely only upon calibration curves relating x-ray yield to concentration. In practice, there is much practical merit in adopting an intermediate stance. We recommend a fundamental parameter approach using the well-understood theories of the interaction of charged particle beams and photon beams with matter and of characteristic x-ray emission from inner shell vacancies but also relying on a small set of standards that need bear only limited resemblance to the specimens at hand. In this approach the equation used to derive element concentrations from measured x-ray intensities is YðZÞ ¼ Y1 ðZÞHeZ tZ CZ Q
ð1Þ
where Y(Z) is the measured intensity of x rays in the principal line of element Z (concentration CZ), Y1(Z) is the theoretically computed intensity per unit beam charge per unit detector solid angle per unit concentration, H is an instrumental constant that subsumes the detector solid angle and any calibration factor required to convert an indirect measurement of beam charge to units of microcoulombs, Q is the direct or indirect measure of beam charge, tZ is the x-ray transmission fraction through any absorbers (see Practical Aspects of the Method) that are deliberately interposed between specimen and detector, and eZ is the detector’s intrinsic efficiency. It is also straightforward to
1212
ION-BEAM TECHNIQUES
calculate the contribution of secondary x rays fluoresced when proton-induced x rays are absorbed in the specimen and to correct for these. The various atomic physics quantities required are obtained from databases that have been developed by fitting appropriately parameterized expressions (in part based on theory and in part semiempirical) to either compilations of experimental data or theoretically generated values. The H value is determined via Equation 1 by use of appropriate standards or standard reference materials. It is implicit in the direct use of Equation 1 that the major elements have already been identified and that their concentrations are known a priori. This permits computation of the integrals that involve matrix effects (slowing down of the protons and attenuation of the x rays); Equation 1 then provides the trace element concentrations. However, if the x-ray lines of the major elements are observed in the spectrum, Equation 1 may be solved in an iterative manner to provide both major and trace element concentrations. The method is fully quantitative, and its accuracy is determined in part by the accuracy of the database. The theoretically computed x-ray intensity uses ionization cross-sections and stopping powers for protons, x-ray mass attenuation coefficients, x-ray fluorescence yields and Coster-Kronig probabilities for the various subshells, and relative intensities of the lines within each x-ray series (e.g., K, L1, . . .) emitted by a given element. For ionization cross-sections, the choice is among the hydrogenic model calculations of Cohen and Harrigan (1985, 1989) or of Liu and Cipolla (1996) based on the so-called ECPSSR model of Brandt and Lapicki (1981), the equivalent but more sophisticated ECPSSR treatment of Chen and Crasemann (1985, 1989) based on self-consistent field wave functions, or experimental cross-section compilations such as those of Paul and Sacher (1989) for the K shell and Orlic et al. (1994) for the L subshells. Proton stopping powers (i.e., rate of energy loss as a function of distance traveled in an element) are usually taken from one of the various compilations of Ziegler and his colleagues, which are summarized in a report by the International Commission on Radiation Units and Measurements (1993). A variety of mass attenuation coefficient schemes ranging from theoretical to semiempirical have been used, and the XCOM database provided by the National Institute of Science and Technology (NIST; Berger and Hubbell, 1987) appears to us to be admirably suited. The relative x-ray intensities within the K and L series are invariably taken from the calculations of Scofield (1974a,b, 1975) and from fits to these by Campbell and Wang (1989), although corrections must be made for the K x rays of the elements 21 < Z < 30 where configuration mixing effects arising from the open 3d subshell cause divergences; relative intensities for the M series are given by Chen and Crasemann (1984). Atomic fluorescence and Coster-Kronig probabilities may be taken from the theoretical calculations of Chen et al. (1980a,b, 1981, 1983) or from critical assessments of experimental data such as those of Krause (1979) and Bambynek (see Hubbell, 1989). Campbell and Cookson (1983) have assessed how the various uncertainties in all these quantities are transmitted into accuracy estimates for PIXE, but the best route for assessing overall
accuracy remains the analysis of known reference materials. The intrinsic efficiency of the Si(Li) x-ray detector enters Equation 1 and so must be known. This efficiency is essentially unity in the x-ray energy region between 5 and 15 keV. At higher energies eZ falls off because of penetration through the silicon crystal (typically 3 to 5 mm thick). At lower energies, eZ falls off due to attenuation in the beryllium or polymer vacuum window, the metal contact, and internal detector effects such as the escape of the primary photoelectrons and Auger electrons created in x-ray interactions with silicon atoms and loss of the secondary ionization electrons due to trapping and diffusion. The related issues of detector efficiency and resolution function (lineshape) are dealt with below (see Appendix B). A basic assumption of the method is that the element distribution in the volume analyzed is homogeneous. The presence of subsurface inclusions, for example, would negate this assumption. Of course, micro-PIXE can be used to probe optically visible inhomogeneities such as zoning and grain boundary phenomena in minerals and to search for optically invisible inhomogeneities in, for example, air particulate deposits from cascade impactors. It can then be used to select homogeneous regions for analysis. Given (1) homogeneity and (2) knowledge of the major element concentrations, PIXE or micro-PIXE analysis is fully quantitative, with the accuracy and the detection limits determined by the type of specimen. A great deal of PIXE work is conducted on specimens that are so thin that the proton energy is scarcely altered in traversing the specimen and there is negligible attenuation of x rays. Films of fine particulates collected from the ambient air by sampling devices are an example. In such cases, a similar approach may be taken with concentration CZ replaced in Equation 1 by the areal density of elements in the specimen. The main engineering advance responsible for PIXE was the advent of the Si(Li) x-ray detector in the late 1960s. Micro-PIXE was made possible by the development of high-precision magnetic quadrupole focusing lenses in the 1970s. The acceptance of PIXE and other acceleratorbased ion beam techniques stimulated advances in accelerator design that have resulted in a new generation of compact, highly stable electrostatic accelerators provided primarily for ion beam analysis.
PRACTICAL ASPECTS OF THE METHOD Most PIXE work is done with nanoampere currents of 2- to 4-MeV protons transmitted in a vacuum beam line to the specimen. However, some laboratories extract the beam into the laboratory through a thin Kapton window to deal with unwieldy or easily damaged specimens such as manuscripts or objects of art (Doyle et al., 1991). As with EPMA, most of the technical work for the user lies in specimen preparation (see Sample Preparation). Conduct of PIXE analysis requires an expert accelerator operator to steer and focus the beam onto the specimen. As indicated above (see Principles of the Method), the intrinsic efficiency of the Si(Li) detector as a function of
PARTICLE-INDUCED X-RAY EMISSION
x-ray energy is an important practical aspect. This function may be determined from, e.g., the manufacturer’s data on crystal thickness and the contact and window thickness. More accurate determinations of these quantities may be effected via methods outlined below (see Appendix B). The lineshape or resolution function of the detector is required (see Data Analysis and Initial Interpretation) for the interpretation of complex PIXE spectra with their many overlapping lines. Guidance in this direction is also provided below (see Appendix B). Specimens are generally classified as thin or thick. Thin specimens are those that cause negligible reduction in the energy of the protons, and they are usually deposited on a substrate of trace-element-free polymer film; examples include microtome slices of tissue, residue from dried drops of fluid containing suspended or dissolved solids, or films of atmospheric particulate material collected by a sampling device. Thick specimens are defined as those having sufficient areal density to stop the beam within the specimen, and they may be self-supporting or may reside on a substrate. As indicated below (see Appendix A), their analysis requires full attention to the effect of the matrix in slowing down the protons and in attenuating the excited characteristic x rays. The x-ray production decreases rapidly with depth but is significant for some 10 to 30 mm. Examples of thick specimens include mineral grains, geological thin sections, metallurgical specimens, and archaeological artefacts such as jewelry, bronzes, and pottery. Specimens between these limiting cases are referred to as having intermediate thickness, and their thickness must be known if matrix effects are to be accurately handled. In early PIXE work, many intermediate specimens were treated as if they were thin; i.e., matrix effects were neglected. However, it is now customary to correct for these, using, for example, an auxiliary proton scattering or transmission measurement to determine specimen thickness and major element composition. With specimens that stop the beam, current and charge measurements are usually accomplished very simply by having the specimen electrically insulated from the chamber and by connecting it to a charge integrator. If thick specimens are not conducting, they must be coated with a very thin layer of carbon to prevent charging; neglecting this results in periodic spark discharge, which in turn causes intense bremsstrahlung background in the x-ray spectrum. Secondary electrons are emitted from the specimen, potentially causing an error of up to 50% in beam charge determination. These electrons must be returned to the specimen by placing a suitable electrode or grid close by, at a negative potential of typically 100 V. In thin specimens, the beam is transmitted into a graphite-lined Faraday cup, also with an electron suppressor, and the charge is integrated. Electronic dead time effects must be accounted for. After detection of an event, there is a finite processing time in the electronic system, during which any subsequent events will be lost. A dead time signal from the pulse processor may be used to gate off the charge integrator and effect the necessary correction. Alternatively, an ondemand beam deflection system may be used. Here, the proton beam passes between two plates situated about
1213
1 m upstream of the specimen and carrying equal voltages; detection of an x ray triggers a unit that grounds one plate as rapidly as possible, thereby deflecting the beam onto a tantalum collimator. The beam is restored when electronic processing is complete, so no corrections for dead time are required. This approach has two further advantages. The first is that specimen heating effects are reduced by removing the beam when its presence is not required. The second is that pile-up of closely spaced x-ray events is substantially decreased, thereby removing undesired artefacts from the spectrum. The finite counting rate capacity of a Si(Li) detector and its associated pulse processor and multichannel pulse height analyzer demand that unnecessary contributions in the x-ray spectrum be minimized and as much of the capacity as possible be used for the trace element x rays of interest. Most thin-specimen analyses in the atmospheric or biological science context are performed with a Mylar absorber of typically 100 to 200 mm thick employed to reduce the intense bremsstrahlung background that lies below 5 keV in the spectrum and whose intensity increases rapidly toward lower energy. With thick metallurgical, geological, or archaelogical specimens, x-ray lines of major elements dominate the spectrum, making it necessary to reduce their intensities with an aluminum absorber of thickness typically 100 to 1000 mm. For example, in silicate minerals, the K x rays of the major elements Na, Mg, Al, and Si occupy most of the counting rate capability; an aluminum foil of thickness 100 mm reduces the intensity of these x rays to a level such that they are still present at useful intensity in the spectrum, but the x rays from trace elements of higher atomic number can now be seen, with detection limits approaching the 1-ppm level. The thickness of such filters must be accurately determined. As a thin-specimen example, Figure 1 shows the PIXE spectrum of an air particulate standard reference material. Proton-induced x-ray emission has proven extremely powerful in aerosol analysis, and special samplers have been devised to match its capabilities (Cahill, 1995). The low detection limits facilitate urban studies with short sampling intervals in order to determine pollution patterns over a day. At the other extreme, in remote locations, robust samplers of the IMPROVE type (Eldred et al., 1990) run twice weekly for 24-h periods at sites around the world in a project aimed at assembling continental and global data on visibility and particulate composition (Eldred and Cahill, 1994). The low detection limits match PIXE perfectly with cascade impactors, where the sample is divided according to particle diameter. Figure 2 (Maenhaut et al., 1996) presents measured detection limits for two cascade impactors that are well suited to PIXE in that they create on each impaction stage a deposit of small diameter that may be totally enveloped by a proton beam of a few millimeters in diameter. Now, PIXE appears in about half of all analyses of atmospheric aerosols in the standard aerosol journals. Figure 3 shows an example spectrum from an ‘‘almost thin’’ specimen where an auxiliary technique is used to determine thickness and major element composition so that matrix corrections may be applied in the data reduction. The specimen is a slice of plant tissue, and the
1214
ION-BEAM TECHNIQUES
Figure 1. PIXE spectra of the BCR-128 fly ash reference standard, measured at Guelph with two x-ray detectors: (A) 8-mm window, no absorber; (B) 25-mm window, 125-mm Mylar absorber. Proton energy was 2.5 MeV.
Figure 3. PIXE and proton backscattering energy spectra from a focal deposit of heavy metals in a plant root (Watt et al., 1991). Proton energy was 3 MeV. The continuous curves represent a fit (PIXE case) and a simulation (BS case). Reproduced by permission of Elsevier Science.
Figure 2. Detection limits for the PIXE International cascade impactor (PCI) and a small-deposit low-pressure impactor (SDI). Note that the detection limit depends upon the impactor stage. From Maenhaut et al. (1996) with permission of Elsevier Science.
PARTICLE-INDUCED X-RAY EMISSION
auxiliary technique is proton backscattering, whose results, acquired simultaneously, are in the lower panel of the figure. Many PIXE systems employ an annular silicon surface barrier detector upstream from the specimen and situated on the beam axis so that the beam passes through the annulus. This affords a detection angle close to 1808 for scattered particles, which results in optimum energy resolution. This combination has proved to be a powerful means to determine major element concentrations in a large range of biological applications. Another powerful technique ancillary to micro-PIXE in the analysis of thin tissue slices is STIM. Here, a charged particle detector situated downstream from the specimen at about 158 to 258 to the beam axis records the spectrum of forward-scattered particles that reflects energy losses from both nuclear scattering and ion-electron collisions. This technique is effective in identifying structures such as individual cultured cells on a substrate, individual aerosol particles, and lesions in medical tissue. It has been used to identify neuritic plaques in brain tissue from Alzheimer’s disease patients (Landsberg et al., 1992), thereby eliminating the need for immunohistochemical staining and so removing a sample preparation step that has the potential for contamination. Simultaneous recording of STIM, backscatter, and PIXE spectra and two-dimensional images has become a standard approach with biological specimens whose structure remains unaffected by preparation. Micro-PIXE elegantly complements electron probe microanalysis in the trace element analysis of mineral grains using the same specimens and extending detection limits down to the parts per million level (Campbell and Czamanske, 1998). One major niche is the study of precious and other metals in sulfide ores. Figure 4 shows spectra of three iron-nickel sulfides (pentlandites) from a Siberian ore deposit; the large dynamic range between major elements such as iron (over 30% concentration in this case) and trace elements at tens to hundreds of parts
Figure 4. PIXE spectra of three Siberian pentlandites (Czamanske et al., 1992). The spectra were recorded using 3-MeV protons and an aluminum filter of 352 mm thickness. Element concentrations (in ppm) are (a) Se 261, Pd 2540, Ag 112, Te 54; (b) Se 80, Pd 132; (c) Se 116, Pd< 5, Ag 34, Te 36, Pb 1416.
1215
Figure 5. Strontium concentration profile in the growth zones of an otolith from an Arctic char collected from the Jayco River in the Northwest Territories of Canada (Babaluk et al., 1997). A 30-min micro-PIXE linescan was conducted using a 10 10-mm beam of 3-MeV protons. Reproduced by permission of the Arctic Institute of Canada.
per million concentration is typical of the earth science application area, and it necessitates use of absorbing filters (several hundred micrometers of aluminum) to depress the intensities of the intense x rays of the lighter major elements. Another important area is the study of trace elements in silicate minerals, where PIXE has aided in the development of geothermometers and fingerprinting approaches that are proving immensely powerful in, for example, assessing kimberlites for diamond potential (Griffin and Ryan, 1995). Micro-PIXE is also widely used in trace element studies of zoned minerals, where the zoning contains information on the history of the mineral. Figure 5 shows a recently developed zoning application in calcium carbonate of biological origin—the otolith of an Arctic fish (Babaluk et al., 1997). The oscillatory behavior of Sr content in annual growth rings reflects annual migration of the fish from a freshwater lake to the open ocean, starting at age 9 years in the example shown. Micro-PIXE is now being applied in population studies for stock management. Many more examples, including exotic ones in the fields of art and archaeometry, may be found in the book by Johansson et al. (1995). A unique example of nondestructive analysis is the use of micro-PIXE to identify use of different inks in individual letters in manuscripts. An example of a partly destructive analysis is the extraction of a very narrow core from a painting and the subsequent scanning of a microbeam along its length to identify and characterize successive paint layers. In the absence of interfering peaks, the detection limit for an element is determined by the intensity of the continuous background on which its principal x-ray peak is superimposed in the measured spectrum. This background is due mainly to secondary electron bremsstrahlung,
1216
ION-BEAM TECHNIQUES
cimen chamber are very small, steps are necessary to shield radiation from beam-defining slits and from the accelerator itself. Special precautions (Doyle et al., 1991) are necessary if the beam is extracted through a window into the laboratory to analyze large or fragile specimens.
METHOD AUTOMATION
Figure 6. Micro-PIXE detection limits measured at Guelph for sulfide minerals and silicate glasses. Upper plot (left scale): triangles, pentlandites (350-mm Al filter); dots, pyrrhotite (250-mm Al filter); 10 mC of 3-MeV protons in a 5 10-mm spot. Lower plot (right scale): triangles, BHVO-1 basalt fused to glass; dots, rhyolitic glass RLS-158; 2.5 mC of 3-MeV protons in 5 10-mm spot; Al filter 250 mm.
whose intensity diminishes rapidly with increasing photon energy, but it can be augmented by gamma rays from nuclear reactions in light elements such as Na and F; these produce an essentially flat contribution to the background, visible at higher x-ray energies beyond the lower energy region where the bremsstrahlung dominates. The usual criterion is that a peak is detectable if its integrated pffiffiffiffi intensity (I) exceeds a three-standard-deviation (3 B) fluctuation of the underlying background intensity (B). The desirability of a thin substrate in the case of thin specimens is obvious, since the substrate contributes only backpffiffiffiffi ground to the spectrum. The ratio I=3 B increases as the square root of the collected charge, suggesting that detection limits will be minimized by maximizing beam current and measuring time. Beam current is often limited by specimen damage, and measuring time may be limited by cost. Detection limits should be quoted for a given beam charge and detector geometry. The Z dependence of detection limits is U shaped, as shown in the examples of Figure 6, and this reflects the theoretical x-ray intensity expression that is included in Equation 1. For many thick specimen types, major elements contribute intense peaks and both pile-up continua and pile-up peaks to the spectrum, and these can locally worsen detection limits considerably; electronic pile-up rejection or on-demand beam deflection, which minimize such artefacts, are therefore necessary parts of a PIXE system. The hazards of the technique are those involved with operation of any small accelerator operating with low beam currents. While radiation fields just outside the spe-
Specimen chambers usually accommodate many specimens, and computer-controlled stepping motors, either inside or outside the vacuum, are employed to expose these sequentially to the beam and to control all other aspects of data acquisition, e.g., charge measurement and selection of absorbing filter. In the case of micro-PIXE, the coordinates for each point analysis, linescan, or area map may be recorded by optical examination at the outset, permitting subsequent unsupervised data acquisition. Human intervention is only needed to break vacuum and insert new specimen loads into the analysis chamber, reevacuate, and input the parameters of the measurement (e.g., integrated charge per specimen). However, the accelerator itself is rarely automated, and a technical expert is required to steer and focus the beam prior to the analysis.
DATA ANALYSIS AND INITIAL INTERPRETATION On the assumption that the volume sampled by the beam is homogeneous, the method is fully quantitative. Prior to analysis of measured spectra, checks are necessary to ensure that the expected detector energy resolution was maintained and that counting rate was in the proper domain. The main step in data analysis is a nonlinear leastsquares fit of a model energy-dispersive x-ray spectrum to the measured spectrum. The software requires the database mentioned earlier to describe the x-ray production process, a representation (Gaussian with corrections, as outlined below; see Appendix B) of the peaks in the energy-dispersed spectrum, and a means of dealing with continuous background; the latter may be fitted by an appropriate expression or removed by applying an appropriate algorithm prior to the least-squares fitting. Accurate knowledge of the properties of the detector and of any x-ray absorbers introduced between specimen and detector is necessary. The quantities determined by the fit are the heights, and therefore the intensities, of the principal x-ray line of each element present in the spectrum. The intensities of all the remaining lines of each element are then set by that of the element’s main line together with the relative x-ray intensity ratios provided by the database; these ratios are adjusted to reflect the effects of both absorber transmission and detector efficiency. For thick specimens, the database intensity ratios are also adjusted to reflect the effects of the matrix. This last adjustment is straightforward in a trace element determination in a known matrix; the matrix corrections to the ratios remain fixed through the iterations of the fit. The more complex case of major element measurement will be discussed below. Given the
PARTICLE-INDUCED X-RAY EMISSION
very large number of x-ray lines present, the number of pile-up combinations can run into hundreds, and these double and triple peaks must also be modeled. A sorting process is used to eliminate pile-up combinations that are too weak to be of importance. The final intensities provided by the fit for the principal lines of each element present are corrected for secondary fluorescence contributions, for pulse pile-up, and dead time effects (if necessary), prior to conversion to concentrations using Equation 1. There is a linear relationship between the x-ray energy and the corresponding channel number of the peak centroid in the spectrum, and there is a linear relationship between the variance of the Gaussian peaks and the corresponding x-ray energy. The four system calibration parameters inherent in these relationships may be fixed or variable in the fitting process. The latter option has merit in that it accounts for small changes due to electronic drifts and does not require that the system be preset in any standard way. If the major element concentrations are known a priori, then the second step, the conversion of these peak intensities to concentrations, is accomplished directly from Equation 1 or a variant thereof using the instrumental constant H, which is determined either using standards or from known major elements in the specimen itself. If the major elements have to be determined by the PIXE analysis, then Equation 1 has to be solved by an iterative process, starting with some estimate of concentrations and repeating the least-squares fit until consistency is achieved between the concentrations utilized and generated by Equation 1. Various software packages are available for the above tasks (Johansson et al., 1995). They differ mainly in the databases that they adopt and in their approaches to handling the continuous background component of the spectra. The first option for the background is to add an appropriate expression to the peak model, thus describing the whole spectrum in mathematical terms, and then determine the parameter values in the background expression via the fitting process; the most common choice to describe the background has been an exponential polynomial of order up to 6 for the electron bremsstrahlung plus a linear function for the gamma ray component. The first of these two expressions must be modified to describe filter transmission and detector efficiency effects; the second needs no modification, because high-energy gamma rays are not affected by the absorbers and windows. While one particular form of expression may cope well, in an empirical sense, with a given type of specimen, there is little basis for assuming that the expressions described will be universally satisfactory. This has led to approaches that involve removing the background continuum by mathematically justifiable means. The simplest such approach, developed in the electron microprobe analogue, involves convoluting the spectrum with the ‘‘top-hat’’ filter shown in Figure 7. Such a convolution reduces a linear background to zero and will therefore be effective if the continuum is essentially linear within the dimensions of the filter. These dimensions are prescribed by Schamber (1977) as UW ¼ 1 FWHM and LW ¼ 0.5 FWHM, where FWHM is the full width at half-maximum of the resolution function at the
1217
Figure 7. Top-hat filter and its effect on a Gaussian peak with linear background: UW and LW are the widths of the central and outer lobes.
x-ray energy specified. A more sophisticated multistep approach to continuum removal uses a peak-clipping algorithm to remove all peaks from the measured spectrum, and Ryan et al. (1988) have developed a version of this that is configured to cope with the very large dynamic range in peak heights that are encountered in PIXE spectra. First, the measured spectrum is smoothed by a ‘‘lowstatistics digital filter’’ of variable width; at each channel of the spectrum this width is determined by the intensity in that channel and its vicinity. As the filter moves from a valley toward a peak, the smoothing interval is reduced to prevent the peak intensity from influencing the smoothing of the valley region. In a second step, a double-logarithmic transformation of channel content is applied, i.e., z ¼ ln½lnðy þ 1Þ
ð2Þ
in order to compress the dynamic range. The third step is a multipass peak clipping, in which each channel content z(c) is replaced by the lesser of z(c) and z¼
1 ½zðc þ wÞ þ zðc wÞ 2
ð3Þ
where the scan width w is twice the FWHM corresponding to channel c. After no more than 24 passes, during which w is reduced by a factor of 2, a smooth background remains. The final step is to transform this background back through the inverse of Equation 2. The accuracy of this so-called SNIP (statistics-insensitive nonlinear peak-clipping algorithm) approach has been demonstrated in many PIXE analyses of geochemical reference materials (Ryan et al., 1990). The VAX software package GEOPIXE (Ryan et al., 1990) is designed for thick specimens encountered in geochemistry and mineralogy and can also provide accurate elemental analysis of subsurface fluid inclusions (Ryan et al., 1995). It is installed in various micro-PIXE laboratories concerned with geochemical applications. It takes the SNIP approach to dealing with the background continuum. The widely used PC package GUPIX (Maxwell et al., 1989, 1995) deals with thin-, thick-, intermediate-, and multiple-layer specimens, employing the simple top-hat filter method to strip background. GUPIX offers the
1218
ION-BEAM TECHNIQUES
option, useful in mineralogy, of analyzing in terms of oxides rather than elements; this enables the user to compare the sum of generated oxide concentrations to 100% and draw appropriate conclusions. GUPIX also allows the user to include one ‘‘invisible’’ element in the list of elements to be included in the calculation, requiring the sum of concentrations of all ‘‘visible’’ elements and the invisible element to sum to 100%; this can be used to determine oxygen content in oxide minerals or sulfide content in sulfide minerals when the oxygen and sulfur K x rays are not observable in the spectrum (as is usually the case). The GUPIX code can provide analysis and thickness determination of multilayer film structures, provided that the elements whose concentrations in a given layer are to be determined are not also present at unknown concentrations in any other layer. GUPIX is presently being extended to cope with analysis using deuteron and helium beams (Maxwell et al., 1997). The above has been concerned with PIXE analysis for element concentrations in a fixed area on the specimen. In imaging applications, the beam is scanned over the specimen, and a record is built of the coordinates and channel number of each recorded x-ray event. From this record, a one- or two-dimensional image may be reconstructed using the spectrum intensity in preset ‘‘windows’’ to represent amounts of each element in a semiquantitative fashion. Obviously, such an approach can encounter errors due to peak overlap, but these may be dealt with by appropriate nonlinear least-squares spectrum fitting upon conclusion of the analysis. It is, however, desirable to transform observed window intensities into element concentrations in real time, updating as the data accumulate. Such a dynamic analysis method has been developed by Ryan and Jamieson (1993), and it enables on-line accumulation of PIXE maps that are inherently overlap resolved and background subtracted. Returning now to direct analysis, standard reference materials of a similar nature to each specimen type of interest should be analyzed to demonstrate, e.g., accuracy and detection limits, which are determined in part by the measurement and in part by the fitting procedure. There are many examples of the accurate analysis of standard reference materials (SRMs). Table 1 shows the results of PIXE analysis by Maenhaut (1987) of the European Community fly-ash reference material used to simulate exposed filters from air particulate samplers. Agreement between measured and certified values is excellent, except for the elements of very low atomic number; these light elements tend to occur in larger soil-derived particles, and the data treatment used did not account for particle size effects. Table 2 summarizes a PIXE study of eight biological standard reference materials prepared for PIXE analysis by two different methods; one method (A) involved direct deposition of freeze-dried powder on the polymer substrate, and the other (D) involved Teflon bomb digestion of the powder followed by deposition and drying of aliquots. Table 3 shows micro-PIXE analyses of geochemical reference standards from three laboratories that specialize in earth science applications of micro-PIXE; these are the Commonwealth Scientific and Industrial Research Organization (CSIRO) in Australia, the University of Guelph
Table 1. Analysis by PIXE of Thin Films (BCR 128) Containing Certified Fly Ash (BCR 38)a Element Mge Ale Sie Pe Se Ke Cae Tie V Cr Mn Fe Ni Cu Zn Ga Ge As Br Rb Sr Zr Pb
Reference Valuea;b
PIXE Valuec
PIXE/ Referenced
9.45 0.34 f 127.5 4.9 f 227 12 f
8.3 1.1 104.0 1.6 176.0 2.1 1.60 0.15 4.17 0.03 30.6 0.7 13.35 0.16 5.07 0.12 340 2 161 16 454 10 32.9 0.5 192 8 169 8 572 18 56.2 4.5 g 16.3 4.7 g 49 11 g 29 7 g 183 18 g 187 18 g 169 23 g 225 26 g
0.88 0.12 0.82 0.01 0.78 0.01
3.89 0.13 f 34.1 1.6 f 13.81 0.63 f 334 23 178 12 479 16 33.8 0.7 194 26 176 9 581 29
48.0 2.3
262 11
1.07 0.01 0.90 0.02 0.97 0.01 1.02 0.07 0.90 0.09 0.95 0.02 0.97 0.01 0.99 0.04 0.96 0.05 0.98 0.03
1.02 0.23
0.86 0.10
a
Concentrations are given in micrograms per gram unless indicated otherwise. b Reference values and associated errors are for the certified fly ash (BCR 38) unless indicated otherwise. c PIXE data are averages and standard deviations, based on the analysis of five thin-film samples, unless indicated otherwise. d Values were obtained by dividing both the PIXE result and its associated error by the reference concentration. e Concentration in millligrams per gram. f Concentration and standard deviation obtained by averaging round-robin values for the fly ash. g Result derived from the sum spectrum, obtained by summing the PIXE spectra of the five films analyzed; the associated error is the error from counting statistics.
in Canada, and the National Accelerator Center at Faure in South Africa; details of these analyses are given by Campbell and Czamanske (1998). In such studies, the accuracy of the PIXE technique is usually assessed by comparing the mean element concentrations over several replicate samples (or spots on the same sample in the micro-PIXE case). In these and other such studies, the accuracy is typically a few percent when the concentration is well above the detection limit. The matter of reproducibility or precision is discussed below (see Problems).
SAMPLE PREPARATION In thin-specimen work, it is straightforward to deposit thin films of powder or fluid (which is then dried) or microtome slices of biological tissue onto a polymer substrate. In aerosol specimens, the particulate material is deposited on an appropriate substrate, usually Nuclepore or Teflon filters, by the air-sampling device. Reduction of original bulk
PARTICLE-INDUCED X-RAY EMISSION Table 2. Comparison between Certified Values and PIXE Dataa for 16 Elements in 8 Reference Materials
Element
Number of Certified Values
K Ca Cr Mn Fe Ni Cu Zn As Se Br Rb Sr Mo Ba Pb
7 7 5 7 8 2 8 6 6 3 1 7 6 3 1 7
material to a fluid or powder draws on the standard repertoire of grinding, ashing, and digestion techniques. Powdered material may be suspended in a liquid that is pipetted onto the substrate. Similarly, an aliquot of liquid containing dissolved material may be allowed to dry on the substrate. However, nonuniform deposits left from drying liquid drops may result in poorer than optimal reproducibility. The beam diameter is kept less than the specimen diameter to avoid problems of nonuniform specimen at the edges. If specimens are thin enough, there are no charging problems, but on occasion graphite coating has been necessary. With thick specimens, a polished surface must be presented to the beam. If the specimen is an insulator, it must be carbon coated to prevent charge buildup. Geological specimens are prepared for micro-PIXE precisely as they are for electron probe microanalysis. The options are thin sections (30 to 50 mm) or multiple grains ‘‘potted’’ in epoxy resin.
Average Absolute % Difference Relative to Certified Valueb Method A
Method D
2.1 (7) 6.6 (7) [17 (4)] 3.3 (6) 5.9 (8) [19 (2)] 5.5 (8) 2.3 (6) 7.8 (2) [0.6 (1)] 0.7 (1) 2.0 (7) 2.6 (5) [0.6 (1)]
5.7 (5) 51 (7) [19 (4)] 3.1 (6) 3.7 (8) [25 (2)] 3.0 (8) 4.0 (6) 6.2 (2) [15 (1)]
4.4 (4)
3.0 (4)
5.6 (7) 3.1 (5) [1.4 (1)]
SPECIMEN MODIFICATION
a
With standard deviation from counting statistics 0 so that the lowest energy configuration is where the spins are parallel (a ferromagnet), then the magnon dispersion along the edge of the cube (the [100] direction) is given by EðqÞ ¼ 8 JS ½sin2 ðqa=2Þ. At each wave vector q, the magnon energy is different, and a neutron can interact with the system of spins and either create a magnon at ðq; EÞ, with a concomitant change of momentum and loss of energy of the neutron, or conversely destroy a magnon with a gain in energy. The observed change in momentum and energy for the neutron can then be used to map out the magnon dispersion relation. Neutron scattering is particularly well suited for such inelastic scattering studies since neutrons typically have energies that are comparable to the energies of excitations in the solid, and therefore the neutron energy changes are large and easily measured. The dispersion relations can then be measured over the entire Brillouin zone (see, e.g., Lovesey, 1984). Additional information about the nature of the excitations can be obtained by polarized inelastic neutron scattering techniques, which are finding increasing use. Spin wave scattering is represented by the raising and lowering operators S ¼ Sx iSy , which cause a reversal of the neutron spin when the magnon is created or destroyed. These ‘‘spin-flip’’ cross-sections are denoted by (þ ) and ( þ). If the neutron polarization P is parallel to the momentum transfer Q; P k Q, then spin angular momentum is conserved (as there is no orbital contribution in this case). In this experimental geometry, we can only create a spin wave in the ( þ) configuration, which at the same time causes the total magnetization of the sample to decrease by one unit (1mB ). Alternatively, we can destroy a spin wave only in the (þ ) configuration, while increasing the magnetization by one unit. This gives us a unique way to unambiguously identify the spin wave scattering, and polarized beam techniques in general can be used to distinguish magnetic from nuclear scattering in a manner similar to the case of Bragg scattering. Finally, we note that the magnetic Bragg scattering is comparable in strength to overall magnetic inelastic scattering. However, all the Bragg scattering is located at a single point in reciprocal space, while the inelastic scattering is distributed throughout the (three-dimensional) Brilouin zone. Hence, when actually making inelastic measurements to determine the dispersion of the excitations, one can only observe a small portion of the dispersion surface at any one time, and thus the observed inelastic scattering is typically two to three orders of magnitude less intense than the Bragg peaks. Consequently these are much more time-consuming measurements, and larger samples are needed to offset the reduction in intensity. Of course, a successful determination of the dispersion relations yields a complete determination of the fundamental atomic interactions in the solid.
1332
NEUTRON TECHNIQUES
Figure 1. Calculated (solid curve) and observed intensities at room temperature for a powder of antiferromagnetically ordered YBa2Fe3O8. The differences between calculated and observed are shown at the bottom. (Huang et al., 1992.)
PRACTICAL ASPECTS OF THE METHOD Diffraction As an example of Bragg scattering, a portion of the powder diffraction pattern from a sample of YBa2Fe3O8 is shown in Figure 1 (Huang et al., 1992). The solid curve is a Rietveld refinement (Young, 1993) of both the antiferromagnetic and crystallographic structure for the sample. From this type of data, we can determine the full crystal structure; lattice parameters, atomic positions in the unit cell, site occupancies, etc. We can also determine the magnetic structure and value of the ordered moment. The results of the analysis are shown in Figure 2; the crystal structure is identical to the structure for the YBa2Cu3O7 high-TC cuprate superconductor, with the Fe replacing the Cu, and the magnetic structure is also the same as has been observed for the Cu spins in the oxygen-reduced (YBa2Cu3O6) semiconducting material. Experimentally, we can recognize the magnetic scattering by several characteristics. First, it should be temperature-dependent, and the Bragg peaks will vanish above the ordering temperature. Figure 3 shows the temperature dependence of the intensity of the peak at a scattering angle of 19.58 in Figure 1. The data clearly reveal a phase transition at the Ne´ el temperature for YBa2Fe3O8 of 650 K (Natali Sora et al., 1994); the Ne´ el temperature is where long-range antiparallel order of the spins first occurs. Above the antiferromagnetic phase transition, this peak completely disappears, indicating that it is a purely magnetic Bragg peak. A second characteristic is that the magnetic intensities become weak at high scattering angles (not shown), as f ðgÞ typically falls off strongly with increasing angle. A third, more elegant, technique is to use polarized neutrons. The polarization technique can be used at any temperature, and for any material, regardless of whether or not it has a crystallographic distortion (e.g., via magnetoelastic interactions) associated with the magnetic transition. It is more involved and time-consuming
experimentally, but yields an unambiguous identification and separation of magnetic and nuclear Bragg peaks. First consider the case where P k g, which is generally achieved by having a horizontal magnetic field, which must also be oriented along the scattering vector. In this geometry, all the magnetic scattering is spin-flip, while the nuclear scattering is always non-spin-flip. Hence for a magnetic Bragg peak the spin-flip scattering should be twice as strong as for the P ? g configuration (vertical
Figure 2. Crystal and magnetic structure for YBa2Fe3O8 deduced from the data of Figure 1. (Huang et al., 1992.)
MAGNETIC NEUTRON SCATTERING
1333
Figure 3. Temperature dependence of the intensity of the magnetic reflection found at a scattering angle of 19.58 in Figure 1. The Ne´ el temper ature is 650 K. (Natali et al., 1994.)
field), while the nuclear scattering is non-spin-flip scattering and independent of the orientation of P and g. Figure 4 shows the polarized beam results for the same two peaks, at scattering angles (for this wavelength) of 308 and 358; these correspond to the peaks at 19.58 and 238 in Figure 1. The top section of the figure shows the data for the P ? g configuration. The peak at 308 has the identical intensity for both spin-flip and non-spin-flip scattering, and hence we conclude that this scattering is purely magnetic in origin, as inferred from Figure 3. The peak at 358, on the other hand, has strong intensity for (þ þ), while the intensity for ( þ) is smaller by a factor of 1/11, the instrumental flipping ratio in this measurement. Hence, ideally there would be no spin-flip scattering, and this peak is identified as a pure nuclear reflection. The center row shows the same peaks for the P k g configuration, while the bottom row shows the subtraction of the P ? g spin-flip scattering from the P k g spin-flip scattering. In this subtraction procedure, instrumental background, as well as all nuclear scattering cross-sections, cancel, isolating the magnetic scattering. We see that there is magnetic intensity only for the low angle position, while no intensity survives the subtraction at the 358 peak position. These data unambiguously establish that the 308 peak is purely magnetic, while the 358 peak is purely nuclear. This simple example demonstrates how the technique works; obviously it would play a much more critical role in cases where it is not clear from other means what is the origin of the peaks, such as in regimes where the magnetic and nuclear peaks overlap, or in situations where the magnetic transition is accompanied by a structural distortion. If needed, a complete ‘‘magnetic diffraction pattern’’ can be obtained and analyzed with these polarization techniques. In cases where there is no significant coupling of the magnetic and lattice systems, on the other hand, the subtraction technique can also be used to obtain the magnetic diffraction pattern (see, e.g., Zhang et al., 1990). This technique is especially useful for low-temperature magnetic phase transitions where the Debye-Waller effects can be
Figure 4. Polarized neutron scattering. The top portion of the figure is for P ? g, where the open circles show the non-spin-flip scattering and the filled circles are the observed scattering in the spin-flip configuration. The low angle peak has equal intensity for both cross-sections, and thus is identified as a pure magnetic reflection, while the ratio of the (þþ) to (þ) scattering for the high-angle peak is 11, the instrumental flipping ratio. Hence this is a pure nuclear reflection. The center portion of the figure is for P ? g and the bottom portion is the subtraction of the P ? g spin-flip scattering from the data for P k g. Note that in the subtraction procedure all background and nuclear cross-sections cancel, isolating the magnetic scattering. (Huang et al., 1992.)
safely neglected. Figure 5 shows the diffraction patterns for Tm2Fe3Si5, which is an antiferromagnetic material that becomes superconducting under pressure (Gotaas et al., 1987). The top part of the figure shows the diffraction pattern obtained above the antiferromagnetic ordering temperature of 1.1 K, where just the nuclear Bragg peaks are observed. The middle portion of the figure shows the low-temperature diffraction pattern in the ordered state, which contains both magnetic and nuclear Bragg peaks, and the bottom portion shows the subtraction, which gives the magnetic diffraction pattern.
1334
NEUTRON TECHNIQUES
Figure 5. Diffraction patterns for the antiferromagnetic superconductor Tm2Fe3Si5. The top part of the figure shows the nuclear diffraction pattern obtained above the antiferromagnetic ordering temperature of 1.1 K, the middle portion of the figure shows the low-temperature diffraction pattern in the ordered state, and the bottom portion shows the subtraction of the two, which gives the magnetic diffraction pattern. (Gotaas et al., 1987.)
Another example of magnetic Bragg diffraction is shown in Figure 6. Here we show the temperature dependence of the intensity of an antiferromagnetic Bragg peak on a single crystal of the high-TC superconductor (TC 92 K) ErBa2Cu3O7 (Lynn et al., 1989). The magnetic interactions of the Er moments in this material are highly anisotropic, and this system turns out to be an ideal twodimensional (2D) (planar) Ising antiferromagnet; the solid curve is Onsager’s exact solution to the S ¼ 12, 2D Ising model (Onsager, 1944), and we see that it provides an excellent representation of the experimental data.
Figure 6. Temperature dependence of the sublattice magnetization for the Er spins in superconducting ErBa2Cu3O7, measured on a single crystal weighing 31 mg. The solid curve is Onsager’s exact theory for the 2D, S ¼ 12, Ising model (Lynn et al., 1989).
Figure 7. Neutron diffraction scans of the (111) reflection along the [001] growth axis direction (scattering vector Qz ) for (A) ˚ )CoO(30 A ˚ )]50 and (B) [Fe3O4/(100 A ˚ )CoO(100 A ˚ )]50 [Fe3O4/(100 A superlattices, taken at 78 K in zero applied field. The closed circles, open circles, and triangles indicate data taken after zerofield cooling (initial state), field cooling with an applied field (H ¼ 14 kOe) in the [110] direction, and field cooling with an applied field in the [110] direction, respectively. The inset illustrates the scattering geometry. The dashed lines indicate the temperature-and field-independent Fe3O4 component (Ijiri et al., 1998).
A final diffraction example is shown in Figure 7, where the data for two Fe3O4/CoO superlattices are shown (Ijiri et al., 1998). The superlattices consist of 50 repeats of ˚ thick layers of magnetite, which is ferrimagnetic, 100-A ˚ thick layers of the antiferromagnet and either 30- or 100-A CoO. The superlattices were grown epitaxially on singlecrystal MgO substrates, and thus these may be regarded as single-crystal samples. These scans are along the growth direction (Qz ), and show the changes in the magnetic scattering that occur when the sample is cooled in an applied field, versus zero-field cooling. These data, together with polarized beam data taken on the same samples, have elucidated the origin of the technologically important exchange biasing effect that occurs in these magnetic superlattices. It is interesting to compare the type and quality of data that are represented by these three examples. The powder diffraction technique is quite straightforward, both to obtain and analyze the data. In this case, typical sample sizes are 1–20 g, and important and detailed information can be readily obtained with such sample sizes in a few hours of spectrometer time, depending on the particulars of the problem. The temperature dependence of the order parameter in ErBa2Cu3O7, on the other hand, was obtained on a single crystal weighing only 31 mg. Note that the statistical quality of the data is much better than for the powder sample, even though the sample is more than 2 orders of magnitude smaller; this is because it is a single crystal and all the scattering is directed into a single peak, rather than scattering into a powderdiffraction ring. The final example was for Fe3O4/CoO
MAGNETIC NEUTRON SCATTERING
Figure 8. Comparison of the observed and calculated magnetic form factor for the cubic K2IrCl6 system, showing the enormous directional anisotropy along the spin direction ([001] direction) compared with perpendicular to the spin direction. The contribution of the Cl moments along the spin direction, arising from the covalent bonding, is also indicated; no net moment is transferred to the Cl ions that are perpendicular to the spin direction. (Lynn et al., 1976.)
superlattices, where the weight of the superlattices that contribute to the scattering is 1 mg. Thus it is clear that interesting and successful diffraction experiments can be carried out on quite small samples. The final example in this section addresses the measurement of a magnetic form factor, in the cubic antiferromagnetic K2IrCl6. The Ir ions occupy a face-centered cubic (fcc) lattice, and the 5d electrons carry a magnetic moment that is covalently bonded to the six Cl ions located octahedrally around each Ir ion. One of the interesting properties of this system is that there is charge transfer from the Ir equally onto the six Cl ions. A net spin transfer, however, only occurs for the two Cl ions along the direction in which the Ir spin points. This separation of spin and charge degrees of freedom leads to a very unusual form factor that is highly anisotropic, as shown in Figure 8 (Lynn et al., 1976). Note the rapid decrease in the form factor as one proceeds along the (c axis) spin direction, from the (0,1,1/2), to the (0,1,3/2), then to the (0,1,5/2) Bragg peak, which in fact has an unobservable intensity. In this example, 30% of the total moment is transferred onto the Cl ions, which is why these covalency effects are so large in this compound. It should be noted that, in principle, x-ray scattering should be able to detect the charge transferred onto the (six) Cl ions as well, but this is much more difficult to observe because it is a very small fraction of the total charge. In the magnetic case, in contrast, it is a large percentage effect and has a different symmetry than the lattice, which makes the covalent effects much easier to observe and interpret.
1335
Figure 9. Spin-flip scattering observed for the amorphous Invar Fe86B14 isotropic ferromagnetic system in the P k Q configuration. Spin waves are observed for neutron energy gain (E < 0) in the (þ) cross-section, and for neutron energy loss (E > 0) in the (þ) configuration. (Lynn et al., 1993.)
Inelastic Scattering There are many types of magnetic excitations and fluctuations that can be measured with neutron scattering techniques, such as magnons, critical fluctuations, crystal field excitations, magnetic excitons, and moment/valence fluctuations. To illustrate the basic technique, consider an isotropic ferromagnet at sufficiently long wavelengths (small q). The spin wave dispersion relation is given by Esw ¼ DðTÞq2 , where D is the spin wave ‘‘stiffness’’ constant. The general form of the spin wave dispersion relation is the same for all isotropic ferromagnets, a requirement of the (assumed) perfect rotational symmetry of the magnetic system, while the numerical value of D depends on the details of the magnetic interactions and the nature of the magnetism. One example of a prototypical isotropic ferromagnet is amorphous Fe86B14. Figure 9 shows an example of polarized beam inelastic neutron scattering data taken on this system (Lynn et al., 1993). These data were taken with the neutron polarization P parallel to the momentum transfer QðP k QÞ, where we should be able to create a spin wave only in the (þ) configuration, or destroy a spin wave only in the (þ ) configuration. This is precisely what we see in the data—for the ( þ) configuration the spin waves can only be observed for neutron energy loss scattering (E > 0), while for the (þ ) configuration, spin waves can only be observed in neutron energy gain (E < 0). This behavior of the scattering uniquely identifies these excitations as spin waves. Data like these can be used to measure the renormalization of the spin waves as a function of temperature, as well as to determine the lifetimes as a function of wave vector and temperature. An example of the renormalization of the ‘‘stiffness’’ constant D for the magnetoresistive oxide Tl2Mn2O7 is shown in Figure 10 (Lynn et al., 1998). Here, the wave vector dependence of the dispersion relation has been determined at a series of q’s, and the stiffness parameter is extracted from the data. The variation in the stiffness parameter is then plotted, and indicates a
1336
NEUTRON TECHNIQUES
Figure 10. Temperature dependence of the spin wave stiffness DðTÞ in the magnetoresistive Tl2Mn2O7 pyrochlore, showing that the spin waves renormalize as the ferromagnetic transition is approached. (Lynn et al., 1998.) Below TC the material is a metal, while above TC it exhibits insulator behavior.
smooth variation with a ferromagnetic transition temperature of 123 K. These measurements can then be compared directly with theoretical calculations. They can also be compared with other experimental observations, such as magnetization measurements, whose variation with temperature originates from spin wave excitations. Finally, these types of measurements can be extended all the way to the zone boundary on single crystals. Figure 11 shows an example of the spin wave dispersion relation for the ferromagnet La0.85Sr0.15MnO3, a system that undergoes a metal-insulator transition that is associated with the transition to ferromagnetic order (Vasiliu-Doloc et al., 1998). The top part of the figure shows the dispersion relation for the spin wave energy along two high-symmetry directions, and the solid curves are fits to a simple nearest-neighbor spin-coupling model. The overall trend of the data is in reasonable agreement with the model, although there are some clear discrepancies as well, indicating that a more sophisticated model will be needed in order to obtain quantitative agreement. In addition to the spin wave energies, though, information about the intrinsic lifetimes can also be determined, and these linewidths are shown in the bottom part of the figure for both symmetry directions. In the simplest type of model, no intrinsic spin wave linewidths at all would be expected at low temperatures, while we see here that the observed linewidths are very large and highly anisotropic, indicating that an itinerant-electron type of model is more appropriate for this system.
Figure 11. Ground state spin wave dispersion along the (0,1,0) and (0,0,1) directions measured to the zone boundary for the magnetoresistive manganite La0.85Sr0.15MnO3. The solid curves are a fit to the dispersion relation for a simple Heisenberg exchange model. The bottom part of the figure shows the intrinsic linewidths of the excitations. In the standard models, intrinsic linewidths are expected only at elevated temperatures. The large observed linewidths demonstrate the qualitative inadequacies of these models. (Vasiliu-Doloc et al., 1998.)
trons, with energies in the meV range. Thus all neutron scattering instrumentation is located at centralized national facilities, and each facility generally has a wide variety of instrumentation that has been designed and constructed for the specific facility, and is maintained and scheduled by facility scientists. Generally, any of these instruments can be used to observe magnetic scattering, and with the number and diversity of spectrometers in operation it is not practical to review the instrumentation here; the interested user should contact one of the facilities directly. The facilities themselves operate for periods of weeks at a time continuously, and hence, since the early days of neutron scattering all the data collection has been accomplished by automated computer control.
SAMPLE PREPARATION METHOD AUTOMATION Neutrons for materials research are produced by one of two processes; fission of U235 in a nuclear reactor, or by the spallation process where high energy protons from an accelerator impact on a heavy-metal target like W and explode the nuclei. Both techniques produce highenergy (MeV) neutrons, which are then thermalized or sub-thermalized in a moderator (such as heavy water) to produce a Maxwellian spectrum of thermal or cold neu-
As a general rule, there is no particular preparation that is required for the sample, in the sense that there is no need, for example, to polish the surface, cleave it under high vacuum, or perform similar procedures. Neutrons are a deeply penetrating bulk probe, and hence generally do not interact with the surface. There are two exceptions to this rule. One is when utilizing polarized neutrons in the investigation of systems with a net magnetization, where a rough surface can cause the variations in the local
MAGNETIC NEUTRON SCATTERING
magnetic field that depolarize the neutrons. In such cases, the surface may need to be polished. The second case is for neutron reflectometry, where small-angle mirror reflection is used to directly explore the magnetization profile of the surface and interfacial layers. In this case, the samples need to be optically flat over the full size of the sample (typically 1 to 10 cm2), and of course the quality of the film in terms of, e.g., surface roughness and epitaxial quality becomes part of the investigation. For magnetic single crystal diffraction, typical sample sizes should be no more than a few mm3 in order to avoid primary and secondary extinction effects; if the sample is too large, extinction will always limit the accuracy of the data obtained. Since the magnetic Bragg intensity is proportional to the square of the ordered moment, any extinction effects will depend on the value of hmz i at a particular temperature and field, while the ideal size of the sample can vary by three to four orders of magnitude, depending on the value of the saturation moment. Powder diffraction requires gram-sized samples, generally in the range of 1 to 20 g. The statistical quality of the data that can be obtained is directly related to the size, so in principle there is a direct trade-off between sample size and the time required to collect a set of diffraction data at a particular temperature or magnetic field. Sample sizes smaller than 1 g can also be measured, but since the neutron facilities are heavily oversubscribed, in practice a smaller sample size translates into fewer data sets, so an adequate-sized sample is highly desirable. The cross-sections for inelastic scattering are typically two to three orders of magnitude smaller than elastic cross-sections, and therefore crystals for inelastic scattering typically must be correspondingly larger in comparison with single-crystal diffraction. Consequently, for both powder diffraction and inelastic scattering, the general rule is the bigger the sample the better the data. The exception is if one or more of the elements in a material has a substantial nuclear absorption cross-section, in which case the optimal size of the sample is then determined by the inverse of this absorption length. The absorption cross-sections are generally directly proportional to the wavelength of the neutrons being employed for a particular measurement, and the optimal sample size then depends on the details of the spectrometer and the neutron wavelength(s). For the absorption cross-sections of all the elements, see Internet Resources. In some cases, particular isotopes can be substituted to avoid some large absorption cross-sections, but generally these are expensive and/ or are only available in limited quantity. Therefore isotopic substitution can be employed only for a few isotopes that are relatively inexpensive, such as deuterium or boron, or in scientifically compelling cases where the cost can be justified.
DATA ANALYSIS AND INITIAL INTERPRETATION One of the powers of neutron scattering is that it is a very versatile tool and can be used to probe a wide variety of magnetic systems over enormous ranges of length scale ˚ ) and dynamical energy (10 neV to (0.1 to 10000 A
1337
1 eV). The instrumentation to provide these measurement capabilities is equally diverse, but data analysis software is generally provided for each instrument to perform the initial interpretation of the data. Moreover, the technique is usually sufficiently fundamental and the interpretation sufficiently straightforward that the initial data analysis is often all that is needed. PROBLEMS Neutron scattering is a very powerful technique, but in general it is a flux-limited, and this usually requires a careful tradeoff between instrumental resolution and signal. There can also be unwanted cross-sections that appear and contaminate the data, and so one must of course exercise care in collecting the data and be vigilant in its interpretation. If a time-of-flight spectrometer is being employed, for example, there may be frame overlap problems that can mix the fastest neutrons with the slowest, and give potentially spurious results. Similarly, if the spectrometer employs one or more single crystals to monochromate or analyze the neutron energies, then the crystal can reflect higher-order wavelengths [since Bragg’s law is nl ¼ 2dsin ðyÞ] which can give spurious peaks. Sometimes these can be suppressed with the use of filters, but more generally it is necessary to vary the spectrometer conditions and retake the data in order to identify genuine cross-sections from spurious ones. This identification process can consume considerable beam time. Another problem can be encountered in identifying the magnetic Bragg scattering in powders using the subtraction technique. If the temperature is sufficiently low, the nuclear spins may begin to align with the electronic spins, particularly if there is a large hyperfine field. Significant polarization can occur at temperatures of a few degrees kelvin and lower, and the nuclear polarization can be confused with electronic magnetic ordering. Another problem can be encountered if the temperature dependence of the Debye-Waller factor is significant, or if there is a structural distortion associated with the magnetic transition, such as through a magnetoelastic interaction. In this case, the nuclear intensities will not be identical, and thus will not cancel correctly in the subtraction. It is up to the experimenter to decide if this is a problem, within the statistical precision of the data. A problem can also occur if there is a significant thermal expansion, where the diffraction peaks shift position with temperature. By significant, we mean that the shift is noticeable in comparison with the instrumental resolution employed. Again it is up to the experimenter to decide if this is a problem. In both these latter cases, though, a full refinement of the combined magnetic and nuclear structures in the ordered phase can be carried out. Alternatively, polarized beam techniques can be employed to unambiguously separate the magnetic and nuclear cross-sections. Finally, if one uses the subtraction technique in the method where a field is applied, the field can cause the powder particles to reorient if there is substantial crystalline magnetic anisotropy in the sample. This preferred orientation will remain when the field is removed, and will be evident in the nuclear peak intensities, but must be taken into account.
1338
NEUTRON TECHNIQUES
Finally, we remark about the use of polarized neutrons. Highly polarized neutron beams can be produced by single-crystal diffraction from a few special magnetic materials, from magnetic mirrors, or from transmission of the beam through a few specific nuclei where the absorption cross-section is strongly spin dependent. The choices are limited, and consequently most spectrometers are not equipped with polarization capability. For instruments that do have a polarized beam option, generally one has to sacrifice instrumental performance in terms of both resolution and intensity. Polarization techniques are then typically not used for many problems in a routine manner, but rather are usually used to answer some specific question or make an unambiguous identification of a cross-section, most often after measurements have already been carried out with unpolarized neutrons.
ACKNOWLEDGMENTS I would like to thank my colleagues, Julie Borchers, Qing Huang, Yumi Ijiri, Nick Rosov, Tony Santoro, and Lida Vasiliu-Doloc, for their assistance in the preparation of this unit. There are vast numbers of studies in the literature on various aspects presented here. The specific examples used for illustrative purposes were chosen primarily for the author’s convenience in obtaining the figures, and his familiarity with the work.
Lynn, J. W., Vasiliu-Doloc, L., and Subramanian, M. 1998. Spin dynamics of the magnetoresistive pyrochlore Tl2Mn2O7. Phys. Rev. Lett. 80:4582–4586. Moon, R. M., Riste, T., and Koehler, W. C. 1969. Polarization analysis of thermal neutron scattering. Phys. Rev. 181:920–931. Natali, S. I., Huang, Q., Lynn, J. W., Rosov, N., Karen, P., Kjekshus, A., Karen, V. L., Mighell, A. D., and Santoro, A. 1994. Neutron powder diffraction study of the nuclear and magnetic structures of the substitutional compounds (Y1xCax) Ba2Fe3O8þd. Phys. Rev. B 49:3465–3472. Onsager, L. 1944. Crystal statistics. I. A two-dimensional model with an order-disorder transition. Phys. Rev. 65:117–149. Trammell, G. T. 1953. Magnetic scattering of neutrons from rare earth ions. Phys. Rev. 92:1387–1393. Vasiliu-Doloc, L., Lynn, J. W., Moudden, A. H., de Leon-Guevara, A. M., and Revcolevschi, A. 1998. Structure and spin dynamics of La0.85Sr0.15MnO3. Phys. Rev. B 58:14913–14921. Williams, G. W. 1988. Polarized Neutrons. Oxford University Press, New York. Young, R. A. 1993. The Rietveld Method. Oxford University Press, Oxford. Zhang, H., Lynn, J. W., Li, W-H., Clinton, T. W., and Morris, D. E. 1990. Two- and three-dimensional magnetic order of the rare earth ions in RBa2Cu4O8. Phys. Rev. B 41:11229–11236.
KEY REFERENCES Bacon, 1975. See above.
Bacon, G. E. 1975. Neutron Diffraction, 3rd ed. Oxford University Press, Oxford.
This text is more for the experimentalist, treating experimental procedures and the practicalities of taking and analyzing data. It does not contain some of the newest techniques, but has most of the fundamentals. It is also rich in the history of many of the techniques.
Blume, M. 1961. Orbital contribution to the magnetic form factor of Niþþ. Phys. Rev. 124:96–103.
Balcar, E. and Lovesey, S. W. 1989. Theory of Magnetic Neutron and Photon Scattering. Oxford University Press, New York.
Gotaas, J. A., Lynn, J. W., Shelton, R. N., Klavins, P., and Braun, H. F. 1987. Suppression of the superconductivity by antiferromagnetism in Tm2Fe3Si5. Phys. Rev. B 36:7277–7280.
More recent work that specifically addresses the theory for the case of magnetic neutron as well as x-ray scattering.
Huang, Q., Karen, P., Karen, V. L., Kjekshus, A., Lynn, J. W., Mighell, A. D., Rosov, N., and Santoro, A. 1992. Neutron powder diffraction study of the nuclear and magnetic structures of YBa2Cu3O8, Phys. Rev. B 45:9611–9619.
This text treats the theory of magnetic neutron scattering in depth. Vol. 1 covers nuclear scattering.
LITERATURE CITED
Lovesey, 1984. See above.
Moon et al., 1969. See above.
Ijiri, Y., Borchers, J. A., Erwin, R. W., Lee, S.-H., van der Zaag, P. J., and Wolf, R. M. 1998. Perpendicular coupling in exchange-biased Fe3O4/CoO superlattices. Phys. Rev. Lett. 80:608–611.
This is the classic article that describes the triple-axis polarized beam technique, with examples of all the fundamental measurements that can be made with polarized neutrons. Very readable.
Lovesey, S. W. 1984. Theory of Neutron Scattering from Condensed Matter, Vol. 2. Oxford University Press, New York.
Price, D. L. and Sko¨ ld, K. 1987. Methods of Experimental Physics: Neutron Scattering. Academic Press, Orlando, Fla.
Lynn, J. W., Clinton, T. W., Li, W.-H., Erwin, R. W., Liu, J. Z., Vandervoort, K., and Shelton, R. N. 1989. 2D and 3D magnetic order of Er in ErBa2Cu3O7. Phys. Rev. Lett. 63:2606–2610.
A recent compendium that covers a variety of topics in neutron scattering, in the form of parts by various experts.
Lynn, J. W., Rosov, N., and Fish, G. 1993. Polarization analysis of the magnetic excitations in Invar and non-Invar amorphous ferromagnets. J. Appl. Phys. 73:5369–5371.
This book is more of a graduate introductory text to the subject of neutron scattering.
Lynn, J. W., Shirane, G., and Blume, M. 1976. Covalency effects in the magnetic form factor of Ir in K2IrCl6. Phys. Rev. Lett. 37:154–157.
Williams, 1988. See above.
Squires, G. L. 1978. Thermal Neutron Scattering. Cambridge University Press, New York.
This textbook focuses on the use of polarized neutrons, with all the details.
MAGNETIC NEUTRON SCATTERING
1339
Young, 1993. See above.
Numerical Values of the Free-Ion Magnetic Form Factors
This text details the profile refinement technique for powder diffraction.
http://www.ncnr.nist.gov/resources/n-lengths/
INTERNET RESOURCES Magnetic Form Factors http://papillon.phy.bnl.gov/form.html
Values of the coherent nuclear scattering amplitudes and other nuclear cross-sections.
J. W. LYNN University of Maryland College Park, Maryland
This page intentionally left blank
INDEX Ab initio theory computational analysis, 72–74 electronic structure analysis local density approximation (LDA) þ U theory, 87 phase diagram prediction, 101–102 metal alloy magnetism, first-principles calculations, paramagnetic Fe, Ni, and Co, 189 molecular dynamics (MD) simulation, surface phenomena, 156, 158 neutron powder diffraction, power data sources, 1297–1298 phonon analysis, 1324–1325 x-ray powder diffraction, structure determination, 845 Abrasion testing, basic principles, 317 Absolute zero, thermodynamic temperature scale, 32 Absorption coefficient, x-ray absorption fine structure (XAFS) spectroscopy, 875 Acceleration, tribological testing, 333 Accidental channeling, nuclear reaction analysis (NRA) and proton-induced gamma ray emission (PIGE) and, 1207–1208 Accuracy calculations energy-dispersive spectrometry (EDS), standardless analysis, 1150 metal alloy bonding, 139–141 ‘‘Acheson’’ graphite, combustion calorimetry, 372 Achromatic lenses, optical microscopy, 670 Acoustic analysis. See also Scanning acoustic microscopy (SAM) impulsive stimulated thermal scattering (ISTS) vs., 744–749 Adhesive mechanisms, tribological and wear testing, 324–325 Adsorbate materials, scanning tunneling microscopy (STM) analysis, 1115 Adsorption cryopump operation, 9–10 vacuum system principles, outgassing, 2–3 Aerial image, optical microscopy, 668 ALCHEMI (atom location by channeling electron microscopy), diffuse intensities, metal alloys, concentration waves, multicomponent alloys, 259–260 ALGOL code, neutron powder diffraction, Rietveld refinements, 1306–1307 Alignment protocols Auger electron spectroscopy (AES), specimen alignment, 1161 ellipsometry polarizer/analyzer, 738–739 PQSA/PCSA arrangements, 738–739 impulsive stimulated thermal scattering (ISTS), 751–752
liquid surface x-ray diffraction, instrumetation criteria, 1037–1038 Optical alignment, Raman spectroscopy of solids, 706 surface x-ray diffraction beamline alignment, 1019–1020 diffractometer alignment, 1020–1021 laser alignment, 1014 sample crystallographic alignment, 1014–1015 ultraviolet photoelectron spectroscopy (UPS), 729–730 Alkanes, liquid surface x-ray diffraction, surface crystallization, 1043 All-electron methods, metal alloy bonding, precision measurements, self-consistency, 141–142 Allen-Cahn equation, microstructural evolution, 122 All-metal flange seal, configuration, 18 Alloys. See Metal alloys Alternating current (AC) losses, superconductors, electrical transport measurement alternatives to, 473 applications, 472 Alternating-gradient magnetometer (AGM), principles and applications, 534–535 Altitude, mass, weight, and density definitions, 25–26 Aluminum metallographic analysis deformed high-purity aluminum, 69 7075-T6 aluminum alloys, microstructural evaluation, 68–69 vacuum system construction, 17 weight standards, 26–27 Aluminum-lithium alloys, phase diagram prediction, 108–110 Aluminum-nickel alloys, phase diagram prediction, 107 American Society for Testing and Materials (ASTM) balance classification, 27–28 fracture toughness testing crack extension measurement, 308 crack tip opening displacement (CTOD) (d), 307 standards for, 302 sample preparation protocols, 287 tensile test principles, 284–285 weight standards, 26–27 Amperometric measurements, electrochemical profiling, 579–580 Amplitude reduction, x-ray absorption fine structure (XAFS) spectroscopy, single scattering picture, 871–872
1341
Analog-to-digital converter (ADC) heavy-ion backscattering spectrometry (HIBS), 1279 ion-beam analysis (IBA) experiments, ERD/ RBS techniques, 1185–1186 polarization-modulation ellipsometer, 739 superconductors, electrical transport measurements, voltmeter properties, 477 Analytical chemistry, combustion calorimetry, 377–378 Analytical electron microscope (AEM), energydispersive spectrometry (EDS), automation, 1141 Analyzers ellipsometry, 738 relative phase/amplitude calculations, 739–740 trace element accelerator mass spectrometry (TEAMS), magnetic/electrostatic analyzer calibration, 1241–1242 x-ray photoelectron spectroscopy (XPS), 980–982 ANA software, surface x-ray diffraction data analysis, 1024 lineshape analysis, 1019 Anastigmats lenses, optical microscopy, 669 Angle calculations liquid surface x-ray diffraction, instrumetation criteria, 1037–1038 surface x-ray diffraction, 1021 Angle-dependent tensors, resonant scattering analysis, 909–910 Angle-dispersive constant-wavelength diffractometer, neutron powder diffraction, 1289–1290 Angle-resolved x-ray photoelectron spectroscopy (ARXPS) applications, 986–988 silicon example, 997–998 Angular dependence, resonant scattering, 910–912 tensors, 916 Angular efficiency factor, x-ray photoelectron spectroscopy (XPS), elemental composition analysis, 984 Anisotropy metal alloy magnetism, 191–193 MAE calculations, 192 magnetocrystalline Co-Pt alloys, 199–200 pure Fe, Ni, and Co, 193 permanent magnets, 497–499 Annular detectors, scanning transmission electron microscopy (STEM), 1091 Anomalous transmission resonant magnetic x-ray scattering, ferromagnets, 925 two-beam diffraction, 231–232
1342
INDEX
Antiferromagnetism basic principles, 494 magnetic neutron scattering, 1335 magnetic x-ray scattering nonresonant scattering, 928–930 resonant scattering, 930–932 neutron powder diffraction, 1288–1289 principles and equations, 524 Antiphase domain boundary (APB), coherent ordered precipitates, microstructural evolution, 123 Antiphase domains (APDs), coherent ordered precipitates, microstructural evolution, 122–123 Antishielding factors, nuclear quadrupole resonance (NQR), 777–778 Anti-Stokes scattering, Raman spectroscopy of solids, semiclassical physics, 701–702 Aperture diffraction scanning transmission electron microscopy (STEM) phase-contrast illumination vs., 1095–1097 probe configuration, 1097–1098 transmission electron microscopy (TEM), 1078 Archard equation, wear testing, 334 Areal density heavy-ion backscattering spectrometry (HIBS), 1281 ion-beam analysis (IBA), ERD/RBS equations, 1188–1189 Argon hangup, cryopump operation, 10 Array detector spectrometers, ultraviolet/visible absorption (UV-VIS) spectroscopy, 693 Arrhenius behavior chemical vapor deposition (CVD) model, gasphase chemistry, 169 deep level transient spectroscopy (DLTS), semiconductor materials, 423 data analysis, 425 thermal analysis and, 343 thermogravimetric (TG) analysis, 346–347 kinetic theory, 352–354 Arrott plots, Landau magnetic phase transition, 529–530 Assembly procedures, vacuum systems, 19 Associated sets, lattice alloys, x-ray diffraction, local atomic correlation, 217 Astigmatism, transmission electron microscopy (TEM), 1079 Asymmetric units crystallography, space groups, 47–50 diffuse scattering techniques, 889 Asymmetry parameter neutron powder diffraction, axial divergence peak asymmetry, 1292–1293 nuclear quadrupole resonance (NQR) nutation nuclear resonance spectroscopy, 783–784 zero-field energy leevels, 777–778 Atmospheric characteristics, gas analysis, simultaneous thermogravimetry (TG)differential thermal analysis (TG-DTA), 394 Atomic absorption spectroscopy (AAS), particleinduced x-ray emission (PIXE) and, 1211 Atomic concentrations, x-ray photoelectron spectroscopy (XPS) composition analysis, 994–996 elemental composition analysis, 984–986 Atomic emission spectroscopy (AES), particleinduced x-ray emission (PIXE) and, 1211 Atomic energy levels, ultraviolet photoelectron spectroscopy (UPS), 727–728 Atomic force microscopy (AFM) liquid surfaces and monomolecular layers, 1028
low-energy electron diffraction (LEED), 1121 magnetic domain structure measurements, magnetic force microscopy (MFM), 549–550 scanning electrochemical microscopy (SECM) and, 637 scanning transmission electron microscopy (STEM) vs., 1093 scanning tunneling microscopy (STM) vs., 1112 surface x-ray diffraction and, 1007–1008 Atomic layer construction, low-energy electron diffraction (LEED), 1134–1135 Atomic magnetic moments local moment origins, 513–515 magnetism, general principles, 491–492 Atomic probe field ion microscopy (APFIM), transmission electron microscopy (TEM) and, 1064 Atomic resolution spectroscopy (ARS), scanning transmission electron microscopy (STEM), Z-contrast imaging, 1103–1104 Atomic short-range ordering (ASRO) principles diffuse intensities, metal alloys, 256 basic definitions, 252–254 competitive strategies, 254–255 concentration waves density-functional approach, 260–262 first-principles, electronic-structure calculations, 263–266 multicomponent alloys, 257–260 effective cluster interactions (ECI), hightemperature experiments, 255–256 hybridization in NiPt alloys, charge correlation effects, 266–268 magnetic coupling and chemical order, 268 mean-field results, 262 mean-field theory, improvement on, 262–267 multicomponent alloys, Fermi-surface nesting and van Hove singularities, 269–270 pair-correlation functions, 256–257 sum rules and mean-field errors, 257 temperature-dependent shifts, 273 magnetism, metal alloys, 190–191 gold-rich AuFe alloys, 198–199 iron-vanadium alloys, 196–198 Ni-Fe alloys, energetics and electronic origins, 193–196 Atomic sphere approximation (ASA) linear muffin tin orbital method (LMTO), electronic topological transitions, van Hove singularities in CuPt, 272–273 metal alloy bonding, precision calculations, 143 phase diagram prediction, 101–102 Atomic structure surface x-ray diffraction, 1010–1011 ultraviolet photoelectron spectroscopy (UPS), symmetry characterization, photoemission process, 726–727 x-ray powder diffraction, candidate atom position search, 840–841 Atomistic Monte Carlo method, microstructural evolution, 117 Atomistic simulation, microstructural evolution modeling and, 128–129 Auger electron spectroscopy (AES) automation, 1167 basic principles, 1159–1160 chemical effects, 1162–1163 competitive and related techniques, 1158–1159 data analysis and interpretation, 1167–1168 depth profiling, 1165–1167 fluorescence and diffraction analysis, 940 instrumentation criteria, 1160–1161 ion-beam analysis (IBA) vs., 1181 ion-excitation peaks, 1171–1173 ionization loss peaks, 1171
limitations, 1171–1173 line scan properties, 1163–1164 low-energy electron diffraction (LEED), sample preparation, 1125–1126 mapping protocols, 1164–1165 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE) and, 1201–1202 plasmon loss peaks, 1171 qualitative analysis, 1161–1162 quantitative analysis, 1169 research background, 1157–1159 sample preparation, 1170 scanning electron microscopy (SEM) and, 1051–1052 scanning tunneling microscopy (STM) and basic principles, 1113 sample preparation, 1117 sensitivity and detectability, 1163 specimen modification, 1170–1171 alignment protocols, 1161 spectra categories, 1161 standards sources, 1169–1170, 1174 x-ray absorption fine structure (XAFS), detection methods, 876–877 x-ray photoelectron spectroscopy (XPS), 978–980 comparisons, 971 final-state effects, 975–976 kinetic energy principles, 972 sample charging, 1000–1001 Auger recombination, carrier lifetime measurement, 431 free carrier absorption (FCA), 441–442 Automated procedures Auger electron spectroscopy (AES), 1167 bulk measurements, 404 carrier lifetime measurement free carrier absorption (FCA), 441 photoconductivity (PC), 445–446 photoluminescence (PL), 451–452 deep level transient spectroscopy (DLTS), semiconductor materials, 423–425 differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 370 diffuse scattering techniques, 898–899 electrochemical quartz crystal microbalance (EQCM), 659–660 electron paramagnetic resonance (EPR), 798 ellipsometry, 739 energy-dispersive spectrometry (EDS), 1141 gas analysis, simultaneous techniques, 399 hardness test equipment, 319 heavy-ion backscattering spectrometry (HIBS), 1280 impulsive stimulated thermal scattering (ISTS), 753 low-energy electron diffraction (LEED), 1127 magnetic domain structure measurements holography, 554–555 Lorentz transmission electron microscopy, 552 spin polarized low-energy electron microscopy (SPLEEM), 557 x-ray magnetic circular dichroism (XMCD), 555–556 magnetic neutron scattering, 1336 magnetometry, 537 limitations, 538 magnetotransport in metal alloys, 565–566 medium-energy backscattering, 1269 micro-particle-induced x-ray emission (MicroPIXE), 1216 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE), 1205–1206
INDEX particle-induced x-ray emission (PIXE), 1216 phonon analysis, 1323 photoluminescence (PL) spectroscopy, 686 scanning electron microscopy (SEM), 1057–1058 scanning tunneling microscopy (STM) analysis, 1115–1116 semiconductor materials, Hall effect, 414 single-crystal x-ray structure determination, 860 superconductors, electrical transport measurements, 480–481 surface magneto-optic Kerr effect (SMOKE), 573 surface measurements, 406 test machine design conditions, 286–287 thermal diffusivity, laser flash technique, 387 thermogravimetric (TG) analysis, 356 thermomagnetic analysis, 541–544 trace element accelerator mass spectrometry (TEAMS), 1247 transmission electron microscopy (TEM), 1080 tribological and wear testing, 333–334 ultraviolet photoelectron spectroscopy (UPS), 731–732 ultraviolet/visible absorption (UV-VIS) spectroscopy, 693 x-ray absorption fine structure (XAFS) spectroscopy, 878 x-ray magnetic circular dichroism (XMCD), 963 x-ray microfluorescence/microdiffraction, 949 x-ray photoelectron spectroscopy (XPS), 989 Automatic recording beam microbalance, thermogravimetric (TG) analysis, 347–350 Avalanche mechanisms, pn junction characterization, 469 AVE averaging software, surface x-ray diffraction, data analysis, 1025 Avrami-Erofeev equation, time-dependent neutron powder diffraction, 1299–1300 Axial dark-field imaging, transmission electron microscopy (TEM), 1071 Axial divergence, neutron powder diffraction, peak asymmetry, 1292–1293 Background properties diffuse scattering techniques, inelastic scattering backgrounds, 890–893 fluorescence analysis, 944 heavy-ion backscattering spectrometry (HIBS), 1281 single-crystal neutron diffraction, 1314–1315 x-ray absorption fine structure (XAFS) spectroscopy, 878–879 x-ray photoelectron spectroscopy (XPS) data analysis, 990–991 survey spectrum, 974 Background radiation energy-dispersive spectrometry (EDS), filtering algorithgm, 1143–1144 liquid surface x-ray diffraction, error sources, 1045 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE) and, 1205 Back-projection imaging sequence, magnetic resonance imaging (MRI), 768 Backscattered electrons (BSEs), scanning electron microscopy (SEM) data analysis, 1058 detector criteria and contrast images, 1055– 1056 signal generation, 1051–1052 Backscatter loss correction Auger electron spectroscopy (AES), quantitative analysis, 1169 energy-dispersive spectrometry (EDS)
standardless analysis, 1149 stray radiation, 1155 Backscatter particle filtering ion-beam analysis (IBA), ERD/RBS examples, 1194–1197 medium-energy backscattering, 1261–1262 data analysis protocols, 1269–1270 nuclear reaction analysis (NRA)/protoninduced gamma ray emission (PIGE), particle filtering, 1204 x-ray absorption fine structure (XAFS) spectroscopy, 871–872 Bain transformation, phase diagram prediction, static displacive interactions, 105–106 Bakeout procedure, surface x-ray diffraction, ultrahigh-vacuum (UHV) systems, 1023– 1024 Balance design and condition classification of balances, 27–28 mass measurement techniques, mass, weight, and density definitions, 25–26 Ball indenter, Rockwell hardness values, 323 ‘‘Ballistic transport reaction’’ model, chemical vapor deposition (CVD) free molecular transport, 170 software tools, 174 Band energy only (BEO) theory, diffuse intensities, metal alloys, concentration waves, first-principles calcuations, electronic structure, 264–266 Band gap measurements, semiconductor materials, semiconductor-liquid interfaces, 613–614 Bandpass thermometer, operating principles, 37 Band theory of solids magnetic moments, 515–516 metal alloy magnetism, electronic structure, 184–185 transition metal magnetic ground state, itinerant magnetism at zero temperature, 182–183 Band-to-band recombination, photoluminescence (PL) spectroscopy, 684 Baseline determination, superconductors, electrical transport measurement, 481–482 Basis sets, metal alloy bonding, precision calculations, 142–143 Bath cooling, superconductors, electrical transport measurements, 478 Bayard-Alpert gauge, operating principles, 14–15 Beamline components liquid surface x-ray diffraction, alignment protocols, 1037–1038 magnetic x-ray scattering, 925–927 harmonic contamination, 935 surface x-ray diffraction, alignment protocols, 1019–1020 Bearings, turbomolecular pumps, 7–8 Beer-Lambert generation function laser spot scanning (LSS), semiconductor-liquid interface, 626–628 ultraviolet/visible absorption (UV-VIS) spectroscopy, quantitative analysis, 690– 691 Beer’s law, photoconductivity (PC), carrier lifetime measurement, 444–445 Bellows-sealed feedthroughs, ultrahigh vacuum (UHV) systems, 19 Bending magnetic devices magnetic x-ray scattering, beamline properties, 926–927 x-ray magnetic circular dichroism (XMCD), 957–959 Benzoic acid, combustion calorimetry, 372 Bessel function, small-angle scattering (SAS), 220–221
1343
Bethe approximation energy-dispersive spectrometry (EDS), standardless analysis, 1148–1149 ion beam analysis (IBA), 1176 multiple-beam diffraction, 239 Bethe-Slater curve, ferromagnetism, 525–526 Bias conditions, deep level transient spectroscopy (DLTS), semiconductor materials, 421–422 Bidimensional rotating frame NQRI, emergence of, 787–788 Bilayer materials, angle-resolved x-ray photoelectron spectroscopy (ARXPS), 987–988 Bimolecular reaction rates, chemical vapor deposition (CVD) model, gas-phase chemistry, 169 Binary collisions, particle scattering, 51 general formula, 52 Binary/multicomponent diffusion applications, 155 dependent concentration variable, 151 frames of reference and diffusion coefficients, 147–150 Kirkendall effect and vacancy wind, 149 lattice-fixed frame of reference, 148 multicomponent alloys, 150–151 number-fixed frame of reference, 147–148 tracer diffusion and self-diffusion, 149–150 transformation between, 148–149 volume-fixed frame of reference, 148 linear laws, 146–147 Fick’s law, 146–147 mobility, 147 multicomponent alloys Fick-Onsager law, 150 frames of reference, 150–151 research background, 145–146 substitutional and interstitial metallic systems, 152–155 B2 intermetallics, chemical order, 154–155 frame of reference and concentration variables, 152 interdiffusion, 155 magnetic order, 153–154 mobilities and diffusivities, 152–153 temperature and concentration dependence of mobilities, 153 Binary phase information, differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 368 Binding energy Auger electron spectroscopy (AES), 1159–1160 x-ray photoelectron spectroscopy (XPS) chemical state information, 986 final-state effects, 975–976 B2 intermetallics, binary/multicomponent diffusion, 154–155 Biological materials scanning electrochemical microscopy (SECM), 644 scanning tunneling microscopy (STM) analysis, 1115 Biot-Savart law, electromagnet structure and properties, 499–500 Bipolar junction transistors (BJTs), characterization basic principles, 467–469 competitive and complementary techniques, 466–467 limitations of, 471 measurement equipment sources and selection criteria, 471 protocols and procedures, 469–470 research background, 466–467 sample preparation, 470–471
1344
INDEX
Bipolar magnetic field gradient, magnetic resonance imaging (MRI), flow imaging sequence, 769–770 Bipotentiostat instrumentation, scanning electrochemical microscopy (SECM), 642– 643 Birefringent polarizer, ellipsometry, 738 Bitter pattern imaging, magnetic domain structure measurements, 545–547 Black-body law, radiation thermometer, 36–37 ‘‘Bleach-out’’ effect, nuclear quadrupole resonance (NQR), spin relaxation, 780 Bloch wave vector diffuse intensities, metal alloys, first-principles calcuations, electronic structure, 265–266 dynamical diffraction, 228 scanning transmission electron microscopy (STEM) dynamical diffraction, 1101–1103 strain contrast imaging, 1106–1108 solid-solution alloy magnetism, 183–184 Bode plots corrosion quantification, electrochemical impedance spectroscopy (EIS), 599–603 semiconductor-liquid interfaces, differential capacitance measurements, 617–619 Body-centered cubic (bcc) cells iron (Fe) magnetism local moment fluctuation, 187–188 Mo¨ ssbauer spectroscopy, 828–830 x-ray diffraction, structure-factor calculations, 209 Bohr magnetons magnetic fields, 496 magnetic moments, atomic and ionic magnetism, local moment origins, 513–515 paramagnetism, 493–494 transition metal magnetic ground state, itinerant magnetism at zero temperature, 182–183 Bohr velocity, ion beam analysis (IBA), 1175– 1176 Boltzmann constant chemical vapor deposition (CVD) model gas-phase chemistry, 169 kinetic theory, 171 liquid surface x-ray diffraction, simple liquids, 1040–1041 phonon analysis, 1318 pn junction characterization, 467–468 thermal diffuse scattering (TDS), 213 Boltzmann distribution function Hall effect, semiconductor materials, 412 quantum paramagnetiic response, 521–522 semiconductor-liquid interface, dark currentpotential characteristics, 607 Bond distances diffuse scattering techniques, 885 single-crystal x-ray structure determination, 861–863 Bonding-antibonding Friedel’s d-band energetics, 135 metal alloys, transition metals, 136–137 Bonding principles metal alloys limits and pitfalls, 138–144 accuracy limits, 139–141 all-electron vs. pseudopotential methods, 141–142 basis sets, 142–143 first principles vs. tight binding, 141 full potentials, 143 precision issues, 141–144 self-consistency, 141 structural relaxations, 143–144 phase formation, 135–138
bonding-antibonding effects, 136–137 charge transfer and electronegativities, 137–138 Friedel’s d-band energetics, 135 size effects, 137 topologically close-packed phases, 135–136 transition metal crystal structures, 135 wave function character, 138 research background, 134–135 surface phenomena, molecular dynamics (MD) simulation, 156–157 x-ray photoelectron spectroscopy (XPS), initialstate effects, 976–978 Boonton 72B meter, capacitance-voltage (C-V) characterization, 460–461 Borie-Sparks (BS) technique, diffuse scattering techniques, 886–889 data interpretation, 895–897 Born approximation distorted-wave Born approximation grazing-incidence diffraction (GID), 244–245 surface x-ray diffraction measurements, 1011 liquid surface x-ray diffraction non-specular scattering, 1033–1034 reflectivity measurements, 1032–1033 multiple-beam diffraction, second-order approximation, 238–240 Born-Oppenheimer approximation electronic structure computational analysis, 74–75 metal alloy bonding, accuracy calculations, 140–141 Born-von Ka´ rma´ n model, phonon analysis, 1319– 1323 data analysis, 1324–1325 Borosilicate glass, vacuum system construction, 17 Boundary conditions dynamical diffraction, basic principles, 229 impulsive stimulated thermal scattering (ISTS) analysis, 754–756 multiple-beam diffraction, NBEAM theory, 238 two-beam diffraction diffracted intensities, 233 dispersion surface, 230–231 Bound excitons, photoluminescence (PL) spectroscopy, 682 Boxcar technique, deep level transient spectroscopy (DLTS), semiconductor materials, 424–425 Bragg-Brentano diffractometer, x-ray powder diffraction, 838–839 Bragg-Fresnel optics, x-ray microprobes, 946–947 Bragg reflection principles diffuse intensities, metal alloys, atomic shortrange ordering (ASRO) principles, 257 diffuse scattering techniques intensity computations, 901–904 research background, 882–884 dynamical diffraction applications, 225 multiple Bragg diffraction studies, 225–226 grazing-incidence diffraction (GID), 241–242 inclined geometry, 243 ion-beam analysis (IBA), ERD/RBS equations, 1188–1189 liquid surface x-ray diffraction, instrumetation criteria, 1037–1038 local atomic order, 214–217 magnetic neutron scattering diffraction techniques, 1332–1335 inelastic scattering, 1331 magnetic diffraction, 1329–1330 polarized beam technique, 1330–1331 subtraction technque, 1330 magnetic x-ray scattering
data analysis and interpretation, 934–935 resonant antiferromagnetic scattering, 931–932 surface magnetic scattering, 932–933 multiple-beam diffraction, basic principles, 236–237 neutron powder diffraction angle-dispersive constant-wavelength diffractometer, 1290 microstrain broadening, 1294–1296 particle size effect, 1293–1294 peak shape, 1292 positional analysis, 1291–1292 probe configuration, 1286–1289 time-of-flight diffractomer, 1290–1291 nonresonant magnetic x-ray scattering, 920–921 phonon analysis, 1320 scanning transmission electron microscopy (STEM) incoherent scattering, 1099–1101 phase-contrast illumination, 1094–1097 single-crystal neutron diffraction instrumentation criteria, 1312–1313 protocols, 1311–1313 research background, 1307–1309 small-angle scattering (SAS), 219–222 surface/interface x-ray diffraction, 218–219 surface x-ray diffraction crystallographic alignment, 1014–1015 error detection, 1018–1019 thermal diffuse scattering (TDS), 211–214 transmission electron microscopy (TEM) Ewald sphere construction, 1066–1067 extinction distances, 1068 Kikuchi line origin and scattering, 1075– 1076 selected-area diffraction (SAD), 1072–1073 structure and shape factor analysis, 1065– 1066 two-beam diffraction boundary conditions and Snell’s law, 230–231 Darwin width, 232 diffracted intensities, 234–235 diffracted wave properties, 229–230 dispersion equation solution, 233 hyperboloid sheets, 230 wave field amplitude ratios, 230 x-ray birefringence, 232–233 x-ray standing wave (XWS) diffraction, 232, 235–236 x-ray absorption fine structure (XAFS) spectroscopy, energy resolution, 877 x-ray diffraction, 209 X-ray microdiffraction analysis, 945 x-ray powder diffraction, 837–838 crystal lattice and space group determination, 840 data analysis and interpretation, 839–840 error detection, 843 integrated intensities, 840 Brale indenter, Rockwell hardness values, 323 Bravais lattices diffuse intensities, metal alloys, concentration waves, 258–260 Raman active vibrational modes, 709–710, 720–721 single-crystal x-ray structure determination, crystal symmetry, 854–856 space groups, 46–50 transmission electron microscopy (TEM), structure and shape factor analysis, 1065– 1066 Breakdown behavior, pn junction characterization, 469
INDEX Breit-Wigner equation, neutron powder diffraction, probe configuration, 1287–1289 Bremsstrahlung ischromat spectroscopy (BIS) ultraviolet photoelectron spectroscopy (UPS) and, 723 x-ray photoelectron spectroscopy (XPS), 979– 980 Brewster angle microscopy (BAM) carrier lifetime measurement, free carrier absorption (FCA), 441 liquid surfaces and monomolecular layers, 1028 liquid surface x-ray diffraction Langmuir monolayers, 1043 p-polarized x-ray beam, 1047 Bright-field imaging reflected-light optical microscopy, 676–680 transmission electron microscopy (TEM) complementary dark-field and selected-area diffraction (SAD), 1082–1084 deviation vector and parameter, 1068 protocols and procedures, 1069–1071 thickness modification, deviation parameter, 1081–1082 Brillouin zone dimension diffuse intensities, metal alloys, concentration waves, first-principles calcuations, electronic structure, 265–266 magnetic neutron scattering, inelastic scattering, 1331 magnetotransport in metal alloys, magnetic field behavior, 563–565 metal alloy bonding, precision calculations, 143 quantum paramagnetiic response, 521–522 ultraviolet photoelectron spectroscopy (UPS), energy band dispersion, 725 Brinell hardness testing automated methods, 319 basic principles, 317–318 data analysis and interpretation, 319–320 hardness values, 317–318, 323 limitations and errors, 322 procedures and protocols, 318–319 research background, 316–317 sample preparation, 320 specimen modification, 320–321 Brittle fracture, fracture toughness testing load-displacement curves, 304 Weilbull distribution, 311 Broadband radiation thermometer, operating principles, 37 Bulk analysis secondary ion mass spectrometry (SIMS) and, 1237 trace element accelerator mass spectrometry (TEAMS) data analysis protocols, 1249–1250 impurity measurements, 1247–1249 Bulk capacitance measurements, electrochemical photocapacitance spectroscopy (EPS), semiconductor materials, 624 Bulk chemical free energy coherent ordered precipitates, microstructural evolution, 123 metal alloy bonding, accuracy calculations, 140–141 microstructural evolution, 119–120 Bulk crystals, surface x-ray diffraction, 1008– 1009 Bulk measurements automation, 404 basic principles, 403 limitations, 405 protocols and procedures, 403–404 sample preparation, 404–405 semiconductor materials, Hall effect, 416–417 Burger’s vector
stress-strain analysis, yield-point phenomena, 283–284 transmission electron microscopy (TEM), defect analysis, 1085 Burial pumping, sputter-ion pump, 11 Butler-Volmer equation, corrosion quantification, Tafel technique, 593–596 Cadmium plating, metallographic analysis, measurement on 4340 steel, 67 Cagliotti coefficients, neutron powder diffraction, constant wavelength diffractometer, 1304–1305 Cahn-Hilliard (CH) diffusion equation continuum field method (CFM), 114 diffusion-controlled grain growth, 127–128 interfacial energy, 120 numerical algorithms, 129–130 microscopic field model (MFM), 116 microstructural evolution, field kinetics, 122 Calibration protocols combustion calorimetry, 379 differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 366–367 diffuse scattering techniques, absolute calibration, measured intensities, 894 electron paramagnetic resonance (EPR), 797 energy-dispersive spectrometry (EDS), 1156 gas analysis, simultaneous thermogravimetry (TG)-differential thermal analysis (TGDTA), 393 hot cathode ionization gauges, 16 magnetometry, 536 mass measurement process assurance, 28–29 trace element accelerator mass spectrometry (TEAMS) data acquisition, 1252 magnetic/electrostatic analyzer calibration, 1241–1242 transmission electron microscopy (TEM), 1088– 1089 x-ray absorption fine structure (XAFS) spectroscopy, 877 Calorimetric bomb apparatus, combustion calorimetry, 375–377 Calorimetry. See Combustion calorimetry CALPHAD group diffuse intensities, metal alloys, basic principles, 254 Phase diagram predictions aluminum-lithium analysis, 108–110 aluminum-nickel analysis, 107 basic principles, 91 cluster approach, 91–92, 96–99 cluster expansion free energy, 99–101 electronic structure calculations, 101–102 ground-state analysis, 102–104 mean-field approach, 92–96 nickel-platinum analysis, 107–108 nonconfigurational thermal effects, 106–107 research background, 90–91 static displacive interactions, 104–106 Camera equation, transmission electron microscopy (TEM), selected-area diffraction (SAD), 1072–1073 Cantilever magnetometer, principles and applications, 535 Capacitance diaphragm manometers, applications and operating principles, 13 Capacitance meters, capacitance-voltage (C-V) characterization limitations of, 463–464 selection criteria, 460–461 Capacitance-voltage (C-V) characterization pn junctions, 467 research background, 401
1345
semiconductors basic principles, 457–458 data analysis, 462–463 electrochemical profiling, 462 instrument limitations, 463–464 mercury probe contacts, 461–462 profiling equipment, 460–461 protocols and procedures, 458–460 research background, 456–457 sample preparation, 463 trapping effects, 464–465 Capillary tube electrodes, scanning electrochemical microscopy (SECM), 648–659 Carbon materials, Raman spectroscopy of solids, 712–713 Carnot efficiency, thermodynamic temperature scale, 32 Carrier decay transient extraction, carrier lifetime measurement, free carrier absorption (FCA), 441 Carrier deficit, carrier lifetime measurement, 428–429 Carrier lifetime measurement characterization techniques, 433–435 device related techniques, 435 diffusuion-length-based methods, 434–435 optical techniques, 434 cyclotron resonance (CR), 805–806 free carrier absorption, 438–444 automated methods, 441 basic principles, 438–440 carrier decay transient, 441 computer interfacing, 441 data analysis and interpretation, 441–442 depth profiling, sample cross-sectioning, 443 detection electronics, 440–441 geometrical considerations, 441 lifetime analysis, 441–442 lifetime depth profiling, 441 lifetime mapping, 441–442 limitations, 443–444 probe laser selection, 440 processed wafers, metal and highly doped layer removal, 443 pump laser selection, 440 sample preparation, 442–443 virgin wafers, surface passivation, 442–443 generation lifetime, 431 photoconductivity, 444–450 basic principles, 444 data analysis, 446–447 high-frequency range, automation, 445–446 limitations, 449–450 microwave PC decay, 447–449 radio frequency PC decay, 449 sample preparation, 447 standard PC decay method, 444–445 photoluminescence, 450–453 automated procedures, 451–452 data analysis and interpretation, 452–453 deep level luminescence, 450–451 limitations, 453 near-band-gap emission, 450 photon recycling, 451 shallow impurity emission, 450 physical quantities, 433 recombination mechanisms, 429–431 selection criteria, characterization methods, 453–454 steady-state, modulated, and transient methods, 435–438 data interpretation problems, 437 limitations, 437–438 modulation-type method, 436 pulsed-type methods, 437–438
1346
INDEX
Carrier lifetime measurement (Continued) quasi-steady-state-type method, 436–437 surface recombination and diffusion, 432–433 theoretical background, 401, 427–429 trapping techniques, 431–432 Cartesian coordinates, Raman active vibrational modes, 710 Case hardening, microindentation hardness testing, 318 Catalyst analysis, thermogravimetric (TG) analysis, gas-solid reactions, 356 Cathode burial, sputter-ion pump, 11 Cathode-ray tubes (CRT), scanning electron microscopy (SEM), 1052–1053 Cathodoluminescence (CL) technique, pn junction characterization, 466–467 Cauchy’s inequality, single-crystal x-ray structure determination, direct method computation, 865–868 Cavity resonators, microwave measurement techniques, 409 Cellular automata (CA) method, microstructural evolution, 114–115 Center-of-mass reference frame forward-recoil spectrometry, data interpretation, 1270–1271 particle scattering, kinematics, 54–55 Central-field theory, particle scattering, 57–61 cross-sections, 60–61 deflection functions, 58–60 approximation, 59–60 central potentials, 59 hard spheres, 58–59 impact parameter, 57–58 interaction potentials, 57 materials analysis, 61 shadow cones, 58 Central potentials, particle scattering, centralfield theory, 57 deflection function, 59 Ceramic materials turbomolecular pumps, 7 x-ray photoelectron spectroscopy (XPS), 988– 989 CFD-ACE, chemical vapor deposition (CVD) model, hydrodynamics, 174 Cgs magnetic units general principles, 492–493 magnetic vacuum permeability, 511 Chain of elementary reactions, chemical vapor deposition (CVD) model, gas-phase chemistry, 168–169 Channeling approximation, scanning transmission electron microscopy (STEM), dynamical diffraction, 1102–1103 Channel plate arrays, x-ray photoelectron spectroscopy (XPS), 982–983 Channeltrons, x-ray photoelectron spectroscopy (XPS), 982–983 Character table, vibrational Raman spectroscopy, group theoretical analysis, 718 Charge-balance equation (CBE), semiconductor materials, Hall effect, 412, 416 Charge correlation effects, diffuse intensities, metal alloys concentration waves, first-principles calcuations, electronic structure, 266 hybriziation in NiPt alloys and, 266–268 Charge-coupled device (CCD) low-energy electron diffraction (LEED), instrumentation and automation, 1126–1127 photoluminescence (PL) spectroscopy, 684 Raman spectroscopy of solids, dispersed radiation measurment, 707–708 single-crystal x-ray structure determination
basic principles, 850 detector components, 859–860 transmission electron microscopy (TEM) automation, 1080 phase-contrast imaging, 1096–1097 Charged particle micropobes, x-ray microprobes, 940–941 Charge-funneling effects, ion-beam-induced charge (IBIC)/single event upset (SEU) microscopy, 1228 Charge transfer cyclic voltammetry, nonreversible chargetransfer reaactions, 583–584 metal alloy bonding, 137–138 semiconductor-liquid interface photoelectrochemistry, thermodynamics, 606–607 time-resolved photoluminescence spectroscopy (TRPL), 630–631 Charpy V-Notch impact testing, fracture toughness testing, 302 Chemically modified electrodes (CMEs), electrochemical profiling, 579–580 Chemical order diffuse intensities, metal alloys, magnetic effects and, 268 diffuse scattering techniques basic principles, 884–885 research background, 882–884 substitutional and interstitial metallic systems, 154–155 Chemical polishing, ellipsometry, surface preparations, 741–742 Chemical shift imaging (CSI) Auger electron spectroscopy (AES), 1162–1163 nuclear magnetic resonance basic principles, 764–765 imaging sequence, 770 x-ray photoelectron spectroscopy (XPS) applications, 986 initial-state effects, 976–978 reference spectra, 996–998 research background, 970–972 Chemical thinning, transmission electron microscopy (TEM), sample preparation, 1086 Chemical vapor deposition (CVD), simulation models basic components, 167–173 free molecular transport, 169–170 gas-phase chemistry, 168–169 hydrodynamics, 170–172 kinetic theory, 172 plasma physics, 170 radiation, 172–173 surface chemistry, 167–168 limitations, 175–176 research background, 166–167 software applications free molecular transport, 174 gas-phase chemistry, 173 hydrodynamics, 174 kinetic theory, 175 plasma physics, 174 radiation, 175 surface chemistry, 173 Chemisorption, sputter-ion pump, 10–11 CHEMKIN software, chemical vapor deposition (CVD) models limitations, 175–176 surface chemistry, 173 Chromatic aberration optical microscopy, 669–670 transmission electron microscopy (TEM), lens defects and resolution, 1078 Chromatograpy, evolved gas analysis (EGA) and, 396–397
Circular dichroism. See X-ray magnetic circular dichroism Circularly polarized x-ray (CPX) sources, x-ray magnetic circular dichroism (XMCD), 957– 959 figure of merit, 969–970 Classical mechanics magnetic x-ray scattering, 919–920 Raman spectroscopy of solids, electromagnetic radiation, 699–701 resonant scattering analysis, 906–907 surface magneto-optic Kerr effect (SMOKE), 570–571 Classical phase equilibrium, phase diagram predictions, 91–92 Clausius-Mosotti equation, electric field gradient, nuclear couplings, nuclear quadrupole resonance (NQR), 777 Cleaning procedures, vacuum systems, 19 Cllip-on gage, mechanical testing, extensometers, 285 Closed/core shell diamagnetism, dipole moments, atomic origin, 512 Cluster magnetism, spin glass materials, 516–517 Cluster variational method (CVM) diffuse intensities, metal alloys, 255 concentration waves, multicomponent alloys, 259–260 mean-field theory and, 262–263 phase diagram prediction nickel-platinum alloys, 107–108 nonconfigurational thermal effects, 106–107 phase diagram predictions, 96–99 free energy calculations, 99–100 Cmmm space group, crystallography, 47–50 Coarse-grained approximation, continuum field method (CFM), 118 Coarse-grained free energy formulation, microstructural evolution, 119 Coaxial probes, microwave measurement techniques, 409 Cobalt alloys magnetism, magnetocrystalline anisotropy energy (MAE), 193 magnetocrystalline anisotropy energy (MAE), Co-Pt alloys, 199–200 paramagnetism, first-principles calculations, 189 Cobalt/copper superlattices, surface magnetooptic Kerr effect (SMOKE), 573–574 Cockroft-Walton particle accelerator, nuclear reaction analysis (NRA)/proton-induced gamma ray emission (PIGE), 1203–1204 Coherence analysis low-energy electron diffraction (LEED), estimation techniques, 1134 Mo¨ ssbauer spectroscopy, 824–825 phonon analysis, neutron scattering, 1325–1326 scanning transmission electron microscopy (STEM) phase-contrast imaging vs., 1093–1097 research background, 1091–1092 strain contrast imaging, 1106–1108 Coherent ordered precipitates, microstructural evolution, 122–126 Coherent potential approximation (CPA) diffuse intensities, metal alloys concentration waves, first-principles calcuations, electronic structure, 264–266 hybridization in NiPt alloys, charge correlation effects, 267–268 metal alloy magnetism, magnetocrystalline anisotropy energy (MAE), 192–193 solid-solution alloy magnetism, 183–184 Cohesive energies, local density approximation (LDA), 79–81
INDEX Cold cathode ionization gauge, operating principles, 16 Collective magnetism, magnetic moments, 515 Collision diameter, particle scattering, deflection functions, hard spheres, 58–59 Combination bearings systems, turbomolecular pumps, 7 Combustion calorimetry commercial sources, 383 data analysis and interpretation, 380–381 limitations, 381–382 protocols and procedures, 375–380 analytical chemistry and, 377–378 calibration protocols, 379 experimental protocols, 377 measurement protocols, 378–379 non-oxygen gases, 380 standard state corrections, 379–380 research background, 373–374 sample preparation, 381 thermodynamic principles, 374–375 Combustion energy computation, combustion calorimetry, enthalpies of formation computation vs., 380–381 Compact-tension (CT) specimens, fracture toughness testing crack extension measurement, 309–311 load-displacement curve measurement, 308 sample preparation, 311–312 stress intensity and J-integral calculations, 314–315 Compensator, ellipsometry, 738 Compliance calculations, fracture toughness testing, 315 Compton scattering diffuse scattering techniques, inelastic scattering background removal, 891–893 x-ray absorption fine-structure (XAFS) spectroscopy, 875–877 x-ray microfluorescence, 942 Computational analysis. See also specific computational techniques, e.g. Schro¨ dinger equation basic principles, 71–74 electronic structure methods basic principles, 74–77 dielectric screening, 84–87 Green’s function Monte Carlo (GFMC), electronic structure methods, 88–89 GW approximation, 84–85 local-density approximation (LDA) þ U, 87 Hartree-Fock theory, 77 local-density approximation (LDA), 77–84 local-density approximation (LDA) þ U, 86–87 quantum Monte Carlo (QMC), 87–89 SX approximation, 85–86 variational Monte Carlo (VMC), 88 low-energy electron diffraction (LEED), 1134–1135 phase diagram prediction aluminum-lithium analysis, 108–110 aluminum-nickel analysis, 107 basic principles, 91 cluster approach, 91–92, 96–99 cluster expansion free energy, 99–101 electronic structure calculations, 101–102 ground-state analysis, 102–104 mean-field approach, 92–96 nickel-platinum analysis, 107–108 nonconfigurational thermal effects, 106–107 research background, 90–91 static displacive interactions, 104–106 research background, 71 thermogravimetric (TG) analysis, 354–355 Computer interface, carrier lifetime measurement, free carrier absorption (FCA), 441
Computer x-ray analyzer (CXA), energydispersive spectrometry (EDS), 1137–1140 automation, 1141 Concentration overpotentials, semiconductor materials, J-E behavior corrections, 611– 612 Concentration variables, substitutional and interstitial metallic systems, 152 Concentration waves, diffuse intensities, metal alloys, 252–254 as competitive strategy, 255 density-functional theory (DFT), 260–262 first-principles, electronic-structure calculations, 263–266 multicomponent alloys, 257–260 Conductance, vacuum system design, 19–20 Conduction cooling, superconductors, electrical transport measurements, 478 Conductive substrates, Hall effect, semiconductor materials, 417 Conductivity measurements Hall effect, 411–412 research background, 401–403 theoretical background, 401 Configuration-interaction theory, electronic structure analysis, 75 Conservation laws, ultraviolet photoelectron spectroscopy (UPS), energy band dispersion, 724–725 Considere construction, nonuniform plastic deformation, 282 Constant current regime photoconductivity (PC), carrier lifetime measurement, 445 scanning electrochemical microscopy (SECM), 643 Constant field regime, photoconductivity (PC), carrier lifetime measurement, 445 Constant fraction discriminators, heavy-ion backscattering spectrometry (HIBS), 1279 Constant Q scan, phonon analysis, triple-axis spectrometry, 1322–1323 Constant wavelength diffractometer, neutron powder diffraction, optical properties, 1303–1304 Constraint factors neutron powder diffraction, 1301 static indentation hardness testing, 317 Construction materials, vacuum systems, 17 Contact materials superconductors, electrical transport measurement, 474 sample contacts, 483 superconductors, electrical transport measurements, properties, 477–478 tribological and wear testing, 324–325 Contact size, Hall effect, semiconductor materials, 417 Continuity equation, chemical vapor deposition (CVD) model, hydrodynamics, 171 Continuous-current ramp measurements, superconductors, electrical transport measurements, 479 Continuous interfaces, liquid surface x-ray diffraction, reflectivity measurements, 1031–1033 Continuous-loop wire saw cutting, metallographic analysis, sample preparation, 65 Continuous magnetic fields, laboratory settings, 505 Continuous-wave (CW) experiments, electron paramagnetic resonance (EPR) calibration, 797 error detection, 800–802 microwave power, 796 modulation amplitude, 796–797
1347
non-X-band frequencies, 795–796 sensitivity, 797 X-band with rectangular resonator, 794–795 Continuous wave nuclear magnetic resonance (NMR), magnetic field measurements, 510 Continuum field method (CFM), microstructural evolution applications, 122 atomistic simulation, continuum process modeling and property calculation, 128–129 basic principles, 117–118 bulk chemical free energy, 119–120 coarse-grained approximation, 118 coarse-grained free energy formulation, 119 coherent ordered precipitates, 122–126 diffuse-interface nature, 119 diffusion-controlled grain growth, 126–128 elastic energy, 120–121 external field energies, 121 field kinetic equations, 121–122 future applications, 128 interfacial energy, 120 limits of, 130 numerical algorithm efficiency, 129–130 research background, 114–115 slow variable selection, 119 theoretical basis, 118 Continuum process modeling, microstructural evolution modeling and, 128–129 Contrast images, scanning electron microscopy (SEM), 1054–1056 Control issues, tribological testing, 332–333 Controlled-rate thermal analysis (CRTA), mass loss, 345 Conventional transmission electron microscopy (CTEM). See Transmission electron microscopy (TEM) Convergent-beam electron diffraction (CBED), transmission electron microscopy (TEM), selected-area diffraction (SAD), 1072–1073 Cooling options and procedures, superconductors, electrical transport measurements, 478–479 Coordinate transformation, resonant scattering techniques, 916–917 Copper-nickel-zinc alloys Fermi-surface nesting and van Hove singularities, 268–270 ordering wave polarization, 270–271 Copper-platinum alloys, topological transitions, van Hove singularities, 271–273 Core-level absorption spectroscopy, ultraviolet photoelectron spectroscopy (UPS) and, 725–726 Core-level emission spectroscopy, ultraviolet photoelectron spectroscopy (UPS) and, 725–726 Core-level photoelectron spectroscopy, ultraviolet photoelectron spectroscopy (UPS) and, 725–726 Corrosion studies electrochemical quantification electrochemical impedance spectroscopy (EIS), 599–603 linear polarization, 596–599 research background, 592–593 Tafel technique, 593–596 scanning electrochemical microscopy (SECM), 643–644 thermogravimetric (TG) analysis, gas-solid reactions, 355–356 Corrosive reaction gases, thermogravimetric (TG) analysis, instrumentation and apparatus, 348
1348
INDEX
Coster-Kronig transition Auger electron spectroscopy (AES), 1159–1160 particle-induced x-ray emission (PIXE), 1211–1212 Coulomb interaction electronic structure analysis dielectric screening, 84 local density approximation (LDA) þ U theory, 87 GW approximation and, 75 ion-beam analysis (IBA), ERD/RBS equations, 1187–1189 Mo¨ ssbauer spectroscopy, isomer shift, 821–822 particle scattering central-field theory, cross-sections, 60–61 shadow cones, 58 Coulomb potential, metal alloy bonding, accuracy calculations, 139–141 Coulomb repulsion, metal alloy magnetism, 180 Counterflow mode, leak detection, vacuum systems, 21–22 Count rate range, energy-dispersive spectrometry (EDS) measurement protocols, 1156 optimization, 1140 Coupling potential energy coherent ordered precipitates, microstructural evolution, 125–126 magnetic x-ray scattering errors, 935–936 Cowley-Warren (CW) order parameter. See Warren-Cowley order parameter Crack driving force (G), fracture toughness testing basic principles, 303–305 defined, 302 Crack extension measurement fracture toughness testing, 308 J-integral approach, 311 SENB and CT specimens, 315 hardness testing, 320 Crack instability and velocity, load-displacement curves, fracture toughness testing, 304 Crack propagation, hardness testing, 320 Crack tip opening displacement (CTOD) (d), fracture toughness testing, 307 errors, 313 sample preparation, 312 Critical current density (Jc) superconductors, electrical transport measurement, 472 data analysis and interpretation, 481 superconductors, electrical transport measurements, current-carrying area, 476 Critical current (Ic) superconducting magnets, stability and losses, 501–502 superconductors, electrical transport measurement, data analysis and interpretation, 481 superconductors, electrical transport measurements, 472 criteria, 474–475 current-carrying area, 476 Critical temperature (Tc) ferromagnetism, 495 superconductors, electrical transport measurement, 472 data analysis and interpretation, 481 magnetic measurements vs., 473 Cromer-Liberman tabulation, diffuse scattering techniques, resonant scattering terms, 893–894 Cross-modulation, cyclotron resonance (CR), 812 Cross-sections diffuse scattering techniques, resonant scattering terms, 893–894
ion-beam analysis (IBA), 1181–1184 ERD/RBS equations, 1187–1189 non-Rutherford cross-sections, 1189–1190 magnetic neutron scattering, error detection, 1337–1338 medium-energy backscattering, 1261 metal alloy magnetism, atomic short range order (ASRO), 191 nonresonant antiferromagnetic scattering, 929–930 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE), Q values, 1205 particle scattering, central-field theory, 60–61 phonon analysis, 1318–1323 resonant magnetic x-ray scattering, 924 superconductors, electrical transport measurement, area definition, 482 x-ray microfluorescence, 943 Crucible apparatus, thermogravimetric (TG) analysis, 351–352 Cryopumps applications, 9 operating principles and procedures, 9–10 Crystallography crystal systems, 44 lattices, 44–46 Miller indices, 45–46 metal alloy bonding, phase formation, 135 neutron powder diffraction, refinement techniques, 1300–1302 point groups, 42–43 resonant x-ray diffraction, 905 space groups, 46–50 symbols, 50 surface x-ray diffraction measurements, 1010– 1011 refinement limitations, 1018 symmetry operators, 39–42 improper rotation axes, 39–40 proper rotation axes, 39 screw axes and glide planes, 40–42 symmetry principles, 39 x-ray diffraction, 208–209 Crystal structure conductivity measurements, research background, 402–403 diffuse scattering techniques, 885–889 dynamical diffraction, applications, 225 Mo¨ ssbauer spectroscopy, defect analysis, 830– 831 photoluminescence (PL) spectroscopy, 685–686 single-crystal neutron diffraction, quality issues, 1314–1315 single-crystal x-ray structure determination refinements, 856–858 symmetry properties, 854–856 surface x-ray diffraction, alignment protocols, 1014–1015 x-ray powder diffraction, lattice analysis, 840 Crystal truncation rods (CTR) liquid surface x-ray diffraction, non-specular scattering, 1036, 1038–1039 surface x-ray diffraction basic properties, 1009 error detection, 1018–1019 interference effects, 219 profiles, 1015 reflectivity, 1015–1016 silicon surface example, 1017–1018 surface x-ray scattering, surface roughness, 1016 Crystral structure, materials characterization, 1 CSIRO Heavy Ion Analytical Facility (HIAF), trace element accelerator mass spectrometry (TEAMS) research at, 1245
Cubic lattice alloys diffuse scattering techniques, static displacement recovery, 897–898 x-ray diffraction, local atomic correlation, 216–217 Curie law collective magnetism, 515 paramagnetism, 493–494 quantum paramagnetiic response, 521–522 Curie temperature band theory of magnetism, 515–516 face-centered cubic (fcc) iron, moment alignment vs. moment formation, 194–195 ferromagnetism, 523–524 metal alloy magnetism, 181 substitutional and interstitial metallic systems, 153–154 thermogravimetric (TG) analysis, temperature measurement errors and, 358–359 Curie-Weiss law ferromagnetism, 524 paramagnetism, 493–494 thermomagnetic analysis, 543–544 Current carriers, illuminated semiconductorliquid interface, J-E equations, 608 Current-carrying area, superconductors, electrical transport measurement, 476 Current density, semiconductor-liquid interface, 607–608 Current image tunneling sepctroscopy (CITS), image acquisition, 1114 Current-potential curve, corrosion quantification, linear polarization, 597–599 Current ramp rate, superconductors, electrical transport measurements, 479 Current sharing, superconductors, electrical transport measurement, other materials and, 474 Current supply, superconductors, electrical transport measurements, 476–477 Current transfer, superconductors, electrical transport measurement, 475–476 Current-voltage (I-V) measurement technique low-energy electron diffraction (LEED) quantitative analysis, 1130–1131 quantitative measurement, 1124–1125 pn junction characterization basic principles, 467–469 competitive and complementary techniques, 466–467 limitations of, 471 measurement equipment sources and selection criteria, 471 protocols and procedures, 469–470 research background, 466–467 sample preparation, 470–471 superconductors, electrical transport measurement, baseline determination, 481–482 Cyclic voltammetry electrochemical profiling data analysis and interpretation, 586–587 electrocatalytic mechanism, 587–588 electrochemical cells, 584 electrolytes, 586 limitations, 590–591 platinum electrode example, 588–590 potentiostats and three-electrode chemical cells, 584–585 reaction reversibility, 582–584 non-reversible charge-transfer reactions, 583–584 quasireversible reaction, 583 total irreversible reaction, 583 total reversible reaction, 582–583 research background, 580–582
INDEX sample preparation, 590 specimen modification, 590 working electrodes, 585–586 feedback mode, scanning electrochemical microscopy (SECM), 638–640 Cyclotron resonance (CR) basic principles, 806–808 cross-modulation, 812 data analysis and interpretation, 813–814 far-infrared (FIR) sources, 809 Fourier transform FIR magneto-spectroscopy, 809–810 laser far infrared (FIR) magneto-spectroscopy, 810–812 limitations, 814–815 optically-detected resonance (ODR) spectroscopy, 812–813 protocols and procedures, 808–813 quantum mechanics, 808 research background, 805–806 sample preparation, 814 semiclassical Drude model, 806–807 Cylinders, small-angle scattering (SAS), 220–221 Cylindrically-focused excitation pulse, vs. impulsive stimulated thermal scattering (ISTS), 745–746 Cylindrical mirror analyzer (CMA) Auger electron spectroscopy (AES), 1160–1161 ultraviolet photoelectron spectroscopy (UPS), 729 x-ray photoelectron spectroscopy (XPS), 980– 982 Dangling bonds, carrier lifetime measurement, surface passivation, free carrier absorption (FCA), 442–443 Dark current-potential characteristics, semiconductor-liquid interface, 607 Dark-field imaging reflected-light optical microscopy, 676–680 transmission electron microscopy (TEM) axial dark-field imaging, 1071 complementary bright-field and selected-area diffraction (SAD), 1082–1084 deviation vector and parameter, 1068 protocols and procedures, 1069–1071 thickness modification, deviation parameter, 1081–1082 Darwin width neutron powder diffraction, peak shapes, 1292 two-beam diffraction, 232 dispersion equation solution, 233 Data reduction, thermal diffusivity, laser flash technique, 389 DC Faraday magnetometer automation, 537 principles and applications, 534 Deadtime correction function energy-dispersive spectrometry (EDS), 1137– 1140 optimization, 1141 particle-induced x-ray emission (PIXE), 1213– 1216 Debye-Gru¨ neisen framework, phase diagram prediction, nonconfigurational thermal effects, 106–107 Debye length, pn junction characterization, 467– 468 Debye-Scherrer geometry liquid surface x-ray diffraction, instrumetation criteria, 1037–1038 neutron powder diffraction axial divergence peak asymmetry, 1292–1293 quantitative phase analysis, 1297 x-ray powder diffraction, 838–839 error detection, 843
Debye temperature, semiconductor materials, Hall effect, 415–416 Debye-Waller factor diffuse intensity computation, 901–904 liquid surface x-ray diffraction, grazing incidence and rod scans, 1036 magnetic neutron scattering, 1329–1330 diffraction techniques, 1333–1335 Mo¨ ssbauer spectroscopy, recoil-free fraction, 821 phonon analysis, 1320 scanning transmission electron microscopy (STEM) incoherent scattering, 1100–1101 strain contrast imaging, 1107–1108 single-crystal neutron diffraction and, 1310–1311 surface x-ray diffraction crystallographic refinement, 1018 crystal truncation rod (CTR) profiles, 1015 thermal diffuse scattering (TDS), 211–214 x-ray absorption fine structure (XAFS) spectroscopy, disordered states, 873 x-ray powder diffraction, 837–838 Decomposition temperature, thermogravimetric (TG) analysis, 351–352 stability/reactivity measurements, 355 Deep level luminescence, carrier lifetime measurement, photoluminescence (PL), 450 Deep level transient spectroscopy (DLTS) basic principles, 419–423 emission rate, 420–421 junction capacitance transient, 421–423 capacitance-voltage (C-V) analysis, comparisons, 401, 456–457 data analysis and interpretation, 425 limitations, 426 pn junction characterization, 467 procedures and automation, 423–425 research background, 418–419 sample preparation, 425–426 semiconductor defects, 418–419 Defect analysis dynamical diffraction, applications, 225 photoluminescence (PL) spectroscopy, transitions, 683–684 transmission electron microscopy (TEM) Kikuchi lines and deviation parameter, 1085 values for, 1084–1085 Deflection functions nuclear reaction analysis (NRA)/protoninduced gamma ray emission (PIGE), particle filtering, 1204 particle scattering, central-field theory, 58–60 approximation, 59–60 central potentials, 59 hard spheres, 58–59 Density definitions, 24–26 surface phenomena, molecular dynamics (MD) simulation, layer density and temperature variation, 164 Density-functional theory (DFT) diffuse intensities, metal alloys basic principles, 254 concentration waves, 260–262 first-principles calcuations, electronic structure, 263–266 limitations, 255 electronic structure analysis, local density approximation (LDA), 77–78 metal alloy bonding, accuracy calculations, 139–141 transition metal magnetic ground state, itinerant magnetism at zero temperature, 181–183
1349
Density measurements basic principles, 24 impulsive stimulated thermal scattering (ISTS), 749 indirect techniques, 24 mass, weight, and density definitions, 24–26 mass measurement process assurance, 28–29 materials characterization, 1 weighing devices, balances, 27–28 weight standards, 26–27 Dependent concentration variable, binary/ multicomponent diffusion, 151 substitutional and interstitial metallic systems, 152 Depletion effects, Hall effect, semiconductor materials, 417 Depletion widths, capacitance-voltage (C-V) characterization, 458–460 Depth of focus, optical microscopy, 670 Depth-profile analysis Auger electron spectroscopy (AES), 1165–1167 carrier lifetime measurement, free carrier absorption (FCA), 443 heavy-ion backscattering spectrometry (HIBS), 1276 ion-beam analysis (IBA), ERD/RBS techniques, 1191–1197 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE) and, 1202–1203 nonresonant depth profiling, 1206 secondary ion mass spectrometry (SIMS) and, 1237 semiconductor materials, Hall effect, 415–416 surface x-ray diffraction measurements, grazing-incidence measurement, 1016–1017 trace element accelerator mass spectrometry (TEAMS) data analysis protocols, 1251–1252 impurity measurements, 1250–1251 Depth resolution ion beam analysis (IBA) ERD/RBS techniques, 1183–1184 periodic table, 1177–1178 nuclear reaction analysis (NRA)/protoninduced gamma ray emission (PIGE), equations, 1209–1210 Derivative thermal analysis, defined, 337–338 Derivative thermogravity (DTG) curve schematic, 345 defined, 338 kinetic theory, 352–354 procedures and protocols, 346–347 Design criteria, vacuum systems, 19–20 Detailed balance principle carrier lifetime measurement, 429 deep level transient spectroscopy (DLTS), semiconductor materials, 420–421 Detection methods Auger electron spectroscopy (AES), 1163 carrier lifetime measurement, free carrier absorption (FCA), 440–441 energy-dispersive spectrometry (EDS), limitations of, 1151–1152 ion-beam analysis (IBA), ERD/RBS geometries, 1185–1186 trace element accelerator mass spectrometry (TEAMS), high-energy beam transport, analysis, and detection, 1239 x-ray absorption fine structure (XAFS) spectroscopy, 875–877 x-ray magnetic circular dichroism (XMCD), 959 Detector criteria diffuse scattering techniques, 894–897 energy-dispersive spectrometry (EDS), 1137– 1140
1350
INDEX
Detector criteria (Continued) solid angle protocols, 1156 standardless analysis, efficiency parameters, 1149 fluorescence analysis, 944 ion-beam analysis (IBA), ERD/RBS techniques, 1185–1186 magnetic x-ray scattering, 927–928 nuclear reaction analysis (NRA)/protoninduced gamma ray emission (PIGE), 1204 particle-induced x-ray emission (PIXE), 1212–1216 silicon detector efficiency, 1222–1223 photoluminescence (PL) spectroscopy, 684 scanning electron microscopy (SEM), 1054–1056 single-crystal neutron diffraction, 1313 x-ray photoelectron spectroscopy (XPS), 982–983 Deuteron acceleration, nuclear reaction analysis (NRA)/proton-induced gamma ray emission (PIGE), equipment criteria, 1203–1204 Deviation vector and parameter, transmission electron microscopy (TEM), 1067–1068 Kikuchi lines, defect contrast, 1085 specimen orientation, 1077–1078 specimen thickness, 1081–1082 Device-related techniques, carrier lifetime measurement, 435 Diagonal disorders, diffuse intensities, metal alloys, concentration waves, first-principles calcuations, electronic structure, 266 Diamagnetism basic principles, 494 closed/core shell diamagnetism, dipole moments, atomic origin, 512 Diaphragm pumps applications, 4 operating principles, 5 Dichroic signal, x-ray magnetic circular dichroism (XMCD), 953–955 basic theory, 955–957 data analysis, 963–964 Dielectric constants, ellipsometry, 737 Dielectric screening, electronic structure analysis, 84–87 Dielectric tensor, surface magneto-optic Kerr effect (SMOKE), 570–571 Difference equations, neutron powder diffraction, stacking faults, 1296 Differential capacitance measurements, semiconductor-liquid interfaces, 616–619 Differential cross-section, particle scattering, central-field theory, 60 Differential interference contrast (DIC), reflectedlight optical microscopy, 679–680 Differential ion pump, operating principles, 11 Differential scanning calorimetry (DSC). See also Simultaneous thermogravimetry (TG)differential thermal analysis/differential scanning calorimetry (TG-DTA/DSC) automated procedures, 370 basic principles, 363–366 data analysis and interpretation, 370–371 defined, 337–339 limitations, 372 protocols and procedures, 366–370 applications, 367–370 calibration, 366–367 instrumentation, 367 zero-line optimization, 366 research background, 362–363 sample preparation, 371 specimen modification, 371–372 thermal analysis, 339 thermogravimetric (TG) analysis, 346
Differential scattering cross-sections, ion-beam analysis (IBA), 1181–1184 Differential thermal analysis (DTA). See also Simultaneous thermogravimetry (TG)differential thermal analysis (TG-DTA) automated procedures, 370 basic principles, 363–366 data analysis and interpretation, 370–371 defined, 337–339 limitations, 372 protocols and procedures, 366–370 applications, 367–370 calibration, 366–367 instrumentation, 367 zero-line optimization, 366 research background, 362–363 sample preparation, 371 simultaneous techniques for gas analysis, research background, 392–393 specimen modification, 371–372 thermogravimetric (TG) analysis, 346 Diffraction techniques. See also specific techniques, e.g. X-ray diffraction defined, 207 density definition and, 26 diffuse scattering and, 884 magnetic neutron scattering, 1332–1335 Mo¨ ssbauer spectroscopy, 824–825 transmission electron microscopy (TEM) bright-field/dark-field imaging, 1070–1071 lattice defect diffraction contrast, 1068–1069 pattern indexing, 1073–1074 Diffractometer components liquid surface x-ray diffraction, instrumetation criteria, 1037–1038 neutron powder diffraction angle-dispersive constant-wavelength diffractometer, 1289–1290 time-of-flight diffractomer, 1290–1291 surface x-ray diffraction, 1009–1010 alignment protocols, 1020–1021 angle calculations, 1021 five-circle diffractometer, 1013–1014 Diffuse intensities metal alloys atomic short-range ordering (ASRO) principles, 256 concentration waves density-functional approach, 260–262 first-principles, electronic-structure calculations, 263–266 multicomponent alloys, 257–260 mean-field results, 262 mean-field theory, improvement on, 262–267 pair-correlation functions, 256–257 sum rules and mean-field errors, 257 competitive and related techniques, 254–255 computational principles, 252–254 data analysis and interpretation, 266–273 Cu2NiZn ordering wave polarization, 270– 271 CuPt van Hove singularities, electronic topological transitions, 271–273 magnetic coupling and chemical order, 268 multicomponent alloys, 268–270 Ni-Pt hybridization and charge correlation, 266–268 temperature-dependent shift, ASRO peaks, 273 high-temperature experiments, effective interactions, 255–256 two-beam diffraction boundary condition, 233 Bragg case, 234–235 integrated intensities, 235
Laue case, 233–234 Diffuse-interface approach, microstructural evolution, 119 Diffuse scattering techniques intensity computations, 886–889 absolute calibration, measured intensities, 894 derivation protocols, 901–904 liquid surface x-ray diffraction non-specular scattering, 1038–1039 simple liquids, 1041 surface x-ray scattering, 1016 x-ray and neutron diffuse scattering applications, 889–894 automation, 897–898 bond distances, 885 chemical order, 884–885 comparisons, 884 competitive and related techniques, 883–884 crystalling solid solutions, 885–889 data analysis and interpretation, 894–896 diffuse x-ray scattering techniques, 889–890 inelastic scattering background removal, 890–893 limitations, 898–899 measured intensity calibration, 894 protocols and procedures, 884–889 recovered static displacements, 896–897 research background, 882–884 resonant scattering terms, 893–894 sample preparation, 898 Diffusion coefficients binary/multicomponent diffusion, 146 multicomponent diffusion, frames of reference, 150–151 substitutional and interstitial metallic systems, 152–153 Diffusion-controlled grain growth, microstructural evolution, 126–128 Diffusion imaging sequence, magnetic resonance imaging (MRI), 769 Diffusion length carrier lifetime measurement free carrier absorption (FCA), limitations of, 443–444 methods based on, 434–435 surface recombination and diffusion, 432–433 semiconductor materials, semiconductor-liquid interfaces, 614–616 Diffusion potential, capacitance-voltage (C-V) characterization, p-n junctions, 457–458 Diffusion processes, binary/multicomponent diffusion applications, 155 dependent concentration variable, 151 frames of reference and diffusion coefficients, 147–150 linear laws, 146–147 multicomponent alloys Fick-Onsager law, 150 frames of reference, 150–151 research background, 145–146 substitutional and interstitial metallic systems, 152–155 Diffusion pumps applications, 6 operating principles and procedures, 6–7 Dimagnetism, principles of, 511 Dimensionless quantitates, coherent ordered precipitates, microstructural evolution, 124–126 Dimer-adatom-stacking fault (DAS) model, surface x-ray diffraction, silicon surface example, 1017–1018 Diode structures, characterization basic principles, 467–469
INDEX competitive and complementary techniques, 466–467 limitations of, 471 measurement equipment sources and selection criteria, 471 protocols and procedures, 469–470 research background, 466–467 sample preparation, 470–471 Dipole magnetic moments atomic origin, 512–515 closed/core shell diamagnetism, 512 coupling theories, 519–527 antiferromagnetism, 524 ferrimagnetism, 524–525 ferromagnetism, 522–524 Heisenberg model and exchange interactions, 525–526 helimagnetism, 526–527 paragmagnetism, classical and quantum theories, 519–522 local atomic moments collective magnetism, 515 ionic magnetism, 513–515 neutron powder diffraction, 1288–1289 nuclear quadrupole resonance (NQR), nuclear moments, 775–776 Dirac equation computational analysis, applications, 71 Mo¨ ssbauer effect, 819–820 Mo¨ ssbauer spectroscopy, hyperfine magnetic field (HMF), 823–824 Dirac’s delta function, phonon analysis, 1318 Direct configurational averaging (DCA), phase diagram prediction cluster variation, 99 electronic structure, 101–102 ground-state analysis, 102–104 Direct detection procedures, nuclear quadrupole resonance (NQR), 780–781 Direct methods single-crystal x-ray structure determination, 860–861 computations, 865–868 surface x-ray diffraction measurements, crystallographic refinement, 1018 x-ray absorption spectroscopy, 870 Discrete Fourier transform (DFT), thermal diffusivity, laser flash technique, 388–389 Disk-shaped electrodes, scanning electrochemical microscopy (SECM), 648–659 Disordered local moment (DLM) face-centered cubic (fcc) iron, moment alignment vs. moment formation, 193–195 metal alloy magnetism first-principles calculations, 188–189 gold-rich AuFe alloys, atomic short range order (ASRO), 198–199 iron-vanadium alloys, 196–198 local exchange splitting, 189–190 local moment fluctuation, 187–188 Disordered states, x-ray absorption fine structure (XAFS) spectroscopy, 872–873 Dispersed radiation measurment, Raman spectroscopy of solids, 707–708 Dispersion corrections transmission electron microscopy (TEM), sample preparation, 1087 x-ray diffraction, scattering power and length, 210 Dispersion equation, two-beam diffraction, solution, 233 Dispersion surface dynamical diffraction, basic principles, 228–229 two-beam diffraction, 230–231 boundary conditions and Snell’s law, 230–231 hyperboloid sheets, 230
Poynting vector and energy flow, 231 wave field amplitude ratios, 230 Dispersive transmission measurement, x-ray absorption fine structure (XAFS) spectroscopy, 875–877 Displacement principle density definition and, 26 diffuse scattering techniques diffuse intensity computations, 902–904 static displacement recovery, 897–898 Distorted-wave Born approximation (DWBA) grazing-incidence diffraction (GID), 244–245 liquid surface x-ray diffraction, non-specular scattering, 1034–1036 surface x-ray diffraction measurements, grazing incidence, 1011 Documentation protocols in thermal analysis, 340–341 thermogravimetry (TG), 361–362 tribological testing, 333 Doniach-Sunjic peak shape, x-ray photoelectron spectroscopy (XPS), 1006 Donor-acceptor transitions, photoluminescence (PL) spectroscopy, 683 Doping dependence, carrier lifetime measurement, 431 Double-beam spectrometer, ultraviolet/visible absorption (UV-VIS) spectroscopy, 692–693 Double-counting corrections, diffuse intensities, metal alloys, concentration waves, firstprinciples calcuations, electronic structure, 264–266 Double-crystal monochromator liquid surface x-ray diffraction, error detection, 1044–1045 surface x-ray diffraction, beamline alignment, 1019–1020 Double-resonance experiments electron-nuclear double resonance (ENDOR), 796 nuclear quadrupole resonance (NQR), frequency-swept Fourier-transform spectrometers, 780–781 Double sum techniques, diffuse intensity computations, 902–904 Dougle-critical-angle phenomenon, grazingincidence diffraction (GID), 243 DPC microscopy, magnetic domain structure measurements, Lorentz transmission electron microscopy, 551–552 Drive stiffness, tribological and wear testing, 328–329 Drude model, cyclotron resonance (CR), 806–807 Drude-Sommerfeld ‘‘free electron’’ model, magnetotransport in metal alloys, 562–563 Dry gases, vacuum system principles, outgassing, 2–3 Dufour effect, chemical vapor deposition (CVD) model, hydrodynamics, 171 ‘‘Dummy’’ orientation matrix (DOM), surface x-ray diffraction crystallographic alignment, 1014–1015 protocols, 1026 Dynamical diffraction applications, 225–226 Bragg reflections, 225–226 crystals and multilayers, 225 defect topology, 225 grazing-incidence diffraction, 226 internal field-dependent phenomena, 225 basic principles, 227–229 boundary conditions, 229 dispersion surface, 228–229 fundamental equations, 227–228 internal fields, 229 grazing incidence diffraction (GID), 241–246
1351
distorted-wave Born approximation, 244–245 evanescent wave, 242 expanded distorted-wave approximation, 245–246 inclinded geometry, 243 multlayers and superlattices, 242–243 specular reflectivity, 241–242 literature sources, 226–227 multiple-beam diffraction, 236–241 basic principles, 236–237 NBEAM theory, 237–238 boundary conditions, 238 D-field component eigenequation, 237 eigenequation matrix, 237–238 intensity computations, 238 numerical solution strategy, 238 phase information, 240 polarization density matrix, 241 polarization mixing, 240 second-order Born approximation, 238–240 standing waves, 240 three-beam interactions, 240 scanning transmission electron microscopy (STEM), Z-contrast imaging, 1101–1103 theoretical background, 224–225 two-beam diffraction, 229–236 anomalous transmission, 231–232 Darwin width, 232 diffuse intensities boundary condition, 233 Bragg case, 234–235 integrated intensities, 235 Laue case, 233–234 dispersion equation solution, 233 dispersion surface properties, 230–231 boundary conditions, Snell’s law, 230–231 hyperboloid sheets, 230 Poynting vector and energy flow, 231 wave field amplitude ratios, 230 Pendello¨ sung, 231 standing waves, 235–236 x-ray birefringence, 232–233 x-ray standing waves (XSWs), 232 Dynamical processes, nuclear quadrupole resonance (NQR), 790 Dynamic random-access memory (DRAM) device heavy-ion backscattering spectrometry (HIBS) ion-beam-induced charge (IBIC)/single event upset (SEU) microscopy, 1230 Dynamic stability conditions, superconducting magnets, 502 Dynamic thermomechanometry, defined, 339 Echo-planar imaging sequence, magnetic resonance imaging (MRI), 768–769 ECPSSR model, particle-induced x-ray emission (PIXE), 1212 Eddy-current measurements, non-contact techniques, 407–408 Effective cluster interactions (ECIs) diffuse intensities, metal alloys, 255 high-temperature experiments, 255–256 mean-field results, 263 phase diagram prediction, ground-state analysis, 102–104 phase diagram predictions, cluster variational method (CVM), 98–99 Effective drift diffusion model, chemical vapor deposition (CVD), plasma physics, 170 Effectively probed volume, carrier lifetime measurement, free carrier absorption (FCA), probe selection criteria, 440 Effective mass approximation (EMA), cyclotron resonance (CR), 806 quantum mechanics, 808
1352
INDEX
Eigenequations dynamical diffraction, boundary conditions, 229 multiple-beam diffraction D-field components, 237 matrix form, 237–238 Elastic collision particle scattering, 51–52 diagrams, 52–54 phase diagram prediction, static displacive interactions, 105–106 Elastic constants impulsive stimulated thermal scattering (ISTS), 745–746 local density approximation (LDA), 81–82 Elastic energy, microstructural evolution, 120– 121 Elastic ion scattering composition analysis applications, 1191–1197 basic concepts, 1181–1184 detector criteria and detection geometries, 1185–1186 equations, 1186–1189 experimental protocols, 1184–1186 limitations, 1189–1191 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE) and, 1201–1202 research background, 1179–1181 Rutherford backscattering spectroscopy (RBS), 1178 Elastic properties impulsive stimulated thermal scattering (ISTS) analysis, 747–749 tension testing, 279 Elastic recoil detection analysis (ERDA) composition analysis applications, 1191–1197 basic concepts, 1181–1184 detector criteria and detection geometries, 1185–1186 equations, 1186–1189, 1199–1200 experimental protocols, 1184–1186 limitations, 1189–1191 research background, 1179–1181 elastic scattering, 1178 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE), 1201–1202 Elastomers, O-ring seals, 17 Electrical and electronic measurements bulk measurements, 403–405 conductivity measurement, 401–403 microwave techniques, 408–410 non-contact methods, 407–408 research background, 401 surface measurements, 405–406 Electrical feedthroughs, vacuum systems, 18 Electrical interference, thermogravimetric (TG) analysis, mass measurement errors and, 357–358 Electrical-resistance thermometer, operating principles, 36 Electrical resistivity, magnetotransport in metal alloys magnetic field behavior, 563–565 research applications, 559 transport equations, 560–561 zero field behavior, 561–563 Electrical transport measurements, superconductors automation of, 480–481 bath temperature fluctuations, 486–487 competitive and related techniques, 472–473 contact materials, 474 cooling options and procedures, 478–479
critical current criteria, 474–475 current-carring area for critical current to current density, 476 current contact length, 477 current ramp rate and shape, 479 current supply, 476–477 current transfer and transfer length, 475–477 data analysis and interpretation, 481–482 electromagnetic phenomena, 479 essential theory current sharing, 474 four-point measurement, 473–474 Ohm’s law, 474 power law transitions, 474 generic protocol, 480 instrumentation and data acquisition, 479–480 lead shortening, 486 magnetic field strength extrapolation and irreversibility field, 475 maximum measurement current determination, 479 probe thermal contraction, 486 research background, 472–473 sample handling/damage, 486 sample heating and continuous-current measurements, 486 sample holding and soldering/making contacts, 478 sample preparation, 482–483 sample quality, 487 sample shape, 477 self-field effects, 487 signal-to-noise ratio parameters, 483–486 current supply noise, 486 grounding, 485 integration period, 486 pick-up loops, 483–484 random noise and signal spikes, 485–486 thermal electromotive forces, 484–485 specimen modification, 483 thermal cycling, 487 troubleshooting, 480 voltage tap placement, 477 voltmeter properties, 477 zero voltage definition, 474 Electric discharge machine (EDM), fracture toughness testing, sample preparation, 312 Electric-discharge machining, metallographic analysis, sample preparation, 65 Electric field gradient (EFG) Mo¨ ssbauer spectroscopy data analysis and interpretation, 831–832 electric quadrupole splitting, 822–823 hyperfine interactions, 820–821 nuclear couplings, nuclear quadrupole resonance (NQR), 776–777 semiconductor-liquid interface, photoelectrochemistry, thermodynamics, 606 two-dimensional exchange NQR spectroscopy, 785 Electric phenomena, superconductors, electrical transport measurements, 479 Electric potential distributions, semiconductorliquid interface, photoelectrochemistry, thermodynamics, 606 Electric potential drop technique, fracture toughness testing, crack extension measurement, 308 Electric quadrupole splitting, Mo¨ ssbauer spectroscopy, 822–823 Electric resistance (R), magnetotransport in metal alloys, basic principles, 559 Electrocatalytic mechanism, cyclic voltammetry, 587–588
Electrochemical analysis. See also Photoelectrochemistry capacitance-voltage (C-V) characterization, 462 corrosion quantification electrochemical impedance spectroscopy (EIS), 599–603 linear polarization, 596–599 research background, 592–593 Tafel technique, 593–596 cyclic voltammetry data analysis and interpretation, 586–587 electrocatalytic mechanism, 587–588 electrochemical cells, 584 electrolytes, 586 limitations, 590–591 platinum electrode example, 588–590 potentiostats and three-electrode chemical cells, 584–585 reaction reversibility, 582–584 non-reversible charge-transfer reactions, 583–584 quasireversible reaction, 583 total irreversible reaction, 583 total reversible reaction, 582–583 research background, 580–582 sample preparation, 590 specimen modification, 590 working electrodes, 585–586 quartz crystal microbalance (QCM) automation, 659–660 basic principles, 653–658 data analysis and interpretation, 660 equivalent circuit, 654–655 film and solution effects, 657–658 impedance analysis, 655–657 instrumentation criteria, 648–659 limitations, 661 quartz crystal properties, 659 research background, 653 sample preparation, 660–661 series and parallel frequency, 660 specimen modification, 661 research background, 579–580 scanning electrochemical microscopy (SECM) biological and polymeric materials, 644 competitive and related techniques, 637–638 constant-current imaging, 643 corrosion science applications, 643–644 feedback mode, 638–640 generation/collection mode, 641–642 instrumentation criteria, 642–643 limitations, 646–648 reaction rate mode, 640–641 research background, 636–638 specimen modification, 644–646 tip preparation protocols, 648–649 Electrochemical cell design, semiconductor materials, photocurrent/photovoltage measurements, 609–611 Electrochemical cells, cyclic voltammetry, 584 Electrochemical dissolution, natural oxide films, ellipsometric measurement, 742 Electrochemical impedance spectroscopy (EIS), corrosion quantification, 599–603 Electrochemical photocapacitance spectroscopy (EPS), semiconductor materials, 623–626 bulk capacitance measurements, 624 electrode bias voltage, 625–626 electrolyte selection, 625 experimental protocols, 626 kinetic rates, 625 limitations, 626 measurement frequency, 625 surface capacitance measurements, 624–625 surface recombination capture coeffieicnt, 625
INDEX Electrochemical polishing, ellipsometry, surface preparations, 742 Electrode bias voltage, electrochemical photocapacitance spectroscopy (EPS), semiconductor materials, 625–626 Electrode materials, cyclic voltammetry, 585–586 Electrolytes cyclic voltammetry, 586 semiconductor materials, electrochemical photocapacitance spectroscopy (EPS), 625 Electrolytic polishing, metallographic analysis deformed high-purity aluminum, 69 sample preparation, 66–67 Electromagnetic radiation, classical physics, Raman spectroscopy of solids, classical physics, 699–701 Electromagnetic spectrum, optical imaging, 666 Electromagnets applications, 496 structure and properties, 499–500 x-ray magnetic circular dichroism (XMCD), error sources, 966 Electron area detectors, single-crystal x-ray structure determination, 859–860 Electron beam induced current (EBIC) technique carrier lifetime measurement, diffusion-lengthbased methods, 434–435 ion-beam-induced charge (IBIC) microscopy and, 1223–1224 topographic analysis, 1226–1227 pn junction characterization, 466–467 Electron beams, Auger electron spectroscopy (AES), 1160–1161 Electron deformation densities, nuclear quadrupole resonance (NQR), 777 Electron diffraction single-crystal neutron diffraction and, 1309 single-crystal x-ray structure determination, 851 x-ray diffraction vs., 836 Electron diffuse scattering, transmission electron microscopy (TEM), Kikuchi lines, 1075 Electronegativities, metal alloy bonding, 137–138 Electron energy loss spectroscopy (EELS) scanning transmission electron microscopy (STEM) atomic resolution spectroscopy, 1103–1104 comparisons, 1093 instrumentation criteria, 1104–1105 transmission electron microscopy (TEM) and, 1063–1064 x-ray photoelectron spectroscopy (XPS), survey spectrum, 974 Electron excitation, energy-dispersive spectrometry (EDS), 1136 Electron flux equations, x-ray photoelectron spectroscopy (XPS), elemental composition analysis, 983–985 Electron guns low-energy electron diffraction (LEED), 1126– 1127 scanning electron microscopy (SEM) instrumentation criteria, 1053–1057 selection criteria, 1061 x-ray photoelectron spectroscopy (XPS), sample charging, 1000–1001 Electron holography, magnetic domain structure measurements, 554–555 Electronic balances, applications and operation, 28 Electronic density of states (DOS) diffuse intensities, metal alloys concentration waves, first-principles calcuations, electronic structure, 264–266 hybridization in NiPt alloys, charge correlation effects, 267–268
nickel-iron alloys, atomic long and short range order (ASRO), 193–196 transition metal magnetic ground state, itinerant magnetism at zero temperature, 182–183 Electronic phase transitions, ultraviolet photoelectron spectroscopy (UPS), 727 Electronic structure computational analysis basic principles, 74–77 dielectric screening, 84–87 diffuse intensities, metal alloys concentration waves, first-principles calculations, 263–266 topological transitions, van Hove singularities in CuPt, 271–273 Green’s function Monte Carlo (GFMC), electronic structure methods, 88–89 GW approximation, 84–85 local-density approximation (LDA) þ U, 87 Hartree-Fock theory, 77 local-density approximation (LDA), 77–84 local-density approximation (LDA) þ U, 86–87 metal alloy magnetism, 184–185 nickel-iron alloys, atomic long and short range order (ASRO), 193–196 phase diagram prediction, 101–102 quantum Monte Carlo (QMC), 87–89 resonant x-ray diffraction, 905 SX approximation, 85–86 variational Monte Carlo (VMC), 88 Electron microprobe analysis (EMPA), microparticle-induced x-ray emission (MicroPIXE) and, 1210–1211 Electron multipliers, x-ray photoelectron spectroscopy (XPS), 982–983 Electron-nuclear double resonance (ENDOR) basic principles, 796 error detection, 801 Electron paramagnetic resonance (EPR) automation, 798 basic principles, 762–763, 793–794 calibration, 797 continuous-wave (CW) experiments microwave power, 796 modulation amplitude, 796–797 non-rectangular resonators, 796 non-X-band frequencies, 795–796 sensitivity parameters, 797 X-band with rectangular resonator, 794–795 data analysis and interpretation, 798 electron-nuclear double resonance (ENDOR), 796 instrumentation criteria, 804 limitations, 800–802 pulsed/Fourier transform EPR, 796 research background, 792–793 sample preparation, 798–799 specimen modification, 799–800 Electron penetration, energy-dispersive spectrometry (EDS), 1155–1156 Electrons, magnetism, general principles, 492 Electron scattering energy-dispersive spectrometry (EDS), matrix corrections, 1145 resonant scattering analysis, 907 Electron scattering quantum chemistry (ESQC), scanning tunneling microscopy (STM), 1116–1117 Electron sources, x-ray photoelectron spectroscopy (XPS), 978–980 Electron spectrometers, ultraviolet photoelectron spectroscopy (UPS), 729 alignment protocols, 729–730 automation, 731–732 Electron spectroscopy for chemical analysis (ESCA)
1353
Auger electron spectroscopy (AES) vs., 1158–1159 ultraviolet photoelectron spectroscopy (UPS) and, 725–726 x-ray photoelectron spectroscopy (XPS), comparisons, 971 Electron spin echo envelope modulation (ESEEM), electron paramagnetic resonance (EPR) and, 801 Electron spin resonance (ESR), basic principles, 762–763 Electron stimulated desorption (ESD), hot cathode ionization gauges, 15–16 Electron stopping, energy-dispersive spectrometry (EDS), matrix corrections, 1145 Electron stripping, trace element accelerator mass spectrometry (TEAMS), secondary ion acceleration and, 1238–1239 Electron techniques Auger electron spectroscopy (AES) automation, 1167 basic principles, 1159–1160 chemical effects, 1162–1163 competitive and related techniques, 1158–1159 data analysis and interpretation, 1167–1168 depth profiling, 1165–1167 instrumentation criteria, 1160–1161 ion-excitation peaks, 1171–1173 ionization loss peaks, 1171 limitations, 1171–1173 line scan properties, 1163–1164 mapping protocols, 1164–1165 plasmon loss peaks, 1171 qualitative analysis, 1161–1162 quantitative analysis, 1169 research background, 1157–1159 sample preparation, 1170 sensitivity and detectability, 1163 specimen modification, 1170–1171 alignment protocols, 1161 spectra categories, 1161 standards sources, 1169–1170, 1174 energy-dispersive spectrometry (EDS) automation, 1141 background removal, 1143–1144 basic principles, 1136–1140 collection optimization, 1140–1141 deadtime correction, 1141 energy calibration, 1140 escape peaks, 1139 limitations, 1153–1156 electron preparation, 1155–1156 ground loops, 1153–1154 ice accumulation, 1155 light leakage, 1154–1155 low-energy peak distortion, 1155 stray radiation, 1155 matrix corrections, 1145–1147 electron scattering, 1145 electron stopping, 1145 secondary x-ray fluorescence, 1145 x-ray absorption, 1145 measurement protocols, 1156–1157 nonuniform detection efficiency, 1139–1140 peak overlap, deconvolution, 1144–1145 qualitative analysis, 1141–1143 quantitative analysis, 1143 research background, 1135–1136 resolution/count rate range, 1140 sample preparation, 1152 specimen modification, 1152–1553 spectrum distortion, 1140 standardless analysis, 1147–1148 accuracy testing, 1150 applications, 1151
1354
INDEX
Electron techniques (Continued) first-principles standardless analysis, 1148–1149 fitted-standards standardless analysis, 1149–1150 sum peaks, 1139 x-ray detection limits, 1151–1152 low-energy electron diffraction (LEED) automation, 1127 basic principles, 1121–1125 complementary and related techniques, 1120–1121 data analysis and interpretation, 1127–1131 instrumentation criteria, 1125–1127 limitations, 1132 qualitative analysis basic principles, 1122–1124 data analysis, 1127–1128 quantitative measurements basic principles, 1124–1125 data analysis, 1128–1131 research background, 1120–1121 sample preparation, 1131–1132 specimen modification, 1132 research background, 1049–1050 scanning electron microscopy (SEM) automation, 1057–1058 constrast and detectors, 1054–1056 data analysis and interpretation, 1058 electron gun, 1061 image formation, 1052–1053 imaging system components, 1061, 1063 instrumentation criteria, 1053–1054 limitations and errors, 1059–1060 research background, 1050 resolution, 153 sample preparation, 1058–1059 selection criteria, 1061–1063 signal generation, 1050–1052 specimen modification, 1059 techniques and innovations, 1056–1057 vacuum system and specimen handling, 1061 scanning transmission electron microscopy (STEM) atomic resolution spectroscopy, 1103–1104 coherent phase-contrast imaging and, 1093–1097 competitive and related techniques, 1092–1093 data analysis and interpretation, 1105–1108 object function retrieval, 1105–1106 strain contrast, 1106–1108 dynamical diffraction, 1101–1103 incoherent scattering, 1098–1101 weakly scattered objects, 1111 limitations, 1108 manufacturing sources, 1111 probe formation, 1097–1098 protocols and procedures, 1104–1105 research background, 1090–1093 sample preparation, 1108 specimen modification, 1108 scanning tunneling microscopy (STM) automation, 1115–1116 basic principles, 1113–1114 complementary and competitive techniques, 1112–1113 data analysis and interpretation, 1116–1117 image acquisition, 1114 limitations, 1117–1118 material selection and limitations, 1114–1115 research background, 1111–1113 sample preparation, 1117 transmission electron microscopy (TEM) automation, 1080
basic principles, 1064–1069 deviation vector and parameter, 1066–1067 Ewald sphere construction, 1066 extinction distance, 1067–1068 lattice defect diffraction contrast, 1068–1069 structure and shape factors, 1065–1066 bright field/dark-field imaging, 1069–1071 data analysis and interpretation, 1080–1086 bright field/dark-field, and selected-area diffraction, 1082–1084 defect analysis values, 1084–1085 Kikuchi lines and deviation parameter, defect contrast, 1085–1086 shape factor effect, 1080–1081 specimen thickness and deviation parameter, 1081–1082 diffraction pattern indexing, 1073–1074 Kikuchi lines and specimen orientation, 1075–1078 deviation parameters, 1077–1078 electron diffuse scattering, 1075 indexing protocols, 1076–1077 line origins, 1075–1076 lens defects and resolution, 1078–1080 aperture diffraction, 1078 astigmatism, 1079 chromatic aberration, 1078 resolution protocols, 1079–1080 spherical aberration, 1078 limitations, 1088–1089 research background, 1063–1064 sample preparation, 1086–1087 dispersion, 1087 electropolishing and chemical thinning, 1086 ion milling and focused gallium ion beam thinning, 1086–1087 replication, 1087 ultramicrotomy, 1087 selected-area diffraction (SAD), 1071–1073 specimen modification, 1087–1088 tilted illumination and diffraction, 1071 tilting specimens and electron beams, 1071 Electron transfer rate, scanning electrochemical microscopy (SECM), reaction rate mode, 640–641 Electron yield measurements, x-ray magnetic circular dichroism (XMCD), 961–962 Electropolishing, transmission electron microscopy (TEM), sample preparation, 1086 Elemental composition analysis, x-ray photoelectron spectroscopy (XPS), 983–986 Ellipsoids, small-angle scattering (SAS), 220 Ellipsometry automation, 739 intensity-measuring ellipsometer, 739 null elipsometers, 739 data analysis and interpretation, 739–741 optical properties from phase and amplitude changes, 740–741 polarizer/analyzer phase and amplitude changes, 739–740 dielectric constants, 737 limitations, 742 liquid surfaces and monomolecular layers, 1028 optical conductivity, 737 optical constants, 737 PQSA/PCSA component arrangement, 738 relative phase/amplitude readings, 740 protocols and procedures, 737–739 alignment, 738–739 compensator, 738 light source, 738 polarizer/analyzer, 738
reflecting surfaces, 735–737 reflectivity, 737 research background, 735 sample preparation, 741–742 Emanation thermal analysis, defined, 338 Embedded-atom method (EAM) electronic structure analysis, phase diagram prediction, 101–102 molecular dynamics (MD) simulation, surface phenomena, 159 interlayer surface relaxation, 161 metal surface phonons, 161–162 Emissivity measurements deep level transient spectroscopy (DLTS), semiconductor materials, 420–421 radiation thermometer, 37 Emittance measurements, radiation thermometer, 37 ‘‘Empty lattice’’ bands, ultraviolet photoelectron spectroscopy (UPS), 730 Endothermic reaction differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 370–371 thermal analysis and principles of, 342–343 Energy band dispersion collective magnetism, 515 ultraviolet photoelectron spectroscopy (UPS), 723–725 automation, 731–732 solid materials, 727 Energy calibration energy-dispersive spectrometry (EDS), optimization, 1140–1141 nuclear reaction analysis (NRA)/protoninduced gamma ray emission (PIGE) equations, 1209 resonance depth profiling, 1204–1205 semiconductor-liquid interface, photoelectrochemistry, thermodynamics, 605–606 x-ray absorption fine structure (XAFS) spectroscopy, 877 Energy content, magnetic field effects and applications, 496 Energy-dispersive spectrometry (EDS) automation, 1141 background removal, 1143–1144 basic principles, 1136–1140 collection optimization, 1140–1141 deadtime correction, 1141 energy calibration, 1140 escape peaks, 1139 ion-beam analysis (IBA) vs., 1181 limitations, 1153–1156 electron preparation, 1155–1156 ground loops, 1153–1154 ice accumulation, 1155 light leakage, 1154–1155 low-energy peak distortion, 1155 stray radiation, 1155 matrix corrections, 1145–1147 electron scattering, 1145 electron stopping, 1145 secondary x-ray fluorescence, 1145 x-ray absorption, 1145 measurement protocols, 1156–1157 nonuniform detection efficiency, 1139–1140 peak overlap, deconvolution, 1144–1145 qualitative analysis, 1141–1143 quantitative analysis, 1143 research background, 1135–1136 resolution/count rate range, 1140 sample preparation, 1152 specimen modification, 1152–1553 spectrum distortion, 1140 standardless analysis, 1147–1148
INDEX accuracy testing, 1150 applications, 1151 first-principles standardless analysis, 1148–1149 fitted-standards standardless analysis, 1149–1150 sum peaks, 1139 x-ray detection limits, 1151–1152 Energy-dispersive spectroscopy (EDS) transmission electron microscopy (TEM) and, 1063–1064 x-ray magnetic circular dichroism (XMCD), 962 x-ray photoelectron spectroscopy (XPS), comparisons, 971 Energy-filtered transmission electron microscopy (EFTEM) basic principles, 1063–1064 bright-field/dark field imaging, 1069 Energy flow, two-beam diffraction, dispersion surface, 231 Energy precision, resonance spectroscopies, 761–762 Energy product (BH)max, permanent magnets, 497–499 Energy resolution ion-beam analysis (IBA), ERD/RBS techniques, 1191–1197 x-ray absorption fine structure (XAFS) spectroscopy, 877 Engineering strain, stress-strain analysis, 280– 281 Enhanced backscattering spectrometry (EBS) composition analysis applications, 1191–1197 basic concepts, 1181–1184 detector criteria and detection geometries, 1185–1186 equations, 1186–1189 experimental protocols, 1184–1186 limitations, 1189–1191 research background, 1179–1181 ion-beam analysis (IBA) ERD/RBS examples, 1194–1197 non-Rutherford cross-sections, 1189–1191 Enthalpy chemical vapor deposition (CVD) model, hydrodynamics, 171 combustion calorimetry, 371–372 enthalpies of formation computation, 380–381 differential thermal analysis (DTA)/differential scanning calorimetry (DSC) applications, 368–370 calibration protocols, 366–367 thermal analysis and principles of, 342–343 Entropy. See also Maximum entropy method (MEM) phase diagram predictions, cluster variational method (CVM), 96–99 thermodynamic temperature scale, 31–32 Environmental testing, stress-strain analysis, 286 Equal arm balance classification, 27 mass, weight, and density definitions, 25–26 Equilibrium charge density, semiconductor-liquid interface, photoelectrochemistry, thermodynamics, 606 Equilibrium constant, thermal analysis and principles of, 342–343 Equivalent circuit model electrochemical quartz crystal microbalance (EQCM), 654–655 semiconductor-liquid interfaces, differential capacitance measurements, 616–617 Equivalent positions space groups, 48–50
symmetry operators, 41–42 Erosion mechanics, tribological and wear testing, 324–325 Error detection Auger electron spectroscopy (AES), 1171–1173 carrier lifetime measurement, photoconductivity (PC) techniques, 446–447 combustion calorimetry, 381–382 corrosion quantification electrochemical impedance spectroscopy (EIS), 602–603 linear polarization, 599 Tafel technique, 596 cyclic voltammetry, 590–591 cyclotron resonance (CR), 814–815 deep level transient spectroscopy (DLTS), semiconductor materials, 426 differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 372 diffuse scattering techniques, 899–900 electrochemical quartz crystal microbalance (EQCM), 661 electron paramagnetic resonance (EPR), 800–802 ellipsometric measurement, 742 energy-dispersive spectrometry (EDS), 1153–1156 electron preparation, 1155–1156 ground loops, 1153–1154 ice accumulation, 1155 light leakage, 1154–1155 low-energy peak distortion, 1155 stray radiation, 1155 fracture toughness testing, 312–313 Hall effect, semiconductor materials, 417 hardness testing, 322 heavy-ion backscattering spectrometry (HIBS), 1281–1282 impulsive stimulated thermal scattering (ISTS) analysis, 757–758 ion-beam-induced charge (IBIC)/single event upset (SEU) microscopy, 1233 liquid surface x-ray diffraction, 1043–1045 low-energy electron diffraction (LEED), 1132 magnetic neutron scattering, 1337–1338 magnetic x-ray scattering, 935 magnetometry, 538–539 magnetotransport in metal alloys, 567–568 mass measurement process assurance, 28–29 medium-energy backscattering and forwardrecoil spectrometry, 1271–1272 neutron powder diffraction, 1300–1302 nuclear magnetic resonance, 772 nuclear quadrupole resonance (NQR), 789–790 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE) and, 1207–1208 particle-induced x-ray emission (PIXE) analysis, 1220 phonon analysis, 1326–1327 photoluminescence (PL) spectroscopy, 687 pn junction characterization, 471 Raman spectroscopy of solids, 713–714 reflected-light optical microscopy, 680–681 resonant scattering, 914–915 scanning electrochemical microscopy (SECM), 646–648 scanning electron microscopy (SEM), 1059–1060 scanning transmission electron microscopy (STEM), 1108 scanning tunneling microscopy (STM), 1117–1118 simultaneous thermogravimetry (TG)differential thermal analysis (TG-DTA), 399
1355
single-crystal neutron diffraction, 1314–1315 single-crystal x-ray structure determination, 863–864 surface magneto-optic Kerr effect (SMOKE) evolution, 575 surface x-ray diffraction measurements, 1018–1019 thermal diffusivity, laser flash technique, 390 thermogravimetric (TG) analysis mass measurement errors, 357–358 temperature measurement errors, 358–359 trace element accelerator mass spectrometry (TEAMS), 1253 transmission electron microscopy (TEM), 1088–1089 tribological and wear testing, 335 ultraviolet photoelectron spectroscopy (UPS), 733 ultraviolet/visible absorption (UV-VIS) spectroscopy, 696 wear testing, 330–331 x-ray absorption fine structure (XAFS) spectroscopy, 880 x-ray magnetic circular dichroism (XMCD), 965–966 x-ray microfluorescence/microdiffraction, 950–951 x-ray photoelectron spectroscopy (XPS), 999–1001 x-ray powder diffraction, 842–843 Escape peaks, energy-dispersive spectrometry (EDS), 1139 qualitative analysis, 1142–1143 Estimated standard deviations, neutron powder diffraction, 1305–1306 Etching procedures, metallographic analysis 7075-T6 anodized aluminum alloy, 69 sample preparation, 67 4340 steel, 67 cadmium plating composition and thickness, 68 Ettingshausen effect, magnetotransport in metal alloys, transport equations, 561 Euler differencing technique, continuum field method (CFM), 129–130 Eulerian cradle mount, single-crystal neutron diffraction, 1312–1313 Euler-Lagrange equations, diffuse intensities, metal alloys, concentration waves, densityfunctional theory (DFT), 260–262 Euler’s equation, crystallography, point groups, 42–43 Evanescent wave, grazing-incidence diffraction (GID), 242 Evaporation ellipsometric measurements, vacuum conditions, 741 gas analysis, simultaneous thermogravimetry (TG)-differential thermal analysis (TGDTA), 393 Everhart process measurement assurance program, mass measurement process assurance, 28–29 Everhart/Thornley (ET) detector, scanning electron microscopy (SEM), 1055–1056 Evolved gas analysis (EGA) chromatography and, 396–397 defined, 338 simultaneous techniques mass spectroscopy for thermal degradation studies, 395–396 research background, 393 Evolved gas detection (EGD) chemical and physical principles, 398–399 defined, 338
1356
INDEX
Ewald sphere construction dynamical diffraction, basic principles, 228–229 single-crystal x-ray structure determination, 853 transmission electron microscopy (TEM), 1066–1067 Kikuchi line orientation, 1078 tilting specimens and electron beams, 1071 Ewald-von Laue equation, dynamical diffraction, 227–228 Excess carriers, carrier lifetime measurement, 428–429 free carrier absorption (FCA), 438–439 Exchange current density, semiconductor-liquid interface, dark current-potential characteristics, 607 Exchange effects, resonant magnetic x-ray scattering, 922–924 Exchange-splitting band theory of magnetism, 515–516 metal alloy paramagnetism, finite temperatures, 187 ultraviolet photoelectron spectroscopy (UPS), energy band dispersion, 727 Excitation functions energy-dispersive spectrometry (EDS), standardless analysis, 1148–1149 nuclear reaction analysis (NRA)/protoninduced gamma ray emission (PIGE), resonant depth profiling, 1210 Exciton-polaritons, photoluminescence (PL) spectroscopy, 682 Excitons, photoluminescence (PL) spectroscopy, 682 Exothermic reaction differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 370–371 thermal analysis and principles of, 342–343 Expanded distorted-wave approximation (EDWA), grazing-incidence diffraction (GID), 245–246 Extended programmable read-only memory (EPROM), ion-beam-induced charge (IBIC)/single event upset (SEU) microscopy, 1229–1230 Extended x-ray absorption fine structure (EXAFS) spectroscopy comparisons with other techniques, 870 detection methods, 875–877 diffuse scattering techniques, comparisons, 883–884 disordered states, 873 L edges, 873 micro-XAFS, 943 research background, 869–870 Extensometers, mechanical testing, 285 External field energy, continuum field method (CFM), 121 Extinction conditions neutron powder diffraction, microstrain broadening, 1295 single-crystal x-ray structure determination, crystal symmetry, 856 transmission electron microscopy (TEM), extinction distance, 1068–1069 Extreme temperatures, stress-strain analysis, 285–286 Extrinsic range of operation, capacitance-voltage (C-V) characterization, 460 Fabry factor, resistive magnets, power-field relations, 503 Face-centered cubic (fcc) cells diffuse intensities, metal alloys, multicomponent alloys, Fermi-surface nesting and van Hove singularities, 269–270
iron magnetism, moment alignment vs. moment formation, 193–194 phonon analysis, 1319–1323 x-ray diffraction, structure-factor calculations, 209 Faraday coils, automated null ellipsometers, 739 Faraday constant, corrosion quantification, Tafel technique, 593–596 Faraday cup energy-dispersive spectrometry (EDS), stray radiation, 1155 ion beam analysis (IBA), ERD/RBS techniques, 1184–1186 low-energy electron diffraction (LEED), 1126–1127 Faraday effect. See also DC Faraday magnetometer magnetic domain structure measurements, magneto-optic imaging, 547–549 surface magneto-optic Kerr effect (SMOKE), 570–571 x-ray magnetic circular dichroism (XMCD), 953–955 Far field regime, scanning tunneling microscopy (STM), 1112–1113 Far-infrared (FIR) radiation, cyclotron resonance (CR) Drude model, 807 error detection, 814–815 Fourier transform FIR magneto-spectroscopy, 809–810 laser magneto-spectroscopy (LMS), 810–812 optically-detected resonance (ODR) spectroscopy, 812–813 research background, 805–806 sources, 809 Fast Fourier transform algorithm continuum field method (CFM), 129–130 corrosion quantification, electrochemical impedance spectroscopy (EIS), 600–603 nuclear quadrupole resonance spectroscopy, temperature, stress, and pressure measurements, 788 scanning tunneling microscopy (STM), error detection, 1117–1118 Feedback mode, scanning electrochemical microscopy (SECM), 636 basic principles, 638–640 FEFF program, x-ray absorption fine structure (XAFS) spectroscopy, data analysis, 879 Fermi contact interaction, Mo¨ ssbauer spectroscopy, hyperfine magnetic field (HMF), 823–824 Fermi-Dirac probability deep level transient spectroscopy (DLTS), semiconductor materials, 420–421 metal alloy paramagnetism, finite temperatures, 186–187 semiconductor-liquid interface, flat-band potential measurements, 628–629 Fermi energy level band theory of magnetism, 516 capacitance-voltage (C-V) characterization, trapping measurements, 465 pn junction characterization, 467–468 scanning tunneling microscopy (STM), 1113–1114 semiconductor-liquid interface, photoelectrochemistry, thermodynamics, 606 transition metal magnetic ground state, itinerant magnetism at zero temperature, 182–183 ultraviolet photoelectron spectroscopy (UPS) automation, 731–732 electronic phase transitions, 727
energy band dispersion, 727 full width at half maximum values, 730 photoemission vs. inverse photoemission, 722–723 Fermi filling factor, diffuse intensities, metal alloys, concentration waves, first-principles calcuations, electronic structure, 265–266 Fermi’s golden rule resonant scattering analysis, 908–909 x-ray absorption fine structure (XAFS) spectroscopy, single-scattering picture, 870–872 Fermi’s pseudopotential, neutron powder diffraction, probe configuration, 1287–1289 Fermi-surface nesting electronic topological transitions, van Hove singularities in CuPt, 272–273 multicomponent alloys, 268–270 Ferrimagnetism, principles and equations, 524–525 Ferroelectric materials, microbeam analysis, strain distribution, 948–949 Ferromagnetism basic principles, 494–495 collective magnetism as, 515 magnetic x-ray scattering, 924–925 nonresonant scattering, 930 resonant scattering, 932 mean-field theory, 522–524 permanent magnets, 497–499 ultraviolet photoelectron spectroscopy (UPS), energy band dispersion, 725, 727 Feynman-Peierls’ inequality, metal alloy magnetism atomic short range order (ASRO), 190–191 first-principles calculations, 188–189 Fiberoptic thermometer, operating principles, 37 Fick-Onsager law, binary/multicomponent diffusion, 150 Fick’s law binary/multicomponent diffusion basic equations, 146–147 Fick-Onsager extension, 150 general formulation, 147 mobility, 147 number-fixed frame of reference, 147–148 chemical vapor deposition (CVD) model, hydrodynamics, 171 semiconductor-liquid interface, laser spot scanning (LSS), 626–628 Field cooling, superconductor magnetization, 518–519 Field cycling methods, nuclear quadrupole resonance (NQR) indirect detection, 781 spatially resolved imaging, 788–789 Field-dependent diffraction, dynamical diffraction, applications, 225 Field-effect transistor (FET), energy-dispersive spectrometry (EDS), 1137–1140 Field emission electron guns (FESEMs) applications, 1057 scanning electron microscopy (SEM) instrumentation criteria, 1054 resolution parameters, 1053 selection criteria, 1061 Field factor, electromagnet structure and properties, 499–500 Field ion microscopy (FIM) diffuse scattering techniques, comparisons, 884 transmission electron microscopy (TEM) and, 1064 Field kinetic equations, microstructural evolution, 121–122 Field simulation. See also Continuum field method (CFM)
INDEX microstructural evolution, 113–117 atomistic Monte Carlo method, 117 cellular automata (CA) method, 114–115 continuum field method (CFM), 114–115 conventional front-tracking, 113–114 inhomogeneous path probability method (PPM), 116–117 mesoscopic Monte Carlo method, 114–115 microscopic field model, 115–116 microscopic master equations, 116–117 molecular dynamics (MD), 117 Figure of merit ultraviolet photoelectron spectroscopy (UPS) light sources, 734–735 x-ray magnetic circular dichroism (XMCD), circularly polarized x-ray sources, 969–970 Film properties chemical vapor deposition (CVD) models, limitations, 175–176 electrochemical quartz crystal microbalance (EQCM), 657–658 impulsive stimulated thermal scattering (ISTS), 744–746, 745–746 applications, 749–752 automation, 753 competitive and related techniques, 744–746 data analysis and interpretation, 753–757 limitations, 757–758 procedures and protocols, 746–749 research background, 744–746 sample preparation and specimen modification, 757 ion-beam analysis (IBA), ERD/RBS techniques, 1192–1197 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE), 1206 particle-induced x-ray emission (PIXE), 1213–1216 single-crystal x-ray structure determination, 859–860 surface phenomena, molecular dynamics (MD) simulation, 156–157 Filtering procedures, x-ray absorption fine structure (XAFS) spectroscopy, 878–879 Final-state effects, x-ray photoelectron spectroscopy (XPS), 974–976 Finite difference equation, continuum field method (CFM), 129–130 Finitei-pulse-time effect, thermal diffusivity, laser flash technique, 388–389 First Law of thermodynamics combustion calorimetry and principles of, 374–375 radiation thermometer, 36–37 thermal analysis and principles of, 341–343 thermodynamic temperature scale, 31–32 First-principles calculations diffuse intensities, metal alloys basic principles, 254 concentration waves density-functional theory (DFT), 260–262 electronic-structure calculations, 263–266 energy-dispersive spectrometry (EDS), standardless analysis, 1148–1149 metal alloy bonding, precision measurements, 141 metal alloy magnetism, 188–189 limitations, 201 paramagnetic Fe, Ni, and Co, 189 molecular dynamics (MD) simulation, surface phenomena, 158 Fitted-standards standardless analysis, energydispersive spectrometry (EDS), 1149–1150 Five-circle diffractometer, surface x-ray diffraction, 1013–1014 angle calculations, 1021
Fixed spin moment (FSM) calculations, transition metal magnetic ground state, itinerant magnetism at zero temperature, 183 Fixture size and properties, fracture toughness testing, 312–313 Flash diffusivity measurements, thermal diffusivity, laser flash technique, 390 Flash getter pump, applications, 12 Flat-band potential measurements, semiconductor-liquid interface, 628–630 Flow imaging sequence, magnetic resonance imaging (MRI), 769–770 Fluctuating Local Band (FLB) theory, metal alloy magnetism, local moment fluctuation, 187–188 FLUENT software, chemical vapor deposition (CVD) model, hydrodynamics, 174 Fluorescence microscopy energy-dispersive spectrometry (EDS), matrix corrections, 1145 liquid surfaces and monomolecular layers, 1028 x-ray absorption fine-structure (XAFS) spectroscopy basic principles, 875–877 errors, 879 x-ray magnetic circular dichroism (XMCD), 961–962 x-ray microfluorescence analysis, 942–945 automation, 949 background signals, 944 characteristic radiation, 942–943 data analysis, 950 detector criteria, 944 error detection, 950–951 fluorescence yields, 942 micro XAFS, 943 penetration depth, 943–944 photoabsorption cross-sections, 943 sample preparation, 949–950 x-ray microprobe protocols and procedures, 941–942 research background, 939–941 Fluorescence yield, energy-dispersive spectrometry (EDS), standardless analysis, 1148 Flux density, permanent magnets, 497–499 Flux-gate magnetometer, magnetic field measurements, 506 Flux magnetometers flux-integrating magnetometer, 533 properties of, 533–534 Flux sources, permanent magnets as, 499 Flying spot technique, carrier lifetime measurement, diffusion-length-based methods, 435 Focused ion beam (FIB) thinning, transmission electron microscopy (TEM), sample preparation, 1086–1087 Force field changes, surface phenomena, molecular dynamics (MD) simulation, 156–157 Force magnetometers, principles and applications, 534–535 Foreline traps, oil-sealed pumps, 4 Forward-recoil spectrometry applications, 1268–1269 automation, 1269 backscattering data, 1269–1270 basic principles, 1261–1265 complementary and alternative techniques, 1261 data analysis and interpretation, 1269–1271 protocols for, 1270–1271 forward-recoil data, 1270–1271 instrumentation criteria, 1267–1268 limitations, 1271–1272
1357
research background, 1259–1261 resolution, 1267 safety issues, 1269 sensitivity parameters, 1266–1267 spectrometer efficiency, 1265–1266 time-of-flight spectrometry (TOS), 1265 Foucault microscopy, magnetic domain structure measurements, Lorentz transmission electron microscopy, 551–552 FOURC software, resonant scattering, experimental design, 914 Fourier transform infrared (FTIR) spectrometry gas analysis, 398 simultaneous techniques for gas analysis, research background, 393 Fourier transform magneto-spectroscopy (FTMS), cyclotron resonance (CR) basic principles, 808–812 data analysis and interpretation, 813–814 error detection, 814–815 Fourier transform Raman spectroscopy (FTRS), principles and protocols, 708–709 Fourier transforms chemical vapor deposition (CVD) model, radiation, 172–173 dynamical diffraction, 228 liquid surface x-ray diffraction, non-specular scattering, 1035–1036 low-energy electron diffraction (LEED), quantitative measurement, 1125 magnetic neutron scattering, 1329–1330 magnetic resonance imaging (MRI), 765–767 metal alloy magnetism, atomic short range order (ASRO), 191 Mo¨ ssbauer effect, momentum space representation, 819–820 nuclear quadrupole resonance (NQR), data analysis and interpretation, 789 scanning transmission electron microscopy (STEM), probe configuration, 1098 single-crystal x-ray structure determination, direct method computation, 866–868 small-angle scattering (SAS), two-phase model, 220 x-ray absorption fine structure (XAFS) spectroscopy data analysis and interpretation, 878–879 single scattering picture, 871–872 x-ray diffraction, crystal structure, 208–209 X-ray microdiffraction analysis, 944–945 Four-point bulk measurement basic principles, 403 conductivity measurements, 402 protocols and procedures, 403–404 superconductors, electrical transport measurement, 473–474 Four-point probing, conductivity measurements, 402 Fracture toughness testing basic principles, 302–307 crack driving force (G), 304–305 crack extension measurement, 308 crack tip opening displacement (CTOD) (d), 307 data analysis and interpretation sample preparation, 311–312 type A fracture, 308–309 type B fracture, 309–311 type C fracture, 311 errors and limitations, 312–313 J-integral, 306–307 load-displacement behaviors measurement and recording apparatus, 307–308 notched specimens, 303–304 research background, 302 stress intensity factor (K), 305–306
1358
INDEX
Fracturing techniques, metallographic analysis, sample preparation, 65 Frames of reference and diffusion coefficients binary/multicomponent diffusion, 147–150 Kirkendall effect and vacancy wind, 149 lattice-fixed frame of reference, 148 multicomponent alloys, 150–151 number-fixed frame of reference, 147–148 substitutional and interstitial metallic systems, 152 tracer diffusion and self-diffusion, 149–150 transformation between, 148–149 volume-fixed frame of reference, 148 multicomponent diffusion, 150–151 Frank-Kasper phase metal alloy bonding, size effects, 137 transition metal crystals, 135 Free-air displacement, oil-sealed pumps, 3 Free carrier absorption (FCA), carrier lifetime measurement, 438–444 automated methods, 441 basic principles, 438–440 carrier decay transient, 441 computer interfacing, 441 data analysis and interpretation, 441–442 depth profiling, sample cross-sectioning, 443 detection electronics, 440–441 geometrical considerations, 441 lifetime analysis, 441–442 lifetime depth profiling, 441 lifetime mapping, 441–442 limitations, 443–444 optical techniques, 434 probe laser selection, 440 processed wafers, metal and highly doped layer removal, 443 pump laser selection, 440 research background, 428 sample preparation, 442–443 virgin wafers, surface passivation, 442–443 Free-electron concentration, deep level transient spectroscopy (DLTS), semiconductor materials, 421 Free-electron laser, cyclotron resonance (CR), 808–809 Free electron model, magnetotransport in metal alloys, 562–563 Free energy calculations continuum field method (CFM) bulk chemical free energy, 119–120 coarse-grained approximation, 118 coarse-grained free energy formulation, 119 diffuse-interface nature, 119 elastic energy, 120–121 external field energy, 121 field kinetic equations, 121–122 interfacial energy, 120 slow variable selection, 119 microstructural evolution, diffusion-controlled grain growth, 126–128 phase diagram predictions, cluster variational method (CVM), 96–99, 99–100 Free excess carriers, carrier lifetime measurement, 438–439 Free induction decay (FID). See also Pseudo free induction decay magnetic resonance imaging (MRI), 765–767 nuclear magnetic resonance, 765 nuclear quadrupole resonance (NQR), data analysis and interpretation, 789 nutation nuclear resonance spectroscopy, 784 Free molecular transport, chemical vapor deposition (CVD) model basic components, 169–170 software tools, 174
Free-space radiation, microwave measurement techniques, 409 Free-to-bound transitions, photoluminescence (PL) spectroscopy, 683 Frequency-swept continuous wave detection, nuclear quadrupole resonance (NQR), 780–781 Frequency-swept Fourier-transform spectrometers, nuclear quadrupole resonance (NQR), 780–781 Fresnel microscope, magnetic domain structure measurements, Lorentz transmission electron microscopy, 551–552 Fresnel reflection coefficients ellipsometry reflecting surfaces, 736–737 relative phase/amplitude calculations, 739–740 grazing-incidence diffraction (GID), 241–242 distorted-wave Born approximation (DWBA), 244–245 liquid surface x-ray diffraction non-specular scattering, 1034–1036 reflectivity measurements, 1029–1031 multiple stepwise and continuous interfaces, 1031–1033 surface magnetic scattering, 933–934 Friction, high-strain-rate testing, 299 Friction coefficient, tribological and wear testing, 325–326 drive stiffness and, 328–329 equipment and measurement techniques, 326–327 Friction tests, tribological and wear testing categories and classification, 328–329 data analysis and interpretation, 334–335 Friedel’s law d-band energetics, bonding-antibonding, 135 single-crystal x-ray structure determination, 854 crystal symmetry, 854–856 Fringe field sensors, magnetic field measurements, 506 Front-tracking simulation, microstructural evolution, 113–114 Frozen phonon calculations, phonon analysis, 1324–1325 Full potentials, metal alloy bonding, precision calculations, 143 FULLPROF software, neutron powder diffraction, Rietveld refinements, 1306 Full width at half-maximum (FWHM) diffuse scattering techniques, inelastic scattering background removal, 891–893 energy-dispersive spectrometry (EDS), 1137–1140 heavy-ion backscattering spectrometry (HIBS), 1277 medium-energy backscattering, 1267 neutron powder diffraction constant wavelength diffractometer, 1303–1304 instrumentation criteria, 1292 microstrain broadening, 1294–1295 peak shapes, 1292 Rietveld refinements, 1296 particle-induced x-ray emission (PIXE) analysis, 1217–1218 Raman spectroscopy of solids, 710–712 ultraviolet photoelectron spectroscopy (UPS), 730 x-ray microprobes, tapered capillary devices, 945 x-ray photoelectron spectroscopy (XPS), 989 background subtraction, 990–991 peak position, 992, 1005
x-ray powder diffraction, 837–838 Furnace windings, thermogravimetric (TG) analysis, 348–349 GaAs structures, capacitance-voltage (C-V) characterization, 462–463 Galvani potential, corrosion quantification, Tafel technique, 593–596 Gamma ray emission, Mo¨ ssbauer effect, 818–820 Ga¨ rtner equation, semiconductor-liquid interfaces, diffusion length, 614–616 Gas analysis combustion calorimetry, 380 simultaneous techniques for automated testing, 399 benefits and limitations, 393–394 chemical and physical methods, 398–399 commercial TG-DTA equipment, 394–395 evolved gas analysis and chromatography, 396–397 gas and volatile product collection, 398 infrared spectrometery, 397–398 limitations, 399 mass spectroscopy for thermal degradation and, 395–396 research background, 392–393 TG-DTA principles, 393 thermal analysis with, 339 time-resolved x-ray powder diffraction, 845–847 Gas chromatography (GC), simultaneous techniques for gas analysis, research background, 393 Gas density detector, evolved gas analysis (EGA) and chromatography, 397 Gas-filled thermometer, operating principles, 35–36 Gas flow, superconductors, electrical transport measurements, 478 Gas-liquid chromatography (GLC), evolved gas analysis (EGA) and, 397 Gas permeation, O-ring seals, 18 Gas-phase chemistry, chemical vapor deposition (CVD) model basic components, 168–169 software tools, 173 Gas-sensing membrane electrodes, evolved gas detection (EGD), 399 Gas-solid reactions, thermogravimetric (TG) analysis, 355–356 Gaussian distribution liquid surface x-ray diffraction, simple liquids, 1040–1041 x-ray absorption fine structure (XAFS) spectroscopy, 873 Gaussian peak shapes, x-ray photoelectron spectroscopy (XPS), 1005–1007 Gauss-Mehlaer quadrature, particle scattering, central-field theory, deflection function, 60 Gebhart absorption-factor method, chemical vapor deposition (CVD) model, radiation, 173 Gelfand-Levitan-Marchenko (GLM) method, liquid surface x-ray diffraction, scattering length density (SLD) analysis, 1039–1043 Generalized gradient approximations (GGAs) local density approximation (LDA), 78–79 heats for formation and cohesive energies, 80–81 structural properties, 78–79 metal alloy bonding, accuracy calculations, 139–141 metal alloy magnetism, 185–186 General-purpose interface bus (GPIB), thermomagnetic analysis, 541–544 Generation/collection (GC) mode, scanning electrochemical microscopy (SECM)
INDEX basic principles, 641–642 error detection, 647–648 properties and principles, 636–637 Generation lifetime, carrier lifetime measurement, 431 Geometrical issues, carrier lifetime measurement, free carrier absorption (FCA), 441 GEOPIXE software, particle-induced x-ray emission (PIXE) analysis, 1217–1218 Georgopoulos/Cohen (GC) technique, diffuse scattering techniques, 887–889 data interpretation, 896–897 Getter pumps classification, 12 nonevaporable getter pumps (NEGs), 12 G-factor, resistive magnets, power-field relations, 503 Gibbs additive principle, continuum field method (CFM), bulk chemical free energy, 119–120 Gibbs energy activation barrier, substitutional and interstitial metallic systems, temperature and concentration dependence, 153 Gibbs free energy combustion calorimetry, 371–372 phase diagram predictions, mean-field approximation, 92–96 thermal analysis and principles of, 342–343 Gibbs grand potential, metal alloy paramagnetism, finite temperatures, 186–187 Gibbs phase rule, phase diagram predictions, 91–92 Gibbs triangle, diffuse intensities, metal alloys concentration waves, multicomponent alloys, 259–260 multicomponent alloys, Fermi-surface nesting and van Hove singularities, 269–270 Glass transition temperatures, differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 368–369 Glide planes, symmetry operators, 40–42 Glow discharge mass spectrosscopy (GD0MS), heavy-ion backscattering spectrometry (HIBS) and, 1275 Gold impurities, deep level transient spectroscopy (DLTS), semiconductor materials, 419 Gold-rich AuFe alloys atomic short range order (ASRO), 198–199 diffuse intensities, metal alloys, magnetic coupling and chemical order, 268 Gorsky-Bragg-Williams (GBW) free energies diffuse intensities, metal alloys, mean-field results, 262 phase diagram predictions cluster variational method (CVM), 99–100 mean-field approximation, 95–96 Gradient corrections, local density approximation (LDA), 78–79 Gradient energy term, interfacial energy, 120 Gradient-recalled echo sequence, magnetic resonance imaging (MRI), 768 Grain growth, microstructural evolution, diffusion-controlled grain growth, 126–128 Graphite furnace atomic absorption mass spectroscopy (GFAA-MS), heavy-ion backscattering spectrometry (HIBS) and, 1275 Grating criteria, spectrometers/monochromators, Raman spectroscopy of solids, 706–707 Grazing-incidence diffraction (GID), 241–246 applications, 226 distorted-wave Born approximation, 244–245 evanescent wave, 242 expanded distorted-wave approximation, 245–246
inclinded geometry, 243 liquid surface x-ray diffraction alkane surface crystallization, 1043 components, 1038–1039 error sources, 1044–1045 Langmuir monolayers, 1042–1043 liquid metals, 1043 non-specular scattering, 1036 literature sources, 226 multlayers and superlattices, 242–243 specular reflectivity, 241–242 surface x-ray diffraction measurements basic principles, 1011 measurement protocols, 1016–1017 Grease lubrication bearings, turbomolecular pumps, 7 Green’s function GW approximation and, 75 solid-solution alloy magnetism, 184 Green’s function Monte Carlo (GFMC), electronic structure analysis, 88–89 Griffith criterion, fracture toughness testing, crack driving force (G), 305 Grounding, superconductors, electrical transport measurements, signal-to-noise ratio, 485 Ground loops, energy-dispersive spectrometry (EDS), 1153–1154 Ground-state analysis atomic/ionic magnetism, ground-state multiplets, 514–515 phase diagram prediction, 102–104 transition metal magnetic ground state, itinerant magnetism at zero temperature, 181–183 Group theoretical analysis Raman spectroscopy of solids, vibrational Raman spectroscopy, 702–704 character tables, 718 point groups and matrix representation, symmetry operations, 717–718 vibrational modes of solids, 720–722 vibrational selection rules, 716–720 resonant scattering, 909–910 Gru¨ neisen theory, metal alloy magnetism, negative thermal expansion, 195–196 GSAS software, neutron powder diffraction axial divergence peak asymmetry, 1293 Rietveld refinements, 1306 Guiner-Preston (GP) zones, transmission electron microscopy (TEM), 1080–1081 Guinier approximation, small-angle scattering (SAS), 221 GUPIX software, particle-induced x-ray emission (PIXE) analysis, 1217–1218 GW approximation, electronic structure analysis, 75 dielectric screening, 84–85 local density approximation (LDA) þ U theory, 87 Gyromagnetic ratio local moment origins, 514–515 nuclear magnetic resonance, 763–765 Half-mirror reflectors, reflected-light optical microscopy, 675–676 Half-rise times, thermal diffusivity, laser flash technique, 390 Half-width at half maximum (HWHM) cyclotron resonance (CR), 814 x-ray photoelectron spectroscopy (XPS), peak position, 992 Hall effect capacitance-voltage (C-V) characterization, 465 semiconductor materials automated testing, 414 basic principles, 411–412
1359
data analysis and interpretation, 414–416 equations, 412–413 limitations, 417 protocols and procedures, 412–414 research background, 411 sample preparation, 416–417 sensitivity, 414 sensors, magnetic field measurements, 506–508 surface magnetometry, 536 Hall resistance, magnetotransport in metal alloys basic principles, 560 magnetic field behavior, 563–565 Hardenability properties, hardness testing, 320 Hardness testing automated methods, 319 basic principles, 317–318 data analysis and interpretation, 319–320 limitations and errors, 322 procedures and protocols, 318–319 research background, 316–317 sample preparation, 320 specimen modification, 320–321 Hardness values calculation of, 323 static indentation hardness testing, 317–318 Hard spheres, particle scattering, deflection functions, 58–59 Hardware components. See also Software tools vacuum systems, 17–19 all-metal flange seals, 18 electrical feedthroughs, 18 O-ring flange seals, 17–18 rotational/translational feedthroughs, 18–19 valves, 18 Harmonic contamination liquid surface x-ray diffraction, 1044–1045 magnetic x-ray scattering, 935 x-ray magnetic circular dichroism (XMCD), 965–966 Harmonic content, x-ray absorption fine structure (XAFS) spectroscopy, 877 Hartree-Fock (HF) theory electronic structure analysis basic components, 77 GW approximation, 84–85 summary, 75 metal alloy bonding, accuracy calculations, 140–141 Heat balance equation, thermal diffusivity, laser flash technique, 391–392 Heat-flux differential scanning calorimetry (DSC) basic principles, 364–365 thermal analysis, 339 Heating curve determination, defined, 338 Heating rate gas analysis, simultaneous thermogravimetry (TG)-differential thermal analysis (TGDTA), 394 thermogravimetric (TG) analysis, 351–352 Heating-rate curves, defined, 338 Heat loss, thermal diffusivity, laser flash technique, 388–389 Heats of formation, local density approximation (LDA), 79–81 Heat transfer relations chemical vapor deposition (CVD) model, radiation, 172–173 differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 368–369 thermal diffusivity, laser flash technique, 385–386 thermogravimetric (TG) analysis, 350–352 Heavy-atom model, single-crystal x-ray structure determination, 860–861 computational techniques, 868–869
1360
INDEX
Heavy-ion backscattering spectrometry (HIBS) applications, 1279–1280 automation, 1280 basic principles, 1275–1277 competitive, complementary, and alternative strategies, 1275 data analysis and interpretation, 1280–1281 areal density, 1281 background and sensitivity, 1281 mass measurements, 1281 elastic scattering, 1178 instrumentation criteria, 1277–1280 sensitivity and mass resolution, 1278–1279 limitations, 1281–1282 research background, 1273–1275 safety protocols, 1280 sample preparation, 1281 Heavy product, particle scattering, nuclear reactions, 57 Heisenberg-Dirac Hamiltonian, metal alloy magnetism, 180 Heisenberg exchange theory collective magnetism, 515 ferromagnetism, 525–526 surface magneto-optic Kerr effect (SMOKE), 571 Heitler-London series, metal alloy bonding, accuracy calculations, 139–141 Helicopter effect, turbomolecular pumps, venting protocols, 8 Helimagnetism, principles and equations, 526–527 Helium gases, cryopump operation, 9–10 Helium mass spectrometer leak detector (HMSLD), leak detection, vacuum systems, 21–22 Helium vapor pressure, ITS-90 standard, 33 Helmholtz free energy Landau magnetic phase transition theory, 529–530 semiconductor-liquid interfaces, differential capacitance measurements, 617–619 Hemi-spherical analysis, ultraviolet photoelectron spectroscopy (UPS), 729 Hemispherical sector analyzer (HSA), x-ray photoelectron spectroscopy (XPS), 980–982 Hermann-Mauguin symmetry operator improper rotation axis, 39–40 single-crystal x-ray structure determination, crystal symmetry, 855–856 Heterodyne detection, impulsive stimulated thermal scattering (ISTS), 749 Heterogeneous broadening, nuclear quadrupole resonance (NQR), 790 Heterojunction bipolar transistors (HBTs), characterization basic principles, 467–469 competitive and complementary techniques, 466–467 limitations of, 471 measurement equipment sources and selection criteria, 471 protocols and procedures, 469–470 research background, 466–467 sample preparation, 470–471 High-alumina ceramics, vacuum system construction, 17 High-energy beam transport, analysis, and detection, trace element accelerator mass spectrometry (TEAMS), 1239 High energy electron diffraction, transmission electron microscopy (TEM), deviation vector and parameter, 1068 High-energy ion beam analysis (IBA) elastic recoil detection analysis (ERDA), 1178 heavy-ion backscattering, 1178
nuclear reaction analysis (NRA), 1178–1179 particle-induced x-ray emission, 1179 periodic table, 1177–1178 research background, 1176–1177 Rutherford backscattering spectrometry, 1178 Higher-order contamination, phonon analysis, 1326–1327 High-field sensors, magnetic field measurements, 506–510 Hall effect sensors, 506–508 magnetoresistive sensors, 508–509 nuclear magnetic resonance (NMR), 509–510 search coils, 509 High-frequency ranges carrier lifetime measurement, automated photoconductivity (PC), 445–446 electron paramagnetic resonance (EPR), continuous-wave (CW) experiments, 795–796 High level injection, carrier lifetime measurement, 428–429 Highly doped layers, carrier lifetime measurement, metal/highly doped layer removal, free carrier absorption (FCA), 443 High-pressure magnetometry, principles and applications, 535 High-resolution spectrum, x-ray photoelectron spectroscopy (XPS), 974–978 appearance criteria, 993–994 final-state effects, 974–976 initial-state effects, 976–978 silicon example, 997–998 High-resolution transmission electron microscopy (HRTEM) astigmatism, 1079 basic principles, 1064 bright-field/dark-field imaging, 1071 data analysis, 1081 sample preparation, 1108 scanning transmission electron microscopy (STEM) vs., 1090–1092 phase-contrast imaging, 1096–1097 High-strain-rate testing basic principles, 290–292 data analysis and interpretation, 296–298 limitations, 299–300 method automation, 296 protocols for, 292–296 stress-state equilibrium, 294–295 temperature effects, 295–296 sample preparation, 298 specimen modification, 298–299 theoretical background, 288–290 High-temperature electron emitter, hot cathode ionization gauges, 15 High-temperature superconductors (HTSs) electrical transport measurement applications, 472 contact materials, 474 current-carrying area, 476 current sharing with other materials, 474 current supply, 476–477 powder-in-tube HTSs, sample quality issues, 486 magnetic measurements vs., 473 superconducting magnets, 500–502 High-temperature testing, stress-strain analysis, 286 High-vacuum pumps, classification, 6–12 cryopumps, 9–10 diffusion pumps, 6–7 getter pumps, 12 nonevaporable getter pumps (NEGs), 12 sputter-ion pumps, 10–12 sublimation pumps, 12 turbomolecular pumps, 7–9
Holographic imaging, magnetic domain structure measurements, 554–555 Holographic notch filter (HNF), Raman spectroscopy of solids, optics properties, 706 Homogeneous materials carrier lifetime measurement, 428–429 thermal diffusivity, laser flash technique, 391–392 Honl corrections, x-ray diffraction, 210 Hooke’s law, elastic deformation, 281 Hopkinson bar technique design specifications for, 292–294 high-strain-rate testing historical background, 289 protocols for, 292–296 stress-state equilibrium, 294–295 temperature effects, 295–296 tensile/torsional variants, 289–290 Hot cathode ionization gauges calibration stability, 16 electron stimulated desorption (ESD), 15–16 operating principles, 14–15 Hot corrosion, electrochemical impedance spectroscopy (EIS), 601–603 Hot zone schematics, thermogravimetric (TG) analysis, 348–349 Hund’s rules magnetic moments, 512 atomic and ionic magnetism, local moment origins, 513–515 metal alloy magnetism, 180–181 electronic structure, 184–185 Hybridization, diffuse intensities, metal alloys, charge correlation effects, NiPt alloys, 266–268 Hybrid magnets, structure and properties, 503–504 Hydraulic-driven mechanical testing system, fracture toughness testing, loaddisplacement curve measurement, 307–308 Hydrocarbon layer deposition, nuclear reaction analysis (NRA) and proton-induced gamma ray emission (PIGE) and, error detection, 1207–1208 Hydrodynamics, chemical vapor deposition (CVD) model basic components, 170–171 software tools, 174 Hyperboloid sheets, two-beam diffraction, dispersion surface, 230 Hyperfine interactions, Mo¨ ssbauer spectroscopy body-centered cubic (bcc) iron solutes, 828–830 data analysis and interpretation, 831–832 magnetic field splitting, 823–824 overview, 820–821 Hysteresis loop magnetism principles, 494–495 permanent magnets, 497–499 surface magneto-optic Kerr effect (SMOKE), 569–570 experimental protocols, 571–572 Ice accumulation, energy-dispersive spectrometry (EDS), 1155 Ideal diode equation, pn junction characterization, 467–468 Ideality factor, pn junction characterization, 470 Ignition systems, combustion calorimetry, 376–377 Illuminated semiconductor-liquid interface J-E equations, 608–609 monochromatic illumination, 610 polychromatic illumination, 610–611 Illumination modes, reflected-light optical microscopy, 676–680
INDEX Image-analysis measurement, hardness test equipment, automated reading, 319 Image-enhancement techniques, reflected-light optical microscopy, 676–680 Image formation protocols scanning electron microscopy (SEM), 1052–1053 quality control, 1059–1060 selection criteria, 1061, 1063 scanning tunneling microscopy (STM), 1114 Imaging plate systems, single-crystal x-ray structure determination, 860 Impact parameters particle scattering, central-field theory, 57–58 wear testing protocols, 331 Impedance analysis, electrochemical quartz crystal microbalance (EQCM), 655–657 Impingement of projectiles, materials characterization, 1 Impulse response, scanning electrochemical microscopy (SECM), feedback mode, 639–640 Impulsive stimulated thermal scattering (ISTS) applications, 749–752 automation, 753 competitive and related techniques, 744–746 data analysis and interpretation, 753–757 limitations, 757–758 procedures and protocols, 746–749 research background, 744–746 sample preparation and specimen modification, 757 vendor information, 759 Impurity measurements, trace element accelerator mass spectrometry (TEAMS) bulk analysis, 1247–1249 depth-profiling techniques, 1250–1251 Inclined geometry, grazing-incidence diffraction (GID), 243 Incoherent imaging, scanning transmission electron microscopy (STEM) phase-contrast illumination vs., 1094–1097 probe configuration, 1098 research background, 1091–1092 scattering devices, 1098–1101 weakly scattering objects, 1111 Indexing procedures low-energy electron diffraction (LEED), qualitative analysis, 1122–1124 neutron powder diffraction, 1297–1298 transmission electron microscopy (TEM) diffraction pattern indexing, 1073–1074 Kikuchi line indexing, 1076–1077 Indirect detection techniques, nuclear quadrupole resonance (NQR), field cycling methods, 781 Indirect mass measurement techniques, basic principles, 24 Indirect structural technique, x-ray absorption spectroscopy, 870 Induced currents, magnetic field effects and applications, 496 Inductively coupled plasma (ICP) mass spectrometry heavy-ion backscattering spectrometry (HIBS) and, 1275 particle-induced x-ray emission (PIXE), 1211 trace element accelerator mass spectrometry (TEAMS) and, 1237 Inelastic scattering diffuse scattering subtraction, 890–893 magnetic neutron scattering basic principles, 1331 data analysis, 1335–1336 particle scattering, 52 diagrams, 52–54 x-ray photoelectron spectroscopy (XPS), survey spectrum, 974
Inertia, high-strain-rate testing, 299 Inert/reactive atmosphere, thermogravimetric (TG) analysis, 354–355 Infrared absorption spectroscopy (IRAS), solids analysis, vs. Raman spectroscopy, 699 Infrared (IR) spectrometry gas analysis and, 397–398 magnetic neutron scattering and, 1321 Inhomogeneity Hall effect, semiconductor materials, 417 x-ray photoelectron spectroscopy (XPS), initialstate effects, 978 Inhomogeneous path probability method (PPM), microstructural evolution, 116–117 Initial-state effects, x-ray photoelectron spectroscopy (XPS), 976–978 In situ analysis time-dependent neutron powder diffraction, 1300 transmission electron microscopy (TEM), specimen modification, 1088 Instrumentation criteria capacitance-voltage (C-V) characterization, 463–464 combustion calorimetry, 375–377 differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 367 selection criteria, 373 electrochemical quartz crystal microbalance (EQCM), 658–659 automated procedures, 659–660 gas analysis, simultaneous thermogravimetry (TG)-differential thermal analysis (TGDTA), sample holder criteria, 394 heavy-ion backscattering spectrometry (HIBS), 1279 impulsive stimulated thermal scattering (ISTS), 749–752 ion-beam-induced charge (IBIC)/single event upset (SEU) microscopy, 1228–1230 liquid surface x-ray diffraction, 1036–1038 low-energy electron diffraction (LEED), 1125– 1126 magnetic x-ray scattering, 934–935 medium-energy backscattering, 1267–1268 neutron powder diffraction, peak shape, 1292 nuclear magnetic resonance, 770–771 nuclear reaction analysis (NRA)/protoninduced gamma ray emission (PIGE), 1203–1204 optical microscopy, 668–671 phonon analysis, triple-axis spectrometry, 1320–1323 resonant scattering, 912–913 scanning electrochemical microscopy (SECM), 642–643 scanning electron microscopy (SEM), basic components, 1053–1057 scanning transmission electron microscopy (STEM), sources, 1111 single-crystal neutron diffraction, 1312–1313 single-crystal x-ray structure determination, 859–860 single event upset (SEU) microscopy, 1227–1228 superconductors, electrical transport measurements, 476–477, 479–480 surface x-ray diffraction, 1011–1015 crystallographic alignment, 1014–1015 five-circle diffractometer, 1013–1014 laser alignment, 1014 sample manipulator, 1012–1013 vacuum system, 1011–1012 thermal diffusivity, laser flash technique, 385–386
1361
trace element accelerator mass spectrometry (TEAMS), 1258 ultraviolet photoelectron spectroscopy (UPS), 733 x-ray magnetic circular dichroism (XMCD), 957–962 circularly polarized sources, 957–959 detector devicees, 959 measurement optics, 959–962 x-ray photoelectron spectroscopy (XPS), 978–983 analyzers, 980–982 detectors, 982 electron detection, 982–983 maintenance, 983 sources, 978–980 Instrumentation errors, hardness testing, 322 Instrumented indentation testing, basic principles, 317 Integrated intensity magnetic x-ray scattering, nonresonant antiferromagnetic scattering, 929–930 small-angle scattering (SAS), 222 two-beam diffraction, 235 x-ray powder diffraction, Bragg peaks, 840 Integrated vertical Hall sensor, magnetic field measurements, 507–508 Intensity computations Auger electron spectroscopy (AES), 1167–1168 diffuse scattering techniques, 886–889 absolute calibration, measured intensities, 894 derivation protocols, 901–904 low-energy electron diffraction (LEED), quantitative measurement, 1124–1125 multiple-beam diffraction, NBEAM theory, 238 particle-induced x-ray emission (PIXE), 1216– 1217 x-ray intensities and concentrations, 1221– 1222 surface x-ray diffraction, crystal truncation rods (CTR), 1009 Intensity-measuring ellipsometer, automation, 739 Interaction potentials molecular dynamics (MD) simulation, surface phenomena, 159 particle scattering, central-field theory, 57 Interdiffusion, substitutional and interstitial metallic systems, 155 Interfacial energy, microstructural evolution, 120 Interference effects nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE), 1205 surface/interface x-ray diffraction, crystal truncation rods (CTR), 219 x-ray absorption fine structure (XAFS) spectroscopy, single scattering picture, 871–872 x-ray diffraction crystal structure, 208–209 single-crystal x-ray structure determination, 851–858 Interlayer surface relaxation, molecular dynamics (MD) simulation, room temperature structure and dynamics, 161 Internal fields, dynamical diffraction, basic principles, 229 Internal magnetic field, ferromagnetism, 522–524 International Organization for Legal Metrology (OILM), weight standards, 26–27 International symmetry operators, improper rotation axis, 39–40 International temperature scales (ITS) IPTS-27 scale, 32–33 IPTS-68 scale, 33 ITS-90, 33
1362
INDEX
Interparticle interference, small-angle scattering (SAS), 222 Interplanar spacing, transmission electron microscopy (TEM), diffraction pattern indexing, 1073–1074 Interpolating gas thermometer, ITS-90 standard, 33 Inverse heating-rate curves, defined, 338 Inverse modeling, impulsive stimulated thermal scattering (ISTS), 753–754 Inverse photoemission, ultraviolet photoelectron spectroscopy (UPS) vs. photoemission, 722–723 valence electron characterization, 723–724 Inverse photoemission (IPES), metal alloy magnetism, local exchange splitting, 189–190 Ion beam analysis (IBA) accelerator mass spectrometry, 1235 elastic ion scattering for composition analysis applications, 1191–1197 basic concepts, 1181–1184 detector criteria and detection geometries, 1185–1186 equations, 1186–1189 experimental protocols, 1184–1186 limitations, 1189–1191 research background, 1179–1181 forward-recoil spectrometry applications, 1268–1269 automation, 1269 backscattering data, 1269–1270 basic principles, 1261–1265 complementary and alternative techniques, 1261 data analysis and interpretation, 1269–1271 forward-recoil data, 1270–1271 instrumentation criteria, 1267–1268 limitations, 1271–1272 research background, 1259–1261 resolution, 1267 safety issues, 1269 sensitivity parameters, 1266–1267 spectrometer efficiency, 1265–1266 time-of-flight spectrometry (TOS), 1265 heavy-ion backscattering spectrometry (HIBS) applications, 1279–1280 automation, 1280 basic principles, 1275–1277 competitive, complementary, and alternative strategies, 1275 data analysis and interpretation, 1280–1281 areal density, 1281 background and sensitivity, 1281 mass measurements, 1281 instrumentation criteria, 1277–1280 sensitivity and mass resolution, 1278–1279 limitations, 1281–1282 research background, 1273–1275 safety protocols, 1280 sample preparation, 1281 high-energy ion beam analysis (IBA) elastic recoil detection analysis (ERDA), 1178 heavy-ion backscattering, 1178 nuclear reaction analysis (NRA), 1178–1179 particle-induced x-ray emission, 1179 periodic table, 1177–1178 research background, 1176–1177 Rutherford backscattering spectrometry, 1178 medium-energy backscattering applications, 1268–1269 automation, 1269 backscattering data, 1269–1270 basic principles, 1261–1265 complementary and alternative techniques, 1261
data analysis and interpretation, 1269–1271 forward-recoil data, 1270–1271 instrumentation criteria, 1267–1268 limitations, 1271–1272 research background, 1258–1259, 1259–1261 resolution, 1267 safety issues, 1269 sensitivity parameters, 1266–1267 spectrometer efficiency, 1265–1266 time-of-flight spectrometry (TOS), 1265 nuclear reaction analysis (NRA) automation, 1205–1206 background, interference, and sensitivity, 1205 cross-sections and Q values, 1205 data analysis, 1206–1207 energy relations, 1209 energy scanning, resonance depth profiling, 1204–1205 energy spread, 1209–1210 instrumentation criteria, 1203–1204 limitations, 1207–1208 nonresonant methods, 1202–1203 research background, 1200–1202 resonant depth profiling, 1203 specimen modification, 1207 standards, 1205 unwanted particle filtering, 1204 particle-induced x-ray emission (PIXE) automation, 1216 basic principles, 1211–1212 competing methods, 1210–1211 data analysis, 1216–1218 limitations, 1220 protocols and techniques, 1213–1216 research background, 1210–1211 sample preparation, 1218–1219 specimen modification, 1219–1220 proton-induced gamma ray emission (PIGE) automation, 1205–1206 background, interference, and sensitivity, 1205 cross-sections and Q values, 1205 data analysis, 1206–1207 energy relations, 1209 energy scanning, resonance depth profiling, 1204–1205 energy spread, 1209–1210 instrumentation criteria, 1203–1204 limitations, 1207–1208 nonresonant methods, 1202–1203 research background, 1200–1202 resonant depth profiling, 1203 specimen modification, 1207 standards, 1205 unwanted particle filtering, 1204 radiation effects microscopy basic principles, 1224–1228 instrumentation criteria, 1128–1230 ion-induced damage, 1232–1233 limitations, 1233 quantitative analysis, pulse height interpretations, 1225–1226 research background, 1223–1224 semiconductor materials, 1225 SEU microscopy, 1227–1228 static random-access memory (SRAM), 1230–1231 specimen modification, 1231–1232 topographical contrast, 1226–1227 research background, 1175–1176 trace element accelerator mass spectrometry (TEAMS) automation, 1247 bulk analysis impurity measurements, 1247–1249
measurement data, 1249–1250 complementary, competitive and alternative methods, 1236–1238 inductively coupled plasma mass spectrometry, 1237 neutron activation-accelerator mass spectrometry (NAAMS), 1237 neutron activation analysis (NAA), 1237 secondary-ion mass spectrometry (SIMS), 1236–1237 selection criteria, 1237–1238 sputter-initiated resonance ionization spectrometry (SIRIS), 1237 data analysis and interpretation, 1247–1252 calibration of data, 1252 depth-profiling data analysis, 1251–1252 impurity measurements, 1250–1251 facilities profiles, 1242–1246 CSIRO Heavy Ion Analytical Facility (HIAF), 1245 Naval Research Laboratory, 1245–1246 Paul Scherrer Institute (PSI)/ETH Zurich Accelerator SIMS Laboratory, 1242–1244 Technical University Munich Secondary Ion AMS Facility, 1245 University of North Texas Ion Beam Modification and Analysis Laboratory, 1246 University of Toronto IsoTrace Laboratory, 1244–1245 facility requirements, 1238 future applications, 1239 high-energy beam transport, analysis, and detection, 1239 historical evolution, 1246–1247 impurity measurements bulk analysis, 1247–1249 depth-profiling, 1250–1251 instrumentation criteria, 1239–1247 magnetic and electrostatic analyzer calibration, 1241–1242 ultraclean ion source design, 1240–1241 instrumentation specifications and suppliers, 1258 limitations, 1253 research background, 1235–1238 sample preparation, 1252–1253 secondary-ion acceleration and electronstripping system, 1238–1239 specimen modification, 1253 ultraclean ion sources, negatively charged secondary-ion generation, 1238 Ion-beam-induced charge (IBIC) microscopy basic principles, 1224–1228 instrumentation criteria, 1128–1230 ion-induced damage, 1232–1233 limitations, 1233 quantitative analysis, pulse height interpretations, 1225–1226 research background, 1223–1224 semiconductor materials, 1225 SEU microscopy, 1227–1228 static random-access memory (SRAM), 1230–1231 specimen modification, 1231–1232 topographical contrast, 1226–1227 Ion burial, sputter-ion pump, 11 Ion excitation Auger electron spectroscopy (AES), peak error detection, 1171–1173 energy-dispersive spectrometry (EDS), 1136 Ionic magnetism, local moment origins, 513–515 Ionic migration, energy-dispersive spectrometry (EDS), specimen modification, 1152–1153
INDEX Ion-induced damage, ion-beam-induced charge (IBIC)/single event upset (SEU) microscopy, 1232–1233 Ionization cross-section, energy-dispersive spectrometry (EDS), standardless analysis, 1148 Ionization gauges cold cathode type, 16 hot cathode type calibration stability, 16 electron stimulated desorption (ESD), 15–16 operating principles, 14–15 Ionization loss peaks, Auger electron spectroscopy (AES), error detection, 1171 Ion milling, transmission electron microscopy (TEM), sample preparation, 1086–1087 Ion pumps, surface x-ray diffraction, 1011–1012 Ion scattering spectroscopy (ISS), Auger electron spectroscopy (AES) vs., 1158–1159 Iron (Fe) alloys. See also Nickel-iron alloys magnetism atomic short range order (ASRO), 191 gold-rich AuFe alloys, 198–199 magnetocrystalline anisotropy energy (MAE), 193 moment formation and bonding, FeV alloys, 196 Mo¨ ssbauer spectroscopy, body-centered cubic (bcc) solutes, 828–830 paramagnetism, first-principles calculations, 189 Irreversibility field (Hirr), superconductors, electrical transport measurement applications, 472 extrapolation, 475 magnetic measurement vs., 473 Ising energy, phase diagram prediction, static displacive interactions, 104–106 Isobaric mass change determination, defined, 338 Isolation valves, sputter-ion pump, 12 Isomers, combustion calorimetry, 377–378 Isomer shift, Mo¨ ssbauer spectroscopy basic principles, 821–822 hyperfine interactions, 820–821 Isothermal method, thermogravimetric (TG) analysis, kinetic theory, 352–354 Isotropic materials, reflected-light optical microscopy, 678–680 Itinerant magnetism, transition metal magnetic ground state, zero temperature, 181–183 J-E equations, semiconductor-liquid interface concentration overpotentials, 611–612 illuminated cases, 608–609 kinetic properties, 631–633 series resistance overpotentials, 612 J-integral approach, fracture toughness testing basic principles, 306–307 crack extension measurement, 311 research background, 302 sample preparation, 312 SENB and CT speciments, 314–315 unstable fractures, 309 Johnson noise, scanning tunneling microscopy (STM), 1113–1114 Joint leaks, vacuum systems, 20 Junction capacitance transient, deep level transient spectroscopy (DLTS), semiconductor materials, 421–423 Junction diodes, deep level transient spectroscopy (DLTS), semiconductor materials, 420 Junction field-effect transistors (JFETs), characterization basic principles, 467–469 competitive and complementary techniques, 466–467
limitations of, 471 measurement equipment sources and selection criteria, 471 protocols and procedures, 469–470 research background, 466–467 sample preparation, 470–471 Katharometer, evolved gas analysis (EGA) and chromatography, 396–397 Keating model, surface x-ray diffraction measurements, crystallographic refinement, 1018 Keithley-590 capacitance meter, capacitancevoltage (C-V) characterization, 460–461 Kelvin-Onsager relations, magnetotransport in metal alloys, 560–561 Kelvin relations, magnetotransport in metal alloys, basic principles, 560 Kerr effect magnetometry magnetic domain structure measurements, magneto-optic imaging, 547–549 principles and applications, 535–536 surface magneto-optic Kerr effect (SMOKE) automation, 573 classical theory, 570–571 data analysis and interpretation, 573–574 limitations, 575 multilayer formalism, 571 phenomenologica origins, 570 protocols and procedures, 571–573 quantum theory, ferromagnetism, 571 research background, 569–570 sample preparation, 575 Kikuchi-Barker coefficients, phase diagram predictions, cluster variational method (CVM), 99–100 Kikuchi lines scanning transmission electron microscopy (STEM), instrumentation criteria, 1104–1105 transmission electron microscopy (TEM) deviation vector and parameter, 1067–1068 defect contrast, 1085 specimen orientation, 1075–1078 deviation parameters, 1077–1078 electron diffuse scattering, 1075 indexing protocols, 1076–1077 line origins, 1075–1076 Kinematic theory diffuse scattering techniques, crystal structure, 885–889 dynamical diffraction, theoretical background, 224–225 ion beam analysis (IBA), 1176 ion-beam analysis (IBA), elastic two-body collision, 1181–1184 liquid surface x-ray diffraction, reflectivity measurements, 1032–1033 particle scattering, 51–57 binary collisions, 51 center-of-mass and relative coordinates, 54–55 elastic scattering and recoiling, 51–52 inelastic scattering and recoiling, 52 nuclear reactions, 56–57 relativistic collisions, 55–56 scattering/recoiling diagrams, 52–54 single-crystal neutron diffraction and, 1310–1311 transmission electron microscopy (TEM) deviation vector and parameter, 1068 structure and shape factor analysis, 1065–1066 x-ray diffraction crystalline material, 208–209 lattice defects, 210
1363
local atomic arrangement - short-range ordering, 214–217 research background, 206 scattering principles, 206–208 small-angle scattering (SAS), 219–222 cylinders, 220–221 ellipsoids, 220 Guinier approximation, 221 integrated intensity, 222 interparticle interference, 222 K=0 extrapolation, 221 porod approximation, 221–222 size distribution, 222 spheres, 220 two-phase model, 220 structure factor, 209–210 surface/interface diffraction, 217–219 crystal truncation rods, 219 two-dimensional diffraction rods, 218–219 thermal diffuse scattering (TDS), 210–214 Kinetic energy Auger electron spectroscopy (AES), 1159–1160 chemical vapor deposition (CVD) model basic components, 171 software tools, 175 electrochemical photocapacitance spectroscopy (EPS), semiconductor materials, 625 low-energy electron diffraction (LEED), quantitative analysis, 1129–1131 semiconductor-liquid interface steady-state J-E data, 631–633 time-resolved photoluminescence spectroscopy (TRPL), 630–631 thermal analysis and, 343 thermogravimetric (TG) analysis, applications, 352–354 ultraviolet photoelectron spectroscopy (UPS), photoemission process, 726–727 x-ray photoelectron spectroscopy (XPS), 971–972 analyzer criteria, 981–982 background subtraction, 990–991 Kirchoff’s law of optics, radiation thermometer, 36–37 Kirkendall effect, binary/multicomponent diffusion, vacancy wind and, 149 Kirkendall shift velocity, binary/multicomponent diffusion, frames of reference, transition between, 148–149 Kirkpatrick-Baez (KB) mirrors, x-ray microprobes, 946 KLM line markers, energy-dispersive spectrometry (EDS), qualitative analysis, 1142–1143 Knock-on damage, transmission electron microscopy (TEM), metal specimen modification, 1088 Knoop hardness testing automated methods, 319 basic principles, 317–318 data analysis and interpretation, 319–320 hardness values, 317–318, 323 limitations and errors, 322 procedures and protocols, 318–319 research background, 316–317 sample preparation, 320 specimen modification, 320–321 Knudsen diffusion coefficient, chemical vapor deposition (CVD) model, free molecular transport, 169–170 Knudsen effect combustion calorimetry, 371–372 thermogravimetric (TG) analysis, instrumentation and apparatus, 349–350
1364
INDEX
Kohn-Sham equation diffuse intensities, metal alloys, concentration waves, density-functional theory (DFT), 260–262 metal alloy bonding accuracy calculations, 139–141 precision measurements, self-consistency, 141 metal alloy magnetism competitive and related techniques, 185–186 magnetocrystalline anisotropy energy (MAE), 192–193 solid-solution alloy magnetism, 183–184 transition metal magnetic ground state, itinerant magnetism at zero temperature, 182–183 Koopman’s theorem, electronic structure analysis, Hartree-Fock (HF) theory, 77 Korringa-Kohn-Rostocker (KKR) method band theory of magnetism, layer Korringa Kohn Rostoker (LKKR) technique, 516 diffuse intensities, metal alloys concentration waves, first-principles calcuations, electronic structure, 264–266 hybridization in NiPt alloys, charge correlation effects, 267–268 metal alloy magnetism, magnetocrystalline anisotropy energy (MAE), 192–193 phase diagram prediction, electronic structure, 101–102 solid-solution alloy magnetism, 183–184 Kossel cones, transmission electron microscopy (TEM), Kikuchi line indexing, 1076–1077 Kovar, vacuum system construction, 17 k-points, metal alloy bonding, precision calculations, Brillouin zone sampling, 143 Kramers-Kronig relation, resonant scattering analysis, 907 Krivoglaz-Clapp-Moss formula, diffuse intensities, metal alloys concentration waves, density-functional theory (DFT), 260–262 mean-field results, 262–263 Kronecker delta function, 209 Kruschov equation, wear testing, 334 Laboratory balances, applications, 27 Lambert cosine law, radiation thermometer, 37 Lamb modes impulsive stimulated thermal scattering (ISTS), 754–755 Mo¨ ssbauer effect, 818–820 Landau levels, cyclotron resonance (CR), 805–806 quantum mechanics, 808 Landau-Lifshitz equation, resonant scattering analysis, 909–910 Landau theory, magnetic phase transitions, 529–530 thermomagnetic analysis, 544 Lande g factor, local moment origins, 514–515 Langevin function diagmagnetism, dipole moments, atomic origin, 512 ferromagnetism, 524 paramagnetism, classical and quantum theories, 520–522 superparamagnetism, 522 Langmuir monolayers, liquid surface x-ray diffraction data analysis, 1041–1043 grazing incidence and rod scans, 1036 Lanthanum hexaboride electron guns Auger electron spectroscopy (AES), 1160–1161 scanning electron microscopy (SEM) instrumentation criteria, 1054 selection criteria, 1061 Laplace equation
magnetotransport in metal alloys, 560–561 thermal diffusivity, laser flash technique, 388–389 Larmor frequency closed/core shell diamagnetism, dipole moments, atomic origin, 512 nuclear magnetic resonance, 764–765 nuclear quadrupole resonance (NQR), Zeemanperturbed NRS (ZNRS), 782–783 Laser components impulsive stimulated thermal scattering (ISTS), 750–752 photoluminescence (PL) spectroscopy, 684 Raman spectroscopy of solids, radiation sources, 705 surface x-ray diffraction, alignment protocols, 1014 ultraviolet/visible absorption (UV-VIS) spectroscopy, 690 Laser flash technique, thermal diffusivity automation of, 387 basic principles, 384–385 data analysis and interpretation, 387–389 limitations, 390 protocols and procedures, 385–387 research background, 383–384 sample preparation, 389 specimen modification, 389–390 Laser magneto-spectroscopy (LMS), cyclotron resonance (CR) basic principles, 808–812 data analysis and interpretation, 813–814 error detection, 814–815 far-infrared (FIR) sources, 810–812 Laser spot scanning (LSS), semiconductor-liquid interface, photoelectrochemistry, 626–628 Lattice-fixed frame of reference binary/multicomponent diffusion, 148 substitutional and interstitial metallic systems, 152 Lattice systems. See also Reciprocal lattices conductivity measurements, research background, 401–403 crystallography, 44–46 Miller indices, 45–46 deep level transient spectroscopy (DLTS), 419 defects, x-ray diffraction, 210 density definition and, 26 local atomic arrangment - short-range ordering, 214–217 metal alloy magnetism, atomic short range order (ASRO), 191 molecular dynamics (MD) simulation, surface phenomena, 158–159 phonon analysis, basic principles, 1317–1318 Raman active vibrational modes, 709–710, 720–721 single-crystal x-ray structure determination, crystal symmetry, 854–856 thermal diffuse scattering (TDS), 212–214 transition metal magnetic ground state, itinerant magnetism at zero temperature, 183 transmission electron microscopy (TEM), defect diffraction contrast, 1068–1069 ultraviolet photoelectron spectroscopy (UPS), 730 x-ray powder diffraction, crystal lattice determination, 840 Laue conditions diffuse scattering techniques, 887–889 low-energy electron diffraction (LEED), 1122 single-crystal neutron diffraction, instrumentation criteria, 1312–1313 single-crystal x-ray structure determination, 853
protocols and procedures, 858–860 two-beam diffraction anomalous transmission, 231–232 diffracted intensities, 233–234 hyperboloid sheets, 230 Pendello¨ sung, 231 x-ray standing wave (XWS) diffraction, 232 x-ray diffraction crystal structure, 208–209 local atomic correlation, 215–217 Layer density, surface phenomena, molecular dynamics (MD) simulation, temperature variation, 164 Layer Korringa Kohn Rostoker (LKKR) technique, band theory of magnetism, 516 Lead zirconium titanate film, impulsive stimulated thermal scattering (ISTS), 751–752 Leak detection, vacuum systems, 20–22 Least-squares computation energy-dispersive spectrometry (EDS), peak overlap deconvolution, multiple linear least squares (MLLS) method, 1144–1145 neutron powder diffraction refinement algorithms, 1300–1301 Rietveld refinements, 1296 single-crystal x-ray structure determination, crystal structure refinements, 857–858 thermal diffusivity, laser flash technique, 389 L edges resonant scattering angular dependent tensors L=2, 916 L=2 measurements, 910–911 L=4 measurements, 911–912 x-ray absorption fine structure (XAFS) spectroscopy, 873 Lennard-Jones parameters, chemical vapor deposition (CVD) model, kinetic theory, 171 Lens defects and resolution optical microscopy, 669 scanning electron microscopy (SEM), 1054 transmission electron microscopy (TEM), 1078– 1080 aperture diffraction, 1078 astigmatism, 1079 chromatic aberration, 1078 resolution protocols, 1079–1080 spherical aberration, 1078 Lenz-Jensen potential, ion-beam analysis (IBA), non-Rutherford cross-sections, 1190 Lenz’s law of electromagnetism diamagnetism, 494 closed/core shell diamagnetism, dipole moments, atomic origin, 512 superconductor magnetization, 518–519 Level-crossing double resonance NQR nutation spectroscopy, 785 LHPM program, neutron powder diffraction axial divergence peak asymmetry, 1293 Rietveld refinements, 1306 Lifetime analysis carrier lifetime measurement, free carrier absorption (FCA), 441–442 x-ray photoelectron spectroscopy (XPS), 973–974 Lifetime characterization. See Carrier lifetime measurement Lifetime depth profiling, carrier lifetime measurement, free carrier absorption (FCA), 441 Lifetime mapping, carrier lifetime measurement, free carrier absorption (FCA), 441–442 Light-emitting diodes (LEDs) carrier lifetime measurement, photoluminescence (PL), 451–452 deep level transient spectroscopy (DLTS), semiconductor materials, 419
INDEX Light-ion backscattering, medium-energy backscattering, trace-element sensitivity, 1266–1267 Light leakage, energy-dispersive spectrometry (EDS), 1154–1155 Light product, particle scattering, nuclear reactions, 57 Light scattering, Raman spectroscopy of solids, semiclassical physics, 701–702 Light sources ellipsometry, 738 ultraviolet photoelectron spectroscopy (UPS), 728–729 Linear combination of atomic orbitals (LCAOs), metal alloy bonding, precision calculations, 142–143 Linear dependence, carrier lifetime measurement, free carrier absorption (FCA), 438–439 Linear elastic fracture mechanics (LEFM) fracture toughness testing crack tip opening displacement (CTOD) (d), 307 stress intensity factor (K), 306 fracture-toughness testing, 302 basic principles, 303 Linear laws, binary/multicomponent diffusion, 146–147 Linearly variable differential transformer (LVDT), stress-strain analysis, lowtemperature testing, 286 Linear muffin tin orbital method (LMTO), electronic topological transitions, van Hove singularities in CuPt, 272–273 Linear polarization, corrosion quantification, 596–599 Line scans, Auger electron spectroscopy (AES), 1163–1164 Lineshape analysis, surface x-ray diffraction, 1019 Liouville’s theorem, ultraviolet photoelectron spectroscopy (UPS), figure of merit light sources, 734–735 Liquid alkanes, liquid surface x-ray diffraction, surface crystallization, 1043 Liquid-filled thermometer, operating principles, 35 Liquid metals, liquid surface x-ray diffraction, 1043 Liquid nitrogen (LN2) temperature trap, diffusion pump, 6–7 Liquid surfaces, x-ray diffraction basic principles, 1028–1036 competitive and related techniques, 1028 data analysis and interpretation, 1039–1043 Langmuir monolayers, 1041–1043 liquid alkane crystallization, 1043 liquid metals, 1043 simple liquids, 1040–1041 non-specular scattering GID, diffuse scattering, and rod scans, 1038– 1039 reflectivity measurements, 1033 p-polarized x-ray beam configuration, 1047 reflectivity, 1029–1036 Born approximation, 1033–1034 distorted-wave Born approximation, 1034– 1036 Fresnel reflectivity, 1029–1031 grazing incidence diffraction and rod scans, 1036 instrumentation, 1036–1038 multiple stepwise and continuous interfaces, 1031–1033 non-specular scattering, 1033 research background, 1027–1028
specimen modification, 1043–1045 Literature sources dynamical diffraction, 226–227 on thermal analysis, 343–344 Lithium models, computational analysis, 72–74 Load-displacement behaviors, fracture toughness testing J-integral approach, 306–307 measurement and recording apparatus, 307–308 notched specimens, 303–304 stable crack mechanics, 309–311 unstable fractures, 308–309 Loading modes, fracture toughness testing, 302– 303 Load-lock procedures, surface x-ray diffraction, ultrahigh-vacuum (UHV) systems, 1023 Load-to-precision ratio (LPR), thermogravimetric (TG) analysis, 347–350 Local atomic correlation, x-ray diffraction, shortrange ordering, 214–217 Local atomic dipole moments collective magnetism, 515 ionic magnetism, 513–515 Local exchange splitting, metal alloy magnetism, 189–190 atomic short range order (ASRO), 191 Localized material deposition and dissolution, scanning electrochemical microscopy (SECM), 645–646 Local measurements, carrier lifetime measurement, device-related techniques, 435 Local moment fluctuation, metal alloy magnetism, 187–188 Local spin density approximation (LDA) computational theory, 74 diffuse intensities, metal alloys, concentration waves, first-principles calcuations, electronic structure, 263–266 electronic structure analysis basic components, 77–84 elastic constants, 81 extensions, 76–77 gradient corrections, 78 GW approximation, 84–85 heats of formation and cohesive energies, 79–81 implementation, 75–76 magnetic properties, 81–83 optical properties, 83–84 structural properties, 78–79 summary, 75 SX approximation, 85–86 metal alloy bonding, accuracy calculations, 139–141 metal alloy magnetism, limitations, 201 Local spin density approximation (LDA) þ U theory electronic structure analysis, 75 dielectric screening, 86–87 metal alloy magnetism, competitive and related techniques, 186 Local spin density approximation (LSDA) metal alloy magnetism, competitive and related techniques, 185–186 transition metal magnetic ground state, itinerant magnetism at zero temperature, 181–183 Lock-in amplifiers/photon counters, photoluminescence (PL) spectroscopy, 685 Longitudinal Kerr effect, surface magneto-optic Kerr effect (SMOKE), 571–572 Longitudinal relaxation time, nuclear quadrupole resonance (NQR), 779–780 Long-range order (LRO), diffuse intensities, metal alloys, 253–254 Lorentz force
1365
electromagnet structure and properties, 499–500 magnetic field effects and applications, 496 Mo¨ ssbauer spectroscopy, 817–818 pulse magnets, 504–505 Lorentzian equation magnetic x-ray scattering, 934–935 two-beam diffraction, diffracted intensities, 234 Lorentzian peak shape, x-ray photoelectron spectroscopy (XPS), 1005–1006 Lorentz transmission electron microscopy basic principles, 1064 magnetic domain structure measurements, 551–552 Lossy samples, electron paramagnetic resonance (EPR), 799 Low-energy electron diffraction (LEED) automation, 1127 basic principles, 1121–1125 calculation protocols, 1134–1135 coherence estimation, 1134 complementary and related techniques, 1120–1121 data analysis and interpretation, 1127–1131 instrumentation criteria, 1125–1127 limitations, 1132 liquid surfaces and monomolecular layers, 1028 qualitative analysis basic principles, 1122–1124 data analysis, 1127–1128 quantitative measurements basic principles, 1124–1125 data analysis, 1128–1131 research background, 1120–1121 sample preparation, 1131–1132 scanning tunneling microscopy (STM), sample preparation, 1117 specimen modification, 1132 surface x-ray diffraction and, 1007–1008 ultraviolet photoelectron spectroscopy (UPS), sample preparation, 732–733 Low-energy electron microscopy (LEEM), magnetic domain structure measurements spin polarized low-energy electron microscopy (SPLEEM), 556–557 x-ray magnetic circular dichroism (XMCD), 555 Low-energy ion scattering (LEIS) Auger electron spectroscopy (AES) vs., 1158–1159 ion beam analysis (IBA), 1175–1176 medium-energy backscattering, 1259 Low-energy peak distortion, energy-dispersive spectrometry (EDS), 1155 Low-field sensors, magnetic field measurements, 506 Low-temperature photoluminescence (PL) spectroscopy, band identification, 686–687 Low-temperature superconductors (LTSs), electrical transport measurement applications, 472 contact materials, 474 current sharing with other materials, 474 Low-temperature testing, stress-strain analysis, 286 Lubricated materials, tribological and wear testing, 324–325 properties of, 326 Luder’s strain, stress-strain analysis, 282 Luggin capillaries, semiconductor electrochemical cell design, photocurrent/photovoltage measurements, 610 Lumped-circuit resonators, electron paramagnetic resonance (EPR), 796 Lumped deposition model, chemical vapor deposition (CVD), 167–168 limitations of, 175–176
1366
INDEX
Macroscopic properties, computational analysis, 72–74 Madelung potential, x-ray photoelectron spectroscopy (XPS) chemical state information, 986 initial-state effects, 977–978 Magnetic bearings, turbomolecular pumps, 7 Magnetic circular dichroism (MCD) magnetic domain structure measurements, 555–556 automation of, 555–556 basic principles, 555 procedures and protocols, 555 ultraviolet photoelectron spectroscopy (UPS) and, 726 Magnetic diffraction, magnetic neutron scattering, 1329–1330 Magnetic dipoles, magnetism, general principles, 492 Magnetic domain structure measurements bitter pattern imaging, 545–547 holography, 554–555 Lorenz transmission electron microscopy, 551–552 magnetic force microscopy (MFM), 549–550 magneto-optic imaging, 547–549 scanning electron microscopy (Types I and II), 550–551 polarization analysis, 552–553 scanning Hall probe and scanning SQUID microscopes, 557 spin polarized low-energy electron microscopy, 556–557 theoretical background, 545 x-ray magnetic circular dichroism, 555–556 Magnetic effects, diffuse intensities, metal alloys, chemical order and, 268 Magnetic field gradient Mo¨ ssbauer spectroscopy, hyperfine magnetic field (HMF), 823–824 spatially resolved nuclear quadruople resonance, 785–786 Magnetic fields continuous magnetic fields, 505 effects and applications, 496 electromagnets, 499–500 generation, 496–505 laboratories, 505 magnetotransport in metal alloys free electron models, 563–565 research applications, 559 transport equations, 560–561 measurements high-field sensors, 506–508 low-field sensors, 506 magnetoresistive sensors, 508–509 nuclear magnetic resonance, 509–510 research background, 505–506 search coils, 509 permanent magnets, 497–499 pulse magnets, 504–505 laboratories, 505 research background, 495–496 resistive magnets, 502–504 hybrid magnets, 503–504 power-field relations, 502–503 superconducting magnets, 500–502 changing fields stability and losses, 501–502 protection, 501 quench and training, 501 Magnetic force microscopy (MFM), magnetic domain structure measurements, 549–550 Magnetic form factor, resonant magnetic x-ray scattering, 922–924 Magnetic induction, principles of, 511
Magnetic measurements, superconductors, electric transport measurements vs., 472–473 Magnetic moments band theory of solids, 515–516 collective magnetism, 515 dipole moment coupling, 519–527 antiferromagnetism, 524 ferrimagnetism, 524–525 ferromagnetism, 522–524 Heisenberg model and exchange interactions, 525–526 helimagnetism, 526–527 paragmagnetism, classical and quantum theories, 519–522 dipole moments, atomic origin, 512–515 closed/core shell diamagnetism, 512 ionic magnetism, local atomic moment origins, 513–515 magnetic field quantities, 511–512 neutron powder diffraction, 1288–1289 research background, 511 spin glass and cluster magnetism, 516–517 superconductors, 517–519 Magnetic neutron scattering automation, 1336 data analysis, 1337 diffraction applications, 1332–1335 inelastic scattering basic principles, 1331 protocols for, 1335–1336 limitations, 1337–1338 magnetic diffraction, 1329–1330 neutron magnetic diffuse scattering, 904–905 polarized beam technique, 1330–1331 research background, 1328–1329 sample preparation, 1336–1337 subtraction techniques, 1330 Magnetic order, binary/multicomponent diffusion, substitutional and interstitial metallic systems, 153–154 Magnetic permeability defined, 492–493 vacuum permeability, 511 Magnetic phase transition theory critical exponents, 530 Landau theory, 529–530 thermodynamics, 528–529 Magnetic phenomena, superconductors, electrical transport measurements, 479 Magnetic resonance imaging (MRI). See also Nuclear magnetic resonance (NMR) applications, 767–771 basic principles and applications, 762–763 permanent magnets, 496 superconducting magnets, 500–502 theoretical background, 765–767 Magnetic sector mass spectrometer, pressure measurments, 16–17 Magnetic short-range order (MSRO), facecentered cubic (fcc) iron, moment alignment vs. moment formation, 194–195 Magnetic susceptibility, defined, 492–493 Magnetic x-ray scattering data analysis and interpretation, 934–935 hardware criteria, 925–927 limitations, 935–936 magnetic neutron scattering and, 1320–1321 nonresonant scattering antiferromagnets, 928–930 ferromagnets, 930 research background, 918–919 theoretical concepts, 920–921 research background, 917–919 resonant scattering antiferromagnets, 930–932
ferromagnets, 932 research background, 918–919 theoretical concepts, 921–924 sample preparation, 935 spectrometer criteria, 927–928 surface magnetic scattering, 932–934 theoretical concepts, 919–925 ferromagnetic scattering, 924–925 nonresonant scattering, 920–921 resonant scattering, 921–924 Magnetism. See also Paramagnetism antiferromagnetism, 494 diamagnetism, 494 ferromagnetism, 494–495 local density approximation (LDA), 81, 83 metal alloys anisotropy, 191–193 MAE calculations, 192 pure Fe, Ni, and Co, 193 approximation techniques, 185–186 atomic short range order, 190–191 atoms-to-solids transition, 180–181 bonding accuracy calculations, 140–141 data analysis and interpretation, 193–200 ASRO in FeV, 196–198 ASRO in gold-rich AuFe alloys, 198–199 atomic long- and short-range order, NiFe alloys, 193 magnetic moments and bonding in FeV alloys, 196 magnetocrystalline anisotropy, Co-Pt alloys, 199–200 moment alignment vs. formation in fcc Fe, 193–195 negative thermal expansion effects, 195–196 electronic structure and Slater-Pauling curves, 184–185 finite temperature paramagnetism, 186–187 first-principles theories, 188–189 paramagnetic Fe, Ni, and Co, 189 limitations of analysis, 200–201 local exchange splitting, 189–190 local moment fluctuations, 187–188 research background, 180 solid-solution alloys, 183–184 transition metal ground state, 181–183 susceptibility and permeability, 492–493 theory and principles, 491–492 thermogravimetric (TG) analysis, mass measurement errors and, 357–358 units, 492 Magnetization magnetic field effects and applications, 496 magnetic moment band theory of solids, 515–516 collective magnetism, 515 dipole moment coupling, 519–527 antiferromagnetism, 524 ferrimagnetism, 524–525 ferromagnetism, 522–524 Heisenberg model and exchange interactions, 525–526 helimagnetism, 526–527 paragmagnetism, classical and quantum theories, 519–522 dipole moments, atomic origin, 512–515 closed/core shell diamagnetism, 512 ionic magnetism, local atomic moment origins, 513–515 magnetic field quantities, 511–512 research background, 511 spin glass and cluster magnetism, 516–517 superconductors, 517–519 Magnetocrystalline anisotropy energy (MAE), metal alloy magnetism
INDEX calculation techniques, 192–193 Co-Pt alloys, 199–200 Magnetoelastic scattering, magnetic x-ray scattering errors, 935–936 Magnetometers/compound semiconductor sensors, Hall effect, 507 Magnetometry automation, 537 limitations, 538 basic principles, 531–533 calibration procedures, 536 data analysis and interpretation, 537–538 flux magnetometers, 533–534 force magnetometers, 534–535 high-pressure magnetometry, 535 limitations, 538–539 research background, 531 rotating-sample magnetometer, 535 sample contamination, 538 sample preparation, 536–537 surface magnetometry, 535–536 vibrating-coil magnetometer, 535 Magneto-optic imaging magnetic domain structure measurements, 547–549 surface magneto-optic Kerr effect (SMOKE) automation, 573 classical theory, 570–571 data analysis and interpretation, 573–574 limitations, 575 multilayer formalism, 571 phenomenologica origins, 570 protocols and procedures, 571–573 quantum theory, ferromagnetism, 571 research background, 569–570 sample preparation, 575 Magneto-optic Kerr effect (MOKE) surface magneto-optic Kerr effect (SMOKE) evolution, 569–570 superlattice structures, 574 x-ray magnetic circular dichroism (XMCD), 953–955 Magnetoresistive sensors, magnetic field measurements, 508–509 Magnetotransport properties, metal alloys automated procedures, 565–566 data analysis and interpretation, 566 Hall resistance, 560 limitations, 567–568 magnetic field behaviors, 563–565 measured vs. intrinsic quantities, 559–560 research background, 559 sample preparation, 566–567 thermal conductance, 560 thermopower, 560 transport equations, 560–561 zero magnetic field behaviors, 561–563 Majority carrier currents, illuminated semiconductor-liquid interface, J-E equations, 608 Many-body interaction potentials molecular dynamics (MD) simulation, surface phenomena, 158–159 transition metal magnetic ground state, itinerant magnetism at zero temperature, 181–183 Mapping techniques, Auger electron spectroscopy (AES), 1164–1165 Mass, definitions, 24–26 Mass loss, thermogravimetric (TG) analysis, 345 sample preparation, 357 Mass measurements basic principles, 24 combustion calorimetry, 378–379 cyclotron resonance (CR), 806
electrochemical quartz crystal microbalance (EQCM), 653 gas analysis, simultaneous thermogravimetry (TG)-differential thermal analysis (TG-DTA), 394 heavy-ion backscattering spectrometry (HIBS), 1281 indirect techniques, 24 mass, weight, and density definitions, 24–26 mass measurement process assurance, 28–29 materials characterization, 1 thermogravimetric (TG) analysis, 345–346 error detection, 357–358 weighing devices, balances, 27–28 weight standards, 26–27 Mass resolution heavy-ion backscattering spectrometry (HIBS), 1278–1279 ion beam analysis (IBA), 1178 ion-beam analysis (IBA), ERD/RBS techniques, 1191–1197 Mass spectrometers evolved gas analysis (EGA), thermal degradation studies, 395–396 leak detection, vacuum systems, 21–22 pressure measurments, 16–17 Mass transfer, thermogravimetric (TG) analysis, 350–352 Material errors, hardness testing, 322 Materials characterization common concepts, 1 particle scattering applications, 61 ultraviolet/visible absorption (UV-VIS) spectroscopy competitive and related techniques, 690 interpretation, 694 measurable properties, 689 procedures and protocols, 691–692 Matrix corrections, energy-dispersive spectrometry (EDS), 1145–1147 Matrix factor, x-ray photoelectron spectroscopy (XPS), elemental composition analysis, 985–986 Matrix representations, vibrational Raman spectroscopy, symmetry operators, 717–720 Matrix techniques, liquid surface x-ray diffraction, reflectivity measurements, 1031–1033 Mattheissen’s rule, magnetotransport in metal alloys, 562–563 Maximum entropy method (MEM) neutron powder diffraction, structure factor relationships, 1298 nutation nuclear resonance spectroscopy, 784 rotating frame NQR imaging, 787–788 scanning transmission electron microscopy (STEM), data interpretation, 1105–1106 Maximum measurement current, superconductors, electrical transport measurements, 479 Maximum-sensitivity regime, photoconductivity (PC), carrier lifetime measurement, 445 Maxwell’s equation dynamical diffraction, 228 microwave measurement techniques, 408–409 Mean-field approximation (MFA) diffuse intensities, metal alloys atomic short-range ordering (ASRO) principles, errors, 257 basic principles, 254 concentration waves, 262 metal alloy magnetism, first-principles calculations, 188–189 phase diagram predictions, 92–96 aluminum-nickel alloys, 107
1367
cluster variational method (CVM), 99–100 Mean-field theory collective magnetism, 515 ferrimagnetism, 525 ferromagnetism, 522–524 Mean-square atomic displacement, surface phenomena, molecular dynamics (MD) simulation, high-temperature structure and dynamics, 162–163 Mean-square vibrational amplitudes, surface phenomena, molecular dynamics (MD) simulation, room temperature structure and dynamics, 160–161 Measured quantities, magnetotransport in metal alloys, intrinsic quantities vs., 559–560 Measurement errors energy-dispersive spectrometry (EDS), 1136 protocols, 1156–1157 hardness testing, 322 Measurement frequency, electrochemical photocapacitance spectroscopy (EPS), semiconductor materials, 625 Measurement time, semiconductor materials, Hall effect, 416–417 Mechanical abrasion, metallographic analysis 7075-T6 anodized aluminum alloy, 68 deformed high-purity aluminum, 69 sample preparation, 66 Mechanical polishing ellipsometric measurements, 741 metallographic analysis 7075-T6 anodized aluminum alloy, 68 deformed high-purity aluminum, 69 sample preparation, 66 4340 steel, 67 Mechanical testing. See specific testing protocols, e.g. Tension testing data analysis and interpretation, 287–288 elastic deformation, 281 method automation, 286–287 nonuniform plastic deformation, 282 research background, 279 sample preparation, 287 stress/strain analysis, 280–282 curve form variables, 282–283 definitions, 281–282 material and microstructure, 283 temperature and strain rate, 283 yield-point phenomena, 283–284 tension testing basic principles, 279–280 basic tensile test, 284–285 elastic properties, 279 environmental testing, 286 extreme temperatures and controlled environments, 285–286 high-temperature testing, 286 low-temperature testing, 286 plastic properties, 279–280 specimen geometry, 285 testing machine characteristics, 285 tribological testing acceleration, 333 automation of, 333–334 basic principles, 324–326 control factors, 332–333 data analysis and interpretation, 334 equipment and measurement techniques, 326–327 friction coefficients, 326 friction testing, 328–329 general procedures, 326 limitations, 335 research background, 324 results analysis, 333 sample preparation, 335
1368
INDEX
Mechanical testing. See specific testing protocols, e.g. Tension testing (Continued) test categories, 327–328 wear coefficients, 326–327 wear testing, 329–332 uniform plastic deformation, 282 Mechanical tolerances, tribological testing, 332–333 MEDIC algorithm, neutron powder diffraction, structure factor relationships, 1298 Medium-boundary/medium-propagation matrices, surface magneto-optic Kerr effect (SMOKE), 576–577 Medium-energy backscattering applications, 1268–1269 automation, 1269 backscattering data, 1269–1270 basic principles, 1261–1265 complementary and alternative techniques, 1261 data analysis and interpretation, 1269–1271 forward-recoil data, 1270–1271 instrumentation criteria, 1267–1268 limitations, 1271–1272 research background, 1258–1259, 1259–1261 resolution, 1267 safety issues, 1269 sensitivity parameters, 1266–1267 spectrometer efficiency, 1265–1266 time-of-flight spectrometry (TOS), 1265 Medium energy ion scattering (MEIS), ion beam analysis (IBA), 1175–1176 Mega-electron-volt deuterons, high-energy ion beam analysis (IBA), 1176–1177 Meissner effect diamagnetism, 494 superconductor magnetization, 517–519 Mercury-filled thermometer, operating principles, 35 Mercury probe contacts, capacitance-voltage (CV) characterization, 461–462 Mesoscopic Monte Carlo method, microstructural evolution, 114–115 Mesoscopic properties, neutron powder diffraction, 1293–1296 microstrain broadening, 1294–1295 particle size effect, 1293–1294 stacking faults, 1295–1296 Metal alloys. See also Transition metals; specific alloys bonding limits and pitfalls, 138–144 accuracy limits, 139–141 all-electron vs. pseudopotential methods, 141–142 basis sets, 142–143 first principles vs. tight binding, 141 full potentials, 143 precision issues, 141–144 self-consistency, 141 structural relaxations, 143–144 phase formation, 135–138 bonding-antibonding effects, 136–137 charge transfer and electronegativities, 137–138 Friedel’s d-band energetics, 135 size effects, 137 topologically close-packed phases, 135–136 transition metal crystal structures, 135 wave function character, 138 research background, 134–135 diffuse intensities atomic short-range ordering (ASRO) principles, 256 concentration waves density-functional approach, 260–262
first-principles, electronic-structure calculations, 263–266 multicomponent alloys, 257–260 mean-field results, 262 mean-field theory, improvement on, 262–267 pair-correlation functions, 256–257 sum rules and mean-field errors, 257 competitive and related techniques, 254–255 computational principles, 252–254 data analysis and interpretation, 266–273 Cu2NiZn ordering wave polarization, 270–271 CuPt van Hove singularities, electronic topological transitions, 271–273 magnetic coupling and chemical order, 268 multicomponent alloys, 268–270 Ni-Pt hybridization and charge correlation, 266–268 temperature-dependent shift, ASRO peaks, 273 high-temperature experiments, effective interactions, 255–256 liquid surface x-ray diffraction, liquid metals, 1043 magnetism anisotropy, 191–193 MAE calculations, 192 pure Fe, Ni, and Co, 193 approximation techniques, 185–186 atomic short range order, 190–191 atoms-to-solids transition, 180–181 data analysis and interpretation, 193–200 ASRO in FeV, 196–198 ASRO in gold-rich AuFe alloys, 198–199 atomic long- and short-range order, NiFe alloys, 193 magnetic moments and bonding in FeV alloys, 196 magnetocrystalline anisotropy, Co-Pt alloys, 199–200 moment alignment vs. formation in fcc Fe, 193–195 negative thermal expansion effects, 195–196 electronic structure and Slater-Pauling curves, 184–185 finite temperature paramagnetism, 186–187 first-principles theories, 188–189 paramagnetic Fe, Ni, and Co, 189 limitations of analysis, 200–201 local exchange splitting, 189–190 local moment fluctuations, 187–188 research background, 180 solid-solution alloys, 183–184 transition metal ground state, 181–183 magnetotransport properties automated procedures, 565–566 data analysis and interpretation, 566 Hall resistance, 560 limitations, 567–568 magnetic field behaviors, 563–565 measured vs. intrinsic quantities, 559–560 research background, 559 sample preparation, 566–567 thermal conductance, 560 thermopower, 560 transport equations, 560–561 zero magnetic field behaviors, 561–563 photoluminescence (PL) spectroscopy, broadening, 682 scanning tunneling microscopy (STM) analysis, 1115 surface phenomena, molecular dynamics (MD) simulation, metal surface phonons, 161–162
transmission electron microscopy (TEM), specimen modification, 1088 ultraviolet photoelectron spectroscopy (UPS), sample preparation, 732–733 Metallographic analysis, sample preparation basic principles, 63–64 cadmium plating composition and thickness, 4340 steel, 68 etching procedures, 67 mechanical abrasion, 66 microstructural evaluation 7075-T6 anodized aluminum, 68–69 deformed high-purity aluminum, 69 4340 steel sample, 67–68 mounting protocols and procedures, 66 polishing procedures, 66–67 sectioning protocols and procedures, 65–66 strategic planning, 64–65 Metal organic chemical vapor deposition (MOCVD), pn junction characterization, 469–470 Metal-semiconductor (MS) junctions, capacitancevoltage (C-V) characterization, 457–458 Metamagnetic state, Landau magnetic phase transition, 530 Microbalance measurement, thermogravimetric (TG) analysis, error detection, 358–359 Microbeam applications trace element accelerator mass spectrometry (TEAMS), ultraclean ion sources, 1240 x-ray microprobes strain distribution ferroelectric sample, 948–949 tensile loading, 948 trace element distribution, SiC nuclear fuel barrier, 947–948 Microchannel plates (MCP) carrier lifetime measurement, photoluminescence (PL), 451–452 heavy-ion backscattering spectrometry (HIBS) sensitivity and mass resolution, 1278–1279 time-of-flight spectrometry (TOS), 1278 semiconductor-liquid interfaces, transient decay dynamics, 621–622 Microfabrication of specimens, scanning electrochemical microscopy (SECM), 644–645 Microindentation hardness testing applications, 319 case hardening, 318 Micro-particle-induced x-ray emission (MicroPIXE) applications, 1215–1216 basic principles, 1212 competitive methods, 1210–1211 research background, 1210 Microscopic field model (MFM), microstructural evolution, 115–117 Microscopic master equations (MEs), microstructural evolution, 116–117 Microstrain broadening, neutron powder diffraction, 1294–1296 Microstructural evaluation fracture toughness testing, crack extension measurement, 311 metallographic analysis aluminum alloys, 7075-T6, 68–69 deformed high-purity aluminum, 69 steel samples, 4340 steel, 67–68 stress-strain analysis, 283 Microstructural evolution continuum field method (CFM) applications, 122 atomistic simulation, continuum process modeling and property calculation, 128–129 basic principles, 117–118 bulk chemical free energy, 119–120
INDEX coarse-grained approximation, 118–119 coarse-grained free energy formulation, 119 coherent ordered precipitates, 122–126 diffuse-interface nature, 119 diffusion-controlled grain growth, 126–128 elastic energy, 120–121 external field energies, 121 field kinetic equations, 121–122 future applications, 128 interfacial energy, 120 limits of, 130 numerical algorithm efficiency, 129–130 research background, 114–115 slow variable selection, 119 theoretical basis, 118 field simulation, 113–117 atomistic Monte Carlo method, 117 cellular automata (CA) method, 114–115 continuum field method (CFM), 114–115 conventional front-tracking, 113–114 inhomogeneous path probability method (PPM), 116–117 mesoscopic Monte Carlo method, 114–115 microscopic field model, 115–116 microscopic master equations, 116–117 molecular dynamics (MD), 117 morphological patterns, 112–113 Microwave measurement techniques basic principles, 408–409 carrier lifetime measurement, photoconductivity (PC) techniques, 447–449 electron paramagnetic resonance (EPR), continuous-wave (CW) experiments, 796 protocols and procedures, 409–410 semiconductor-liquid interfaces, time-resolved microwave conductivity, 622–623 Microwave PC-decay technique, carrier lifetime measurement, optical techniques, 434 Microwave reflectance power, carrier lifetime measurement, photoconductivity (PC) techniques, 447–449 Micro-x-ray absorption fine structure (XAFS) spectroscopy, fluorescence analysis, 943 Miller-Bravais indices, lattice systems, 46 Miller indices lattice systems, 45–46 magnetic neutron scattering, polarized beam technique, 1330–1331 phonon analysis, 1320 surface x-ray diffraction, 1008 x-ray powder diffraction, 838 single-crystal x-ray structure determination, 853 Mininum-detectable limits (MDLs), fluorescence analysis, 940–941 Minority carrier injection carrier lifetime measurement, 428–429 semiconductor-liquid interface illuminated J-E equations, 608 laser spot scanning (LSS), 626–628 Mirror planes group theoretical analysis, vibrational Raman spectroscopy, 703–704 single-crystal x-ray structure determination, crystal symmetry, 855–856 Miscibility gap, phase diagram predictions, meanfield approximation, 94–96 Mixed-potential theory, corrosion quantification, Tafel technique, 593–596 Mobility measurements binary/multicomponent diffusion Fick’s law, 147 substitutional and interstitial metallic systems, 152–153 substitutional and interstitial metallic systems, tracer diffusion, 154–155
Modulated differential scanning calorimetry (MDSC), basic principles, 365–366 Modulated free carrier absorption, carrier lifetime measurement, 440 Modulation amplitude, electron paramagnetic resonance (EPR), continuous-wave (CW) experiments, 796–797 Modulation-type measurements, carrier lifetimes, 435–438 data interpretation issues, 437 limitations of, 437 Molecular beam epitaxy (MBE) low-energy electron diffraction (LEED), 1121 surface magneto-optic Kerr effect (SMOKE), superlattice structures, 573–574 Molecular dissociation microstructural evolution, 117 sputter-ion pump, 11 Molecular drag pump, application and operation, 5 Molecular dynamics (MD) simulation electronic structure analysis, phase diagram prediction, 101–102 phase diagram prediction, static displacive interactions, 105–106 surface phenomena data analysis and interpretation, 160–164 higher-temperature dynamics, 161–162 interlayer relaxation, 161 layer density and temperature variation, 164 mean-square atomic displacement, 162– 163 metal surface phonons, 161–162 room temperature structure and dynamics, 160–161 thermal expansion, 163–164 limitations, 164 principles and practices, 158–159 related theoretical techniques, 158 research background, 156 surface behavior, 156–157 temperature effects on surface behavior, 157 Molecular flow, vacuum system design, 19 Molecular orbitals, ultraviolet photoelectron spectroscopy (UPS), 727–728 Molecular point group, group theoretical analysis, vibrational Raman spectroscopy, 703–704 Molie´ re approximation, particle scattering, central-field theory, deflection function, 60 Moment formation metal alloy magnetism, 180 FeV alloys, 196 local moment fluctuation, 187–188 nickel-iron alloys, atomic long and short range order (ASRO), 195–196 metal alloy paramagnetism, finite temperatures, 187 Momentum matching, low-energy electron diffraction (LEED), sample preparation, 1131–1132 Momentum space representation, Mo¨ ssbauer effect, 819–820 Monochromatic illumination, semiconductorliquid interface, 610 Monochromators. See Spectrometers/ monochromators Monomolecular layers, x-ray diffraction basic principles, 1028–1036 competitive and related techniques, 1028 data analysis and interpretation, 1039–1043 Langmuir monolayers, 1041–1043 liquid alkane crystallization, 1043 liquid metals, 1043 simple liquids, 1040–1041
1369
non-specular scattering GID, diffuse scattering, and rod scans, 1038–1039 reflectivity measurements, 1033 p-polarized x-ray beam configuration, 1047 reflectivity, 1029–1036 Born approximation, 1033–1034 distorted-wave Born approximation, 1034–1036 Fresnel reflectivity, 1029–1031 grazing incidence diffraction and rod scans, 1036 instrumentation, 1036–1038 multiple stepwise and continuous interfaces, 1031–1033 non-specular scattering, 1033 research background, 1027–1028 specimen modification, 1043–1045 Monotonic heating, thermal diffusivity, laser flash technique, 386–387 Monte Carlo simulation chemical vapor deposition (CVD) model free molecular transport, 170 radiation, 173 diffuse intensities, metal alloys concentration waves, multicomponent alloys, 259–260 effective cluster interactions (ECIs), 255–256 energy-dispersive spectrometry (EDS), standardless analysis, backscatter loss correction, 1149 scanning electron microscopy (SEM), signal generation, 1050–1052 Mo¨ ssbauer effect, basic properties, 818–820 Mo¨ ssbauer spectroscopy basic properties, 761–762 bcc iron alloy solutes, 828–830 coherence and diffraction, 824–825 crystal defects and small particles, 830–831 data analysis and interpretation, 831–832 diffuse scattering techniques, comparisons, 883–884 electric quadrupole splitting, 822–823 hyperfine interactions, 820–821 magnetic field splitting, 823–824 isomer shift, 821–822 magnetic neutron scattering and, 1321 Mo¨ ssbauer effect, 818–820 nuclear excitation, 817–818 phase analysis, 827–828 phonons, 824 radioisotope sources, 825–826 recoil-free fraction, 821 relaxation phenomena, 824 research background, 816–817 sample preparation, 832 single-crystal x-ray structure determniation, 851 synchrotron sources, 826–827 valence and spin determination, 827 Mott detector, ultraviolet photoelectron spectroscopy (UPS), 729 Mott-Schottky plots electrochemical photocapacitance spectroscopy (EPS), surface capacitance measurements, 625 semiconductor-liquid interface differential capacitance measurements, 617–619 flat-band potential measurements, 628–629 Mounting procedures, metallographic analysis 7075-T6 anodized aluminum alloy, 68 deformed high-purity aluminum, 69 sample preparation, 66 4340 steel, 67
1370
INDEX
Muffin-tin approximation linear muffin tin orbital method (LMTO), electronic topological transitions, van Hove singularities in CuPt, 272–273 metal alloy bonding, precision calculations, 143 Multichannel analyzer (MCA) energy-dispersive spectrometry (EDS), 1137–1140 heavy-ion backscattering spectrometry (HIBS), 1279 ion-beam analysis (IBA), 1182–1184 ERD/RBS examples, 1191–1197 surface x-ray diffraction basic principles, 1010 crystallographic alignment, 1014–1015 Multicomponent alloys concentration waves, 257–260 diffuse intensities, Fermi-surface nesting, and van Hove singularities, 268–270 Multilayer structures dynamical diffraction, applications, 225 grazing-incidence diffraction (GID), 242–243 surface magneto-optic Kerr effect (SMOKE), 571 Multiphonon mechanism, carrier lifetime measurement, 429 Multiple-beam diffraction, 236–241 basic principles, 236–237 literature sources, 226 NBEAM theory, 237–238 boundary conditions, 238 D-field component eigenequation, 237 eigenequation matrix, 237–238 intensity computations, 238 numerical solution strategy, 238 phase information, 240 polarization density matrix, 241 polarization mixing, 240 second-order Born approximation, 238–240 standing waves, 240 three-beam interactions, 240 Multiple linear least squares (MLLS) method, energy-dispersive spectrometry (EDS), peak overlap deconvolution, 1144–1146 Multiple scattering magnetic x-ray scattering error detection, 935 nonresonant scattering, 929–930 x-ray absorption fine structure (XAFS) spectroscopy, 873–874 Multiple-specimen measurement technique, fracture toughness testing, crack extension measurement, 308 Multiple stepwise interfaces, liquid surface x-ray diffraction, reflectivity measurements, 1031–1033 Multiplet structures metal alloy bonding, accuracy calculations, 140–141 x-ray photoelectron spectroscopy (XPS), finalstate effects, 976 Multiple-wavelength anomalous diffraction (MAD), resonant scattering, 905 Multipole contributions, resonant scattering analysis, 909–910 Muon spin resonance, basic principles, 762–763 Nanotechnology, surface phenomena, molecular dynamics (MD) simulation, 156–157 Narrow-band thermometer, operating principles, 37 National Institutes of Standards and Technology (NIST), weight standards, 26–27 Natural oxide films, electrochemical dissolution, ellipsometric measurement, 742
Naval Research Laboratory, trace element accelerator mass spectrometry (TEAMS) research at, 1245–1246 NBEAM theory, multiple-beam diffraction, 237–238 boundary conditions, 238 D-field component eigenequation, 237 eigenequation matrix, 237–238 intensity computations, 238 numerical solution strategy, 238 Near-band-gap emission, carrier lifetime measurement, photoluminescence (PL), 450 Near-edge x-ray absorption fine structure (NEXAFS) spectroscopy. See also X-ray absorption near-edge structure (XANES) spectroscopy micro-XAFS, 943 ultraviolet photoelectron spectroscopy (UPS) and, 726 Near-field regime, scanning tunneling microscopy (STM), 1112–1113 Near-field scanning optical microscopy (NSOM) magnetic domain structure measurements, 548–549 pn junction characterization, 467 Near-surface effects magnetic x-ray scattering errors, 936 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE) and, 1202 Necking phenomenon nonuniform plastic deformation, 282 stress-strain analysis, 280–281 Neel temperature, antiferromagnetism, 524 Nernst equation cyclic voltammetry quasireversible reaction, 583 totally reversible reaction, 582–583 semiconductor-liquid interface, photoelectrochemistry, thermodynamics, 605–606 Nernst-Ettingshausen effect, magnetotransport in metal alloys, transport equations, 561 Net magnetization, nuclear magnetic resonance, 765 Neutron activation-accelerator mass spectrometry, trace element accelerator mass spectrometry (TEAMS) and, 1237 Neutron activation analysis (NAA) nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE) and, 1201–1202 trace element accelerator mass spectrometry (TEAMS) and, 1237 Neutron diffuse scattering applications, 889–894 automation, 897–898 bond distances, 885 chemical order, 884–885 comparisons, 884 competitive and related techniques, 883–884 crystalling solid solutions, 885–889 data analysis and interpretation, 894–896 diffuse x-ray scattering techniques, 889–890 inelastic scattering background removal, 890– 893 limitations, 898–899 magnetic diffuse scattering, 904–905 magnetic x-ray scattering, comparisons, 918–919 measured intensity calibration, 894 protocols and procedures, 884–889 recovered static displacements, 896–897 research background, 882–884 resonant scattering terms, 893–894 sample preparation, 898
Neutron powder diffraction angle-dispersive constant-wavelength diffractometer, 1289–1290 constant wavelength diffractometer wavelength, 1304–1305 constraints, 1301 crystallographic refinements, 1300–1301 data analysis, 1291–1300 ab initio structure determination, 1297–1298 axial divergence and peak asymmetry, 1292–1293 Bragg reflection positions, 1291–1292 indexing techniques, 1297–1298 particle structure determination, 1297 peak shape, 1292 quantitative phase analysis, 1296–1297 structure factor extraction, 1298 structure solution, 1298 estimated standard deviations, 1305–1306 limitations, 1300–1302 mesoscopic properties, 1293–1296 microstrain broadening, 1294–1295 particle size effect, 1293–1294 stacking faults, 1295–1296 probe characteristics, 1286–1289 reliability factors, 1305 research background, 1285–1286 restraints, 1301–1302 Rietveld analysis protocols, 1296 software programs, 1306–1307 sample preparation, 1291 single-crystal neutron diffraction and, 1307–1309 single-crystal x-ray structure determination, 850–851 time-dependent neutron powder diffraction, 1298–1300 time-of-flight diffractometers, 1290–1291 x-ray diffraction vs., 836 Neutron sources, single-crystal neutron diffraction, 1312–1313 Neutron techniques magnetic neutron scattering automation, 1336 data analysis, 1337 diffraction applications, 1332–1335 inelastic scattering basic principles, 1331 protocols for, 1335–1336 limitations, 1337–1338 magnetic diffraction, 1329–1330 polarized beam technique, 1330–1331 research background, 1328–1329 sample preparation, 1336–1337 subtraction techniques, 1330 neutron powder diffraction angle-dispersive constant-wavelength diffractometer, 1289–1290 constant wavelength diffractometer wavelength, 1304–1305 constraints, 1301 crystallographic refinements, 1300–1301 data analysis, 1291–1300 ab initio structure determination, 1297–1298 axial divergence and peak asymmetry, 1292–1293 Bragg reflection positions, 1291–1292 indexing techniques, 1297–1298 particle structure determination, 1297 peak shape, 1292 quantitative phase analysis, 1296–1297 structure factor extraction, 1298 structure solution, 1298 estimated standard deviations, 1305–1306
INDEX limitations, 1300–1302 mesoscopic properties, 1293–1296 microstrain broadening, 1294–1295 particle size effect, 1293–1294 stacking faults, 1295–1296 probe characteristics, 1286–1289 reliability factors, 1305 research background, 1285–1286 restraints, 1301–1302 Rietveld analysis protocols, 1296 software programs, 1306–1307 sample preparation, 1291 time-dependent neutron powder diffraction, 1298–1300 time-of-flight diffractometers, 1290–1291 phonon studies automation, 1323 basic principles, 1317–1318 data analysis, 1324–1325 instrumentation and applications, 1318–1323 triple-axis spectrometry, 1320–1323 limitations, 1326–1327 research background, 1316–1317 sample preparation, 1325–1326 specimen modification, 1326 research background, 1285 single-crystal neutron diffraction applications, 1311–1312 data analysis and interpretation, 1313–1314 neutron sources and instrumentation, 1312– 1313 research background, 1307–1309 sample preparation, 1314–1315 theoretical background, 1309–1311 Newton-Raphson algorithm, diffuse intensities, metal alloys, mean-field theory and, 263 Newton’s equations, molecular dynamics (MD) simulation, surface phenomena, 159 Newton’s law of cooling, combustion calorimetry, 379 Nickel-iron alloys, atomic short range order (ASRO), energetics and electronic origins, 193–196 Nickel-platinum alloys diffuse intensities, hybriziation in NiPt alloys, charge correlation effects, 266–268 magnetism first-principles calculations, 189 paramagnetism, 189 magnetocrystalline anisotropy energy (MAE), 193 phase diagram predictions, cluster variational method (CVM), 107–108 Nonconfigurational thermal effects, phase diagram prediction, 106–107 Nonconsumable abrasive wheel cutting, metallographic analysis, sample preparation, 65 Non-contact measurements basic principles, 407 protocols and procedures, 407–408 Nonevaporable getter pumps (NEGs), applications, operating principles, 12 Nonlinearity, fracture toughness testing load-displacement curves, 303–304 stable crack mechanics, 309–311 Nonlinear least-squares analysis, thermal diffusivity, laser flash technique, 389 Nonlinear optical effects, cyclotron resonance (CR), far-infrared (FIR) radiation sources, 809 Nonohmic contacts, Hall effect, semiconductor materials, 417 Nonrelativistic collisions, particle scattering conversions, 62–63
fundamental and recoiling relations, 62 Nonresonant depth-profiling, nuclear reaction analysis (NRA) and proton-induced gamma ray emission (PIGE) and basic principles, 1202 data analysis, 1206 Nonresonant magnetic x-ray scattering antiferromagnets, 928–930 ferromagnets, 930 research background, 918–919 theoretical concepts, 920–921 Nonreversible charge-transfer reaactions, cyclic voltammetry, 583–584 Non-Rutherford cross-sections, ion-beam analysis (IBA) ERD/RBS techniques, 1191–1197 error detection, 1189–1190 Non-specular scattering, liquid surface x-ray diffraction GID, diffuse scattering, and rod scans, 1038–1039 measurement protocols, 1033–1036 Nonuniform detection efficiency, energydispersive spectrometry (EDS), 1139–1140 Nonuniform laser beam, thermal diffusivity, laser flash technique, 390 Nonuniform plastic deformation, stress-strain analysis, 282 Nordsieck’s algorithm, molecular dynamics (MD) simulation, surface phenomena, 159 Normalization procedures, x-ray absorption fine structure (XAFS) spectroscopy, 878–879 Notch filter microbeam analysis, 948 Raman spectroscopy of solids, optics properties, 705–706 Notch-toughness testing research background, 302 sample preparation, 312 n-type metal-oxide semiconductor (NMOS), ionbeam-induced charge (IBIC)/single event upset (SEU) microscopy, 1229–1230 Nuclear couplings, nuclear quadrupole resonance (NQR), 776–777 Nuclear excitation, Mo¨ ssbauer spectroscopy, 817–818 Nuclear magnetic resonance (NMR) applications, 767–771 back-projection imaging sequence, 768 basic principles, 762–763 chemical shift imaging sequence, 770 data analysis and interpretation, 771 diffusion imaging sequence, 769 echo-planar imaging sequence, 768–769 flow imaging sequence, 769–770 gradient recalled echo sequence, 768 instrumentation setup and tuning, 770–771 limitations, 772 magnetic field effects, 496 measurement principles, 509–510 magnetic neutron scattering and, 1321 sample preparation, 772 single-crystal neutron diffraction and, 1309 solid-state imaging, 770 spin-echo sequence, 768 superconducting magnets, 500–502 theoretical background, 763–765 three-dimensional imaging, 770 Nuclear moments, nuclear quadrupole resonance (NQR), 775–776 Nuclear quadrupole resonance (NQR) data analysis and interpretation, 789 direct detection techniques, spectrometer criteria, 780–781 dynamics limitations, 790
1371
first-order Zeeman perturbation and line shapes, 779 heterogeneous broadening, 790 indirect detection, field cycling methods, 781 nuclear couplings, 776–777 nuclear magnetic resonance (NMR), comparisons, 761–762 nuclear moments, 775–776 one-dimensional Fourier transform NQR, 781–782 research background, 775 sensitivity problems, 789 spatially resolved NQR, 785–789 field cycling methods, 788–789 magnetic field gradient method, 785–786 rotating frame NQR imaging, 786–788 temperature, stress, and pressure imaging, 788 spin relaxation, 779–780 spurious signals, 789–790 Sternheimer effect and electron deformation densities, 777 two-dimensional zero-field NQR, 782–785 exchange NQR spectroscopy, 785 level-crossing double resonance NQR nutation spectroscopy, 785 nutation NRS, 783–784 Zeeman-perturbed NRS (ZNRS), 782–783 zero field energy levels, 777–779 higher spin nuclei, 779 spin 1 levels, 777–778 spin 3/2 levels, 778 spin 5/2 levels, 778–779 Nuclear reaction analysis (NRA) automation, 1205–1206 background, interference, and sensitivity, 1205 cross-sections and Q values, 1205 data analysis, 1206–1207 energy relations, 1209 energy scanning, resonance depth profiling, 1204–1205 energy spread, 1209–1210 high-energy ion beam analysis (IBA) and, 1176–1177 instrumentation criteria, 1203–1204 limitations, 1207–1208 nonresonant methods, 1202–1203 particle scattering, kinematics, 56–57 research background, 1178–1179, 1200–1202 resonant depth profiling, 1203 specimen modification, 1207 standards, 1205 unwanted particle filtering, 1204 Null ellipsometry automation, 739 defined, 735 schematic, 737–738 Null Laue method, diffuse scattering techniques, 887–889 data interpretation, 895–897 Number-fixed frame of reference, binary/ multicomponent diffusion, 147–148 Numerical algorithms microstructural evolution, 129–130 multiple-beam diffraction, NBEAM theory, 238 x-ray photoelectron spectroscopy (XPS), composition analysis, 994–996 Numerical aperture, optical microscopy, 669 Nutation nuclear resonance spectroscopy level-crossing double resonance NQR nutation spectroscopy, 785 zero-field nuclear quadrupole resonance, 783–784 n value semiconductor-liquid interface, current density-potential properties, 607–608
1372
INDEX
n value (Continued) superconductors-to-normal (S/N) transition, electrical transport measurement, 472 determination, 482 Nyquist representation corrosion quantification, electrochemical impedance spectroscopy (EIS), 599–603 semiconductor-liquid interfaces, differential capacitance measurements, 617–619 Object function retrieval, scanning transmission electron microscopy (STEM) data interpretation, 1105–1106 incoherent imaging, weakly scattering objects, 1111 Objective lenses, optical microscopy, 668 Oblique coordinate systems, diffuse intensities, metal alloys, concentration waves, multicomponent alloys, 259–260 Oblique-light illumination, reflected-light optical microscopy, 676–680 ‘‘Off-axis pinhole’’ device, surface x-ray diffraction, diffractometer alignment, 1020–1021 Off-diagonal disorder, diffuse intensities, metal alloys, concentration waves, first-principles calcuations, electronic structure, 266 Offset, Hall effect sensors, 508 Ohmeter design, bulk measurements, 403–404 Ohm’s law electrochemical quartz crystal microbalance (EQCM), impedance analysis, 655–657 superconductors, electrical transport measurement, 473–474 Oil composition, diffusion pumps, selection criteria, 6 Oil-free (dry) pumps classification, 4–5 diaphragm pumps, 4–5 molecular drag pump, 5 screw compressor, 5 scroll pumps, 5 sorption pumps, 5 Oil lubrication bearings, turbomolecular pumps, 7 Oil-sealed pumps applications, 3 foreline traps, 4 oil contamination, avoidance, 3–4 operating principles, 3 technological principles, 3–4 One-body contact, tribological and wear testing, 324–325 One-dimensional Fourier transform NQR, implementation, 781–782 One-electron model, x-ray magnetic circular dichroism (XMCD), 956 dichroism principles and notation, 968–969 One-step photoemission model, ultraviolet photoelectron spectroscopy (UPS), 726–727 One-wave analysis, high-strain-rate testing data analysis and interpretation, 296–298 Hopkinson bar technique, 291–292 Onsager cavity-field corrections, diffuse intensities, metal alloys, mean-field theory and, 262–263 Open circuit voltage decay (OCVD) technique, carrier lifetime measurement, 435 Optical beam induced current (OBIC) technique, carrier lifetime measurement, diffusionlength-based methods, 434–435 Optical conductivity, ellipsometry, 737 Optical constants, ellipsometry, 737 Optical imaging and spectroscopy electromagnetic spectrum, 666 ellipsometry automation, 739
intensity-measuring ellipsometer, 739 null elipsometers, 739 data analysis and interpretation, 739–741 optical properties from phase and amplitude changes, 740–741 polarizer/analyzer phase and amplitude changes, 739–740 dielectric constants, 737 limitations, 742 optical conductivity, 737 optical constants, 737 protocols and procedures, 737–739 alignment, 738–739 compensator, 738 light source, 738 polarizer/analyzer, 738 reflecting surfaces, 735–737 reflectivity, 737 research background, 735 sample preparation, 741–742 impulsive stimulated thermal scattering (ISTS) applications, 749–752 automation, 753 competitive and related techniques, 744–746 data analysis and interpretation, 753–757 limitations, 757–758 procedures and protocols, 746–749 research background, 744–746 sample preparation and specimen modification, 757 optical microscopy adjustment protocols, 671–674 basic components, 668–671 research background, 667–668 photoluminescence (PL) spectroscopy alloy broadening, 682 automation, 686 band-to-band recombination, 684 bound excitons, 682 defect-level transitions, 683–684 donor-acceptor and free-to-bound transitions, 683 excitons and exciton-polaritons, 682 experimental protocols, 684–865 limitations, 686 low-temperature PL spectra band interpretation, 686–687 crystal characterization, 685–686 phonon replicas, 682–683 research background, 681–682 room-temperature PL plating, 686 curve-fitting, 686 sample preparation, 686 specimen modification, 686 Raman spectroscopy of solids competitive and related techniques, 699 data analysis and interpretation, 709–713 active vibrational modes, aluminum compounds, 709–710 carbon structure, 712–713 crystal structures, 710–712 dispersed radiation measurement, 707–708 electromagnetic radiation, classical physics, 699–701 Fourier transform Raman spectroscopy, 708– 709 group theoretical analysis, vibrational Raman spectroscopy, 702–704 character tables, 718 point groups and matrix representation, symmetry operations, 717–718 vibrational modes of solids, 720–722 vibrational selection rules, 716–720 light scattering, semiclassical physics, 701– 702 limitations, 713–714
optical alignment, 706 optics, 705–706 polarization, 708 quantitative analysis, 713 radiation sources, 704–705 research background, 698–699 spectrometer components, 706–707 theoretical principles, 699–700 research background, 665, 667 ultraviolet photoelectron spectroscopy (UPS) alignment procedures, 729–730 atoms and molecules, 727–728 automation, 731 competitive and related techniques, 725–726 data analysis and interpretation, 731–732 electronic phase transitions, 727 electron spectrometers, 729 energy band dispersion, 723–725 solid materials, 727 light sources, 728–729 limitations, 733 photoemission process, 726–727 photoemission vs. inverse photoemission, 722–723 physical relations, 730 sample preparation, 732–733 sensitivity limits, 730–731 surface states, 727 valence electron characterization, 723–724 ultraviolet/visible absorption (UV-VIS) spectroscopy applications, 692 array detector spectrometer, 693 automation, 693 common components, 692 competitive and related techniques, 690 dual-beam spectrometer, 692–693 limitations, 696 materials characterization, 691–692, 694–695 materials properties, 689 qualitative/quantitative analysis, 689–690 quantitative analysis, 690–691, 693–694 research background, 688–689 sample preparation, 693, 695 single-beam spectrometer, 692 specimen modification, 695–696 Optically-detected resonance (ODR) spectroscopy, cyclotron resonance (CR), 812–813 Optical microscopy adjustment protocols, 671–674 basic components, 668–671 reflected-light optical microscopy illumination modes and image-enhancement techniques, 676–680 limitations, 680–681 procedures and protocols, 675–680 research background, 674–675 sample preparation, 680 research background, 667–668 Optical parametric oscillation (OPO), cyclotron resonance (CR), far-infrared (FIR) radiation sources, 809 Optical properties ellipsometric measurement, 740–741 local density approximation (LDA), 83–84 scanning electron microscopy (SEM), 1054 quality control, 1059–1060 x-ray magnetic circular dichroism (XMCD), 959–962 x-ray microprobes, 945–947 Optical pyrometry ITS-90 standard, 33 operating principles, 37 Optical reciprocity theorem, grazing-incidence diffraction (GID), distorted-wave Born approximation (DWBA), 244–245
INDEX Optical techniques, carrier lifetime measurement, 434 Optics properties, Raman spectroscopy of solids, 705–706 Optimization protocols, energy-dispersive spectrometry (EDS), 1140–1141 Orbital angular momentum number, metal alloy bonding, wave function variation, 138 Orbital effects metal alloy bonding, accuracy calculations, 140–141 x-ray magnetic circular dichroism (XMCD), 953–955 Orientation matrices, surface x-ray diffraction crystallographic alignment, 1014–1015 protocols, 1026 O-ring seals configuration, 17–18 elastomer materials for, 17 rotational/translation motion feedthroughs, 18–19 Ornstein-Zernicke equation diffuse intensities, metal alloys, concentration waves, density-functional theory (DFT), 261–262 metal alloy magnetism, atomic short range order (ASRO), 190–191 ORTEP program, single-crystal x-ray structure determination, 861–864 Outgassing Auger electron spectroscopy (AES), sample preparation, 1170 hot cathode ionization gauges, 15 vacuum system principles, 2–3 pumpdown procedures, 19 x-ray photoelectron spectroscopy (XPS), sample preparation, 998–999 Overcorrelation, diffuse intensities, metal alloys, mean-field results, 262 Overpotentials, semiconductor materials, J-E behavior, 611–612 series resistance, 612 Oxidative mechanisms, tribological and wear testing, 324–325 Oxides, scanning tunneling microscopy (STM) analysis, 1115 Oxygen purity, combustion calorimetry, 377–378 error detection and, 381–382 Pair correlations diffuse intensities, metal alloys atomic short-range ordering (ASRO) principles, 256–257 basic definitions, 252–254 diffuse scattering techniques, 888–889 Palladium tips, scanning tunneling microscopy (STM), 1114 Parabolic rate law, corrosion quantification, electrochemical impedance spectroscopy (EIS), 600–603 Parallel frequencies, electrochemical quartz crystal microbalance (EQCM), 660 Parallel geometry, carrier lifetime measurement, free carrier absorption (FCA), 439440 Paramagnetism classical and quantum theories, 519–522 electron paramagnetic resonance (EPR), theoretical background, 793 metal alloys, finite temperatures, 186–187 nuclear magnetic resonance data analysis, 771–772 principles of, 511–512 spin glass materials, cluster magnetism of, 516–517 theory and principles, 493–494
Parratt formalism, liquid surface x-ray diffraction, reflectivity measurements, 1031 Partial differential equations (PDEs), fronttracking simulation, microstructural evolution, 113–114 Partial pressure analyzer (PPA), applications, 16 Particle accelerator ion beam analysis (IBA), 1176 nuclear reaction analysis (NRA)/protoninduced gamma ray emission (PIGE), 1203–1204 Particle detection, medium-energy backscattering, 1260–1261 Particle erosion, wear testing and, 329–330 Particle filtering, nuclear reaction analysis (NRA)/proton-induced gamma ray emission (PIGE), 1204 Particle-induced x-ray emission (PIXE) automation, 1216 basic principles, 1211–1212 competing methods, 1210–1211 data analysis, 1216–1218 limitations, 1220 protocols and techniques, 1213–1216 research background, 1179, 1210–1211 sample preparation, 1218–1219 specimen modification, 1219–1220 Particle scattering central-field theory, 57–61 cross-sections, 60–61 deflection functions, 58–61 approximation, 59–60 central potentials, 59 hard spheres, 58–59 impact parameter, 57–58 interaction potentials, 57 materials analysis, 61 shadow cones, 58 kinematics, 51–57 binary collisions, 51 center-of-mass and relative coordinates, 54–55 elastic scattering and recoiling, 51–52 inelastic scattering and recoiling, 52 nuclear reactions, 56–57 relativistic collisions, 55–56 scattering/recoiling diagrams, 52–54 nonrelativistic collisions conversions, 62–63 fundamental and recoiling relations, 62 research background, 51 Particle size distribution Mo¨ ssbauer spectroscopy, 830–831 neutron powder diffraction, 1293–1294 small-angle scattering (SAS), 222 thermogravimetric (TG) analysis, 357 Particle structure determination, neutron powder diffraction, 1297 Passive film deposition, corrosion quantification, Tafel technique, 596 Patterson mapping neutron powder diffraction stacking faults, 1296 structure factor relationships, 1298 single-crystal x-ray structure determination, 858–859 heavy-atom computation, 868–869 x-ray powder diffraction, candidate atom position search, 840–841 Pauli exclusion principle, metal alloy magnetism, 180 Pauli paramagnetism band theory of magnetism, 516 defined, 493–494 local moment origins, 513–515
1373
Paul Scherrer Institute (PSI)/ETH Zurich Accelerator SIMS Laboratory, trace element accelerator mass spectrometry (TEAMS) research at, 1242–1244 PBE functionals, local density approximation (LDA), gradient corrections, 78 Peak current measurement, cyclic voltammetry, 587 Peak height measurments Auger electron spectroscopy (AES), 1169 ion-beam analysis (IBA), 1183–1184 Peak identification Auger electron spectroscopy (AES), 1162 ion excitation, 1171–1173 energy-dispersive spectrometry (EDS), deconvolution protocols, 1144–1145 x-ray photoelectron spectroscopy (XPS), 991–992 reference spectra, 996–998 PEAK integration software, surface x-ray diffraction, data analysis, 1024–1025 Peak position, x-ray photoelectron spectroscopy (XPS), 992 Peak shape, neutron powder diffraction, mesoscopic properties, 1293–1296 microstrain broadening, 1294–1295 particle size effect, 1293–1294 stacking faults, 1295–1296 Peak shapes neutron powder diffraction, 1292 axial divergence asymmetry, 1292–1293 instrumental contributions, 1292 x-ray photoelectron spectroscopy (XPS), classification, 1004–1007 Peltier heat, magnetotransport in metal alloys basic principles, 560 transport equations, 561 Pendello¨ sung period, two-beam diffraction, 231 diffracted intensities, 234 Pendry’s R factor, low-energy electron diffraction (LEED), quantitative analysis, 1130–1131 Penetration depth, x-ray microfluorescence, 943–944 Penning ionization gauge (PIG), operating principles, 16 Periodic boundary conditions, molecular dynamics (MD) simulation, surface phenomena, 158–159 limitations from, 164 Periodic heat flow, thermal diffusivity, laser flash technique, 383–384 Periodic table, ion beam analysis (IBA), 1177–1178 Permanent magnets applications, 496 structures and properties, 497–499 Permittivity parameters microwave measurement techniques, 409 cavity resonators, 409–410 coaxial probes, 409 surface x-ray diffraction, defined, 1028–1029 Perpendicular geometry, carrier lifetime measurement, free carrier absorption (FCA), 439–440 Phase-contrast imaging reflected-light optical microscopy, 679–680 scanning transmission electron microscopy (STEM) coherence analysis, 1093–1097 dynamical diffraction, 1101–1103 research background, 1092 Phase diagrams, predictions aluminum-lithium analysis, 108–110 aluminum-nickel analysis, 107 basic principles, 91 cluster approach, 91–92, 96–99 cluster expansion free energy, 99–101 electronic structure calculations, 101–102
1374
INDEX
Phase diagrams, predictions (Continued) ground-state analysis, 102–104 mean-field approach, 92–96 nickel-platinum analysis, 107–108 nonconfigurational thermal effects, 106–107 research background, 90–91 static displacive interactions, 104–106 Phase equilibrium, basic principles, 91–92 Phase field method. See also Continuum field method (CFM) Phase formation metal alloy bonding, 135–138 bonding-antibonding effects, 136–137 charge transfer and electronegativities, 137–138 Friedel’s d-band energetics, 135 size effects, 137 topologically close-packed phases, 135–136 transition metal crystal structures, 135 wave function character, 138 multiple-beam diffraction, 240 Phase retarders, x-ray magnetic circular dichroism (XMCD), 958–959 Phase shifts low-energy electron diffraction (LEED), 1125 Mo¨ ssbauer spectroscopy, 827–828 Raman spectroscopy of solids, 711–712 scanning transmission electron microscopy (STEM), research background, 1092 x-ray absorption fine structure (XAFS) spectroscopy, single scattering picture, 871–872 Phenomenological models, phonon analysis, 1324–1325 PHOENICS-CVD, chemical vapor deposition (CVD) model, hydrodynamics, 174 Phonons magnetotransport in metal alloys, 562–563 Mo¨ ssbauer spectroscopy, 824 neutron analysis automation, 1323 basic principles, 1317–1318 data analysis, 1324–1325 instrumentation and applications, 1318–1323 triple-axis spectrometry, 1320–1323 limitations, 1326–1327 research background, 1316–1317 sample preparation, 1325–1326 specimen modification, 1326 photoluminescence (PL) spectroscopy, replicas, 682–683 spectral densities, metal surfaces, molecular dynamics (MD) simulation, 161–162 thermal diffuse scattering (TDS), 212–214 Photoabsorption, x-ray microfluorescence basic principles, 942 cross-sections, 943 Photoconductivity (PC) carrier lifetime measurement, 444–450 basic principles, 444 data analysis, 446–447 high-frequency range, automation, 445–446 limitations, 449–450 microwave PC decay, 447–449 optical techniques, 434 radio frequency PC decay, 449 research background, 428 sample preparation, 447 standard PC decay method, 444–445 steady-state methods, 435–438 cyclotron resonance (CR), 812 Photocurrent density-potential behavior, semiconductor-liquid junctions, 609 Photocurrent/photovoltage measurements, semiconductor materials, 605–613 charge transfer at equilibrium, 606–607
electrochemical cell design, 609–610 J-E equations, 608–609 concentration overpotentials, 611–612 series resistance overpotentials, 612 raw current-potential data, 611 sample preparation, 612–613 semiconductor-liquid interface, 610–611 current density-potential properties, 607–608 dark current-potential characteristics, 607 thermodynamics, 605–606 Photodiode array (PDA), Raman spectroscopy of solids, dispersed radiation measurment, 707–708 Photoelectrochemistry, semiconductor materials electrochemical photocapacitance spectroscopy, 623–626 photocurrent/photovoltage measurements, 605–613 charge transfer at equilibrium, 606–607 electrochemical cell design, 609–610 J-E equations, 608–609 concentration overpotentials, 611–612 series resistance overpotentials, 612 raw current-potential data, 611 sample preparation, 612–613 semiconductor-liquid interface, 610–611 current density-potential properties, 607–608 dark current-potential characteristics, 607 junction thermodynamics, 605–606 research background, 605 semiconductor-liquid interface band gap measurement, 613–614 differential capacitance measurements, 616–619 diffusion length, 614–616 flat-band potential measurements, 628–630 J-E data, kinetic analysis, 631–632 laser spot scanning, 626–628 photocurrent/photovoltage measurements, 610–611 current density-potential properties, 607–608 dark current-potential characteristics, 607 junction thermodynamics, 605–606 transient decay dynamics, 619–622 surface recombiation velocity, time-resolved microwave conductivity, 622–623 time-resolved photoluminescence spectroscopy, interfacial charge transfer kinetics, 630–631 Photoelectromagnetic (PEM) effect, carrier lifetime measurement, steady-state methods, 436 Photoelectron detectors, x-ray photoelectron spectroscopy (XPS), 982–983 Photoemission angle-resolved x-ray photoelectron spectroscopy (ARXPS), 987–988 metal alloy magnetism, local exchange splitting, 189–190 ultraviolet photoelectron spectroscopy (UPS) automation, 731–732 vs. inverse photoemission, 722–723 limitations, 733 process, 726–727 valence electron characterization, 723–724 x-ray photoelectron spectroscopy (XPS), 971–972 survey spectrum, 973–974 Photoluminescence (PL) spectroscopy alloy broadening, 682 automation, 686 band-to-band recombination, 684 bound excitons, 682 carrier lifetime measurement, 450–453
automated procedures, 451–452 data analysis and interpretation, 452–453 deep level luminescence, 450–451 limitations, 453 near-band-gap emission, 450 photon recycling, 451 research background, 428 shallow impurity emission, 450 steady-state methods, 436 defect-level transitions, 683–684 donor-acceptor and free-to-bound transitions, 683 excitons and exciton-polaritons, 682 experimental protocols, 684–865 limitations, 686 low-temperature PL spectra band interpretation, 686–687 crystal characterization, 685–686 phonon replicas, 682–683 research background, 681–682 room-temperature PL plating, 686 curve-fitting, 686 sample preparation, 686 specimen modification, 686 Photometric ellipsometry, defined, 735 Photomultiplier tubes (PMTs) magnetic x-ray scattering, detector criteria, 927–928 photoluminescence (PL) spectroscopy, 684 Raman spectroscopy of solids, dispersed radiation measurment, 707–708 semiconductor-liquid interfaces, transient decay dynamics, 621–622 Photon energy energy-dispersive spectrometry (EDS), 1136, 1137–1140 ultraviolet photoelectron spectroscopy (UPS), 728–729 light sources, 728–729 wavelength parameters, 730 Photon recycling, carrier lifetime measurement, photoluminescence (PL), 451 Physical quantities, carrier lifetime measurement, 433 Pick-up loops, superconductors, electrical transport measurements, signal-to-noise ratio, 483–484 Piezo-birefringent elements, polarizationmodulation ellipsometer, 739 Piezoelectric behavior, electrochemical quartz crystal microbalance (EQCM), 653–654 Pirani gauge, applications and operation, 14 Placement effects, Hall effect, semiconductor materials, 417 Planar devices magnetic x-ray scattering, beamline properties, 927 x-ray magnetic circular dichroism (XMCD), 958–959 Planck’s constant chemical vapor deposition (CVD) model, gasphase chemistry, 169 deep level transient spectroscopy (DLTS), semiconductor materials, 421 neutron powder diffraction, probe configuration, 1286–1289 phonon analysis, 1318 radiation thermometer, 36–37 Plane wave approach electronic structure analysis, 76 phase diagram prediction, 101–102 metal alloy bonding, precision calculations, 142–143 Plasma physics chemical vapor deposition (CVD) model basic components, 170
INDEX software tools, 174 ion beam analysis (IBA), ERD/RBS techniques, 1184–1186 Plasmon loss Auger electron spectroscopy (AES), error detection, 1171 x-ray photoelectron spectroscopy (XPS), finalstate effects, 974–978 Plasticity fracture toughness testing crack driving force (G), 305 load-displacement curves, 304 semibrittle materials, 311 nonuniform plastic deformation, 282 tension testing, 279–280 uniform plastic deformation, 282 Platinum alloys. See also Nickel-platinum alloys cyclic voltammetry experimental protocols, 588–590 working electrode, 585–586 electrodes, scanning electrochemical microscopy (SECM), 649 low-energy electron diffraction (LEED), qualitative analysis, 1127–1128 magnetocrystalline anisotropy energy (MAE), Co-Pt alloys, 199–200 scanning tunneling microscopy (STM) tip configuration, 1113–1114 tip preparation, 1117 Platinum resistance thermometer, ITS-90 standard, 33 pn junctions capacitance-voltage (C-V) characterization basic principles, 457–458 data analysis, 462–463 electrochemical profiling, 462 instrument limitations, 463–464 mercury probe contacts, 461–462 profiling equipment, 460–461 protocols and procedures, 458–460 research background, 456–457 sample preparation, 463 trapping effects, 464–465 characterization basic principles, 467–469 competitive and complementary techniques, 466–467 limitations of, 471 measurement equipment sources and selection criteria, 471 protocols and procedures, 469–470 research background, 466–467 sample preparation, 470–471 deep level transient spectroscopy (DLTS), semiconductor materials, 425–426 semiconductor-liquid interface, current density-potential properties, 607–608 Point groups crystallography, 42–43 crystal systems, 44 group theoretical analysis, vibrational Raman spectroscopy, 703–704 matrix representations, 717–720 Point spread function (PSF), scanning electrochemical microscopy (SECM), feedback mode, 639–640 Poisson’s equation capacitance-voltage (C-V) characterization, 457–458 neutron powder diffraction, refinement algorithms, 1300–1301 Poisson’s ratio fracture-toughness testing, lnear elastic fracture mechanics (LEFM), 303 high-strain-rate testing data interpretation and analysis, 296–298
inertia, 299 impulsive stimulated thermal scattering (ISTS), 753–755 Polarimeter devices, magnetic x-ray scattering, 928 Polarimetric spectroscopy. See Ellipsometry Polarizability of materials, Raman spectroscopy of solids, electromagnetic radiation, 701 Polarization analysis copper-nickel-zinc alloys, ordering wave polarization, 270–271 multiple-beam diffraction, mixing, 240 Raman spectroscopy of solids, 708 resonant magnetic x-ray scattering, 922–924 resonant scattering, 910–912 experimental deesign, 914 instrumentation, 913 ultraviolet photoelectron spectroscopy (UPS), automation, 731–732 x-ray absorption fine structure (XAFS) spectroscopy, 873 Polarization density matrix, multiple-beam diffraction, 241 Polarization-modulation ellipsometer, automation of, 739 Polarized beam technique, magnetic neutron scattering, 1330–1331 error detection, 1337–1338 Polarized-light microscopy, reflected-light optical microscopy, 677–680 Polarizers ellipsometry, 738 relative phase/amplitude calculations, 739–740 magnetic x-ray scattering, beamline properties, 925–927 Polar Kerr effect, surface magneto-optic Kerr effect (SMOKE), 571–572 Polishing procedures, metallographic analysis sample preparation, 66–67 4340 steel, cadmium plating composition and thickness, 68 Polychromatic illumination, semiconductor-liquid interface, 610 Polymeric materials impulsive stimulated thermal scattering (ISTS) analysis, 746–749 Raman spectroscopy of solids, 712 scanning electrochemical microscopy (SECM), 644 transmission electron microscopy (TEM), specimen modification, 1088 x-ray photoelectron spectroscopy (XPS), 988–989 Polynomial expansion coefficients, coherent ordered precipitates, microstructural evolution, 123–124 ‘‘Pop-in’’ phenomenon, load-displacement curves, fracture toughness testing, 304 Porod approximation, small-angle scattering (SAS), 221–222 Porous materials, thermal diffusivity, laser flash technique, 390 Position-sensitive detector (PSD) liquid surface x-ray diffraction, non-specular scattering, 1039 surface x-ray diffraction, 1010 crystallographic alignment, 1014–1015 diffractometer alignment, 1021 grazing-incidence measurement, 1016–1017 time-dependent neutron powder diffraction, 1299 Positive-intrinsic-negative (PIN) diode detector photoluminescence (PL) spectroscopy, 684 x-ray powder diffraction, 838–839 Potential energy, fracture toughness testing, crack driving force (G), 305
1375
Potentiometric measurements, electrochemical profiling, 579–580 Potentiostats cyclic voltammetry, 584–585 electrochemical profiling, 579–580 scanning electrochemical microscopy (SECM), 642–643 Powder diffraction file (PDF), x-ray powder diffraction, 844 Power-compensation differential scanning calorimetry (DSC) basic principles, 364–365 thermal analysis, 339 Power-field relations, resistive magnets, 502–503 Power law transitions, superconductors, electrical transport measurement, 474 Poynting vector Raman spectroscopy of solids, electromagnetic radiation, 700–701 two-beam diffraction, dispersion surface, 231 P-polarized x-ray beam, liquid surface x-ray diffraction, 1047 Preamplifier design, ion-beam-induced charge (IBIC)/single event upset (SEU) microscopy, 1228–1230 Precision measurements, metal alloy bonding, 141–144 all-electron vs. pseudopotential methods, 141–142 basis sets, 142–143 Brillouin zone sampling, 143 first principles vs. tight binding, 141 full potentials, 143 self-consistency, 141 structural relaxation, 143–144 Pressure measurements capacitance diaphragm manometers, 13 leak detection, vacuum systems, 21–22 mass spectrometers, 16–17 nuclear quadrupole resonance spectroscopy, 788 thermal conductivity gauges applications, 13 ionization gauge cold cathode type, 16 hot cathode type, 14–16 operating principles and procedures, 13–14 Pirani gauge, 14 thermocouple gauges, 14 total and partial gauges, 13 Primitive unit cells lattices, 44–46 x-ray diffraction, structure-factor calculations, 209 Principal component analysis (PCA), x-ray photoelectron spectroscopy (XPS), 992–993 Probe configuration contact problems, pn junction characterization, 471 neutron powder diffraction, 1286–1289 nuclear quadrupole resonance (NQR), frequency-swept Fourier-transform spectrometers, 780–781 scanning transmission electron microscopy (STEM), 1097–1098 transmission electron microscopy (TEM), lens resolution, 1079–1080 Probe lasers, carrier lifetime measurement, free carrier absorption (FCA), selection criteria, 440 Processed wafers, carrier lifetime measurement, metal/highly doped layer removal, free carrier absorption (FCA), 443 Processing operations, vacuum systems, 19 PROFIL program, neutron powder diffraction, 1306
1376
INDEX
Projectiles heavy-ion backscattering spectrometry (HIBS), 1276–1277 impingement, materials characterization, 1 ion beam analysis (IBA), ERD/RBS techniques, 1184–1186 Property calculation, microstructural evolution modeling and, 128–129 Proton-induced gamma ray emission (PIGE) automation, 1205–1206 background, interference, and sensitivity, 1205 cross-sections and Q values, 1205 data analysis, 1206–1207 energy relations, 1209 energy scanning, resonance depth profiling, 1204–1205 energy spread, 1209–1210 instrumentation criteria, 1203–1204 limitations, 1207–1208 nonresonant methods, 1202–1203 research background, 1200–1202 resonant depth profiling, 1203 specimen modification, 1207 standards, 1205 unwanted particle filtering, 1204 Proton scattering, ion-beam analysis (IBA), ERD/ RBS examples, 1195–1197 Pseudo free induction decay, rotating frame NQR imaging, 786–788 Pseudopotential approach electronic structure analysis, 75 metal alloy bonding precision measurements, self-consistency, 141–142 semiconductor compounds, 138 phonon analysis, 1324–1325 Pulsed Fourier transform NMR basic principles, 765 electron paramagnetic resonance (EPR), 796 Pulsed magnetic fields, laboratory settings, 505 Pulsed MOS capacitance technique carrier lifetime measurement, 431, 435 Pulsed step measurements, superconductors, electrical transport measurements, current ramp rate, 479 Pulsed-type measurement techniques, carrier lifetime measurement, 436–438 Pulse height measurements, ion-beam-induced charge (IBIC)/single event upset (SEU) microscopy, 1225–1226 Pulse laser impulsive stimulated thermal scattering (ISTS), 750–752 thermal diffusivity, laser flash technique, 385–386 Pulse magnets cyclotron resonance (CR), laser magnetospectroscopy (LMS), 811–812 structure and properties, 504–505 Pulse nclear magnetic resonance (NMR), magnetic field measurements, 510 Pump beam photons, carrier lifetime measurement, free carrier absorption (FCA), 439 Pumpdown procedures, vacuum systems, 19 Pumping speed sputter-ion pump, 11 vacuum system principles, outgassing, 2–3 Pump lasers, carrier lifetime measurement, free carrier absorption (FCA), selection criteria, 440 Purity of materials, differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 367–368
Quadrupolar moments Mo¨ ssbauer spectroscopy, electric quadrupole splitting, 822–823 nuclear quadrupole resonance (NQR), 776 resonant magnetic x-ray scattering, 922–924 Quadrupole mass filter, applications and operating principles, 17 Qualitative analysis Auger electron spectroscopy (AES), 1161–1162 energy-dispersive spectrometry (EDS), 1141– 1143 low-energy electron diffraction (LEED) basic principles, 1122–1124 data analysis, 1127–1128 mass measurement process assurance, 28–29 ultraviolet/visible absorption (UV-VIS) spectroscopy, comptetitive and related techniques, 689–690 Quantitative analysis Auger electron spectroscopy (AES), 1169 differential thermal analysis, defined, 363 energy-dispersive spectrometry (EDS), 1143 standardless analysis, 1147–1148 accuracy testing, 1150 applications, 1151 first-principles standardless analysis, 1148–1149 fitted-standards standardless analysis, 1149–1150 ion-beam-induced charge (IBIC)/single event upset (SEU) microscopy, pulse height interpretation, 1225–1226 low-energy electron diffraction (LEED) basic principles, 1124–1125 data analysis, 1128–1131 medium-energy backscattering, 1271–1272 neutron powder diffraction, 1296–1297 Raman spectroscopy of solids, 713 ultraviolet/visible absorption (UV-VIS) spectroscopy comptetitive and related techniques, 689–690 interpretation, 693–694 procedures and protocols, 690–691 x-ray powder diffraction, phase analysis, 842, 844–845 Quantum mechanics cyclotron resonance (CR), 808 low-energy electron diffraction (LEED), quantitative analysis, 1129–1131 resonant scattering analysis, 908–909 surface magneto-optic Kerr effect (SMOKE), 571 Quantum Monte Carlo (QMC), electronic structure analysis, 75 basic principles, 87–89 Quantum paramagnetiic response, principles and equations, 521–522 Quarter wave plates, magnetic x-ray scattering, beamline properties, 927 Quartz crystal microbalance (QCM) basic principles, 24 electrochemical analysis automation, 659–660 basic principles, 653–658 data analysis and interpretation, 660 equivalent circuit, 654–655 film and solution effects, 657–658 impedance analysis, 655–657 instrumentation criteria, 648–659 limitations, 661 quartz crystal properties, 659 research background, 653 sample preparation, 660–661 series and parallel frequency, 660 specimen modification, 661 Quasi-Fermi levels, carrier lifetime measurement, trapping, 431–432
Quasireversible reaction, cyclic voltammetry, 583 Quasi-steady-state-type measurement technique, carrier lifetime measurement, 436–437 Quench detection electrical transport measurements, superconductors, 480–481 superconducting magnets, 501 Q values heavy-ion backscattering spectrometry (HIBS), 1275–1276 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE), crosssections and, 1205 nuclear reaction analysis (NRA)/protoninduced gamma ray emission (PIGE), particle filtering, 1204 Radiation effects microscopy basic principles, 1224–1228 instrumentation criteria, 1128–1230 ion-induced damage, 1232–1233 limitations, 1233 quantitative analysis, pulse height interpretations, 1225–1226 research background, 1223–1224 semiconductor materials, 1225 SEU microscopy, 1227–1228 static random-access memory (SRAM), 1230–1231 specimen modification, 1231–1232 topographical contrast, 1226–1227 Radiation sources chemical vapor deposition (CVD) model basic components, 172–173 software tools, 175 energy-dispersive spectrometry (EDS), stray radiation, 1155 ion-beam analysis (IBA), hazardous exposure, 1190 liquid surface x-ray diffraction, damage from, 1044–1045 Raman spectroscopy of solids, 704–705 x-ray microfluorescence, 942–943 Radiation thermometer, operating principles, 36–37 Radiative process, carrier lifetime measurement, 430 Radiofrequency (RF) magnetic resonance imaging (MRI), 765–767 nuclear quadrupole resonance (NQR), research background, 775 PC-decay, carrier lifetime measurement optical techniques, 434 principles and procedures, 449 Radiofrequency (RF) quadrupole analyzer, secondary ion mass spectrometry (SIMS) and, 1236–1237 Radioisotope sources, Mo¨ ssbauer spectroscopy, 825–826 Radiolysis, transmission electron microscopy (TEM), polymer specimen modification, 1088 Radius of gyration, small-angle scattering (SAS), 221 Raman active vibrational modes aluminum crystals, 709–710 solids, 720–721 Raman spectroscopy of solids competitive and related techniques, 699 data analysis and interpretation, 709–713 active vibrational modes, aluminum compounds, 709–710 carbon structure, 712–713 crystal structures, 710–712 dispersed radiation measurement, 707–708 electromagnetic radiation, classical physics, 699–701
INDEX Fourier transform Raman spectroscopy, 708–709 group theoretical analysis, vibrational Raman spectroscopy, 702–704 character tables, 718 point groups and matrix representation, symmetry operations, 717–718 vibrational modes of solids, 720–722 vibrational selection rules, 716–720 light scattering, semiclassical physics, 701–702 limitations, 713–714 magnetic neutron scattering and, 1321 optical alignment, 706 optics, 705–706 polarization, 708 quantitative analysis, 713 radiation sources, 704–705 research background, 698–699 spectrometer components, 706–707 theoretical principles, 699–700 Random noise, superconductors, electrical transport measurements, signal-to-noise ratio, 485 Random-phase approximation (RPA), electronic structure analysis, 84–87 Rapid-scan FT spectrometry, cyclotron resonance (CR), 810 Rare earth ions, atomic/ionic magnetism, groundstate multiplets, 514–515 Ratio thermometer, operating principles, 37 Rayleigh scattering impulsive stimulated thermal scattering (ISTS) analysis, 754–755 Raman spectroscopy of solids, optics properties, 705–706 scanning transmission electron microscopy (STEM) incoherent scattering, 1099–1101 phase-contrast illumination, 1094–1097 transmission electron microscopy (TEM), aperture diffraction, 1078 R-curve behavior, fracture toughness testing, 302 crack driving force (G), 305 sample preparation, 311–312 Reaction onset temperatures, differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 368–369 Reaction rate mode, scanning electrochemical microscopy (SECM), 640–641 Reaction reversibility. See Reversible reactions Reactive gases cryopump, 9–10 turbomolecular pumps, bearing failure and, 8 ‘‘Reactive sticking coefficient’’ concept, chemical vapor deposition (CVD) model, surface chemistry, 168 Readout interpretation magnetic resonance imaging (MRI), readout gradient, 768 thermometry, 35 Rear-face temperature rise, thermal diffusivity, laser flash technique, 384–385 equations for, 391–392 error detection, 390 Rebound hardness testing, basic principles, 316 Reciprocal lattices phonon analysis, basic principles, 1317–1318 surface x-ray scattering, mapping protocols, 1016 transmission electron microscopy (TEM) deviation vector and parameter, 1067–1068 diffraction pattern indexing, 1073–1074 tilting specimens and electron beams, 1071 x-ray diffraction, crystal structure, 209 Recoil-free fraction, Mo¨ ssbauer spectroscopy, 821 Recoiling, particle scattering
diagrams, 52–54 elastic recoiling, 51–52 inelastic recoiling, 52 nonrelativistic collisions, 62 Recombination mechanisms carrier lifetime measurement, 429–431 limitations, 438 photoluminescence (PL), 452–453 surface recombination and diffusion, 432–433 photoluminescence (PL) spectroscopy, band-toband recombination, 684 pn junction characterization, 468–469 Rectangular resonator, electron paramagnetic resonance (EPR), continuous-wave (CW) experiments, 794–795 Redhead Extractor gauge, operating principles, 14–15 Redox potentials, semiconductor-liquid interface, current density-potential properties, 607– 608 Reduced mass, particle scattering, kinematics, 54 Reference electrodes, semiconductor electrochemical cell design, photocurrent/ photovoltage measurements, 610 Reference materials, tribological testing, 333 Refinement algorithms, neutron powder diffraction, 1300–1301 Reflected-light optical microscopy basic principles, 667 illumination modes and image-enhancement techniques, 676–680 limitations, 680–681 procedures and protocols, 675–680 research background, 674–675 sample preparation, 680 Reflection high-energy electron diffraction (RHEED) low-energy electron diffraction (LEED), comparisons, 1121 surface magneto-optic Kerr effect (SMOKE), 573–574 surface x-ray diffraction and, 1007–1008 ultraviolet photoelectron spectroscopy (UPS), sample preparation, 732–733 Reflection polarimetry. See Ellipsometry Reflectivity ellipsometry, 735–737 defined, 737 liquid surface x-ray diffraction, 1029–1036 Born approximation, 1033–1034 distorted-wave Born approximation, 1034– 1036 error sources, 1044–1045 Fresnel reflectivity, 1029–1031 grazing incidence diffraction and rod scans, 1036 instrumentation, 1036–1038 multiple stepwise and continuous interfaces, 1031–1033 non-specular scattering, 1033–1036 simple liquids, 1040–1041 surface x-ray diffraction, 1015–1016 Refractive lenses, x-ray microprobes, 946 REITAN program, neutron powder diffraction, Rietveld refinements, 1306 Relative amplitude ellipsometric measurement angle of incidence errors, 742 basic principles, 735 optical properties, 740–741 polarizer/analyzer readings, 739–740 ellipsometry, reflecting surfaces, 736–737 Relative coordinates, particle scattering, kinematics, 54–55 Relative energy, particle scattering, kinematics, 54
1377
Relative phase, ellipsometric measurement angle of incidence errors, 742 basic principles, 735 optical properties, 740–741 polarizer/analyzer readings, 739–740 reflecting surfaces, 736–737 Relative sensitivity factors (RSFs), secondary ion mass spectrometry (SIMS) and, 1236–1237 Relativistic collisions, particle scattering, kinematics, 55–56 Relaxation measurements Mo¨ ssbauer spectroscopy, 824 non-contact techniques, 408 x-ray photoelectron spectroscopy (XPS), finalstate effects, 975–976 Relaxation time approximation (RTA), semiconductor materials, Hall effect, 412 interpretation, 416 Reliability factors low-energy electron diffraction (LEED), quantitative analysis, 1130–1131 neutron powder diffraction, 1305 Renninger peaks, multiple-beam diffraction, 236–237 Renormalized forward scattering (RFS), lowenergy electron diffraction (LEED), 1135 Repeated-cycle deformation, tribological and wear testing, 324–325 Replication process, transmission electron microscopy (TEM), sample preparation, 1087 Residual gas analyzer (RGA), applications, 16 Resistive magnets, structure and properties, 502–504 hybrid magnets, 503–504 power-field relations, 502–503 Resistivity. See Conductivity measurements; Electrical resistivity Resolution ellipsoid, phonon analysis, triple-axis spectrometry, 1322–1323 Resolution parameters energy-dispersive spectrometry (EDS), 1137– 1140 measurement protocols, 1156 optimization, 1140 medium-energy backscattering, 1267 optical microscopy, 669 scanning electron microscopy (SEM), 1053 surface x-ray diffraction, diffractometer components, 1010 transmission electron microscopy (TEM), lens defects and, 1079–1080 x-ray photoelectron spectroscopy (XPS), 993–994 Resonance methods. See also Nonresonant techniques cyclotron resonance (CR) basic principles, 806–808 cross-modulation, 812 data analysis and interpretation, 813–814 far-infrared (FIR) sources, 809 Fourier transform FIR magnetospectroscopy, 809–810 laser far infrared (FIR) magnetospectroscopy, 810–812 limitations, 814–815 optically-detected resonance (ODR) spectroscopy, 812–813 protocols and procedures, 808–813 quantum mechanics, 808 research background, 805–806 sample preparation, 814 semiclassical Drude model, 806–807 electron paramagnetic resonance (EPR) automation, 798 basic principles, 762–763, 793–794 calibration, 797
1378
INDEX
Resonance methods. See also Nonresonant techniques (Continued) continuous-wave experiments, X-band with rectangular resonator, 794–795 data analysis and interpretation, 798 electron-nuclear double resonance (ENDOR), 796 instrumentation criteria, 804 limitations, 800–802 microwave power, 796 modulation amplitude, 796–797 non-rectangular resonators, 796 non-X-band frequencies, 795–796 pulsed/Fourier transform EPR, 796 research background, 792–793 sample preparation, 798–799 sensitivity parameters, 797 specimen modification, 799–800 magnetic resonance imaging (MRI), theoretical background, 765–767 Mo¨ ssbauer spectroscopy basic properties, 761–762 bcc iron alloy solutes, 828–830 coherence and diffraction, 824–825 crystal defects and small particles, 830–831 data analysis and interpretation, 831–832 electric quadrupole splitting, 822–823 hyperfine interactions, 820–821 magnetic field splitting, 823–824 isomer shift, 821–822 Mo¨ ssbauer effect, 818–820 nuclear excitation, 817–818 phase analysis, 827–828 phonons, 824 radioisotope sources, 825–826 recoil-free fraction, 821 relaxation phenomena, 824 research background, 816–817 sample preparation, 832 synchrotron sources, 826–827 valence and spin determination, 827 nuclear magnetic resonance imaging (NMRI) applications, 767–771 back-projection imaging sequence, 768 basic principles, 762–763 chemical shift imaging sequence, 770 data analysis and interpretation, 771 diffusion imaging sequence, 769 echo-planar imaging sequence, 768–769 flow imaging sequence, 769–770 gradient recalled echo sequence, 768 instrumentation setup and tuning, 770–771 limitations, 772 sample preparation, 772 solid-state imaging, 770 spin-echo sequence, 768 theoretical background, 763–765 three-dimensional imaging, 770 nuclear quadrupole resonance (NQR) data analysis and interpretation, 789 direct detection techniques, spectrometer criteria, 780–781 dynamics limitations, 790 first-order Zeeman perturbation and line shapes, 779 heterogeneous broadening, 790 indirect detection, field cycling methods, 781 nuclear couplings, 776–777 nuclear magnetic resonance (NMR), comparisons, 761–762 nuclear moments, 775–776 one-dimensional Fourier transform NQR, 781–782 research background, 775 sensitivity problems, 789 spatially resolved NQR, 785–789
field cycling methods, 788–789 magnetic field gradient method, 785–786 rotating frame NQR imaging, 786–788 temperature, stress, and pressure imaging, 788 spin relaxation, 779–780 spurious signals, 789–790 Sternheimer effect and electron deformation densities, 777 two-dimensional zero-field NQR, 782–785 exchange NQR spectroscopy, 785 level-crossing double resonance NQR nutation spectroscopy, 785 nutation NRS, 783–784 Zeeman-perturbed NRS (ZNRS), 782–783 zero field energy levels, 777–779 higher spin nuclei, 779 spin 1 levels, 777–778 spin 3/2 levels, 778 spin 5/2 levels, 778–779 research background, 761–762 Resonant depth profiling nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE) and, 1203 data analysis, 1206–1207 energy calibration, 1204–1205 nuclear reaction analysis (NRA)/protoninduced gamma ray emission (PIGE), excitation functions, 1210 Resonant magnetic x-ray scattering antiferromagnets, 930–932 ferromagnets, 932 anomalous dispersion effects, 925 research background, 918–919 theoretical concepts, 921–924 Resonant nuclear reaction analysis (RNRA), research background, 1178–1179 Resonant Raman scattering, diffuse scattering techniques, 889 inelastic scattering background removal, 890–893 Resonant scattering techniques angular dependent tensors, 916 calculation protocols, 909–912 L=2 measurements, 910–911 L=4 measurements, 911–912 comparisons with other methods, 905–906 coordinate transformation, 916–917 data analysis and interpretation, 914–915 experiment design, 914 instrumentation criteria, 912–913 magnetic x-ray scattering antiferromagnets, 930–932 ferromagnets, 932 research background, 918–919 theoretical concepts, 921–924 materials properties measurments, 905 polarization analysis, 913 research background, 905 sample preparation, 913–914 tensor structure factors, transformation, 916 x-ray diffraction principles classical mechanics, 906–907 quantum mechanics, 908–909 theory, 906 Response function, particle-induced x-ray emission (PIXE), 1222–1223 Restraint factors, neutron powder diffraction, 1301–1302 Reverse-bias condition, deep level transient spectroscopy (DLTS), semiconductor materials, 421–422 Reverse currents, pn junction characterization, 470 Reverse recovery (RR) technique, carrier lifetime measurement, 435
Reversible reactions chemical vapor deposition (CVD) model, gas-phase chemistry, 169 cyclic voltammetry, 582–584 non-reversible charge-transfer reactions, 583–584 quasireversible reaction, 583 total irreversible reaction, 583 total reversible reaction, 582–583 Rice-Rampsberger-Kassel-Marcus (RRKM) theory, chemical vapor deposition (CVD) model, gas-phase chemistry, 169 Rietveld refinements neutron powder diffraction axial divergence peak asymmetry, 1292–1293 data analysis, 1296 estimated standard deviations, 1305–1306 particle structure determination, 1297 quantitative phase analysis, 1296–1297 reliability factors, 1305 software tools, 1306–1307 single-crystal x-ray structure determination, 850–851 time-resolved x-ray powder diffraction, 845–847 x-ray powder diffraction candidate atom position search, 840–841 estimated standard deviations, 841–842 final refinement, 841 quantitative phase analysis, 842 structural detail analysis, 847–848 Rising-temperature method, thermogravimetric (TG) analysis, kinetic theory, 353–354 RKKY exchange, ferromagnetism, 526–527 Rockwell hardness testing automated methods, 319 basic principles, 317–318 data analysis and interpretation, 319–320 hardness values, 317–318, 323 limitations and errors, 322 procedures and protocols, 318–319 research background, 316–317 sample preparation, 320 specimen modification, 320–321 ROD program, surface x-ray diffraction crystal truncation rod (CTR) profiles, 1015 data analysis, 1025 silicon surface example, 1017–1018 Rod scans. See Crystal truncation rods (CTR) Rolling friction coefficient, tribological and wear testing, 326 Room-temperature photoluminescence (PL) mapping applications, 686 curve fitting procedures, 687 Root-mean-square techniques, diffuse scattering techniques, inelastic scattering background removal, 892–893 1ro parameter, microstructural evolution coherent ordered precipitates, 124–126 field kinetic equations, 121–122 Rotating-analyzer ellipsometer, automation of, 739 Rotating frame NQR imaging, spatially resolved nuclear quadruople resonance, 786–788 Rotating-sample magnetometer (RSM), principles and applications, 535 Rotating seal design, surface x-ray diffraction, 1013 Rotational/translation motion feedthroughs, vacuum systems, 18–19 Rotation axes single-crystal x-ray structure determination, crystal symmetry, 854–856 surface x-ray diffraction, five-circle diffractometer, 1013–1014
INDEX symmetry operators improper rotation axis, 39–40 proper rotation axis, 39 Roughing pumps oil-free (dry) pumps, 4–5 oil-sealed pumps, 3–4 technological principles, 3 RUMP program, ion-beam analysis (IBA), ERD/ RBS equations, 1189 Rutherford backscattering spectroscopy (RBS). See also Medium-energy backscattering elastic scattering, 1178 heavy-ion backscattering spectrometry (HIBS) and, 1275–1276 high-energy ion beam analysis (IBA), 1176–1177 ion-beam composition analysis applications, 1191–1197 basic concepts, 1181–1184 detector criteria and detection geometries, 1185–1186 equations, 1186–1189, 1199–1200 experimental protocols, 1184–1186 limitations, 1189–1191 research background, 1179–1181 low-energy electron diffraction (LEED), comparisons, 1121 medium-energy backscattering, 1261–1262 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE), 1201–1202 Rutherford scattering fluorescence and diffraction analysis, 940 particle scattering, central-field theory, crosssections, 60–61 scanning transmission electron microscopy (STEM) dynamical diffraction, 1101–1103 research background, 1091–1092 Safety protocols heavy-ion backscattering spectrometry (HIBS), 1280 medium-energy backscattering, 1269 nuclear magnetic resonance, 771 Sample-and-hold technique, deep level transient spectroscopy (DLTS), semiconductor materials, 424–425 Sample manipulator components, surface x-ray diffraction, 1012–1013 Sample preparation Auger electron spectroscopy (AES), 1170 bulk electronic measurements, 404 capacitance-voltage (C-V) characterization, 463 carrier lifetime measurement free carrier absorption (FCA), 442–443 cross-sectioning, depth profiling, 443 photoconductivity (PC) techniques, 447 combustion calorimetry, 381 cyclic voltammetry, 590 cyclotron resonance (CR), 814 deep level transient spectroscopy (DLTS), semiconductor materials, 425–426 differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 371 diffuse scattering techniques, 899 electrochemical quartz crystal microbalance (EQCM), 660–661 electron paramagnetic resonance (EPR), 798–799 ellipsometric measurements, 741–742 energy-dispersive spectrometry (EDS), 1152 fracture toughness testing, 311–312 gas analysis, simultaneous thermogravimetry (TG)-differential thermal analysis (TGDTA), 394 hardness testing, 320
heavy-ion backscattering spectrometry (HIBS), 1281 high-strain-rate testing, 298 impulsive stimulated thermal scattering (ISTS) analysis, 757 low-energy electron diffraction (LEED) cleaning protocols, 1125–1126 protocols, 1131–1132 magnetic domain structure measurements bitter pattern imaging, 545–547 holography, 554–555 Lorentz transmission electron microscopy, 552 magnetic force microscopy (MFM), 550 magneto-optic imaging, 547–549 scanning electron microscopy with polarization analysis (SEMPA), 553 spin polarized low-energy electron microscopy (SPLEEM), 557 magnetic neutron scattering, 1336–1337 magnetic x-ray scattering, 935 magnetometry, 536–537 contamination problems, 538 magnetotransport in metal alloys, 566–567 mechanical testing, 287 metallographic analysis basic principles, 63–64 cadmium plating composition and thickness, 4340 steel, 68 etching procedures, 67 mechanical abrasion, 66 microstructural evaluation 7075-T6 anodized aluminum, 68–69 deformed high-purity aluminum, 69 4340 steel sample, 67–68 mounting protocols and procedures, 66 polishing procedures, 66–67 sectioning protocols and procedures, 65–66 strategic planning, 64–65 micro-particle-induced x-ray emission (microPIXE) analysis, 1219 Mo¨ ssbauer spectroscopy, 832 nuclear magnetic resonance, 772 particle-induced x-ray emission (PIXE) analysis, 1218–1219 phonon analysis, 1325–1326 photoluminescence (PL) spectroscopy, 687 pn junction characterization, 470–471 reflected-light optical microscopy, 680 resonant scattering, 913–914 scanning electron microscopy (SEM), 1058–1059 vacuum systems, 1061 scanning transmission electron microscopy (STEM), 1108 scanning tunneling microscopy (STM), 1117 semiconductor materials Hall effect, 416–417 photoelectrochemistry, 612–613 single-crystal neutron diffraction, 1314 single-crystal x-ray structure determination, 862–863 superconductors, electrical transport measurements contact issues, 483 geometric options, 482 handling and soldering/making contacts, 478 holder specifications, 478, 489 quality issues, 486 sample shape, 477 support issues, 482–483 surface magneto-optic Kerr effect (SMOKE) evolution, 575 surface x-ray diffraction, crystallographic alignment, 1014–1015 thermal diffusivity, laser flash technique, 389
1379
thermogravimetric (TG) analysis, 350–352, 356–357 thermomagnetic analysis, 541–544 trace element accelerator mass spectrometry (TEAMS), 1252–1253 transmission electron microscopy (TEM), 1086–1087 dispersion, 1087 electropolishing and chemical thinning, 1086 ion milling and focused gallium ion beam thinning, 1086–1087 replication, 1087 ultramicrotomy, 1087 tribological and wear testing, 335 ultraviolet photoelectron spectroscopy (UPS), 732–733 ultraviolet/visible absorption (UV-VIS) spectroscopy, 693, 695 x-ray absorption fine structure (XAFS) spectroscopy, 879–880 x-ray magnetic circular dichroism (XMCD) general techniques, 964–965 magnetization, 962–963 x-ray microfluorescence/microdiffraction, 949–950 x-ray photoelectron spectroscopy (XPS), 998–999 composition analysis, 994–996 sample charging, 1000–1001 x-ray powder diffraction, 842 Saturation magnetization, basic principles, 495 Sauerbrey relationship, electrochemical quartz crystal microbalance (EQCM), piezoelectric behavior, 654 Sawing techniques, metallographic analysis, sample preparation, 65 Sayre equation, single-crystal x-ray structure determination, direct method computation, 865–868 Scaling laws, magnetic phase transition theory, 530 Scanning acoustic microscopy (SAM), transmission electron microscopy (TEM) and, 1064 Scanning categories, surface x-ray diffraction, 1027 Scanning electrochemical microscopy (SECM) biological and polymeric materials, 644 competitive and related techniques, 637–638 constant-current imaging, 643 corrosion science applications, 643–644 feedback mode, 638–640 generation/collection mode, 641–642 instrumentation criteria, 642–643 limitations, 646–648 localized material deposition and dissolution, 645–646 reaction rate mode, 640–641 research background, 636–638 specimen modification, 644–646 tip preparation protocols, 648–649 Scanning electron microscopy (SEM) Auger electron spectroscopy (AES) depth-profile analysis, 1166–1167 speciment alignment, 1161 automation, 1057–1058 constrast and detectors, 1054–1056 data analysis and interpretation, 1058 electron gun, 1061 energy-dispersive spectrometry (EDS), automation, 1141 fracture toughness testing, crack extension measurement, 308 image formation, 1052–1053 imaging system components, 1061, 1063 instrumentation criteria, 1053–1054
1380
INDEX
Scanning electron microscopy (SEM) (Continued) limitations and errors, 1059–1060 magnetic domain structure measurements, type I and II protocols, 550–551 metallographic analysis, 7075-T6 anodized aluminum alloy, 69 research background, 1050 resolution, 153 sample preparation, 1058–1059 scanning transmission electron microscopy (STEM) vs., 1092–1093 selection criteria, 1061–1063 signal generation, 1050–1052 specimen modification, 1059 techniques and innovations, 1056–1057 vacuum system and specimen handling, 1061 x-ray photoelectron spectroscopy (XPS), sample charging, 1000–1001 Scanning electron microscopy with polarization analysis (SEMPA), magnetic domain structure measurements, 552–553 Scanning monochromators, x-ray magnetic circular dichroism (XMCD), 960–962 Scanning probe microscopy (SPM), scanning tunneling microscopy (STM) vs., 1112 Scanning profilometer techniques, wear testing, 332 Scanning transmission electron microscopy (STEM) automation, 1080 transmission electron microscopy (TEM), comparison, 1063–1064, 1090–1092 x-ray absorption fine structure (XAFS) spectroscopy and, 875–877 Z-contrast imaging atomic resolution spectroscopy, 1103–1104 coherent phase-contrast imaging and, 1093– 1097 competitive and related techniques, 1092– 1093 data analysis and interpretation, 1105–1108 object function retrieval, 1105–1106 strain contrast, 1106–1108 dynamical diffraction, 1101–1103 incoherent scattering, 1098–1101 weakly scattered objects, 1111 limitations, 1108 manufacturing sources, 1111 probe formation, 1097–1098 protocols and procedures, 1104–1105 research background, 1090–1093 sample preparation, 1108 specimen modification, 1108 Scanning transmission ion microscopy (STIM), particle-induced x-ray emission (PIXE) and, 1211 Scanning tunneling microscopy (STM) automation, 1115–1116 basic principles, 1113–1114 complementary and competitive techniques, 1112–1113 data analysis and interpretation, 1116–1117 image acquisition, 1114 limitations, 1117–1118 liquid surfaces and monomolecular layers, 1028 low-energy electron diffraction (LEED), 1121 material selection and limitations, 1114–1115 research background, 1111–1113 sample preparation, 1117 scanning electrochemical microscopy (SECM) and, 637 scanning transmission electron microscopy (STEM) vs., 1093 surface x-ray diffraction and, 1007–1008 Scattered radiation. See also Raman spectroscopy of solids
thermal diffusivity, laser flash technique, 390 Scattering analysis cyclotron resonance (CR), 805–806 low-energy electron diffraction (LEED), 1135 Scattering length density (SLD) liquid surface x-ray diffraction data analysis and interpretation, 1039–1043 Langmuir monolayers, 1041–1043 liquid alkane crystallization, 1043 liquid metals, 1043 simple liquids, 1040–1041 error sources, 1044–1045 non-specular scattering, 1033–1036 reflectivity measurements, 1028–1033 surface x-ray diffraction, defined, 1028–1029 Scattering power and length tribological and wear testing, 326–327 x-ray diffraction, 210 Scattering theory defined, 207 kinematic principles, 207–208 Scherrer equation, neutron powder diffraction, microstrain broadening, 1294–1295 Scherrer optimum aperture, scanning transmission electron microscopy (STEM), probe configuration, 1097–1098 Scho¨ nflies notation, symmetry operators, improper rotation axis, 39–40 Schottky barrier diodes capacitance-voltage (C-V) characterization basic principles, 457–458 data analysis, 462–463 electrochemical profiling, 462 instrument limitations, 463–464 mercury probe contacts, 461–462 profiling equipment, 460–461 protocols and procedures, 458–460 research background, 456–457 sample preparation, 463 trapping effects, 464–465 deep level transient spectroscopy (DLTS), semiconductor materials, 425–426 Schro¨ dinger equation computational analysis, applications, 71 cyclotron resonance (CR), 808 electronic structure, 74–75 dielectric screening, 84 phase diagram prediction, 101–102 Mo¨ ssbauer spectroscopy, isomer shift, 821–822 transition metal magnetic ground state, itinerant magnetism at zero temperature, 182–183 x-ray diffraction, scattering power and length, 210 Scintillation counters, single-crystal x-ray structure determination, 859–860 Scintillation detectors, single-crystal x-ray structure determination, 859–860 Scratch testing, basic principles, 316–317 Screening function, particle scattering, centralfield theory, deflection function approximation, 59–60 Screw axes single-crystal x-ray structure determination, crystal symmetry, 854–856 symmetry operators, 40–42 Screw compressor, application and operation, 5 Screw-driven mechanical testing system, fracture toughness testing, load-displacement curve measurement, 307–308 Scroll pumps, application and operation, 5 Search coils, magnetic field measurements, 508 Secondary electrons (SEs), scanning electron microscopy (SEM) contamination issues, 1059–1060 contrast images, 1055–1056
data analysis, 1058 signal generation, 1051–1052 Secondary ion accelerator mass spectrometry (SIAMS), trace element accelerator mass spectrometry (TEAMS) and, 1235–1237 Secondary ion generation, trace element accelerator mass spectrometry (TEAMS) acceleration and electron-stripping system, 1238–1239 ultraclean ion sourcees, 1238 Secondary ion mass spectroscopy (SIMS) Auger electron spectroscopy (AES) vs., 1158–1159 composition analysis, nuclear reaction analysis (NRA) and proton-induced gamma ray emission (PIGE) and, 1201–1202 heavy-ion backscattering spectrometry (HIBS) and, 1275 ion beam analysis (IBA) and, 1175–1176, 1181 medium-energy backscattering, 1261 pn junction characterization, 467 scanning tunneling microscopy (STM) vs., 1113 trace element accelerator mass spectrometry (TEAMS) and, 1235–1237 bulk impurity measurements, 1248–1249 ultraclean ion source design, 1241 Secondary x-ray fluorescence, energy-dispersive spectrometry (EDS), matrix corrections, 1145 Second-harmonic effects magnetic domain structure measurements, magneto-optic imaging, 548–549 surface magneto-optic Kerr effect (SMOKE), 569–570 Second Law of thermodynamics thermal analysis and principles of, 342–343 thermodynamic temperature scale, 31–32 Second-order Born approximation, multiple-beam diffraction, 238–240 Sectioning procedures, metallographic analysis cadmium plating composition and thickness, 4340 steel, 68 deformed high-purity aluminum, 69 microstructural evaluation 4340 steel, 67 7075-T6 anodized aluminum alloy, 68 sample preparation, 65–66 Selected-area diffraction (SAD), transmission electron microscopy (TEM) basic principles, 1071–1073 complementary bright-field and dark-field techniques, 1082–1084 data analysis, 1081 diffraction pattern indexing, 1073–1074 Selection rules, ultraviolet photoelectron spectroscopy (UPS), photoemission process, 726–727 Self-consistency, metal alloy bonding, precision measurements, 141 Self-diffusion, binary/multicomponent diffusion, 149–150 Self-field effects, superconductors, electrical transport measurements, signal-to-noise ratio, 486 Self-interaction correction, electronic structure analysis, 75 Semibrittle materials, fracture toughness testing, plasticity, 311 Semiclassical physics cyclotron resonance (CR), Drude model, 806–807 Raman spectroscopy of solids, light scattering, 701–702 Semiconductor-based thermometer, operating principles, 36 Semiconductor materials
INDEX capacitance-voltage (C-V) characterization basic principles, 457–458 data analysis, 462–463 electrochemical profiling, 462 instrument limitations, 463–464 mercury probe contacts, 461–462 profiling equipment, 460–461 protocols and procedures, 458–460 research background, 456–457 sample preparation, 463 trapping effects, 464–465 carrier lifetime measurement characterization techniques, 433–435 device related techniques, 435 diffusuion-length-based methods, 434–435 optical techniques, 434 free carrier absorption, 438–444 automated methods, 441 basic principles, 438–440 carrier decay transient, 441 computer interfacing, 441 data analysis and interpretation, 441–442 depth profiling, sample cross-sectioning, 443 detection electronics, 440–441 geometrical considerations, 441 lifetime analysis, 441–442 lifetime depth profiling, 441 lifetime mapping, 441–442 limitations, 443–444 probe laser selection, 440 processed wafers, metal and highly doped layer removal, 443 pump laser selection, 440 sample preparation, 442–443 virgin wafers, surface passivation, 442–443 generation lifetime, 431 photoconductivity, 444–450 basic principles, 444 data analysis, 446–447 high-frequency range, automation, 445–446 limitations, 449–450 microwave PC decay, 447–449 radio frequency PC decay, 449 sample preparation, 447 standard PC decay method, 444–445 photoluminescence, 450–453 automated procedures, 451–452 data analysis and interpretation, 452–453 deep level luminescence, 450–451 limitations, 453 near-band-gap emission, 450 photon recycling, 451 shallow impurity emission, 450 physical quantities, 433 recombination mechanisms, 429–431 selection criteria, characterization methods, 453–454 steady-state, modulated, and transient methods, 435–438 data interpretation problems, 437 limitations, 437–438 modulation-type method, 436 pulsed-type methods, 437–438 quasi-steady-state-type method, 436–437 surface recombination and diffusion, 432–433 theoretical background, 401, 427–429 trapping techniques, 431–432 conductivity measurements, research background, 401–403 deep level transient spectroscopy (DLTS), 418–419 basic principles emission rate, 420–421 junction capacitance transient, 421–423
data analysis and interpretation, 425 limitations, 426 procedures and automation, 423–425 research background, 418–419 sample preparation, 425–426 semiconductor defects, 418–419 electronic measurement, 401 energy-dispersive spectrometry (EDS), basic principles, 1136–1140 Hall effect automated testing, 414 basic principles, 411–412 data analysis and interpretation, 414–416 equations, 412–413 limitations, 417 protocols and procedures, 412–414 research background, 411 sample preparation, 416–417 sensitivity, 414 heavy-ion backscattering spectrometry (HIBS), 1279–1280 ion-beam-induced charge (IBIC)/single event upset (SEU) microscopy, 1225, 1229–1230 photoelectrochemistry electrochemical photocapacitance spectroscopy, 623–626 photocurrent/photovoltage measurements, 605–613 charge transfer at equilibrium, 606–607 electrochemical cell design, 609–610 J-E equations, 608–609 concentration overpotentials, 611–612 series resistance overpotentials, 612 raw current-potential data, 611 sample preparation, 612–613 semiconductor-liquid interface, 610–611 current density-potential properties, 607–608 dark current-potential characteristics, 607 junction thermodynamics, 605–606 research background, 605 semiconductor-liquid interface band gap measurement, 613–614 differential capacitance measurements, 616–619 diffusion length, 614–616 flat-band potential measurements, 628–630 J-E data, kinetic analysis, 631–632 laser spot scanning, 626–628 photocurrent/photovoltage measurements, 610–611 current density-potential properties, 607–608 dark current-potential characteristics, 607 junction thermodynamics, 605–606 transient decay dynamics, 619–622 surface recombiation velocity, time-resolved microwave conductivity, 622–623 time-resolved photoluminescence spectroscopy, interfacial charge transfer kinetics, 630–631 Raman spectroscopy of solids, 712 scanning tunneling microscopy (STM) analysis, 1114–1115 ultraviolet photoelectron spectroscopy (UPS), surface analysis, 727–728 x-ray photoelectron spectroscopy (XPS), 988–989 Sensing elements, thermometry, 34–35 Sensitive-tint plate, reflected-light optical microscopy, 678–680 Sensitivity measurements Auger electron spectroscopy (AES), 1163
1381
quantitative analysis, 1169 carrier lifetime measurement, microwave photoconductivity (PC) techniques, 448–449 electron paramagnetic resonance (EPR), continuous-wave (CW) experiments, 797 heavy-ion backscattering spectrometry (HIBS), 1276, 1278–1279 background and, 1281 ion beam analysis (IBA), periodic table, 1177–1178 medium-energy backscattering, 1266–1267 nuclear quadrupole resonance (NQR), 789 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE), 1205 semiconductor materials, Hall effect, 414 ultraviolet photoelectron spectroscopy (UPS), 730–731 Series resistance capacitance-voltage (C-V) characterization, 464 electrochemical quartz crystal microbalance (EQCM), 660 overpotentials, semiconductor materials, J-E behavior corrections, 612 SEXI protocol, rotating frame NQR imaging, 787–788 Shadow conees, particle scattering, central-field theory, 58 Shake-up/shake-off process, x-ray photoelectron spectroscopy (XPS), final-state effects, 976 Shallow impurity emission, carrier lifetime measurement, photoluminescence (PL), 450 Shape factor analysis, transmission electron microscopy (TEM), 1065–1066 data analysis, 1080–1081 deviation vector and parameter, 1067–1068 Sharp crack, fracture toughness testing, stress field parameters, 314 Shearing techniques, metallographic analysis, sample preparation, 65 SHELX direct method procedure, single-crystal xray structure determination, 862 computational techniques, 867–868 Shirley background, x-ray photoelectron spectroscopy (XPS), 990–991 Shockley-Reed-Hall (SRH) mechanism, carrier lifetime measurement, 429–430 free carrier absorption (FCA), 442 photoluminescence (PL), 452–453 Short-range ordering (SRO). See also Atomic short range ordering (ASRO) diffuse scattering techniques, 886–889 absolute calibration, measured intensities, 894 x-ray scattering measurements, 889–890 metal alloy magnetism, local moment fluctuation, 187–188 neutron magnetic diffuse scattering, 904–905 x-ray diffraction local atomic arrangement, 214–217 local atomic correlation, 214–217 Signal generation ion beam analysis (IBA), 1175–1176 scanning electron microscopy (SEM), 1050–1052 Signal-to-noise ratio parameters cyclic voltammetry, 591 ion-beam-induced charge (IBIC)/single event upset (SEU) microscopy, 1233 nuclear magnetic resonance, 772 nuclear quadrupole resonance (NQR), data analysis and interpretation, 789 superconductors, electrical transport measurements, 483–486 current supply noise, 486 grounding, 485 integration period, 486
1382
INDEX
Signal-to-noise ratio parameters (Continued) pick-up loops, 483–484 random noise and signal spikes, 485–486 thermal electromotive forces, 484–485 x-ray photoelectron spectroscopy (XPS), 993–994 Silicon Hall sensors, magnetic field measurements, 507 Silicon surfaces low-energy electron diffraction (LEED), qualitative analysis, 1127–1128 particle-induced x-ray emission (PIXE), detector criteria, 1222–1223 surface x-ray diffraction, 1017–1018 SI magnetic units, general principles, 492–493 Simple liquids, liquid surface x-ray diffraction, 1040–1041 Simulation techniques, tribological and wear testing, 325 Simultaneous techniques gas analysis automated testing, 399 benefits and limitations, 393–394 chemical and physical methods, 398–399 commercial TG-DTA equipment, 394–395 evolved gas analysis and chromatography, 396–397 gas and volatile product collection, 398 infrared spectrometery, 397–398 limitations, 399 mass spectroscopy for thermal degradation and, 395–396 research background, 392–393 TG-DTA principles, 393 thermal analysis, 339 Simultaneous thermogravimetry (TG)differential thermal analysis/differential scanning calorimetry (TG-DTA/DSC), gas analysis, 394 Simultaneous thermogravimetry (TG)differential thermal analysis (TG-DTA) commercial sources, 394–395 gas analysis, 393 limitations, 399 Sine integral function scanning transmission electron microscopy (STEM), incoherent scattering, 1100 x-ray absorption fine structure (XAFS) spectroscopy, single scattering picture, 871–872 Single-beam spectrometer, ultraviolet/visible absorption (UV-VIS) spectroscopy, 692 Single-crystal neutron diffraction applications, 1311–1312 data analysis and interpretation, 1313–1314 neutron sources and instrumentation, 1312–1313 research background, 1307–1309 sample preparation, 1314–1315 theoretical background, 1309–1311 x-ray diffraction vs., 836 Single-crystal x-ray diffraction basic properties, 836 single-crystal neutron diffraction and, 1307–1309 Single-crystal x-ray structure determination automation, 860 competitive and related techniques, 850–851 derived results interpretation, 861–862 initial model structure, 860–861 limitations, 863–864 nonhydrogen atom example, 862 protocols and procedures, 858–860 data collection, 859–860 research background, 850–851 sample preparation, 862–863 specimen modification, 863
x-ray crystallography principles, 851–858 crystal structure refinement, 856–858 crystal symmetry, 854–856 Single-cycle deformation, tribological and wear testing, 324–325 Single-domain structures, low-energy electron diffraction (LEED), qualitative analysis, 1123–1124 Single-edge notched bend (SENB) specimens, fracture toughness testing crack extension measurement, 308 J-integral approach, 311 sample preparation, 311–312 stress intensity and J-integral calculations, 314–315 unstable fractures, 308–309 Single-event upset (SEU) imaging basic principles, 1224–1228 instrumentation criteria, 1128–1230 ion-induced damage, 1232–1233 limitations, 1233 microscopic instrumentation, 1227–1228 static random-access memory (SRAM), 1230– 1231 quantitative analysis, pulse height interpretations, 1225–1226 research background, 1223–1224 semiconductor materials, 1225 specimen modification, 1231–1232 static random-access memory (SRAM) devices, 1230–1231 topographical contrast, 1226–1227 Single-pan balance, classification, 27 Single-phase materials, Mo¨ ssbauer spectroscopy, recoil-free fraction, 821 Single-scattering representation, x-ray absorption fine structure (XAFS) spectroscopy, 870–872 Size effects Auger electron spectroscopy (AES), sample preparation, 1170 diffuse intensities, metal alloys, hybridization in NiPt alloys, charge correlation effects, 268 metal alloy bonding, 137 semiconductor compounds, 138 semiconductor materials, Hall effect, 416–417 Slab analysis, ion-beam analysis (IBA), ERD/RBS equations, 1186–1189 Slater determinants electronic structure analysis Hartree-Fock (HF) theory, 77 local density approximation (LDA), 77–78 metal alloy bonding, precision calculations, 142–143 Slater-Pauling curves band theory of magnetism, 516 metal alloy magnetism, electronic structure, 184–185 Sliding friction coefficient, tribological and wear testing, 326 Sliding models, wear testing protocols, 331 Slip line field theory, static indentation hardness testing, 317 Slow-scan FT spectroscopy, cyclotron resonance (CR), 810 Slow variables, continuum field method (CFM), 119 Small-angle neutron scattering (SANS) magnetic neutron scattering and, 1320 neutron powder diffraction and, 1288–1289 Small-angle scattering (SAS) local atomic correlation, short-range ordering, 217 x-ray diffraction, 219–222 cylinders, 220–221
ellipsoids, 220 Guinier approximation, 221 integrated intensity, 222 interparticle interference, 222 K=0 extrapolation, 221 porod approximation, 221–222 size distribution, 222 spheres, 220 two-phase model, 220 Small-angle x-ray scattering (SAXS) diffuse scattering techniques, absolute calibration, measured intensities, 894 low-energy electron diffraction (LEED), comparisons, 1120–1121 Small-particle magnetism, Mo¨ ssbauer spectroscopy, 830–831 Small-spot imaging scanning tunneling microscopy (STM), 1113 x-ray photoelectron spectroscopy (XPS), 982 Smoothing routines, x-ray photoelectron spectroscopy (XPS), 994 Snell’s law dynamical diffraction, boundary conditions, 229 ellipsometry, reflecting surfaces, 736–737 liquid surface x-ray diffraction, reflectivity measurements, 1029–1031 two-beam diffraction, dispersion surface, 230–231 Software tools chemical vapor deposition (CVD) models free molecular transport, 174 gas-phase chemistry, 173 hydrodynamics, 174 kinetic theory, 175 limitations of, 175–176 plasma physics, 174 radiation, 175 surface chemistry, 173 diffuse scattering techniques, 898–899 electrical transport measurements, superconductors, 480–481 electron paramagnetic resonance (EPR), 798 Mo¨ ssbauer spectroscopy, 831–832 neutron powder diffraction axial divergence peak asymmetry, 1293 indexing procedures, 1298 Rietveld refinements, 1306–1307 particle-induced x-ray emission (PIXE) analysis, 1217–1218 phonon analysis, triple-axis spectrometry, 1320–1323 scanning electrochemical microscopy (SECM), 642–643 single-crystal x-ray structure determination, 858–860 superconductors, electrical transport measurement, data analysis, 482 surface x-ray diffraction crystal truncation rod (CTR) profiles, 1015 data analysis, 1024–1025 lineshape analysis, 1019 Solenoids, electromagnet structure and properties, 499–500 Solids analysis Raman spectroscopy competitive and related techniques, 699 data analysis and interpretation, 709–713 active vibrational modes, aluminum compounds, 709–710 carbon structure, 712–713 crystal structures, 710–712 dispersed radiation measurement, 707–708 electromagnetic radiation, classical physics, 699–701 Fourier transform Raman spectroscopy, 708–709
INDEX group theoretical analysis, vibrational Raman spectroscopy, 702–704 character tables, 718 point groups and matrix representation, symmetry operations, 717–718 vibrational modes of solids, 720–722 vibrational selection rules, 716–720 light scattering, semiclassical physics, 701–702 limitations, 713–714 optical alignment, 706 optics, 705–706 polarization, 708 quantitative analysis, 713 radiation sources, 704–705 research background, 698–699 spectrometer components, 706–707 theoretical principles, 699–700 ultraviolet photoelectron spectroscopy (UPS), energy band dispersion, 727 x-ray photoelectron spectroscopy (XPS), research background, 970–972 Solid-solid reactions, thermogravimetric (TG) analysis, 356 Solid-solution alloys, magnetism, 183–184 Solid-state detectors, fluorescence analysis, 944 Solid-state imaging nuclear magnetic resonance, 770 single-crystal x-ray structure determination, 851 Solution resistance corrosion quantification, Tafel technique, 596 electrochemical quartz crystal microbalance (EQCM), 657–658 Soret effect, chemical vapor deposition (CVD) model, hydrodynamics, 171 Sorption pump, application and operation, 5 Source-to-specimen distance, x-ray diffraction, 206 Space groups crystallography, 46–50 single-crystal x-ray structure determination, crystal symmetry, 855–856 x-ray powder diffraction, crystal lattice determination, 840 Spatially resolved nuclear quadruople resonance basic principles, 785–789 field cycling methods, 788–789 magnetic field gradient method, 785–786 rotating frame NQR imiaging, 786–788 temperature, stress, and pressure imaging, 788 Specimen geometry fracture toughness testing load-displacement curves, 304 stress intensity factor (K), 306 stress-strain analysis, 285 thermogravimetric (TG) analysis, 351–352 transmission electron microscopy (TEM), deviation parameter, 1077–1078 Specimen modification Auger electron spectroscopy (AES) alignment protocols, 1161 basic principles, 1170–1171 cyclic voltammetry, 590 differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 371–372 electrochemical quartz crystal microbalance (EQCM), 661 electron paramagnetic resonance (EPR), 799–800 energy-dispersive spectrometry (EDS), 1152–1153 position protocols, 1156 fracture toughness testing, alignment, 312–313 high-strain-rate testing, 298–299 impulsive stimulated thermal scattering (ISTS) analysis, 757
ion-beam-induced charge (IBIC)/single event upset (SEU) microscopy, 1231–1232 liquid surface x-ray diffraction, 1043–1045 low-energy electron diffraction (LEED), 1132 magnetic domain structure measurements Lorentz transmission electron microscopy, 552 magneto-optic imaging, 548–549 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE) and, 1207 particle-induced x-ray emission (PIXE) error detection, 1219–1220 protocols, 1212–1216 phonon analysis, 1326 photoluminescence (PL) spectroscopy, 687 scanning electrochemical microscopy (SECM), 644–645 scanning electron microscopy (SEM), 1059 scanning transmission electron microscopy (STEM), 1108 single-crystal x-ray structure determination, 863 superconductors, electrical transport measurement, 483 thermal diffusivity, laser flash technique, 389 trace element accelerator mass spectrometry (TEAMS), 1253 transmission electron microscopy (TEM) basic principles, 1087–1088 thickness modification, deviation parameter, 1081–1082 tribological testing, 332–333 ultraviolet/visible absorption (UV-VIS) spectroscopy, 695–696 x-ray absorption fine structure (XAFS) spectroscopy, 880 x-ray photoelectron spectroscopy (XPS), 999 Specimen-to-detector distance, x-ray diffraction, 206 Spectral resolution Auger electron spectroscopy (AES), 1161 nuclear quadrupole resonance (NQR), data analysis and interpretation, 789 Spectroelectrochemical quartz crystal microbalance (SEQCM), instrumentation criteria, 659 Spectrometers/monochromators diffuse scattering techniques, 894–987 electron spectrometers, ultraviolet photoelectron spectroscopy (UPS), 729 alignment protocols, 729–730 automation, 731–732 liquid surface x-ray diffraction, instrumetation criteria, 1036–1038 magnetic x-ray scattering, 925, 927–928 medium-energy backscattering, efficiency protocols, 1265–1266 nuclear quadrupole resonance (NQR), direct detection techniques, 780–781 phonon analysis, triple-axis spectrometry, 1320–1323 photoluminescence (PL) spectroscopy, 684 Raman spectroscopy of solids, criteria for, 706– 707 surface x-ray diffraction, beamline alignment, 1019–1020 ultraviolet/visible absorption (UV-VIS) spectroscopy, components, 692 x-ray absorption fine structure (XAFS) spectroscopy detection methods, 875–877 glitches, 877–878 x-ray magnetic circular dichroism (XMCD), 959–962 glitches, 966
1383
Spectrum channel width and number, energydispersive spectrometry (EDS), measurement protocols, 1156 Spectrum distortion, energy-dispersive spectrometry (EDS), 1140 Spectrum subtraction, Auger electron spectroscopy (AES), 1167–1168 SPECT software diffuse scattering techniques, 898–899 x-ray microfluorescence/microdiffraction, 949 Specular reflectivity grazing-incidence diffraction (GID), 241–242 distorted-wave Born approximation (DWBA), 244–245 liquid surface x-ray diffraction, measurement protocols, 1028–1033 surface x-ray diffraction, crystal truncation rod (CTR) profiles, 1015–1016 Spheres, small-angle scattering (SAS), 220 Spherical aberration, transmission electron microscopy (TEM), lens defects and resolution, 1078 Spherical sector analysis, Auger electron spectroscopy (AES), 1160–1161 ‘‘SPICE’’ models, chemical vapor deposition (CVD), plasma physics, 170 Spin density functional theory (SDFT) metal alloy magnetism atomic short range order (ASRO), 190–191 competitive and related techniques, 185–186 first-principles calculations, 188–189 limitations, 200–201 transition metal magnetic ground state, itinerant magnetism at zero temperature, 181–183 Spin determination, Mo¨ ssbauer spectroscopy, 827 Spin-echo sequence, magnetic resonance imaging (MRI) basic principles, 768 theoretical background, 767 Spin 1 energy level, nuclear quadrupole resonance (NQR), 777–778 Spin 3/2 energy level, nuclear quadrupole resonance (NQR), 778 Spin 5/2 energy level, nuclear quadrupole resonance (NQR), 778 Spin glass materials, cluster magnetism, 516–517 Spin-lattice relaxation, nuclear magnetic resonance, 764–765 Spin moments nuclear quadrupole resonance (NQR), 776 x-ray magnetic circular dichroism (XMCD), 953–955 Spin-orbit interaction metal alloy bonding, accuracy calculations, 140–141 surface magneto-optic Kerr effect (SMOKE), 571 x-ray photoelectron spectroscopy (XPS), 973–974 Spin packets, nuclear magnetic resonance, 764–765 Spin-polarization effects, metal alloy bonding, accuracy calculations, 140–141 Spin polarized low-energy electron microscopy (SPLEEM), magnetic domain structure measurements, 556–557 Spin-polarized photoemission (SPPE) measurements, x-ray magnetic circular dichroism (XMCD) comparison, 955 Spin properties, nuclear magnetic resonance theory, 763–765 Spin relaxation, nuclear quadrupole resonance (NQR), 779–780 Spin-spin coupling, nuclear magnetic resonance, 764–765
1384
INDEX
Split Hopkinson pressure bar technique, highstrain-rate testing, 290 automation procedures, 296 data analysis and interpretation, 297–298 limitations, 299–300 sample preparation, 298 specimen modification, 298–299 stress-state equilibrium, 294–295 temperature effects, 295–296 Split-Pearson VII function, neutron powder diffraction, axial divergence peak asymmetry, 1293 Spontaneous magnetization collective magnetism, 515 ferromagnetism, 523–524 Spot profile analysis LEED (SPA-LEED) instrumentation criteria, 1126–1127 surface x-ray diffraction and, 1008 Spot profile analysis RHEED (SPA-RHEED), surface x-ray diffraction and, 1008 Spurious signaling nuclear quadrupole resonance (NQR), 789–790 phonon analysis, 1326–1327 Sputter-initiated resonance ionization spectrometry (SIRIS) secondary ion mass spectrometry (SIMS) and, 1236–1237 trace element accelerator mass spectrometry (TEAMS) and, 1235, 1237 Sputter-ion pump applications, 10 operating principles and procedures, 10–12 Sputter profiling angle-resolved x-ray photoelectron spectroscopy (ARXPS), 986–988 Auger electron spectroscopy (AES), depthprofile analysis, 1165–1167 ion beam analysis (IBA), ERD/RBS techniques, 1184–1186 Stability/reactivity, thermogravimetric (TG) analysis, 355 Stable crack mechanics, fracture toughness testing, 309–311 Stacking faults, neutron powder diffraction, 1295–1296 Stainless steel vacuum system construction, 17 weight standards, 26–27 Staircase step measurements, superconductors, electrical transport measurements, current ramp rate, 479 Standardless analysis, energy-dispersive spectrometry (EDS), 1147–1148 accuracy testing, 1150 applications, 1151 first-principles standardless analysis, 1148–1149 fitted-standards standardless analysis, 1149–1150 Standard reference material Auger electron spectroscopy (AES), sources, 1169–1170, 1174 energy-dispersive spectrometry (EDS), matrix corrections, 1146–1147 particle-induced x-ray emission (PIXE) analysis, 1218 Standards mass measurement process assurance, 28–29 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE), 10205 in thermal analysis, 340 weight, 26–27 Standard state corrections, combustion calorimetry, 379–380 Standing waves
multiple-beam diffraction, 240 two-beam diffraction, 235–236 Startup procedures, turbomolecular pumps, 8 Static displacive interactions diffuse scattering techniques, recovered displacements, 897–898 phase diagram prediction, 104–106 Static indentation hardness testing applications and nomenclature, 318–319 automated methods, 319 basic principles, 317–318 data analysis and interpretation, 319–320 limitations and errors, 322 procedures and protocols, 318–319 research background, 316–317 sample preparation, 320 specimen modification, 320–321 Static magnetic field, nuclear magnetic resonance, instrumentation and tuning, 770–771 Static random-access memory (SRAM) devices, ion-beam-induced charge (IBIC)/single event upset (SEU) microscopy, 1230–1231 Statistics-insensitive nonlinear peak-clipping (SNIP) algorithm, particle-induced x-ray emission (PIXE) analysis, 1217–1218 ‘‘Steady-state’’ carrier profile, carrier lifetime measurement, surface recombination and diffusion, 432–433 Steady-state techniques, carrier lifetime measurement, 435–438 data interpretation issues, 437 limitations of, 437–438 quasi-steady-state-type method, 436–437 Steel samples, metallographic analysis, 4340 steel cadmium plating composition and thickness, 68 microstructural evaluation, 67–68 Sternheimer effect, nuclear quadrupole resonance (NQR), 777 Stoichiometry, combustion calorimetry, 378 Stokes-Poincare´ parameters, multiple-beam diffraction, polarization density matrix, 241 Stokes Raman scattering, Raman spectroscopy of solids, semiclassical physics, 701–702 Stoner enhancement factor, band theory of magnetism, 516 Stoner theory band theory of magnetism, 515–516 Landau magnetic phase transition, 529–530 Stoner-Wohlfarth theory, transition metal magnetic ground state, itinerant magnetism at zero temperature, 181–183 Stopping power, energy-dispersive spectrometry (EDS), standardless analysis, 1148–1149 Strain contrast imaging, scanning transmission electron microscopy (STEM), data interpretation, 1106–1108 Strain distribution, microbeam analysis ferroelectric sample, 948–949 tensile loading, 948 Strain-gauge measurements, high-strain-rate testing, 300 Strain rate fracture toughness testing, crack driving force (G), 305 stress-strain analysis, 283 Stress field, fracture toughness testing, sharp crack, 314 Stress intensity factor (K), fracture toughness testing basic principles, 305–306 defined, 302 SENB and CT speciments, 314–315 stable crack mechanics, 309–311 two-dimensional representation, 303
unstable fractures, 309 Stress measurements impulsive stimulated thermal scattering (ISTS), 745–746 nuclear quadrupole resonance spectroscopy, 788 Stress-state equilibrium, high-strain-rate testing, 294–295 Stress-strain analysis data analysis and interpretation, 287–288 elastic deformation, 281 fracture toughness testing, crack driving force (G), 304–305 high-strain-rate testing, Hopkinson bar technique, 290–292 mechanical testing, 280–282 curve form variables, 282–283 definitions, 281–282 material and microstructure, 283 temperature and strain rate, 283 yield-point phenomena, 283–284 Stretched exponentials, carrier lifetime measurement, 432 Structural phase transformation, thermomagnetic analysis, 542–544 Structural relaxation, metal alloy bonding, precision calculations, 143–144 Structure-factor calculations, x-ray diffraction crystal structure, 208–209 principles and examples, 209–210 Structure factor relationships neutron powder diffraction individual extraction, 1298 solution techniques, 1298 resonant scattering, 909–910 tensor structure factors, 916 single-crystal neutron diffraction and, 1309–1311 single-crystal x-ray structure determination, 852–853 direct method computation, 866–868 surface x-ray diffraction, 1010–1011 error detection, 1018–1019 transmission electron microscopy (TEM) analysis, 1065–1066 x-ray powder diffraction, 845 detail analysis, 847–848 Structure inversion method (SIM), phase diagram predictions, cluster variational method (CVM), 99 Sublimation pumps applications, 12 operating principles, 12 surface x-ray diffraction, 1011–1012 Substitutional and interstitial metallic systems, binary/multicomponent diffusion, 152–155 B2 intermetallics, chemical order, 154–155 frame of reference and concentration variables, 152 interdiffusion, 155 magnetic order, 153–154 mobilities and diffusivities, 152–153 temperature and concentration dependence of mobilities, 153 Substrate generation/tip collection (SG/TC), scanning electrochemical microscopy (SECM), 641–642 Subtraction technique, magnetic neutron scattering, 1330 error detection, 1337–1338 Sum peaks, energy-dispersive spectrometry (EDS), 1139 qualitative analysis, 1142–1143 Sum rules diffuse intensities, metal alloys, atomic shortrange ordering (ASRO) principles, 257
INDEX x-ray magnetic circular dichroism (XMCD), 956–957 data analysis, 964 error detection, 966 Super-Borrmann effect, multiple-beam standing waves, 240 Super command format, surface x-ray diffraction, 1025–1026 Superconducting magnets changing fields stability and losses, 501–502 protection, 501 quench and training, 501 structure and properties, 500–502 Superconducting quantum interference device (SQUID) magnetometry automation of, 537 components, 532 principles and applications, 534 nuclear quadrupole resonance (NQR) detector criteria, 781 nuclear moments, 776 thermomagnetic analysis, 540–544 Superconductors diamagnetism, 494 electrical transport measurements automation of, 480–481 bath temperature fluctuations, 486–487 competitive and related techniques, 472–473 contact materials, 474 cooling options and procedures, 478–479 critical current criteria, 474–475 current-carring area for critical current to current density, 476 current contact length, 477 current ramp rate and shape, 479 current supply, 476–477 current transfer and transfer length, 475–477 data analysis and interpretation, 481–482 electromagnetic phenomena, 479 essential theory current sharing, 474 four-point measurement, 473–474 Ohm’s law, 474 power law transitions, 474 generic protocol, 480 instrumentation and data acquisition, 479–480 lead shortening, 486 magnetic field strength extrapolation and irreversibility field, 475 maximum measurement current determination, 479 probe thermal contraction, 486 research background, 472–473 sample handling/damage, 486 sample heating and continuous-current measurements, 486 sample holding and soldering/making contacts, 478 sample preparation, 482–483 sample quality, 487 sample shape, 477 self-field effects, 487 signal-to-noise ratio parameters, 483–486 current supply noise, 486 grounding, 485 integration period, 486 pick-up loops, 483–484 random noise and signal spikes, 485–486 thermal electromotive forces, 484–485 specimen modification, 483 thermal cycling, 487 troubleshooting, 480 voltage tap placement, 477
voltmeter properties, 477 zero voltage definition, 474 magnetization, 517–519 permanent magnets in, 496 Superlattice structures grazing-incidence diffraction (GID), 242–243 magnetic neutron scattering, diffraction, 1334–1335 surface magneto-optic Kerr effect (SMOKE), 573–574 Superparamagnetism, principles of, 522 Surface analysis Auger electron spectroscopy (AES), 1158–1159 automation, 406 basic principles, 405–406 limitations, 406 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE), film thickness, 1206 protocols and procedures, 406 Surface barrier detectors (SBDs) heavy-ion backscattering spectrometry (HIBS), 1276 ion-beam analysis (IBA) experiments, 1181–1184 ERD/RBS techniques, 1185–1186 medium-energy backscattering, backscattering techniques, 1262–1265 Surface capacitance measurements, electrochemical photocapacitance spectroscopy (EPS), semiconductor materials, 624–625 Surface degradation, electrochemical photocapacitance spectroscopy (EPS), semiconductor materials, 626 Surface extended x-ray absorption fine structure (EXAFS) analysis, low-energy electron diffraction (LEED), 1121 Surface magnetic scattering, protocols and procedures, 932–934 Surface magnetometry, principles and applications, 535–536 Surface magneto-optic Kerr effect (SMOKE) automation, 573 classical theory, 570–571 data analysis and interpretation, 573–574 limitations, 575 medium-boundary and medium-propagation matrices, 576–577 multilayer formalism, 571 phenomenologica origins, 570 protocols and procedures, 571–573 quantum theory, ferromagnetism, 571 research background, 569–570 sample preparation, 575 ultrathin limit, 577 Surface photovoltage (SPV) technique, carrier lifetime measurement, diffusion-lengthbased methods, 435 Surface preparation carrier lifetime measurement passivation, free carrier absorption (FCA), 442–443 recombination and diffusion, 432–433 chemical vapor deposition (CVD) model basic components, 167–168 software tools, 173 ellipsometry polishing protocols, 741–742 reflecting surfaces, 735–737 molecular dynamics (MD) simulation data analysis and interpretation, 160–164 higher-temperature dynamics, 161–162 interlayer relaxation, 161 layer density and temperature variation, 164
1385
mean-square atomic displacement, 162–163 metal surface phonons, 161–162 room temperature structure and dynamics, 160–161 thermal expansion, 163–164 limitations, 164 principles and practices, 158–159 related theoretical techniques, 158 research background, 156 surface behavior, 156–157 temperature effects on surface behavior, 157 pn junction characterization, 468 surface/interface x-ray diffraction, 217–219 crystal truncation rods, 219 two-dimensional diffraction rods, 218–219 ultraviolet photoelectron spectroscopy (UPS), 727 sample preparation, 732–733 Surface recombination capture coefficient, electrochemical photocapacitance spectroscopy (EPS), semiconductor materials, 625 Surface recombination velocity carrier lifetime measurements, time-resolved microwave conductivity, 622–623 semiconductor-liquid interfaces, transient decay dynamics, 620–622 Surface reconstruction, molecular dynamics (MD) simulation, 156–157 Surface relaxation, molecular dynamics (MD) simulation, 156–157 Surface roughness angle-resolved x-ray photoelectron spectroscopy (ARXPS), 988 liquid surface x-ray diffraction, reflectivity measurements, 1033 surface x-ray scattering, 1016 temperature effects, molecular dynamics (MD) simulation, 157 Surface x-ray diffraction angle calculations, 1021 basic principles, 1008–1011 crystallographic measurements, 1010–1011 grazing incidence, 1011 measurement instrumentation, 1009–1010 surface properties, 1008–1009 beamline alignment, 1019–1020 competitive and related strategies, 1007–1008 data analysis and interpretation, 1015–1018 crystal truncation rod (CTR) profiles, 1015 diffuse scattering, 1016 grazing incidence measurements, 1016–1017 reciprocal lattice mapping, 1016 reflectivity, 1015–1016 silicon surface analysis, 1017–1018 software programs, 1024–1025 diffractometer alignment, 1020–1021 instrumentation criteria, 1011–1015 crystallographic alignment, 1014–1015 five-circle diffractometer, 1013–1014 laser alignment, 1014 sample manipulator, 1012–1013 vacuum system, 1011–1012 limitations, 1018–1019 liquid surfaces basic principles, 1028–1036 competitive and related techniques, 1028 data analysis and interpretation, 1039–1043 Langmuir monolayers, 1041–1043 liquid alkane crystallization, 1043 liquid metals, 1043 simple liquids, 1040–1041 non-specular scattering GID, diffuse scattering, and rod scans, 1038–1039 reflectivity measurements, 1033
1386
INDEX
Surface x-ray diffraction (Continued) p-polarized x-ray beam configuration, 1047 reflectivity, 1029–1036 Born approximation, 1033–1034 distorted-wave Born approximation, 1034–1036 Fresnel reflectivity, 1029–1031 grazing incidence diffraction and rod scans, 1036 instrumentation, 1036–1038 multiple stepwise and continuous interfaces, 1031–1033 non-specular scattering, 1033 research background, 1027–1028 specimen modification, 1043–1045 research background, 1007–1008 scan categories, 1027 super command format, 1025–1026 ultrahigh-vacuum (UHV) systems bakeout procedure, 1023–1024 basic principles, 1011–1012 load-lock procedure, 1023 protocols and procedures, 1022–1024 venting procedures, 1023 Survey spectrum Auger electron spectroscopy (AES), 1163–1164 x-ray photoelectron spectroscopy (XPS), 972–974 analyzer criteria, 981–982 post-processing of, 1000–1001 SX approximation, electronic structure analysis, 85–86 Symbols, in thermal analysis, 339–340 Symmetry analysis crystallography, 39–42 improper rotation axes, 39–40 proper rotation axes, 39 screw axes and glide planes, 40–42 in crystallography, 39 resonant scattering, 906 resonant scattering analysis, 909–910 single-crystal x-ray structure determination, crystal symmetry, 854–856 surface x-ray diffraction, error detection, 1019 transmission electron microscopy (TEM), diffraction pattern indexing, 1073–1074 vibrational Raman spectroscopy group theoretical analysis, 718–720 matrix representations, 717–720 Synchrotron radiation diffuse scattering techniques, 882–884 magnetic x-ray scattering beamline properties, 925–927 hardware criteria, 925–926 Mo¨ ssbauer spectroscopy, radiation sources, 826–827 surface x-ray diffraction, diffractometer components, 1009–1010 ultraviolet photoelectron spectroscopy (UPS), 728–729 commercial sources, 734 x-ray absorption fine structure (XAFS) spectroscopy, 877 x-ray photoelectron spectroscopy (XPS), 980 x-ray powder diffraction, 839 Tafel technique corrosion quantification, 593–596 limitations, 596 linear polarization, 596–599 protocols and procedures, 594–596 cyclic voltammetry, quasireversible reaction, 583 Tapered capillary devices, x-ray microprobes, 945 Taylor expansion series coherent ordered precipitates, microstructural evolution, 123
small-angle scattering (SAS), 221 thermal diffuse scattering (TDS), 211–214 Taylor rod impact test, high-strain-rate testing, 290 Technical University Munich Secondary Ion AMS Facility, trace element accelerator mass spectrometry (TEAMS) research at, 1245 Temperature cross-sensitivity, Hall effect sensors, 508 Temperature-dependent Hall (TDH) measurements, semiconductor materials, 414–416 Temperature factors, neutron powder diffraction, 1301 Temperature gradients, thermogravimetric (TG) analysis, 350–352 Temperature measurement binary/multicomponent diffusion, substitutional and interstitial metallic systems, 153 combustion calorimetry, 378–379 control of, 37–38 corrosion quantification, electrochemical impedance spectroscopy (EIS), 600–603 defined, 30–31 differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 367–368 fixed points, ITS-90 standard, 34 gas analysis, simultaneous thermogravimetry (TG)-differential thermal analysis (TGDTA), 394 high-strain-rate testing, 295–296 international temperature scale (ITS), development of, 32–34 measurement resources, 38 metal alloy paramagnetism, finite temperatures, 186–187 nuclear quadrupole resonance spectroscopy, 788 stress-strain analysis, 283 extreme temperatures and controlled environments, 285–286 surface phenomena, molecular dynamics (MD) simulation high-temperature structury and dynamics, 161–164 room temperature structure and dynamics, 160–161 variations caused by, 157 thermal diffusivity, laser flash technique, 387–389 thermogravimetric (TG) analysis error detection, 358–359 instrumentation and apparatus, 349–350 Tensile-strength measurement, hardness testing, 320 Tensile test basic principles, 284–285 high-strain-rate testing, Hopkinson bar technique, 289–290 microbeam analysis, strain distribution, 948 Tension testing basic principles, 279–280 basic tensile test, 284–285 elastic properties, 279 environmental testing, 286 extreme temperatures and controlled environments, 285–286 high-temperature testing, 286 low-temperature testing, 286 plastic properties, 279–280 specimen geometry, 285 stress/strain analysis, 280–282 curve form variables, 282–283 definitions, 281–282 material and microstructure, 283
temperature and strain rate, 283 yield-point phenomena, 283–284 testing machine characteristics, 285 Tensor elements, magnetotransport in metal alloys, transport equations, 560–561 Tensor structure factors, resonant scattering, transformation, 916 Test machines automated design, 286–287 mechanical testing, 285 Thermagnetometry/differential thermomagnetometry/DTA, temperature measurement errors and, 358–359 Thermal analysis data interpretation and reporting, 340–341 definitions and protocols, 337–339 gas analysis, simultaneous thermogravimetry (TG)-differential thermal analysis (TGDTA), 395 kinetic theory, 343 literature sources, 343–344 nomenclature, 337–339 research background, 337 scanning tunneling microscopy (STM) analysis, thermal drift, 1115 standardization, 340 symbols, 339–340 thermodynamic theory, 341–343 Thermal conductivity gauges, pressure measurements applications, 13 ionization gauge cold cathode type, 16 hot cathode type, 14–16 operating principles and procedures, 13–14 Pirani gauge, 14 thermocouple gauges, 14 magnetotransport in metal alloys basic principles, 560 magnetic field behavior, 563–565 research applications, 559 zero field behavior, 561–563 Thermal cycling, superconductors, electrical transport measurements, signal-to-noise ratio, 486 Thermal degradation evolved gas analysis (EGA), mass spectrometry for, 395–396 gas analysis, simultaneous thermogravimetry (TG)-differential thermal analysis (TGDTA), 393 Thermal diffuse scattering (TDS) diffuse scattering techniques, data interpretation comparisons, 895–897 neutron diffuse scattering, comparisons, 885 surface x-ray diffraction, crystallographic alignment, 1014–1015 x-ray diffraction, 210–214 Thermal diffusivity, laser flash technique automation of, 387 basic principles, 384–385 data analysis and interpretation, 387–389 limitations, 390 protocols and procedures, 385–387 research background, 383–384 sample preparation, 389 specimen modification, 389–390 Thermal effects, phase diagram prediction, nonconfigurational thermal effects, 106–107 Thermal electromotive forces, superconductors, electrical transport measurements, signalto-noise ratio, 484–485 Thermal expansion metal alloy magnetism, negative effect, 195–196
INDEX surface phenomena, molecular dynamics (MD) simulation, 163–164 Thermal neutron scattering, time-dependent neutron powder diffraction, 1299–1300 Thermal vibrations, scanning transmission electron microscopy (STEM), incoherent scattering, 1100 Thermal volatilization analysis (TVA), gas analysis and, 398 Thermistor, operating principles, 36 Thermoacoustimetry, defined, 339 Thermobalance apparatus, thermogravimetric (TG) analysis, 347–350 Thermocouple feedthroughs, vacuum systems, 18 Thermocouple gauge, applications and operation, 14 Thermocouple thermometer, operating principles, 36 Thermocoupling apparatus differential thermal analysis (DTA), 364–365 thermogravimetric (TG) analysis, 349–350 Thermodilatatometry, defined, 339 Thermodynamics combustion calorimetry and principles of, 374– 375 magnetic phase transition theory, 528–529 semiconductor photoelectrochemistry, semiconductor-liquid interface, 605–606 thermal analysis and principles of, 341–343 Thermodynamic temperature scale, principles of, 31–32 Thermoelectrometry, defined, 339 Thermogravimetry (TG). See also Simultaneous thermogravimetry (TG)-differential thermal analysis (TG-DTA) apparatus, 347–350 applications, 352–356 computational analysis, 354–355 gas-solid reactions, 355–356 kinetic studies, 352–354 solid-solid reactions, 356 thermal stability/reactivity, 355 automated procedures, 356 basic principles, 346–347 data analysis and interpretation, 356 defined, 338, 344–345 documentation protocols, 361–362 experimental variables, 350–352 limitations, 357–359 mass measurement errors, 357–358 temperature measurement errors, 358–359 research background, 344–346 sample preparation, 356–357 simultaneous techniques for gas analysis, research background, 392–393 Thermomagnetic analysis automation, 541–542 basic principles, 540–541 data analysis and interpretation, 543–544 research background, 540 structural phase transformations, 542–543 Thermomagnetic errors, Hall effect, semiconductor materials, 417 Thermomagnetometry, defined, 339 Thermomechanical analysis (TMA), defined, 339 Thermometry combustion calorimetry, 376–377 definitions, 30–31 electrical-resistance thermometers, 36 fixed temperature points, 34 gas-filled thermometers, 35–36 helium vapor pressure, 33 international temperature scales, 32–34 IPTS-27 scale, 32–33 ITS-90 development, 33–34 pre-1990s revisions, 33
interpolatinc gas thermometer, 33 liquid-filled thermometers, 35 optical pyrometry, 34 platinum resistance thermometer, 33–34 radiation thermometers, 36–37 readout interpretation, 35 semiconductor-based thermometers, 36 sensing elements, 34–35 temperature controle, 37–38 temperature-measurement resources, 38 thermistors, 36 thermocouple thermometers, 36 thermodynamic temperature scale, 31–32 Thermoparticulate analysis, defined, 338 Thermopower (S), magnetotransport in metal alloys basic principles, 560 magnetic field behavior, 563–565 research applications, 559 transport equations, 561 zero field behavior, 561–563 Thermoptometry, defined, 339 Thermosonimetry, defined, 339 Thickness parameters medium-energy backscattering, 1261 scanning transmission electron microscopy (STEM) dynamical diffraction, 1102–1103 instrumentation criteria, 1105 transmission electron microscopy (TEM), deviation parameters, 1081–1082 x-ray absorption fine structure (XAFS) spectroscopy, 879 x-ray magnetic circular dichroism (XMCD), 965–966 Thin-detector techniques, nuclear reaction analysis (NRA)/proton-induced gamma ray emission (PIGE), particle filtering, 1204 Thin-film structures conductivity measurements, 402 Hall effect, semiconductor materials, depletion effects, 417 impulsive stimulated thermal scattering (ISTS), 744–746 applications, 749–752 automation, 753 competitive and related techniques, 744–746 data analysis and interpretation, 753–757 limitations, 757–758 procedures and protocols, 746–749 research background, 744–746 sample preparation and specimen modification, 757 ion-beam analysis (IBA), ERD/RBS techniques, 1191–1197 medium-energy backscattering applications, 1268–1269 backscattering techniques, 1261–1265 particle-induced x-ray emission (PIXE), 1213– 1216 ultraviolet photoelectron spectroscopy (UPS), energy band dispersion, 725 x-ray photoelectron spectroscopy (XPS) reference spectra, 996–998 research background, 970–972 Third Law of thermodynamics, thermodynamic temperature scale, 31–32 Thomson coefficient, magnetotransport in metal alloys, basic principles, 560 Thomson scattering magnetic x-ray scattering, classical theory, 920 resonant scattering analysis, 906–907 quantum mechanics, 908–909 x-ray diffraction, 210 Three-beam interactions, dynamic diffraction, 240
1387
Three-dimensional imaging magnetic resonance, 770 scanning electron microscopy (SEM), 1056– 1057 ultraviolet photoelectron spectroscopy (UPS), energy band dispersion, 725 Three-dimensional lattices crystal systems, 45–46 low-energy electron diffraction (LEED), 1122 Three-dimensional standing wave, multiple-beam diffraction, 240 Three-electrode electrochemical cells, cyclic voltammetry, potentiostats, 584–585 Three-step photoemission model, ultraviolet photoelectron spectroscopy (UPS), 726–727 Three-wave analysis, high-strain-rate testing data analysis and interpretation, 296–298 Hopkinson bar technique, 291–292 Tight binding (TB) method electronic structure analysis, phase diagram prediction, 101–102 metal alloy bonding, precision measurements vs. first-principles calculations, 141 Tilted illumination and diffraction, transmission electron microscopy (TEM), axial dark-field imaging, 1071 Tilting specimens, transmission electron microscopy (TEM), diffraction techniques, 1071 Time-dependent Ginzburg-Landau (TDGL) equation continuum field method (CFM), 114 diffusion-controlled grain growth, 127–128 numerical algorithms, 129–130 microscopic field model (MFM), 116 microstructural evolution, field kinetics, 122 Time-dependent junction capacitance, deep level transient spectroscopy (DLTS), semiconductor materials, 422–423 Time-dependent neutron powder diffraction, applications and protocols, 1298–1300 Time-dependent wear testing, procedures and protocols, 330 Time-of-flight diffractomer, neutron powder diffraction, 1290–1291 Bragg reflection positions, 1291–1292 Time-of-flight spectrometry (TOS) heavy-ion backscattering spectrometry (HIBS), 1278 medium-energy backscattering, 1261 backscattering techniques, 1262–1265 error detection, 1271–1272 protocols and procedures, 1265 nuclear reaction analysis (NRA)/protoninduced gamma ray emission (PIGE), 1204 secondary ion mass spectrometry (SIMS) and, 1236–1237 Time-resolved microwave conductivity, semiconductor-liquid interfaces, surface recombination velocity, 622–623 Time-resolved photoluminescence (TRPL) semiconductor interfacial charge transfer kinetics, 630–631 semiconductor-liquid interfaces, transient decay dynamics, 620–622 Time-resolved x-ray powder diffraction, protocols, 845–847 Time-to-amplitude converter (TAC) carrier lifetime measurement, photoluminescence (PL), 451–452 heavy-ion backscattering spectrometry (HIBS), 1279 medium-energy backscattering, 1268 Time-to-digital converter (TDC), medium-energy backscattering, 1268
1388
INDEX
Tip position modulation (TPM) scanning electrochemical microscopy (SECM) constant current regime, 643 protocols, 648–659 scanning tunneling microscopy (STM), 1117 Top-loading balance, classification, 27–28 Topographic analysis dynamical diffraction, applications, 225 ion-beam-induced charge (IBIC)/single event upset (SEU) microscopy, 1226–1227 scanning tunneling microscopy (STM), 1116–1117 Topological distribution diffusion-controlled grain growth, 128 van Hove singularities, copper-platinum alloys, 271–273 Topologically close-packed phases (tcps), metal alloy bonding, 135–136 size effects, 137 Toroidal analyzers, ultraviolet photoelectron spectroscopy (UPS), 729 Torsional testing, high-strain-rate testing, Hopkinson bar technique, 289–290 Total current under illumination, illuminated semiconductor-liquid interface, J-E equations, 608 Totally irreversible reaction, cyclic voltammetry, 583 Totally reversible reaction, cyclic voltammetry, 582–583 Total-reflection x-ray fluorescence (TXRF) spectroscopy, heavy-ion backscattering spectrometry (HIBS) and, 1275 semiconductor manufacturing, 1280 Tougaard background, x-ray photoelectron spectroscopy (XPS), 990–991 Toughness property, tension testing, 280 Trace element accelerator mass spectrometry (TEAMS) automation, 1247 bulk analysis impurity measurements, 1247–1249 measurement data, 1249–1250 complementary, competitive and alternative methods, 1236–1238 inductively coupled plasma mass spectrometry, 1237 neutron activation-accelerator mass spectrometry (NAAMS), 1237 neutron activation analysis (NAA), 1237 secondary-ion mass spectrometry (SIMS), 1236–1237 selection criteria, 1237–1238 sputter-initiated resonance ionization spectrometry (SIRIS), 1237 data analysis and interpretation, 1247–1252 calibration of data, 1252 depth-profiling data analysis, 1251–1252 impurity measurements, 1250–1251 facilities profiles, 1242–1246 CSIRO Heavy Ion Analytical Facility (HIAF), 1245 Naval Research Laboratory, 1245–1246 Paul Scherrer Institute (PSI)/ETH Zurich Accelerator SIMS Laboratory, 1242–1244 Technical University Munich Secondary Ion AMS Facility, 1245 University of North Texas Ion Beam Modification and Analysis Laboratory, 1246 University of Toronto IsoTrace Laboratory, 1244–1245 facility requirements, 1238 future applications, 1239
high-energy beam transport, analysis, and detection, 1239 historical evolution, 1246–1247 impurity measurements bulk analysis, 1247–1249 depth-profiling, 1250–1251 instrumentation criteria, 1239–1247 magnetic and electrostatic analyzer calibration, 1241–1242 ultraclean ion source design, 1240–1241 instrumentation specifications and suppliers, 1258 limitations, 1253 research background, 1235–1238 sample preparation, 1252–1253 secondary-ion acceleration and electronstripping system, 1238–1239 specimen modification, 1253 ultraclean ion sources, negatively charged secondary-ion generation, 1238 Trace element distribution, microbeam analysis, 947–948 Trace-element sensitivity, medium-energy backscattering, 1266–1267 Tracer diffusion binary/multicomponent diffusion, 149–150 substitutional and interstitial metallic systems, 154–155 Training effect, superconducting magnets, 501 Transfer length, superconductors, electrical transport measurement, 475–476 Transfer pumps, technological principles, 3 Transient decay dynamics, semiconductor-liquid interfaces, 619–622 Transient gating (TG) methods, impulsive stimulated thermal scattering (ISTS), 744 Transient ion-beam-induced charge (IBIC) microscopy, basic principles, 1228 Transient measurement techniques, carrier lifetime measurement, 435–438 data interpretation, 437 free carrier absorption (FCA), 439 limitations, pulsed-type methods, 437–438 selection criteria, 453–454 Transient response curves, thermal diffusivity, laser flash technique, 387–389 Transition metals atomic/ionic magnetism, ground-state multiplets, 514–515 bonding-antibonding, 136–137 crystal structure, tight-binding calculations, 135 magnetic ground state, itinerant magnetism at zero temperature, 181–183 surface phenomena, molecular dynamics (MD) simulation, 159 Transition operator, resonant scattering analysis, quantum mechanics, 908–909 Transition temperature measurement, differential thermal analysis (DTA)/ differential scanning calorimetry (DSC), 367–368 Transmission electron microscopy (TEM) automation, 1080 basic principles, 1064–1069 deviation vector and parameter, 1066–1067 Ewald sphere construction, 1066 extinction distance, 1067–1068 lattice defect diffraction contrast, 1068–1069 structure and shape factors, 1065–1066 bright field/dark-field imaging, 1069–1071 data analysis and interpretation, 1080–1086
bright field/dark-field, and selected-area diffraction, 1082–1084 defect analysis values, 1084–1085 Kikuchi lines and deviation parameter, defect contrast, 1085–1086 shape factor effect, 1080–1081 specimen thickness and deviation parameter, 1081–1082 diffraction pattern indexing, 1073–1074 ion-beam analysis (IBA) vs., 1181 Kikuchi lines and specimen orientation, 1075–1078 deviation parameters, 1077–1078 electron diffuse scattering, 1075 indexing protocols, 1076–1077 line origins, 1075–1076 lens defects and resolution, 1078–1080 aperture diffraction, 1078 astigmatism, 1079 chromatic aberration, 1078 resolution protocols, 1079–1080 spherical aberration, 1078 limitations, 1088–1089 magnetic domain structure measurements, Lorentz transmission electron microscopy, 551–552 pn junction characterization, 467 research background, 1063–1064 sample preparation, 1086–1087 dispersion, 1087 electropolishing and chemical thinning, 1086 ion milling and focused gallium ion beam thinning, 1086–1087 replication, 1087 ultramicrotomy, 1087 scanning electron microscopy (SEM) vs., 1050 phase-contrast illumination, 1093–1097 scanning transmission electron microscopy (STEM) vs., 1090–1092 scanning tunneling microscopy (STM) vs., 1112–1113 selected-area diffraction (SAD), data analysis, 1080–1081 specimen modification, 1087–1088 tilted illumination and diffraction, 1071 tilting specimens and electron beams, 1071 Transmission lines, microwave measurement techniques, 409 Transmission measurements x-ray absorption fine structure (XAFS) spectroscopy, 875–877 x-ray photoelectron spectroscopy (XPS), elemental composition analysis, 984–985 Transmission x-ray magnetic circular dichroism (TXMCD), magnetic domain structure measurements, 555–556 Transmitted-light microscopy, optical microscopy and, 667 Transport equations, magnetotransport in metal alloys, 560–561 Transport measurements. See also Magnetotransport properties theoretical background, 401 Transport phenomena model, chemical vapor deposition (CVD), basic components, 167 Transverse Kerr effect, surface magneto-optic Kerr effect (SMOKE), 571–572 Transverse magnetization, nuclear magnetic resonance, 764–765 Transverse magnetoresistance effects, magnetotransport in metal alloys, 564–566 Transverse relaxation time, nuclear quadrupole resonance (NQR), 779–780
INDEX Trap-assisted Auger recombination, carrier lifetime measurement, 430–431 Trapping mechanisms capacitance-voltage (C-V) characterization, 464–465 carrier lifetime measurement, 431–432 Traveling wave model, x-ray diffraction, 206 Tribological testing acceleration, 333 automation of, 333–334 basic principles, 324–326 control factors, 332–333 data analysis and interpretation, 334 equipment and measurement techniques, 326–327 friction coefficients, 326 friction testing, 328–329 general procedures, 326 limitations, 335 research background, 324 results analysis, 333 sample preparation, 335 test categories, 327–328 wear coefficients, 326–327 wear testing, 329–332 Triclinic systems, crystallography, space groups, 50 Triode ionization gauge, 14–15 Triode sputter-ion pump, operating principles, 11 Triple-axis spectrometry, phonon analysis, 1320–1323 error detection, 1326–1327 TRISO-coated fuel particles, microbeam analysis, 947–948 Trouton’s rule, thermal analysis and, 343 Tungsten electron guns, scanning electron microscopy (SEM) instrumentation criteria, 1054–1057 selection criteria, 1061 Tungsten filaments Auger electron spectroscopy (AES), 1160–1161 hot cathode ionization gauges, 15 Tungsten tips, scanning tunneling microscopy (STM), 1117 Tuning protocols, nuclear magnetic resonance, 770–771 Tunneling mechanisms. See also Scanning tunneling microscopy (STM) pn junction characterization, 469 Turbomolecular pumps applications, 7 bearings for, 7 operating principles and procedures, 7–9 reactive gas problems, 8 startup procedure, 8 surface x-ray diffraction, 1011–1012 venting protocols, 8 Two-beam diffraction, 229–236 anomalous transmission, 231–232 Darwin width, 232 diffuse intensities boundary condition, 233 Bragg case, 234–235 integrated intensities, 235 Laue case, 233–234 dispersion equation solution, 233 dispersion surface properties, 230–231 boundary conditions, Snell’s law, 230–231 hyperboloid sheets, 230 Poynting vector and energy flow, 231 wave field amplitude ratios, 230 Pendello¨ sung, 231 standing waves, 235–236 x-ray birefringence, 232–233 x-ray standing waves (XSWs), 232
Two-body contact, tribological and wear testing, 324–325 categories for, 327–328 equipment and measurement techniques, 326– 327 wear testing categories, 329–332 Two-dimensional exchange NQR spectroscopy, basic principles, 785 Two-dimensional Fourier transform nuclear magnetic resonance data analysis, 771–772 nutation nuclear resonance spectroscopy, 783–784 Zeeman-perturbed NRS (ZNRS), 782–783 Two-dimensional lattices crystal systems, 44–46 low-energy electron diffraction (LEED), 1122 surface/interface x-ray diffraction, 218–219 Two-dimensional states, ultraviolet photoelectron spectroscopy (UPS), energy band dispersion, 724–725 Two-dimensional zero-field NQR level-crossing double resonance NQR nutation spectroscopy, 785 Zeeman-perturbed nuclear resonance spectroscopy, 782–783 zero-field nutation NRS, 783–784 Two-phase model, small-angle scattering (SAS), 220 Two-point measurement techniques bulk measurements, 403 protocols and procedures, 403–404 conductivity measurements, 402 Two-wave analysis, high-strain-rate testing, Hopkinson bar technique, 291–292 Ultraclean ion sources, trace element accelerator mass spectrometry (TEAMS) design criteria, 1239–1240 negatively charged secondary ion generation, 1238 Ultrahigh-vacuum (UHV) systems all-metal flange seals, 18 assembly, processing and operation, 19 basic principles, 1–2 bellows-sealed feedthroughs, 19 construction materials, 17 hot cathode ionization gauges, 15 low-energy electron diffraction (LEED), comparisons, 1121 O-ring seals, limits of, 18 scanning tunneling microscopy (STM), 1117 sputter-ion pump, 10–11 surface magneto-optic Kerr effect (SMOKE), 573 surface x-ray diffraction bakeout procedure, 1023–1024 basic principles, 1011–1012 load-lock procedure, 1023 protocols and procedures, 1022–1024 venting procedures, 1023 valve construction, 18 x-ray absorption fine structure (XAFS), 876–877 x-ray photoelectron spectroscopy (XPS) instrumentation, 978–983 research background, 970–972 Ultralarge-scale integrated circuits, heavy-ion backscattering spectrometry (HIBS), 1280 Ultra microbalances, thermogravimetric (TG) analysis, 347–350 Ultramicroelectrode (UME), scanning electrochemical microscopy (SECM) feedback mode, 638–640 properties and applications, 636
1389
Ultramicrotomy, transmission electron microscopy (TEM), sample preparation, 1087 Ultrasonic microhardness testing, basic principles, 316 Ultrathin limit, surface magneto-optic Kerr effect (SMOKE), 577 Ultraviolet photoelectron spectroscopy (UPS) alignment procedures, 729–730 atoms and molecules, 727–728 automation, 731 competitive and related techniques, 725–726 data analysis and interpretation, 731–732 electronic phase transitions, 727 electron spectrometers, 729 energy band dispersion, 723–725 solid materials, 727 light sources, 728–729 figure of merit, 734–735 limitations, 733 photoemission process, 726–727 photoemission vs. inverse photoemission, 722–723 physical relations, 730 sample preparation, 732–733 sensitivity limits, 730–731 surface states, 727 synchrotron light sources, 734 valence electron characterization, 723–724 x-ray photoelectron spectroscopy (XPS), comparisons, 971 Ultraviolet/visible absorption (UV-VIS) spectroscopy applications, 692 array detector spectrometer, 693 automation, 693 common components, 692 competitive and related techniques, 690 dual-beam spectrometer, 692–693 limitations, 696 materials characterization, 691–692, 694–695 materials properties, 689 qualitative/quantitative analysis, 689–690 quantitative analysis, 690–691, 693–694 research background, 688–689 sample preparation, 693, 695 single-beam spectrometer, 692 specimen modification, 695–696 Uncertainty principle diffuse scattering techniques, 887–889 resonance spectroscopies, 761–762 x-ray photoelectron spectroscopy (XPS), composition analysis, 994–996 Uniaxial flow stress, static indentation hardness testing, 317 Uniform plastic deformation, stress-strain analysis, 282 Units (magnetism), general principles, 492–493 Universal gas constant, corrosion quantification, Tafel technique, 593–596 University of North Texas Ion Beam Modification and Analysis Laboratory, trace element accelerator mass spectrometry (TEAMS) research at, 1246 University of Toronto IsoTrace Laboratory, trace element accelerator mass spectrometry (TEAMS) at, 1244–1245 Unloading compliance technique, fracture toughness testing, crack extension measurement, 308 Unstable crack growth, fracture toughness testing, load-displacement behavior, 304–305 Unstable fractures, fracture toughness testing, 308–309
1390
INDEX
Upper critical field strength (Hc2), superconductors electrical transport measurements, extrapolation, 475 magnetic measurements, 473 magnetization, 519 superconductors-to-normal (S/N) transition, electrical transport measurement, estimation protocols, 482 Vacancy wind, binary/multicomponent diffusion, Kirkendall effect, 149 Vacuum systems basic principles, 1–3 outgassing, 2–3 leak detection, 20–22 load-lock antechamber, basic principles, 1–2 nuclear reaction analysis (NRA)/protoninduced gamma ray emission (PIGE), 1203–1204 pumps surface x-ray diffraction, 1011–1012 technological principles, 3 scanning electron microscopy (SEM), 1061 scanning transmission electron microscopy (STEM), error detection, 1108 standards and units, 1 surface x-ray diffraction, basic properties, 1011–1012 system construction and design assembly, processing and operation, 19 design criteria, 19–20 hardware components, 17–19 materials, 17 technological principles cryopumps, 9–10 diffusion pumps, 6–7 getter pumps, 12 high-vacuum pumps, 5–6 nonevaporable getter pumps (NEGs), 12 oil-free (dry) pumps, 4–5 oil-sealed pumps, 3–4 roughing pumps, 3 scroll pumps, 5 sorption pumps, 5 sputter-ion pumps, 10–12 sublimation pumps, 12 thermal conductivity gauges for pressure measurement, 13–17 total/partial pressure measurement, 13 turbomolecular pumps, 7–9 vacuum pumps, 3 ultraviolet photoelectron spectroscopy (UPS), automation, 731–732 Valence bands, x-ray photoelectron spectroscopy (XPS), 973–974 Valence electron characterization Mo¨ ssbauer spectroscopy, 827 ultraviolet photoelectron spectroscopy (UPS), 723–724 Valence potential, x-ray photoelectron spectroscopy (XPS), initial-state effects, 977–978 Valves, vacuum systems, 18 Van de Graaff particle accelerator, nuclear reaction analysis (NRA)/proton-induced gamma ray emission (PIGE), 1203–1204 Van der Pauw method conductivity measurements, 402 Hall effect, semiconductor materials, relaxation time approximation (RTA), 413 semiconductor materials, Hall effect, automated apparatus, 414 surface measurements, 406 Van der Waals forces
liquid surface x-ray diffraction, simple liquids, 1040–1041 neutron powder diffraction, 1302 vacuum system principles, outgassing, 2–3 Van Hove singularities, diffuse intensities, metal alloys copper-platinum alloys, 271–273 multicomponent alloys, 268–270 Van’t Hoff equation differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 368 thermal analysis and principles of, 342–343 Vapor phase deposition (VPD), heavy-ion backscattering spectrometry (HIBS), 1274 Vapor phase epitaxy (VPE) reactor, capacitancevoltage (C-V) characterization, 462–463 Variability in test results, tribological and wear testing, 335 Variable frequency Hewlett-Packard meters, capacitance-voltage (C-V) characterization, 460–461 Variable-pressure scanning electron microscopy (SEM) basic principles, 1056–1057 specimen modification, 1059 Variational Monte Carlo (VMC), electronic structure analysis, 88–89 Venting protocols surface x-ray diffraction, ultrahigh-vacuum (UHV) systems, 1023 turbomolecular pumps, 8 Vertical illuminators, reflected-light optical microscopy, 675–676 Vibrating-coil magnetometer (VCM), principles and applications, 535 Vibrating-sample magnetometer (VSM) automation, 537 properts and applications, 533–534 thermomagnetic analysis, 540–544 Vibrational Raman spectroscopy, group theoretical analysis, 702–704 character tables, 718 point groups and matrix representation, symmetry operations, 717–718 vibrational modes of solids, 720–722 vibrational selection rules, 716–720 Vibrational selection rules, group theoretical analysis, 716–717 symmetry operators, 718–720 Vibrational spectra, group theoretical analysis, 717–720 Vickers hardness testing automated methods, 319 basic principles, 317–318 data analysis and interpretation, 319–320 hardness values, 317–318, 323 limitations and errors, 322 procedures and protocols, 318–319 research background, 316–317 sample preparation, 320 specimen modification, 320–321 Virgin wafers, carrier lifetime measurement, free carrier absorption (FCA), 442–443 Virtual leaks, vacuum systems, 20–22 Viscous flow oil-sealed pumps, oil contamination, avoidance, 4 vacuum system design, 20 Voigt function, x-ray photoelectron spectroscopy (XPS), peak shape, 1006 Volatile products energy-dispersive spectrometry (EDS), loss mechanisms, 1153 gas analysis and, 398
Voltage spikes, superconductors, electrical transport measurements, signal-to-noise ratio, 485 Voltage tap, superconductors, electrical transport measurements, placement protocols, 477 Voltmeter properties, superconductors, electrical transport measurements, 477 Volume deformation (VD), phase diagram prediction, static displacive interactions, 104–106 Volume-fixed frame of reference, binary/ multicomponent diffusion, 148 Vortex melting transition, superconductor magnetization, 519 Warren-Cowley order parameter diffuse intensities, metal alloys atomic short-range ordering (ASRO) principles, pair correlations, 257 basic definitions, 252–254 concentration waves, density-functional theory (DFT), 261–262 copper-nickel-zinc alloys, 269–270 effective cluster interactions (ECIs), 255–256 diffuse scattering techniques, 886–889 protocols and procedures, 889–894 x-ray diffraction, local atomic correlation, 215–217 Wave equations, high-strain-rate testing, Hopkinson bar technique, 290–292 Wave field amplitude ratios, two-beam diffraction, dispersion surface, 230 Wave function character metal alloy bonding, 138 ultraviolet photoelectron spectroscopy (UPS), photoemission process, 726–727 Waveguide mode velocities, impulsive stimulated thermal scattering (ISTS) analysis, 756– 757 Wave-length-dispersive x-ray spectrometry (WDS), energy-dispersive spectrometry (EDS), matrix corrections, 1145–1147 Wavelength properties fluorescence analysis, dispersive spectrometers, 944 impulsive stimulated thermal scattering (ISTS), 746–749 Wave measurements, thermal diffusivity, laser flash technique vs., 383–384 Weak-beam dark-field (WBDF) imaging, transmission electron microscopy (TEM), 1071 Weak scattering sources magnetic x-ray scattering, 935 scanning transmission electron microscopy (STEM), incoherent imaging, weakly scattering objects, 1111 Wear coefficients, tribological and wear testing, 326–327 Wear curves, wear testing protocols, 330–331 Wear-rate-vs.-usage curve, wear testing protocols, 331 Wear testing acceleration, 333 automation of, 333–334 basic principles, 317, 324–326 classification of, 324–325 control factors, 332–333 data analysis and interpretation, 334 equipment and measurement techniques, 326– 327 friction coefficients, 326 friction testing, 328–329 general procedures, 326 limitations, 335
INDEX research background, 324 results analysis, 333 sample preparation, 335 techniques and procedures, 329–332 test categories, 327–328 wear coefficients, 326–327 Weight definitions, 24–26 standards for, 26–27 Weilbull distribution, fracture toughness testing, brittle materials, 311 Weiss molecular field constant ferromagnetism, 523–524 surface magneto-optic Kerr effect (SMOKE), 571 Wigner-Eckart theorem, resonant scattering analysis, 909–910 Williamson-Hall plot, neutron powder diffraction, microstrain broadening, 1294–1295 Wollaston prism, reflected-light optical microscopy, 679–680 Wood notation, low-energy electron diffraction (LEED), qualitative analysis, 1123–1124 Working distance, optical microscopy, 670 Working electrode, cyclic voltammetry, 585–586 X-band frequencies, electron paramagnetic resonance (EPR), continuous-wave (CW) experiments, rectangular resonator, 794– 795 XCOM database, particle-induced x-ray emission (PIXE), 1212 X-ray absorption, energy-dispersive spectrometry (EDS), matrix corrections, 1145 X-ray absorption fine structure (XAFS) spectroscopy automation, 878 data analysis and interpretation, 878–879 detection methods, 875–877 energy calibration, 877 energy resolution, 877 harmonic content, 877 limitations, 880 micro-XAFS, fluorescence analysis, 943 monochromator glitches, 877–878 related structures, 870 research background, 869–870 sample preparation, 879–880 simple picture components, 872–874 disorder, 872–873 L edges, 873 multiple scattering, 873–874 polarization, 873 single-crystal x-ray structure determination, 851 single-scattering picture, 870–871 specimen modification, 880 synchrotron facilities, 877 x-ray photoelectron spectroscopy (XPS), 971 X-ray absorption near-edge structure (XANES) spectroscopy. See also Near-edge x-ray absorption fine structure (NEXAFS) spectroscopy comparisons x-ray absorption (XAS), 870 research background, 874–875 ultraviolet photoelectron spectroscopy (UPS) and, 726 X-ray birefringence, two-beam diffraction, 232– 233 X-ray crystallography, single-crystal x-ray structure determination, 851–858 crystal structure refinement, 856–858 crystal symmetry, 854–856 X-ray diffraction kinematic theory crystalline material, 208–209
lattice defects, 210 local atomic arrangement - short-range ordering, 214–217 research background, 206 scattering principles, 206–208 small-angle scattering (SAS), 219–222 cylinders, 220–221 ellipsoids, 220 Guinier approximation, 221 integrated intensity, 222 interparticle interference, 222 K=0 extrapolation, 221 porod approximation, 221–222 size distribution, 222 spheres, 220 two-phase model, 220 structure factor, 209–210 surface/interface diffraction, 217–219 crystal truncation rods, 219 two-dimensional diffraction rods, 218–219 thermal diffuse scattering (TDS), 210–214 liquid surfaces and monomolecular layers basic principles, 1028–1036 competitive and related techniques, 1028 data analysis and interpretation, 1039–1043 Langmuir monolayers, 1041–1043 liquid alkane crystallization, 1043 liquid metals, 1043 simple liquids, 1040–1041 non-specular scattering GID, diffuse scattering, and rod scans, 1038–1039 reflectivity measurements, 1033 p-polarized x-ray beam configuration, 1047 reflectivity, 1029–1036 Born approximation, 1033–1034 distorted-wave Born approximation, 1034– 1036 Fresnel reflectivity, 1029–1031 grazing incidence diffraction and rod scans, 1036 instrumentation, 1036–1038 multiple stepwise and continuous interfaces, 1031–1033 non-specular scattering, 1033 research background, 1027–1028 specimen modification, 1043–1045 low-energy electron diffraction (LEED), comparisons, 1120–1121 neutron powder diffraction vs., 1286–1289 resonant scattering techniques calculation protocols, 909–912 L=2 measurements, 910–911 L=4 measurements, 911–912 classical mechanics, 906–907 comparisons with other methods, 905–906 data analysis and interpretation, 914–915 experiment design, 914 instrumentation criteria, 912–913 materials properties measurments, 905 polarization analysis, 913 quantum mechanics, 908–909 research background, 905 sample preparation, 913–914 theory, 906 single-crystal x-ray structure determination automation, 860 competitive and related techniques, 850–851 derived results interpretation, 861–862 initial model structure, 860–861 limitations, 863–864 nonhydrogen atom example, 862 protocols and procedures, 858–860 data collection, 859–860
1391
research background, 850–851 sample preparation, 862–863 specimen modification, 863 x-ray crystallography principles, 851–858 crystal structure refinement, 856–858 crystal symmetry, 854–856 surface x-ray diffraction angle calculations, 1021 basic principles, 1008–1011 crystallographic measurements, 1010– 1011 grazing incidence, 1011 measurement instrumentation, 1009–1010 surface properties, 1008–1009 beamline alignment, 1019–1020 competitive and related strategies, 1007– 1008 data analysis and interpretation, 1015–1018 crystal truncation rod (CTR) profiles, 1015 diffuse scattering, 1016 grazing incidence measurements, 1016– 1017 reciprocal lattice mapping, 1016 reflectivity, 1015–1016 silicon surface analysis, 1017–1018 diffractometer alignment, 1020–1021 instrumentation criteria, 1011–1015 crystallographic alignment, 1014–1015 five-circle diffractometer, 1013–1014 laser alignment, 1014 sample manipulator, 1012–1013 vacuum system, 1011–1012 limitations, 1018–1019 research background, 1007–1008 transmission electron microscopy (TEM) and, 1064 X-ray diffuse scattering applications, 889–894 automation, 897–898 bond distances, 885 chemical order, 884–885 comparisons, 884 competitive and related techniques, 883–884 crystalling solid solutions, 885–889 data analysis and interpretation, 894–896 diffuse x-ray scattering techniques, 889–890 inelastic scattering background removal, 890– 893 limitations, 898–899 measured intensity calibration, 894 protocols and procedures, 884–889 recovered static displacements, 896–897 research background, 882–884 resonant scattering terms, 893–894 sample preparation, 898 X-ray fluorescence analysis. See X-ray microfluorescence analysis X-ray magnetic circular dichroism (XMCD) automation, 963 basic principles, 955–957 circularly polarized x-ray sources, 957–959 competitive/related techniques, 954–955 data analysis and interpretation, 963–964 detection protocols, 959 limitations, 965–966 magnetic domain structure measurements, 555–556 automation of, 555–556 basic principles, 555 procedures and protocols, 555 magnetic x-ray scattering, comparisons, 919 nonresonant ferromagnetic scattering, 930 measurement optics, 959–962 research background, 953–955 sample magnetization, 962–963 sample preparation, 964–965
1392
INDEX
X-ray magnetic linear dichroism (XMLD), magnetic domain structure measurements, 555 X-ray microdiffraction analytic applications, 944–945 basic principles, 941–942 data analysis, 950 error detection, 950–951 sample preparation, 949–950 X-ray microfluorescence analysis, 942–945 background signals, 944 characteristic radiation, 942–943 detector criteria, 944 fluorescence yields, 942 micro XAFS, 943 particle-induced x-ray emission (PIXE) and, 1210–1211 penetration depth, 943–944 photoabsorption cross-sections, 943 scanning transmission electron microscopy (STEM), atomic resolution spectroscopy, 1103–1104 X-ray microprobe protocols and procedures, 941–942 research background, 939–941 X-ray microprobes microbeam applications strain distribution ferroelectric sample, 948–949 tensile loading, 948 trace element distribution, SiC nuclear fuel barrier, 947–948 protocols and procedures, 941–942 research background, 939–941 sources, 945 X-ray photoelectron spectroscopy (XPS) applications, 983–988 chemical state, 986 elemental composition analysis, 983–986 materials-specific issues, 988–989 Auger electron spectroscopy (AES) vs., 1158–1159 automation, 989 basic principles, 971–978 final-state effects, 974–976 high-resolution spectrum, 974–978 initial-state effects, 976–978 survey spectrum, 972–974 competitive and related techniques, 971 data analysis, 989–994 appearance protocols, 993–994 background subtraction, 990–991 peak integration and fitting, 991–992 peak position and half-width, 992 principal components, 992–993 instrumentation criteria, 978–983 analyzers, 980–982 detectors, 982 electron detection, 982–983 maintenance, 983 sources, 978–980 interpretation protocols, 994–998 chemical state information, 996–998 composition analysis, 994–996 ion-beam analysis (IBA) vs., 1181 limitations, 999–1001 low-energy electron diffraction (LEED), sample preparation, 1125–1126 nuclear reaction analysis (NRA) and protoninduced gamma ray emission (PIGE) and, 1201–1202 research background, 970–971 sample preparation, 998–999 specimen modification, 999
ultraviolet photoelectron spectroscopy (UPS) and, 725–726 X-ray powder diffraction ab initio structure determination, 845 basic principles, 836–838 Bragg peak integrated intensity extraction, 840 candidate atom positions, 840–841 competitive and related techniques, 836 crystal lattice and space group determination, 840 data analysis and interpretation, 839–842 database comparisons of known materials, 843–844 limitations, 842–843 protocols and procedures, 838–839 quantitative phase analysis, 842, 844–845 research background, 835–836 Rietveld refinements, 841–842 sample preparation, 842 single-crystal neutron diffraction and, 1307–1309 structure determination in known analogs, 845 structure determination protocols, 847–848 time-resolved x-ray powder diffraction, 845–847 X-ray resonant exchange scattering (XRES), x-ray magnetic circular dichroism (XMCD) comparison, 955 X-ray scattering and spectroscopy diffuse scattering techniques, measurement protocols, 889–894 magnetic x-ray scattering data analysis and interpretation, 934–935 hardware criteria, 925–927 limitations, 935–936 nonresonant scattering antiferromagnets, 928–930 ferromagnets, 930 theoretical concepts, 920–921 research background, 917–919 resonant scattering antiferromagnets, 930–932 ferromagnets, 932 theoretical concepts, 921–924 sample preparation, 935 spectrometer criteria, 927–928 surface magnetic scattering, 932–934 theoretical concepts, 919–925 ferromagnetic scattering, 924–925 nonresonant scattering, 920–921 resonant scattering, 921–924 research background, 835 X-ray self-absorption, energy-dispersive spectrometry (EDS), standardless analysis, 1149 X-ray standing wave (XWS) diffraction literature sources, 226 low-energy electron diffraction (LEED), comparisons, 1120–1121 two-beam diffraction, 232, 235–236 XRS-82 program, neutron powder diffraction, 1306 Yafet-Kittel triangular configuration, ferromagnetism, 525–526 Yield-point phenomena, stress-strain analysis, 283–284 Yield strength hardness testing, 320 heavy-ion backscattering spectrometry (HIBS), 1277 ion-beam analysis (IBA), ERD/RBS equations, 1186–1189 Young’s modulus
fracture-toughness testing, lnear elastic fracture mechanics (LEFM), 303 high-strain-rate testing, Hopkinson bar technique, 291–292 impulsive stimulated thermal scattering (ISTS), 753–755 stress-strain analysis, 280–281 Zachariasen formalism, single-crystal neutron diffraction, 1314–1315 ZBL approximation, particle scattering, centralfield theory, deflection function, 60 Z-contrast imaging diffuse scattering techniques, comparisons, 884 scanning transmission electron microscopy (STEM) atomic resolution spectroscopy, 1103–1104 coherent phase-contrast imaging and, 1093–1097 competitive and related techniques, 1092–1093 data analysis and interpretation, 1105–1108 object function retrieval, 1105–1106 strain contrast, 1106–1108 dynamical diffraction, 1101–1103 incoherent scattering, 1098–1101 weakly scattered objects, 1111 limitations, 1108 manufacturing sources, 1111 probe formation, 1097–1098 protocols and procedures, 1104–1105 research background, 1090–1093 sample preparation, 1108 specimen modification, 1108 transmission electron microscopy (TEM), 1063–1064 Zeeman effects nuclear quadrupole resonance (NQR), perturbation and line shapes, 779 splitting Pauli paramagnetism, 493–494 quantum paramagnetiic response, 521–522 Zeeman-perturbed NRS (ZNRS), nuclear quadrupole resonance (NQR), basic principles, 782–783 Zero field cooling (ZFC), superconductor magnetization, 518–519 Zero-line optimization, differential thermal analysis (DTA)/differential scanning calorimetry (DSC), 366 Zero magnetic field magnetotransport in metal alloys, 561–563 nuclear quadrupole resonance (NQR) energy levels, 777–779 higher spin nuclei, 779 spin 1 levels, 777–778 spin 3/2 levels, 778 spin 5/2 levels, 778–779 two-dimensional zero-field NQR, 782–785 exchange NQR spectroscopy, 785 level-crossing double resonance NQR nutation spectroscopy, 785 nutation NRS, 783–784 Zeeman-perturbed NRS (ZNRS), 782–783 Zeroth Law of Thermodynamics, thermodynamic temperature scale, 31–32 Zero voltage definition, superconductors, electrical transport measurement, 474 Zinc alloys. See also Copper-nickel-zinc alloys x-ray diffraction, structure-factor calculations, 209 Zone plates, x-ray microprobes, 945–946
This page intentionally left blank