Springer Series in
MATERIALS SCIENCE
106
Springer Series in
MATERIALS SCIENCE Editors: R. Hull
R. M. Osgood, Jr.
J. Parisi
H. Warlimont
The Springer Series in Materials Science covers the complete spectrum of materials physics, including fundamental principles, physical properties, materials theory and design. Recognizing the increasing importance of materials science in future device technologies, the book titles in this series reflect the state-of-the-art in understanding and controlling the structure and properties of all important classes of materials. 98 Physics of Negative Refraction and Negative IndexMaterials Optical and Electronic Aspects and Diversified Approaches Editors: C.M. Krowne and Y. Zhang 99 Self-Organized Morphology in Nanostructured Materials Editors: K. Al-Shamery and J. Parisi 100 Self Healing Materials An Alternative Approach to 20 Centuries of Materials Science Editor: S. van der Zwaag 101 New Organic Nanostructures for Next Generation Devices Editors: K. Al-Shamery, H.-G. Rubahn, and H. Sitter 102 Photonic Crystal Fibers Properties and Applications By F. Poli, A. Cucinotta, and S. Selleri 103 Polarons in Advanced Materials Editor: A.S. Alexandrov
104 Transparent Conductive Zinc Oxide Basics and Applications in Thin Film Solar Cells Editors: K. Ellmer, A. Klein, and B. Rech 105 Dilute III-V Nitride Semiconductors and Material Systems Physics and Technology Editor: A. Erol 106 Into The Nano Era Moore’s Law Beyond Planar Silicon CMOS Editor: H.R. Huff 107 Organic Semiconductors in Sensor Applications Editors: D.A. Bernards, R.M. Ownes, and G.G. Malliaras 108 Evolution of Thin-Film Morphology Modeling and Simulations By M. Pelliccione and T.-M. Lu 109 Reactive Sputter Deposition Editors: D. Depla and S. Mahieu 110 The Physics of Organic Superconductors and Conductors Editor: A. Lebed
Volumes 50–97 are listed at the end of the book.
Howard R. Huff Editor
Into The Nano Era Moore’s Law Beyond Planar Silicon CMOS
With 136 Figures
Dr. Howard R. Huff 2116 Cumberland Hill Drive Henderson, NV 89052, USA E-mail:
[email protected] Series Editors: Professor Robert Hull
Professor Jürgen Parisi
University of Virginia Dept. of Materials Science and Engineering Thornton Hall Charlottesville, VA 22903-2442, USA
Universität Oldenburg, Fachbereich Physik Abt. Energie- und Halbleiterforschung Carl-von-Ossietzky-Strasse 9–11 26129 Oldenburg, Germany
Professor R.M. Osgood, Jr.
Professor Hans Warlimont
Microelectronics Science Laboratory Department of Electrical Engineering Columbia University Seeley W. Mudd Building New York, NY 10027, USA
Institut für Festkörperund Werkstofforschung, Helmholtzstrasse 20 01069 Dresden, Germany
Springer Series in Materials Science ISSN 0933-033X ISBN 978-3-540-74558-7
e-ISBN 978-3-540-74559-4
Library of Congress Control Number: 2008925374 © Springer-Verlag Berlin Heidelberg 2009 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Data prepared by VTEX using a Springer TEX macro package Cover concept: eStudio Calamar Steinen Cover production: WMX Design GmbH, Heidelberg SPIN: 12112692 57/3180/VTEX Printed on acid-free paper 987654321 springer.com
This snapshot of the IC industry and several opportunities for enhanced growth in the coming nanotechnology era is dedicated, in memoriam, to Robert Cahn and Fred Seitz, who passed away during the preparation of this book. Robert Cahn (1924–2007), FRS, University of Cambridge, and Fred Seitz (1912–2008), Rockefeller University, President, Emeritus and National Academy of Sciences, Past President
Foreword Silicon and Electronics
The readership of this monograph, Into The Nano Era – Moore’s Law Beyond Planar Silicon CMOS may be surprised to find that there was a time when silicon materials did not reign supreme. While silicon was utilized both before and during World War I for coded wireless detectors, it was quickly replaced by Vacuum Tube Electronics in the late 1910s and 1920s. The ham radio proponents often preferred to use galena (PbS) in the 1920s, a naturally occurring mineral which was much less expensive than polycrystalline silicon. In the late 1930s and with the advent of World War II, however, silicon became the preferred material for radar detectors. Silicon has continued to be the dominant (and pre-eminent) material during the rest of the twentieth century (although germanium and silicon transistors were commercialized during the 1950s). During the last several decades, we have come from the electronics revolution initiated by the transistor in the late 1940s to the microelectronics revolution, exemplified by the integrated circuit (IC) that was invented in the late 1950s, to today where we are on the verge of the nano-technology Revolution. Of course, many other materials are utilized in today’s most advanced ICs and surely this will be the case in the nano-technology of the future; yet, the base still appears to include silicon. Howard Huff and his authors have developed this monograph to guide us into the nano-technology era by focusing on some current aspects of silicon materials relevant to the fabrication of ICs and several potential opportunities in the nano era. The importance of defects and their control, both in the as-grown silicon and during the chip-making process, is emphasized. Indeed, the admonition to make certain that the quality of the silicon used in chips should be examined carefully before rather than after making the chips repeats one of the basic principles that we discovered during World War II. That is, we found by using metallurgical-grade (polycrystalline) silicon in the 1940s that the erratic behavior and irreproducibility of the electronic characteristics of detectors was dependant on the prior history of the sili-
viii
Silicon and Electronics
con material utilized. This monograph not only stirs up old memories of the earlier days but brings me to further appreciate the fabrication of today’s opto-electronic devices. And such also appears to be the case with the “bottom-up” fabrication technology in the nano era. Indeed, it appears that the drive to miniaturization is finally approaching the stage where quantum effects will become of the essence, which is quite an achievement when I recall the ham radio years of the 1920s and the earliest silicon materials utilized for radar in the late 1930s and early 1940s. I wish Howard Huff and the personnel involved in the creation of this monograph and its readership well in the exciting and never-ending journey towards the next revolution in information and communications technology – the nano era. New York, March 2008
Fred Seitz (deceased March 2008)
Foreword Silicon and the III–V’s: Semiconductor Electronics (Electron, Hole, and Photon) Forever
Without silicon and the III–V semiconductors, today’s world of electronics does not exist, would not exist, likely could not exist. There is no substitute for the semiconductor, Si ranking at the top. I learned about transistors and semiconductors from John Bardeen, and then about diffused Si devices, in their inception, with John Moll (and Carl Frosch and the oxide) before learning further from work, colleagues, meetings, and journal articles. Very early, for example, it was a trick of junction assembly of my Si tunnel diodes that revealed phonon-assisted tunneling so strikingly (1959), the first unambiguous experiment showing inelastic tunneling, which made it possible for R.H. Hall and me to introduce into solid-state science and technology (via Si!) the now-universal tunneling spectroscopy. Why deviate from Si, why go off exploring the III–V’s, when Si proved to be so rich and wondrous – with Bell Labs’ oxide and diffused device technology at its pinnacle; indeed, the very technology that, moving west, spawned the “chip” and Silicon Valley? And what about Si, and its further role? Can we now be so bold (so rude) and commit the sin of even asking the question? At the 1962 Institute of Radio Engineers (I.R.E., now the I.E.E.E.) Solid State Device Research Conference, Art D’Asaro and I engaged in a friendly argument with Bob Noyce in which we defended the case for the III–V’s (light emitters) while Bob argued for a still greater future for Si. Bob knew that Art and I, at Bell Labs, knew about Si from the beginning. Why leave it? We, and Noyce, were both right and both wrong! The two, Si and the III–V’s, are complementary. We need both. We need the electron, hole, and photon, the three so unique in performance and tied together so incestuously across the energy gap. Recall: no energy gap, no semicon-
x
Silicon and the III–Vs
ductor, no electron and hole, no transistor, no light emitter, no solar cell! What else is like this, and technologically so tractable? Nothing approaches the uniqueness of the semiconductor in what it does and in how it allows us to impose, to render amazing tiny sub-microscopic connected active-device geometries in a crystalline substance, in a nano-ordered substance, and as a consequence realize unbelievable electronic functions – the “chip.” We can now properly ask: When we were shown in John Moll’s group (Bell Labs, 1954–1955) a bag full of DuPont Si needles – nano-rods, as it were – should we have tried to attack at such an opportune moment nano-assembly? To, say, assemble at once active microscopic circuitry? Or should we have proceeded, as happened, to grow crystals from the needles (i.e., “self-assemble” bulk crystalline Si atom-byatom) and proceed bit-by-bit to the “chip”? To be sure, should we have proceeded, Oh, so slowly but, Oh, so successfully? Who could have predicted, in the beginning, all that would be needed to make today’s Si “chip”? And now, in contrast, where and what is the science and technology of direct nano-assembly? Is it, say, an ultra-tiny complex system that must take on great variety and form and not be just the bland simple atom-stacking of crystal growth? Is this (a complex system) even possible without invoking some form of sorcery, i.e., without facing the abyss of total guessing or outright chicanery? Does it make sense and in what substance? Do we wish to abandon Si? If so, why? We not only build in Si, it teaches us. For example, it is the Si p–n–p–n switch, in its successful form as the thyristor, that teaches us why a CMOS element in a “chip” breaks down or why a III–V transistor laser switches and exhibits negative resistance. As a matter of fact, it was the p–n–p–n switch that took Si to “Silicon Valley.” It is Si that we have most studied and understand best, and that informs us further in how to realize a still smaller and more sophisticated “chip.” If there is anything past the integrated circuit, the “chip,” it is Si that guides us towards it. From the standpoint of the III–V semiconductor, heterostructures and direct energy gaps, and quantum wells, we see silicon’s strengths and weaknesses. We see, in comparison with III–V’s, better and worse choices, what can be done profitably and what can not. Silicon, from 1-ton single crystal ingots to the tiniest integrated circuits, is so valuable and such a perfect guide to what is possible in the construction of ultra-small devices, that we must continue to study it. We cannot afford not to, and thus owe a considerable debt to our colleagues Howard Huff and his authors for exposing us to more Si science and technology as we enter the nanotechnology era. The most questionable topic is that of device self-assembly. We know it works for crystal growth, even in the case of a 1-ton Si crystal, but does it work for the most intricate and tiniest integrated circuits? Note that carbon self-assembles into diamond, but we polish and pattern to develop the mirror facets that make diamond an attractive and expensive jewel. When Si self-assembles, it is too simple. We pattern and process it, at increasingly tiny size, into a more complex and useful form, into an integrated circuit, a “chip.” Now how small can it be, or do we look for other ways (heterojunctions, quantum wells, etc.) to obtain higher performance? Concern-
Silicon and the III–Vs
xi
ing “self-assembly,” where and what is the science to make it real and not merely a wish or just a name? I consider Si and its study, with the aid and added perspective of the III–V semiconductor (and quantum wells), as holding the answer to whether “self-assembly” makes sense. We all get old studying the abundantly rich and fertile semiconductor, but the semiconductor itself, because of the gift of the electron, hole, and photon, and their amazingly connected performance, does not weaken or age. It is not going away. We have no choice but to study Si and the III–V family of materials. Nothing else has worked so well in electronics or promises so much more. There is reason for the semiconductor to prevail, and for us to welcome this new book of Howard Huff and his authors. Urbana, April 2008
Nick Holonyak, Jr.
Preface
The revolutionary impact of the discovery of transistor action by John Bardeen and Walter Brattain of Bell Labs in December 1947 was not anticipated. Similarly, the importance of William Shockley’s invention at Bell Labs in January 1948 of the junction transistor (which was not experimentally demonstrated until 1950, although proof-of-concept using a non-colinear configuration was shown in 1949) was not recognized immediately. The transistor’s potential was only recognized after it became evident during the 1950s that the transistor – with its much lower power dissipation – could be used to do significantly more than simply mimic vacuum tube electronics in solid-state. It was the invention of the Integrated Circuit (IC) by Jack Kilby of Texas Instruments in 1958 (germanium in the mesa configuration) and, independently, by Bob Noyce of Fairchild in 1959 (silicon in the planar configuration, built upon Jean Hoerni’s research at Fairchild in late 1957), that initiated the microelectronics revolution. Even then, however, the implications were barely perceived. The bipolar IC entered into high-volume production in the mid-to-late 1960s, followed by the MOSFET IC in the early 1970s. Patrick Haggerty’s vision at Texas Instruments in the early 1960s of the pervasiveness of the silicon microelectronics revolution, based on the concept of the “learning curve” (i.e., the concomitant reduction in the cost of fabrication with the increased volume of production) and market elasticity, was one of immeasurable significance to the fledging IC industry. Concurrently, Gordon Moore at Fairchild Semiconductor in 1965 made a remarkably prescient assessment of memory component growth, based initially on bipolar and then on MOS memory density trends: A semi-log graph of the number of memory bits in an IC versus the date of initial production was a straight line, representing almost a doubling each year. Moore’s observation (updated at Intel in 1975 to about 18 months per doubling and subsequently re-affirmed in 1995) showed that a viable market was indeed practical, and gave impetus to the industry. His analysis became enshrined as Moore’s law and set the cadence for technology advancement, e.g., as laid out in the International Technology Roadmap for Semiconductors (ITRS). These
xiv
Preface
business-oriented considerations, moreover, combined with Bob Dennard’s invention of the one-transistor/one-capacitor dynamic random access memory cell (DRAM) at IBM in 1968 and the related transistor scaling methodologies introduced by Dennard and colleagues at IBM in 1972, established the paradigm for the progression of IC fabrication technology (from a minimum feature size of about 10 μm in the early 1970s, to sub-35 nm in the present era) that has facilitated the explosive growth and application of the MOSFET IC (and subsequently the CMOS IC) during the past 35 years. The myriad of new electronic products and the creation of new market segments was not (and perhaps could not be) foreseen by the researchers involved. Indeed, Robert Lucky noted in Engineering Tomorrow (edited by J. Fouke, T.E. Bell, and D. Dooling, IEEE Press, 2000) that “there is no a priori way to determine what will tip a market. It’s a fundamental instance of chaos in group dynamics. And that makes it fundamentally difficult to predict future societal behaviors in the adoption of technologies.” More than luck is involved; nevertheless, the next application is often a surprise. And here we are, on the brink of the 50th anniversary of the invention of the IC, in the nano-technology era, wherein critical dimensions on an IC chip, such as the physical channel length, is less than 35 nm. Will silicon continue to be the preeminent active semiconductor material, and will Moore’s law continue unabated, albeit in a broader economic venue? Indeed, are we wiser now in comprehending that fundamental research, per se, inevitably will lead to new material and device configurations as well as new market opportunities, barely (if at all) perceived at the present time? The research agenda is yet our best opportunity to spawn new innovations to sustain industry expansion to the next major set(s) of global applications. In that regard, this monograph addresses these questions by reflecting upon the scientific and technological breakthroughs that enabled the microelectronics era, providing a firm foundation for ensuing research, and offering a glimpse of what is to come in the nano-technology era. Accordingly, a review and assessment of topics fundamental to silicon materials and MOSFET device structures is presented, to identify potential nano-technology research directions and possible nano-technology applications. The monograph is divided into three sections, similar to the format of the Spring 2005 issue of INTERFACE (published by The Electrochemical Society) from which this book has its genesis. The first section reviews aspects of the historical foundations of our industry. The second section proceeds to examine the silicon material and device structures that are the foundation for state-of-the art IC technology. The third section then presents perspectives of future directions for the nano-technology era. Interestingly, the authors do not anticipate that the current silicon materials/IC industry infrastructure will simply dissolve. The global captains of industry, in-point-offact, would not allow this. Rather, the initial new applications in the nano-technology era may indeed come about via integration and merging of new materials with (leading edge) IC structures, forging new applications that may be presently envisioned, even as the IC industry drives towards the sub 10 nm physical MOSFET channel
Preface
xv
length. It is the anticipation of what comes next, however, that will require our most creative perceptions and, most probably, will produce the greatest surprise(s). Historical Background Silicon, and more recently, related group-IV material systems such as silicon-germanium, have been utilized (with silicon) for IC fabrication over the past ∼45 years or so. While silicon and group-IV material systems are anticipated to continue to be utilized in future IC products, the group III–V materials may also concurrently be adopted in order to achieve continued improvement in the device active channel characteristics and related IC performance. Robert Cahn presents an historical perspective of silicon and the silicon revolution in an enchanting introduction titled Silicon: Child and Progenitor of Revolution. The phenomenal growth of the IC industry is discussed in a decidedly upbeat fashion by Dan Hutcheson in The Economic Implications of Moore’s Law. Perhaps Gordon Moore described it best when he recently noted that “. . . you are once again reminded that this is no longer just an industry, but an economic and cultural phenomenon, a crucial force at the heart of the modern world.” Moore further noted that “no exponential is forever; but ‘forever’ can be delayed.” Indeed, we will depend on a new generation of research personnel to maintain and, perhaps, extend Moore’s law into the nano-technology world and the next group of big applications. State-of-the-Art The characterization, annihilation, and selective utilization of defects to achieve superior IC performance, yield, and reliability is a cornerstone of the IC industry. Because many of the phenomena discussed are structure-sensitive, the “processstructure-property” approach is used to describe the characteristics of modern electronic/opto-electronic ICs which utilize III-V compounds in conjunction with silicon (and germanium again). Specifically, the fabrication process determines the material structure, which in turn determines the subsequent material properties and, therefore, the IC characteristics. Jim Chelikowsky notes that “computers built with silicon can be used to solve for the electronic properties of silicon itself.” Chelikowsky reviews these computational approaches from first principles in Using Silicon to Understand Silicon. Stefan Estreicher continues this first-principles study of point defects in silicon in Theory of Defects in Si: Past, Present and Challenges. These theoretical considerations, in combination with microscopic experiments, have led to an understanding of silicon that is incomparable to that of any other material studied in the technological era. The selective utilization of defects, as grown in the silicon crystal as well as process-induced during device/IC fabrication, and their mutual interactions, has achieved superior IC performance. Andrei Istratov, Tonio Buonassisi, and Eicke Weber pursue several aspects of these phenomena and, in particular, indicate the viability of such an approach for the rapidly expanding defectengineered silicon photovoltaics initiative (with quantities of silicon usage fast approaching that of the IC industry) in Structural, Elemental, and Chemical Complex
xvi
Preface
Defects in Silicon and Their Impact on Silicon Devices. Materials science and engineering will continue to be critical but it appears that the art and science wherein properties of materials may be dictated not as much by what atoms the materials consist of (taking some liberty here) but rather how they are arranged together will be the sine qua non of opto-electronic devices and circuits in the nano-era. The theme of defects and their control may be further extended by realizing that the surface itself may be considered a giant defect, as noted by H.C. Gatos of M.I.T. and others in the 1960s. The characterization and control of the silicon surface is a fundamental requirement for stable device and IC characteristics. Martin Frank and Yves Chabal present our current understanding of surfaces and interfaces as well as their unique position in silicon micro-electronics, in Surface and Interface Chemistry for Gate Stacks on Silicon. This section concludes with two device-focused articles. Patricia Mooney presents a summary of current trends in silicon-based nano-electronics – in particular the enhancement of carrier mobilities – in Enhanced Carrier Mobility for Improved CMOS Performance. The use of variously configured, sequential compositions and combinations of strained silicon-germanium (utilizing carbon as appropriate) to produce strain at the silicon channel surface for various MOSFET configurations permits electron and hole mobilities higher than predicted by the universal mobility curves. Further materials opportunities are noted, wherein an NMOS [PMOS] transistor exhibits optimal electron [hole] mobility for the (100) [(110)] silicon wafer orientation (in the 110 direction for both surfaces). Methods of fabricating substrates to enhance both NMOS and PMOS performance are described as a hybrid orientation technology (HOT), and a simplified hybrid orientation technology (SHOT). Finally, Tsu-Jae King Liu and Leland Chang discuss a host of silicon-based advanced transistor structures and associated materials, based on the conventional “top-down” IC fabrication methodology in Transistor Scaling to the Limit. They note that these efforts are expected to extend the ITRS to a physical channel length in the single-digits, consistent with IC leakage current, power-supply voltage, and power-delay product specifications. Future Directions The final section of this monograph covers several evolving opportunities for future nano-technology. Ted Kamins discusses the alternative “bottom-up” approach for device fabrication in the nano-world in Beyond CMOS Electronics: Self-Assembled Nanostructures. Here we see the concept of “self-assembly,” introduced by way of an example in the fabrication of in-plane nanowires (5 nm in diameter by several hundred nm in length) for connections between active circuit components to enhance IC performance. Indeed, we are still basically using silicon and its myriad fabrication process technologies in conjunction with the self-assembly concept. Mircea R. Stan, Garrett S. Rose, and Matthew M. Ziegler then discuss Hybrid CMOS / Molecular Integrated Circuits. The authors look to further the pervasiveness of silicon technology by “piggy-backing” the nano-technology world onto the ever-shrinking IC devices on a chip. In these initial nano-technology applications,
Preface
xvii
the authors suggest that the (nano) molecular assembled structure will be electrically connected to the upper surface of a programmable logic array (PLA), based on majority-carrier logic, with appropriate wiring schemas. It is anticipated that a nanotechnology single-electron transistor can be operated in conjunction with a CMOS logic IC at room temperature (an extremely important requirement), thereby enhancing the performance of advanced logic CMOS devices beyond what they could achieve on their own. Delving further into the nano-world, Andre DeHon notes that at the current stage of the micro/nano-electronics revolution, we no longer have the orders of magnitude difference between the size of the IC and the constituent atoms, which previously allowed the crafting of large collections of atoms into “perfect” devices. Accordingly, Andre notes that circuit designers and architects now need to take some of the responsibility for dealing with truly atomic-scale imperfections and uncertainty in Sublithographic Architecture: Shifting the Responsibility for Perfection. Finally, David P. DiVincenzo discusses Quantum Computing. Besides the potential realization of fabricating qubits (quantum bits) in Josephson junction circuits and ion traps, the author discusses the role of semiconductor quantum dots. He notes that III–V heterostructures might indeed facilitate the fabrication of a quantum computer. Interestingly, the scientific literature is also discussing the utilization of an isolated silicon double quantum dot as a qubit. The author notes at the end of his article: “It may be hoped that in ten years the details of this chapter will be thoroughly obsolete, and completely new and unanticipated effects will have been seen and controlled in such a way that it makes the path to a quantum computer clear. We will see.” Indeed, we shall see more clearly as we enter the nano-technology era to identify the next big technologies that can be wrought from the nano-world for the betterment of humankind. Finally, we are fortunate to have four additional brief contributions to the monograph rounding out this perspective of Into The Nano Era: Moore’s Law Beyond Planar Silicon CMOS. Fred Seitz leads off with a brief introductory comment, Silicon and Electronics, about the evolution of electronics over the past 75 years. This is followed by Nick Holonyak’s reflections on Silicon and The III–V’s: Semiconductor Electronics (Electron, Hole, and Photon) Forever. We conclude with two afterwords by the Nobel Prize awardees Herb Kroemer and Horst Stormer. Herb Kroemer’s contribution is titled Nano-Whatever: Do We Really Know Where We Are Heading?, reprinted from Phys. Stat. Sol. (a) 202, No. 6, 957–964 (2005). Horst Stormer’s afterword is titled Silicon Forever! Really?, from the 2006 issue of Solid-State Electronics, 50, No. 4, 516–519 (2006). Clearly, we will all have benefited by these colleagues sharing their perspectives with us as we enter the nano-technology era. Acknowledgments We appreciate the contributions of Len Feldman and Konstantin Likharev for their participation in the earlier version of this endeavor, published by The Electrochemical Society in the Spring 2005 issue (14, No. 1) of INTERFACE. We also appreciate Mary Yess, Deputy Executive Director of The Electrochemical Society, for her
xviii
Preface
fine assistance during the development of the original INTERFACE issue. Appreciation and thanks are due to Claus Ascheron, Physics Editorial IV, Executive Editor Physics and Ms. Adelheid Duhm, Associate Editor Physics, of Springer-Verlag for their strong interest and guidance throughout the development and production of this book. Finally, this book is dedicated to Robert Cahn and Fred Seitz, both of whom passed away during the preparation of Into the Nano Era: Moore’s Law Beyond Planar Silicon CMOS. Henderson, NV, April 2008
Howard R. Huff
Contributors’ Acknowledgement
This volume would not have come into existence without the unrelenting efforts of Howard Huff and his dedicated wife Helen. Howard Huff succeeded to describe in the Preface the whole history of the semiconductor revolution of the last 60 years, but he left out one important person: Huff, as he likes to be called, himself. First, he promoted the field through his own research at Fairchild, later National Semiconductors. Later, his work had even more impact in the field, we can say Huff’s contributions were key for the Si IC technology following the (accelerated) Moore’s law. This became especially obvious during his years at Sematech where he was instrumental in establishing the ITRS Si roadmap. He went on to make sure that regular, very carefully worked-out updates were created, mainly by consensus of the members of the different task teams. This careful work made the Si roadmap immensely valuable for all planning processes, especially of the equipment supplier industry. Recently, his interest was specifically focused on issues related to the gate stack and the search for new gate dielectrics. In addition, he was responsible for the big Si conferences organized each four years for more than 40 years with the Electrochemical Society, the proceedings of which contain an impressive body of knowledge in the field of Si materials science and technology, and continue to be frequently cited. Huff kept saying in the last decade of the last millennium that the Si roadmap is only layed out till 2010 as he will be dead afterwards and thus does not care beyond that date. Well, Huff, in this single instant we might prove you wrong, we expect to have you with us beyond 2010, and the ITRS Si roadmap of course stretches by now far beyond 2010.
xx
Contributors’ Acknowledgement
It has been a special honor for all of us to work closely with Huff in this book project, and we look forward to exciting initiatives in this important field from Huff in the years to come! Tonio Buonassisi, Patricia Cahn, Yves J. Chaba, Leland Chang, Jim Chelikowsky, Andre DeHon, David P. DiVincenzo, Stefan K. Estreicher, Martin M. Frank, Nick Holonyak, Jr., Dan Hutcheson, Andrei Istratov, Ted Kamins, Herbert Kroemer, Tsu-Jae King Liu, Patricia M. Mooney, Garrett S. Rose, Fred Seitz, Mircea R. Stan, Horst L. Stormer, Eicke R. Weber, Matthew M. Ziegler
Contents
Foreword: Silicon and Electronics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vii
Foreword: Silicon and the III–V’s: Semiconductor Electronics (Electron, Hole, and Photon) Forever . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ix
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii List of Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xxvii Part I Historical Background 1 Silicon: Child and Progenitor of Revolution R.W. Cahn† . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 9
2 The Economic Implications of Moore’s Law G.D. Hutcheson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Moore’s Law: A Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 The History of Moore’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 The Microeconomics of Moore’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 The Macroeconomics of Moore’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Moore’s Law Meets Moore’s Wall: What Is Likely to Happen . . . . . . . . 2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11 11 12 12 23 30 32 35 36 38
† Deceased.
xxii
Contents
Part II State-of-the-Art 3 Using Silicon to Understand Silicon J.R. Chelikowsky . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The Electronic Structure Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 The Empirical Pseudopotential Method . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Ab Initio Pseudopotentials and the Electronic Structure Problem . 3.3 New Algorithms for the Nanoscale: Silicon Leads the Way . . . . . . . . . . 3.4 Optical Properties of Silicon Quantum Dots . . . . . . . . . . . . . . . . . . . . . . . 3.5 Doping Silicon Nanocrystals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 The Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41 41 42 42 47 50 52 55 58 58
4 Theory of Defects in Si: Past, Present, and Challenges S.K. Estreicher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 From Empirical to First-Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 First-Principles Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 First-Principles Theory at Non-zero Temperatures . . . . . . . . . . . . . . . . . . 4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61 61 63 66 70 73 74
5 Structural, Elemental, and Chemical Complex Defects in Silicon and Their Impact on Silicon Devices A.A. Istratov, T. Buonassisi, E.R. Weber . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Defect Interactions in Single-Crystalline Silicon . . . . . . . . . . . . . . . . . . 5.3 Precipitation Behavior, Chemical State, and Interaction of Copper with Extended Defects in Single-Crystalline and Multicrystalline Silicon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Precipitation Behavior, Chemical State, and Interaction of Iron with Extended Defects in Silicon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Pathways for Metal Contamination in Solar Cells . . . . . . . . . . . . . . . . . 5.6 Effect of Thermal Treatments on Metal Distributions and on Device Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Discussion: Chemical States of Metals in mc-Si . . . . . . . . . . . . . . . . . . . 5.8 Discussion: Interactions between Metals and Structural Defects . . . . . 5.9 Discussion: Engineering of Metal-Related Nanodefects by Altering the Distributions and Chemical States of Metals in mc-Si . . . . . . . . . . . 5.10 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79 79 80
84 91 96 99 101 104 106 108 109
Contents
6 Surface and Interface Chemistry for Gate Stacks on Silicon M.M. Frank, Y.J. Chabal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Introduction: The Silicon/Silicon Oxide Interface at the Heart of Electronics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Current Practices and Understanding of Silicon Cleaning . . . . . . . . . . . 6.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Silicon Cleans Leading to Oxidized Silicon Surfaces . . . . . . . . 6.2.3 Si Cleans Leading to Hydrogen-Terminated Silicon Surfaces . 6.2.4 Microscopic Origin of Silicon Oxidation . . . . . . . . . . . . . . . . . . 6.2.5 Initial Oxidation of Hydrogen-Terminated Silicon . . . . . . . . . . . 6.3 High-Permittivity (“High-k”) Gate Stacks . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2 Silicon Surface Preparation and High-k Growth: The Impact of Thin Oxide Films on Nucleation and Performance . . . . . . . . 6.3.3 Post-Treatment of the High-k Layer: Nitridation . . . . . . . . . . . . 6.3.4 The pFET Threshold Voltage Issue: Oxygen Vacancies . . . . . . 6.3.5 Threshold Voltage Control: Oxygen and Metal Ions . . . . . . . . . 6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xxiii
113 113 115 115 116 124 136 137 147 147 148 156 157 158 161 161
7 Enhanced Carrier Mobility for Improved CMOS Performance P.M. Mooney . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 7.2 Enhanced Carrier Mobility in Si under Biaxial Tensile Strain . . . . . . . . 169 7.2.1 Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 7.2.2 Strain-Relaxed SiGe Buffer Layers . . . . . . . . . . . . . . . . . . . . . . . 171 7.2.3 SGOI and SSOI Substrates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 7.2.4 Defect-Free (Elastic) Strain Relaxation . . . . . . . . . . . . . . . . . . . . 178 7.3 Enhanced Hole Mobility via Biaxial Compressive Strain . . . . . . . . . . . 181 7.4 Other Methods to Increase Carrier Mobility for Si CMOS Applications 183 7.4.1 Hybrid Crystal Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 7.4.2 Uniaxial Strain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 8 Transistor Scaling to the Limit T.-J.K. Liu, L. Chang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Planar Bulk MOSFET Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Thin-Body Transistor Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.1 Ultra-Thin Body (UTB) MOSFET . . . . . . . . . . . . . . . . . . . . . . . . 8.3.2 Double-Gate (DG) MOSFET . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.3 Tri-Gate (TG) MOSFET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.4 Back-Gated (BG) MOSFET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Fundamental Scaling Limit and Ultimate MOSFET Structure . . . . . . .
191 191 193 196 197 199 205 205 207
xxiv
Contents
8.5 Advanced Gate-Stack Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5.1 High-k Gate Dielectrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5.2 Metallic Gate Electrode Materials . . . . . . . . . . . . . . . . . . . . . . . . 8.6 Performance Enhancement Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6.1 Enhancement of Carrier Mobilities . . . . . . . . . . . . . . . . . . . . . . . 8.6.2 Reduction of Parasitic Components . . . . . . . . . . . . . . . . . . . . . . . 8.6.3 Alternative Switching Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
209 209 210 213 213 215 216 216 217
Part III Future Directions 9 Beyond CMOS Electronics: Self-Assembled Nanostructures T.I. Kamins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.1 Conventional “Top-Down” Fabrication . . . . . . . . . . . . . . . . . . . . 9.1.2 “Bottom-Up” Fabrication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Strain-Induced Nanostructures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Metal-Catalyzed Nanowires . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.1 Catalyst Nanoparticles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.2 Nanowire Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.3 Germanium and Compound-Semiconductor Nanowires . . . . . . 9.3.4 Doping Nanowires . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.5 Connecting Nanowires . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.6 Comparison of Semiconducting Nanowires and Carbon Nanotubes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Potential Applications of Metal-Catalyzed Nanowires . . . . . . . . . . . . . . 9.4.1 Field-Effect Transistors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.2 Field-Effect Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.3 Interconnections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Hybrid CMOS/Molecular Integrated Circuits M.R. Stan, G.S. Rose, M.M. Ziegler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.1 Top-Down Fabrication vs. Bottom-Up Assembly . . . . . . . . . . 10.1.2 Typical Molecular Device Characteristics . . . . . . . . . . . . . . . . . 10.2 MolMOS: Integrating CMOS and Nanoelectronics . . . . . . . . . . . . . . . 10.2.1 The CMOS/Nano Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.2 CMOS/Nano Co-design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 The Crossbar Array for Molecular Electronics . . . . . . . . . . . . . . . . . . . 10.3.1 Molecular Memory Structures . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.2 Programmable Logic via the Crossbar Array . . . . . . . . . . . . . .
227 227 227 228 229 235 235 238 241 243 244 250 251 251 252 252 253 254 257 257 257 258 259 260 262 264 265 267
Contents
xxv
10.3.3 Signal Restoration at the Nanoscale: The Goto Pair . . . . . . . . 10.3.4 Hysteresis and NDR based Devices in Programmable Logic . 10.4 MolMOS Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.1 The CMOS Interface & I/O Considerations . . . . . . . . . . . . . . . 10.4.2 Augmenting the PMLA with CMOS . . . . . . . . . . . . . . . . . . . . . 10.4.3 Array Access for Programmability . . . . . . . . . . . . . . . . . . . . . . 10.4.4 A More Complete Picture of the Overall Architecture . . . . . . 10.5 Circuit Simulation of MolMOS System . . . . . . . . . . . . . . . . . . . . . . . . . 10.5.1 Device Modeling for Circuit Simulation . . . . . . . . . . . . . . . . . . 10.5.2 Functional Verification of a Stand-Alone Nanoscale PMLA . 10.6 Conclusions and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
268 270 272 272 272 273 274 276 276 276 278 279
11 Sublithographic Architecture: Shifting the Responsibility for Perfection A. DeHon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Revising the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Bottom-Up Feature Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Regular Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4 Statistical Effects Above the Device Level . . . . . . . . . . . . . . . . . . . . . . 11.4.1 Defect and Variation Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.2 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5 NanoPLA Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.6 Defect Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.6.1 Wire Sparing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.6.2 Crosspoint Defects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.6.3 Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.6.4 Roundup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.7 Testing and Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.8 New Abstraction Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.8.1 Lessons from Data Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.8.2 Abstraction Hierarchy for Computation . . . . . . . . . . . . . . . . . . 11.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
281 281 282 283 283 283 284 285 288 288 290 291 291 292 293 293 293 295 295
12 Quantum Computing D.P. DiVincenzo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1 What Is Quantum Computing? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3 Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.4 Quantum Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.5 Realizing a Quantum Computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.6 Physical Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.6.1 Josephson Junction Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.6.2 Semiconductor Quantum Dots . . . . . . . . . . . . . . . . . . . . . . . . . .
297 297 298 299 301 303 306 306 308
xxvi
Contents
12.6.3 Ion Traps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 12.6.4 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 Part IV Afterwords 13 Nano-Whatever: Do We Really Know Where We Are Heading? Phys. Stat. Sol. (a) 202(6), 957–964 (2005) H. Kroemer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.1 Introduction: “Nano-Talk = Giga-Hype?” . . . . . . . . . . . . . . . . . . . . . . . 13.2 From Physics and Technology to New Applications . . . . . . . . . . . . . . 13.2.1 Kroemer’s Lemma of New Technology . . . . . . . . . . . . . . . . . . 13.2.2 Three Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2.3 Lessons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3 Roots of Nano-Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.4 Back to the Future: Beyond a Single Degree of Quantization . . . . . . . 13.4.1 Quantum Wires . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.4.2 Quantum Dots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.5 More Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.5.1 Lithography Alternatives for the Nanoscale . . . . . . . . . . . . . . . 13.5.2 “Loose” Nanoparticles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.6 “Other” Quantization Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.6.1 Charge Quantization and Coulomb Blockade . . . . . . . . . . . . . . 13.6.2 Magnetic Flux Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.6.3 Spintronics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.7 Meta-Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.8 Research vs. Applications Re-visited . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
317 317 317 317 318 319 320 320 320 321 322 322 322 323 323 323 324 324 325 326 326
14 Silicon Forever! Really? Solid-State Electr. 50(4), 516–519 (2006) H.L. Stormer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2.1 The End of Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2.2 The “Beginning” of Architecture . . . . . . . . . . . . . . . . . . . . . . . . 14.2.3 Silicon Stands Tall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2.4 The Silicon Wart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2.5 Beyond Lithography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.4 Acknowledgements and Disclaimer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.5 Citations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
327 327 327 328 329 330 331 332 333 333
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
List of Contributors
Tonio Buonassisi Assistant Professor of Mechanical Engineering Massachusetts Institute of Technology 77 Massachusetts Avenue, 35-213 Cambridge MA 02139, USA
[email protected] Robert W. Cahn† Yves J. Chabal Department of Materials Science and Engineering University of Texas at Dallas 800 W. Campbell Rd., M/S RL10 Richardson, TX 75080, USA
[email protected] Leland Chang Manager, Design and Technology Solutions IBM T.J. Watson Research Center P.O. Box 218 Yorktown Heights NY 10598, USA
[email protected] James R. Chelikowsky Institute for Computational Engineering and Sciences (ICES) (C0200) ACES Building, Room 4.324 201 East 24th Street ACES 1 University Station University of Texas at Austin Austin, TX 78712, USA
[email protected] Andre DeHon University of Pennsylvania Department of Electrical and Systems Engineering 200 S. 33rd Street Philadelphia, PA 19104, USA
[email protected] David P. DiVincenzo IBM T.J. Watson Research Center P.O. Box 218 Yorktown Heights, NY 10598, USA
[email protected] Stefan K. Estreicher Paul Whitfield Horn Professor Physics Department - ms 1051 Texas Tech University Lubbock, TX 79409, USA
[email protected] xxviii
Martin M. Frank IBM T.J. Watson Research Center Room 5-117 1101 Kitchawan Road Yorktown Heights, NY 10598, USA
[email protected] Garrett S. Rose Dept. of ECE Polytechnic University Five MetroTech Center Brooklyn, NY 11021, USA
[email protected] Dan Hutcheson VLSI Research, Inc. 2880 Lakeside Drive, Suite 350 Santa Clara, CA 95054, USA
[email protected] Fred Seitz Rockefeller University President Emeritus New York, NY 10065, USA
Andrei Istratov 9635 NW Shadywood Ln. Portland, OR 97229, USA andrei.istratov@ silitronic.com Ted Kamins Hewlett-Packard Laboratories 1501 Page Mill Road - M/S 1123 Palo Alto, CA 94304-1100, USA
[email protected] Herbert Kroemer ECE Department University of California Santa Barbara, CA 93106-9560, USA
[email protected] Tsu-Jae King Liu 42063 Benbow Drive Fremont, CA 94539, USA
[email protected] Patricia M. Mooney Professor and Canada Research Chair in Semiconductor Physics Physics Department Simon Fraser University 8888 University Drive Burnaby, BC V5A 1S6, Canada
[email protected] Mircea R. Stan Dept. of ECE, University of Virginia, 351 McCormick Road Charlottesville, VA 22904, USA
[email protected] Horst L. Stormer Dept. Physics Columbia University 704 Pupin Hall (Mail Code 5255) 538 West 120th Street New York, NY 10027, USA
[email protected] Eicke R. Weber Fraunhofer-Institut für Solare Energiesysteme ISE Heidenhofstr. 2 79110 Freiburg, Germany eicke.weber@ ise.fraunhofer.de Matthew M. Ziegler IBM T.J. Watson Resaerch Center 1501 Kitchawan Road, P.O. Box 218 Yorktown Heights, NY 10598, USA
[email protected] Part I
Historical Background
1 Silicon: Child and Progenitor of Revolution R.W. Cahn
Antoine Lavoisier, the pioneering French chemist who (together with Joseph Priestley in England) identified oxygen as an element and gave it its name, in 1789 concluded that quartz was probably a compound with an as-yet undiscovered but presumably extremely common element. That was also the year in which the French Revolution broke out. Five years later, the Jacobins accused Lavoisier of offences against the people and cut off his head, thereby nearly cutting off the new chemistry. It was not until 1824 that Jöns Berzelius in Sweden succeeded in confirming Lavoisier’s speculation by isolating silicon. Argument at once broke out among the scientific elite as to whether the newly found element was a metal or an insulator. It took more than a century to settle that disagreement decisively: As so often, when all-or-nothing alternatives are fiercely argued, the truth turned out to be neither all nor nothing. Silicon and oxygen are in fact the most abundant elements in the earth’s crust and are also very common in our galaxy. Why in particular is silicon so common? Our modern understanding of nucleosynthesis got under way at about the same time as the invention of the transistor. The great British astronomer Fred Hoyle in 1946 [1] took the first steps in working out how hydrogen first fused to generate helium and how multiple helium nuclei might then fuse to produce carbon, which in turn would fuse with more helium nuclei to progressively generate heavier elements (all of which astronomers simply call ‘metals’). An apparently insoluble energy barrier turned up against the combination of beryllium and helium to generate carbon; Hoyle proposed a possible way around this roadblock and in one of the great triumphs of modern astronomy he combined with several American colleagues to prove in detail that this escape route was indeed correct [2]. The synthesis of elements up to silicon and iron proceed in the interior of stars at temperatures exceeding 109 K. Further nucleosynthesis, of heavier elements, mostly takes place in supernovas which are even hotter. Silicon is one of the stablest elements against both fusion and fission, which is very appropriate for an element that has proved so crucial for humanity. Silicon R.W. Cahn is deceased.
4
R.W. Cahn
is in fact often used by astronomers as a reference standard when they estimate the cosmic abundances of different elements. Nucleosynthesis is today sometimes utilised for the improvement of semiconducting devices. The minority silicon isotope 30 Si can be transmuted into 31 P by bombardment at ambient temperature with thermal neutrons. This was first discovered by Lark-Horowitz in 1951 [3] and later applied to practical devices requiring extremely uniform phosphorus doping: the recent history of this approach, with its benefits and drawbacks, is set out by Wilkes [4]. Towards the end of the nineteenth century, silicon found a growing role as an alloying element for iron. The British metallurgist Robert Hadfield discovered some interesting properties in iron–silicon alloys with a few mass per cent of silicon and very little carbon. Systematic experiments at the end of the century by William Barrett in Dublin, Ireland, culminated in the single-phase iron–silicon alloys that for more than a century have been used for transformer laminations, saving significant money because transformers made with this alloy had very low core losses. The American metallurgist T.D. Yensen (who later introduced the use of vacuum melting for these alloys) estimated as early as 1921 [5] that in the first 15 years of silicon– iron, the use of this alloy family had returned savings in electrical power generation and transmission sufficient to finance the building of the Panama Canal – and this was before the mastery of crystallographic textures further improved the performance of silicon–iron transformer laminations. This early use of silicon thus foreshadowed the extraordinary financial savings and untold applications resulting from the introduction of transistors and integrated circuits, half a century later. A detailed account of the development of silicon-iron was written by J.L. Walter of the GE (Central) Research Laboratory [6]. The electrical uses of silicon began hesitatingly. Crystal rectification, making use of cat’s whisker counter-electrodes, developed into early detectors for wireless telegraphy, and coarse-grained silicon of merely “metallurgical-grade purity” (99%) was used until World War I when vacuum tubes began to take over the role of detectors. According to a brilliant historical overview of electronic developments involving silicon [7], Jürgen Rottgardt in Germany in 1938 reported on extensive research into the possible use of cat’s whisker crystal rectifying junctions in the microwave region, which was becoming important for the incipient development of radar. Rottgardt concluded that the combination silicon–tungsten was particularly promising as a detector in this wavelength range. This was developed into a practical detector by Herbert Skinner in Britain during World War II, and independently by Russell Ohl and George Southworth at the Bell Telephone Laboratories in America. This approach gradually gained ground against the devotees of vacuum tubes due to its higher operating frequency; each advance in this field was fiercely resisted by the exponents of the preceding orthodoxy. Seitz and Einspruch [7] tell us that in 1941 Skinner wrote a bitter little poem, which included the words: “And so alone / we, fighting every inch of the way, / against those ingrained elephants of inertia / against. . . prejudice and hardened pride. . . / we fought (through forests thick with self-satisfaction) / to shorter electromagnetic wavelengths.”
1 Silicon: Child and Progenitor of Revolution
5
A proper understanding of the electrical properties of silicon was slow in coming. The term “semiconductor” appears to have been first used by Alexander Volta in 1782. Humphry Davy in London, in 1840, first established that ordinary metals become poorer conductors as they heat up, while a few years later, Michael Faraday, working in the same laboratory as Davy, discovered a number of compounds which conducted electricity better as they became warmer. Attention soon focused on silver sulphide, Ag2 S, and this was thoroughly studied; this compound is today known to exhibit a semiconductor–metal transition. In the early days it proved impossible to get good reproducibility, and it became the orthodoxy that semiconductors must be impure to function as such and, ipso facto, were not respectable materials because impurities necessarily vary from one sample to another. Until the end of the 1930s, most physicists looked down their noses at semiconductors and kept clear of them; some, like Wolfgang Pauli, expressed themselves in positively violent terms: a semiconductor, declared Pauli, is a “Schweinerei”.
Sir Alan Herries Wilson, 1905–1995. Photograph by Godfrey Wilson, collection of the National Portrait Gallery, London, reproduced with permission
6
R.W. Cahn
The man who changed all this was Alan Herries Wilson, a theoretical physicist in Cambridge, who as a young man spent a sabbatical with Werner Heisenberg in Leipzig and applied the brand-new field of quantum mechanics to issues of electrical conduction, first in metals and then in semiconductors, as reported in two Royal Society papers in 1931 [8, 9]. When he returned to Cambridge, Wilson urged that attention be paid to germanium but, as he expressed it long afterwards in a retrospective essay [10], “the silence was deafening” in response. He was told that devoting attention to semiconductors, those messy entities, was likely to blight his career among physicists. He ignored these dire warnings and in 1939 he brought out his famous book, Semi-conductors and Metals [11], which interpreted semiconductor properties, including the much-doubted phenomenon of intrinsic semiconduction, in terms of electronic energy bands. His academic career does indeed seem to have been blighted, because despite his intellectual distinction he was not promoted in Cambridge. At the end of World War II, he abandoned his university functions (a cousin of mine was his last research student) and embarked on a long and notably successful career as an industrialist, culminating in his post of chief executive of a leading pharmaceutical company. He kept clear of electronics. It was only in the 1940s that n and p-type domains in silicon were observed and their nature identified, by the metallurgists Jack Scaff and Henry Theuerer at the Bell Laboratories, collaborating with Ohl and Southworth. They determined that the sense of rectification in point-contact mode was opposite either side of a p/n junction. Many years later, Scaff published an account of these early researches [12]. The recognition that the way forward for transistor technology lay in the use of single crystals did not come until the early 1950s. Gordon Teal at the Bell Laboratories was the visionary who pushed this recognition through against fierce opposition. Teal, incidentally, was a devoted admirer of Wilson’s great book. The process of silicon crystal growth was enhanced by W.C. Dash in 1958/59 in a way that got rid of almost all dislocations and their associated electrical effects. The role of various defects, including dislocations, and more generally the role of materials science in microelectronics “past, present and future” has been surveyed by Mahajan [13, 14]. The other recognition that came in the 1950s was the imperative need for extreme purity, in germanium and in silicon. True, the material for transistors had to be doped to create controlled n and p-type domains, but such doping only worked if it was applied to ultrapure starting material. In those early days, the essential approach was zone-refining, invented at the Bell Laboratories by a chemical engineer, William Pfann: It involved the passage of successive narrow molten zones along an ingot, gradually sweeping impurities to one end. For a decade at least, zone-refining was the inescapable technique for achieving ultrapure, crystalline germanium, at a time when this semiconductor was the material of choice for transistors. However, this technique was not applicable to silicon owing to its reactivity with the walls of the zone-refining chamber material at silicon’s melting point of 1414◦ C. For silicon, thereafter, chemical purification using silicon halides and silane was used. It seems that zone-refining is still used today with germanium intended for radiation detectors. Students of electronics today may not sufficiently appreciate the importance of zone-
1 Silicon: Child and Progenitor of Revolution
7
refining, without which the age of solid-state electronics, including microcircuits and nanocircuits, would have been substantially delayed. The developments which I have very concisely sketched here are beautifully treated at length in an outstanding book by Riordan and Hoddeson [15]. After the long years during which semiconductors, including silicon, were widely held in contempt, silicon has now become the most studied element in the periodic table, having overtaken iron nearly 40 years ago. The physics, chemistry and processing technology of silicon captivate a ceaseless procession of highly skilled scientists and engineers. The methods developed for shaping silicon monocrystals on an ultrafine scale, making use of controlled etching, oxidation and vacuum deposition, have recently led to some unexpected applications. The whole field of microelectromechanical systems (MEMS) is based on this technology; materials issues in MEMS have recently been reviewed [16]. MEMS already has some mass applications, including acceleration sensors for automotive airbags and tire monitoring systems, but the newest uses have some way to go before mass application. A recent study describes the microfabrication of a high-pressure bipropellant rocket engine, starting with a stack of single-crystal silicon wafers. The engine, weighing 1.2 g and generating just 1 N of thrust at a thrust power rating of 750 W, might be used “on future generations of spacecraft including microsatellites and very small launch vehicles” and be used for “‘servicing existing satellites” [17]. A parallel study describes the design of a silicon micro-turbo-generator [18]. Such a “micro-engine” is intended to be able to produce 50 W of electrical power in a device measuring less than a cubic centimetre while consuming 7–8 g of jet fuel per hour; it would achieve more than ten times the power and energy density of current batteries, at a reasonable cost. As a contribution to this form of design, the fracture strength of silicon on a very fine scale has been systematically examined in relation to factors such as the etching technique used for shaping MEMS [19]. Silicon is used in these futuristic designs not because, in mechanical engineering terms, it is the ideal material (it clearly is not), but because it can be shaped with the extreme precision needed, using techniques perfected in the microelectronics industry. I have pointed out that silicon is sometimes used as a reference element in assessing the abundances of different elements in the galaxy. This is by no means the only such use of silicon. As long ago as 1956, the International Union of Crystallography resolved to organise a project on the precision measurement of lattice parameters. 16 laboratories worldwide took part, all measuring the same batches of silicon and tungsten powders, and mostly using photographic diffraction methods; the results were published in 1960 [20]. Agreement was only about one part in 104 , including random and systematic errors. This was disconcertingly poor. Thirty years later [21], techniques had greatly improved, and in fact a silicon powder known unromantically as SRM640B was certified by the National Bureau of Standards (re-named as the National Institute of Science and Technology) to have a lattice parameter determined over 100 times more precisely than the 1960 measurement. In the interim, another completely different approach to measuring the lattice parameter of silicon, this time
8
R.W. Cahn
in the form of a single crystal, was invented by Bonse and Hart [22]. This made use of an X-ray interferometer which came to be known as an “Ångström ruler” [23]. This device is cut from a single highly perfect silicon crystal. X-ray interference produces a series of fringes, the spacing of which is measured by a separate, backlash-free moving crystal and the motion of which is measured by means of an optical (light) interferometer. The X-ray wavelength does not need to be known. The outcome is that the lattice parameter of silicon can be measured in terms of an optical wavelength which is in fact the modern international length standard, and thereby single-crystal silicon became a reliable secondary length standard. The latest projected use of electronic-grade silicon, the most unexpected of all, is as a tool in one of the last great unsolved problems in metrology: the science of ultimate standards. Of the seven base units of the International System of Units (the SI) – the meter, kilogram, second, ampere, kelvin, mole and candela – only the kilogram is still defined in terms of a material, the standard kilogram, made of platinum-iridium alloy and kept under conditions of extraordinary care, in a vault in Paris. Three of the base units, the ampere, mole and candela, require reference to the kilogram. Metrologists the world over are now engaged in an extremely demanding research program to replace the metal standard with another standard based upon an “invariant of nature”. Two alternative approaches are being examined: the watt balance and the X-ray crystal density (XRCD) method using silicon. The watt balance involves balancing the gravitational pull on a metal mass against an electromagnetic force derived from a coil immersed in a magnetic field: the outcome is to relate the kilogram to Planck’s constant, h. ¯ The XRCD method relies on measuring a large-scale mass in terms of the mass of a silicon atom. This method is deeply linked to Avogadro’s constant. Silicon has been chosen because the microelectronics industry has shown how to make a monocrystal of unique perfection and purity. Such a crystal is shaped into a sphere weighing nominally one kilogram, and polished to a sphericity so perfect that if it were expanded to the size of the earth, the highest mountain would be about 7 meters in height. The diameter of the sphere is measured by laser interferometry, and the number of atoms in the sphere is deduced from the lattice parameter, itself measured by means of X-ray interferometry (the Ångström ruler introduced above). The benefit of using a “perfect” sphere is that a single-size measurement (the diameter) suffices to determine the volume of the crystal. The essentials of both methods, the watt balance and XRCD, and the implications of each for metrology, are set out accessibly in a recent article [24]. These implications are analysed in great depth in two very recent papers [25, 26]; its title is “Redefinition of the kilogram: a decision whose time has come”. However, nobody in the metrology community appears to be willing just yet to express a positive preference between the two approaches – the matter is just too delicate. At present, there is a mysterious mismatch between the results of the two approaches [27]. Both methods have been examined in several countries; the chase is firmly international. Thus, the current silicon sphere has been produced in Australia. Scientists in Germany, USA, Britain, Belgium and Russia are engaged in this enterprise. The next objective is for a Russian team to produce a sufficient amount of silicon en-
1 Silicon: Child and Progenitor of Revolution
9
riched to 99.99% in the majority isotope, 28 Si, to manufacture a sphere in which the atomic weight is known to a higher precision than for natural silicon. The hope is that such a sphere will reduce the uncertainty of measuring Avogadro’s constant (on the basis of the present standard kilogram) to better than one part in 107 , or alternatively (on the basis of the accepted value of Avogadro’s Constant) allow a standard kilogram of silicon to be reproduced with this kind of precision. This whole approach is only possible because of the devoted labours of generations of microelectronics specialists. My title for this introduction – Silicon: Child and Progenitor of Revolution – indicates that while the identification of silicon in a certain sense derives from a political revolution, its modern study has generated one scientific revolution after another. These revolutions all stem from the world of microelectronics, which itself involves successive revolutions.
References 1. F. Hoyle, Mon. Not. R. Astron. Soc. 106, 343 (1946) 2. F. Hoyle, E.M. Burbidge, G.R. Burbidge, W. Fowler, Rev. Mod. Phys. 29, 547 (1957) 3. K. Lark-Horowitz, in Proc. Conf. Semiconducting Materials, Reading, UK (Butterworth, London, 1951), p. 47 4. J.G. Wilkes, in Processing of Semiconductors, ed. by K.A. Jackson. Materials Science and Technology (ed. by R.W. Cahn et al.), vol. 16 (VCH, Weinheim, 1996), p. 19 5. T.D. Yensen, Elec. J. (March 1921) 6. J.L. Walter, in The Sorby Centennial Symposium on the History of Metallurgy, ed. by C.S. Smith (Gordon and Breach, New York, 1965), p. 519 7. F. Seitz, N.G. Einspruch, Electronic Genie: The Tangled History of Silicon (University of Illinois Press, Urbana, 1998) 8. A.H. Wilson, Proc. R. Soc. Lond. A 133, 458 (1931) 9. A.H. Wilson, Proc. R. Soc. Lond. A 134, 277 (1931) 10. A.H. Wilson, Proc. R. Soc. Lond. A 371, 39 (1980) 11. A.H. Wilson, Semi-conductors and Metals (Cambridge University Press, Cambridge, 1939) 12. J.H. Scaff, Metall. Trans. 1, 561 (1970) 13. S. Mahajan, K.S. Sree Harsha, Principles of Growth and Processing of Semiconductors (McGraw-Hill, Boston, 1999) 14. S. Mahajan, Prog. Mat. Sci. 49, 487 (2004) 15. M. Riordan, L. Hoddeson, Crystal Fire: The Birth of the Information Age (W.W. Norton, New York, 1997) 16. S.M. Spearing, Acta Mater. 48, 179 (2000) 17. A.P. London, A.A. Ayón, A.H. Epstein, S.M. Spearing, T. Harrison, Y. Peles, J.L. Kerrebrrock, Sens. Actuators A 92, 351 (2001) 18. K.-S. Chen, S.M. Spearing, N.N. Nemeth, AIAA J. 39, 720 (2001) 19. K.-S. Chen, A. Ayon, S.M. Spearing, J. Am. Ceram. Soc. 83, 1476 (2000) 20. W. Parrish, Acta Cryst. 13, 838 (1960) 21. M. Hart, R.J. Cernik, W. Parrish, H. Toraya, J. Appl. Cryst. 23, 286 (1990) 22. U. Bonse, M. Hart, Appl. Phys. Lett. 6, 155 (1965)
10 23. 24. 25. 26. 27.
R.W. Cahn M. Hart, Brit. J. Appl. Phys. (J. Phys. D) Ser. 21, 1405 (1968) I. Robinson, Phys. World 17, 31 (May 2004) I.M. Mills, P.J. Mohr, T.J. Quinn, B.N. Tayloer, E.R. Williams, Metrologia 42, 71 (2005) R.S. Davis, Philos. Trans. R. Soc. Lond. A 363, 2249 (2005) P. Becker, H. Bettin, H.-U. Danzebrink, M. Gläser, U. Kuetgens, A. Nicolaus, D. Schiel, P. de Bièvre, S. Valkiers, P. Taylor, Metrologia 40, 271 (2003)
2 The Economic Implications of Moore’s Law G.D. Hutcheson
2.1 Introduction One hundred nanometers is a fundamental technology landmark. It is the demarcation point between microtechnology and nanotechnology. The semiconductor industry crossed it just after the second millennium had finished. In less than 50 years, it had come from transistors made in mils (one-thousandth of an inch or 25.4 microns); to integrated circuits which were popularized as microchips; and then as the third millennium dawned, nanochips. At this writing, nanochips are the largest single sector of nanotechnology. This, in spite of many a nanotechnology expert’s prediction that semiconductors would be dispatched to the dustbin of science – where tubes and core memory lie long dead. Classical nanotechnologists should not feel any disgrace, as pundits making bad predictions about the end of technology progression go back to the 1960s. Indeed, even Gordon Moore wondered as he wrote his classic paper in 1965 if his observation would hold into the 1970s. Semiconductors owe their amazing resilience to Moore’s law. To truly understand their greater impact, one must understand Moore’s law. Moore’s law is predicated on shrinking the critical features of the planar process: The smaller these features, the more bits that can be packed into a given area. The most critical feature size is the physical gate length; as shrinking it not only makes the transistor smaller, it makes it faster. But we are fast approaching the limits of what can be done by scaling. What changes are needed to keep the silicon miracle going, especially as we approach the nano era? This book examines these changes from a technical standpoint because barriers to Moore’s law have always been solved with new technology. However, these barriers are ultimately expressed economically and have important ramifications far beyond the industry itself. Moore’s law is not only an expression of a powerful engine for economic growth in the industry, but also for the economy as a whole. This chapter reviews Moore’s law and the economic implications that it poses. It shows how the continuation of Moore’s law provides
12
G.D. Hutcheson
a foundation for future economic growth and as such, sets the stage for a technical treatise on the nano era.
2.2 Moore’s Law: A Description Looking back thirty years after Gordon E. Moore first published his observations which would become known as Moore’s law, he mused “The definition of ‘Moore’s Law’ has come to refer to almost anything related to the semiconductor industry that when plotted on semi-log paper approximates a straight line” [1]. Indeed, this abuse of the meaning of Moore’s law has led to a great deal of confusion about what it exactly is. Simply put, Moore’s law [2] postulates that the level of chip complexity that can be manufactured for minimal cost is an exponential function that doubles in a period of time. So for any given period, the optimal component density would be: Ct = 2 · Ct−1 ,
(2.1)
where Ct = Component count in period t, Ct−1 = Component count in the prior period. This first part would have been of little economic import had Moore not also observed that the minimal cost of manufacturing a chip was decreasing at a rate that was nearly inversely proportional to the increase in the number of components. Thus, the other critical part of Moore’s law is that the cost of making any given integrated circuit at optimal transistor density levels is essentially constant in time. So the cost per component, or transistor, is cut roughly in half for each tick of Moore’s clock: Mt−1 , (2.2) 2 where Mt = Manufacturing cost per component in period t, Mt−1 = Manufacturing cost component in the prior period. These two functions have proven remarkably resilient over the years as can be seen in Fig. 2.1.1 The periodicity, or Moore’s clock cycle, was originally set forth as a doubling every year. In 1975, Moore gave a second paper on the subject. While the plot of data showed the doubling each year had been met, the integration growth for MOS logic was slowing to a doubling every year-and-a-half [3]. So in this paper he predicted that the rate of doubling would further slow to once every two years. He never updated this latter prediction. Between 1975 and 2006, the average rate between MPUs and DRAMs ran right at a doubling every two years. Mt =
2.3 The History of Moore’s Law Moore’s law is indelibly linked to the history of our industry and the economic benefits that it has provided over the years. Gordon Moore has tried repeatedly to dismiss 1 The forces behind the law were still strongly in effect when Gordon Moore retired in 2001,
leading him to quip to the author that “Moore’s law had outlived Moore’s career.”
2 The Economic Implications of Moore’s Law
13
Fig. 2.1. Five decades of Moore’s law
the notion that it was law, but instead just an observation. It was actually Carver Mead who first called the relationship “Moore’s law.” Either way, the term became famous because Moore had proved amazingly perceptive about how technology would drive the industry and indeed the world. Moore’s observations about semiconductor technology are not without precedent. As early as 1887, Karl Marx, in predicting the coming importance of science and technology in the twentieth century, noted that for every question science answered, it created two new ones; and that the answers were generated at minimal cost in proportion to the productivity gains made [4]. His observation was one of the times, referring to mechanics for which the importance of the industrial age’s development had been largely questioned by economists up to that point [5] (much like the productivity gains of computers in the latter twentieth century are still debated today) [6]. More important was Marx’s observation that science and engineering had proved to be a reasonably predictable way of advancing productivity. Moreover, investments in science and engineering led to technology, which paid off in a way that grew economies, not just military might. Today, no one questions that science was at the heart of the industrial age, as it led to important inventions like the cotton gin, the steam engine, the internal combustion engine, and the fractional horsepower electric motor, to name a few. Nevertheless, it is the exponential growth of scientific “answers” that led to these, as well as to the invention of the transistor in 1947, and ultimately the integrated circuit in 1958, which led to
14
G.D. Hutcheson
Moore’s observation that became known as a law, and in turn launched the information revolution.2 The progress of science into the twentieth century would ultimately lead to the invention of the transistor, which is really where the story of the information revolution from the semiconductor perspective starts. Like all great inventions, it relied on the prior work of others. The solid-state amplifier was conceptualized by Julius Edgar Lilienfeld. He filed for a patent on October 8, 1926 and it was granted on January 28, 1930 (U.S. patent No. 1745175). While Lilienfeld didn’t use the term Field Effect Transistor (FET), Oskar Heil did in British Patent No. 439457, dated March 2, 1934. Heil became the first person to use the term semiconductor. While neither ever gained from these patents, these works established the basis of all modern day MOS technology, even though neither author was cognizant of the concept of an inversion layer [7]. That is, the technology was not there to build it. Soon after World War II, figuring out how to make a solid state switch would become the Holy Grail of research as vacuum tube and electromechanical relay-based switching networks and computers were already proving too unreliable. John Bardeen, Walter Brattain and William Shockley were in pursuit of trying to make workable solid state devices at Bell Labs in the late 1940s when Bardeen and Brattain invented the point-contact semiconductor amplifier (i.e., the point-contact transistor) on December 16, 1947 [7]. It was Brattain and Bardeen who discovered transistor action, not Shockley. Shockley’s contribution was to invent injection and the p–n junction transistor. Bardeen, Brattain and Shockley, nevertheless, all properly shared the 1956 Nobel Prize in physics. It was these efforts that would set the stage for the invention of the integrated circuit in 1958 and Moore’s observation seven years later. The story of the integrated circuit centers on the paths of two independent groups, one at Fairchild and the other at Texas Instruments (TI), who in their collision created the chain reaction that created the modern semiconductor industry. It is more than a story of technology. It is a story about the triumph of human endeavor and the victory of good over bad management. It begins with the “Traitorous Eight” leaving Shockley Transistor in 1957 to start Fairchild Semiconductor (the eight were Julius Blank, Victor Grinich, 2 These observations are often imitated as well. For example, Monsanto’s 1997 annual report proclaimed Monsanto’s Law, which is “the ability to identify and use genetic information is doubling every 12 to 24 months. This exponential growth in biological knowledge is transforming agriculture, nutrition, and health care in the emerging life sciences industry.” Its measure is the number of registered genetic base pairs, which grew from nil to almost 1.2 billion between 1982 and 1997. Magnetic memory has seen a similar parallel to Moore’s law as it shrinks the size of a magnetic pixel. Life sciences gains are a direct result of increased modeling capability of ever more powerful computers. Magnetic memory’s gains are a direct result of chip manufacturing methodologies being applied to this field. Both are a direct result of the benefits gained from Moore’s law. Indeed, Paul Allen of Microsoft fame has credited his observation that there would be a need for more increasingly powerful software as a direct result of learning about Moore’s law. He reasoned that this would be the outcome of ever more powerful chips and computers and then convinced Bill Gates there was a viable future in software – something no major systems maker ever believed until it was too late.
2 The Economic Implications of Moore’s Law
15
Jean Hoerni, Eugene Kliener, Jay Last, Gordon Moore, Robert Noyce and Sheldon Roberts). They had been frustrated at Shockley because they wanted to move away from the four-layer device (thryristor) that had been developed at Bell Labs, and use lithography and diffusion to build silicon transistors with what would be called the mesa process. Fairchild was the first company to specialize exclusively in making its transistors out of silicon. Their expertise for pulling this off was a rare balance: Bob Noyce and Jay Last on litho and etch, Gordon Moore and Jean Hoerni on diffusion, Sheldon Roberts on silicon crystal growing and Gene Kliener on the financial side. The mesa process was named because cross-sectional views of the device revealed the steep sides and flat top of a mesa (it mounted the base on top of the collector). Debuted in 1958, it was the immediate rage throughout the industry, because transistors could be uniformly mass-produced for the first time. But the mesa process would not survive because its transistors were not reliable due to contamination problems. They were also costly due to their labor intensity, as the contacts were hand-painted. It was Jean Hoerni who – in seeking a solution to these problems – came up with the planar process, which diffused the base down into the collector. It was flat and it included an oxide passivation layer. So the interconnect between the base, emitter and collector could be made by evaporating aluminum (PVD) on oxide and etching it. This was a revolutionary step that, with the exception of the damascene process, is the basis for almost all semiconductor manufacturing today. It is so important that many consider Jean Hoerni the unknown soldier whose contributions were the real seed for the integrated circuit (IC). This is because the aluminum layer would make it easy to interconnect multiple transistors. The planar process was the basis for Fairchild’s early success and was considered so important that it was kept secret until 1960. Its process methodology was not revealed until after the IC had been invented. At the time, however, it was only viewed as an important improvement in manufacturing. The first work to actually interconnect transistors to build an IC was actually occurring halfway across the United States. Jack Kilby joined TI in May of 1958 and had to work through its mass vacation in July. A new employee, with no vacation time built-up, he was left alone to ruminate over his charge of working on microminiaturization. It was then that he came up with the idea of integrating transistors, capacitors and resistors onto a single substrate. It could have been a repeat of what happened at Shockley Semiconductor. Kilby’s bosses were skeptical. But instead of chasing him off, they encouraged him to prove his ideas. TI already had a mesa transistor on the market, which was made on germanium slices (TI used to call “die and wafers” “bar and slices”). Jack took one of these slices and cut it up into narrow bars (which may be why TI called chips “bars” versus the more commonly used word “die”). He then built an integrated phase-shift oscillator from one bar with a germanium mesa transistor on it and another with a distributed RC network. Both were bonded to a glass substrate and connected with a gold wire. He then built a flip-flop with multiple mesa transistors wire-bonded together, proving the methodology was universal in October of 1958. This was the first integrated circuit ever made. It was unveiled in March 1959 at the Institute of Radio Engineers show.
16
G.D. Hutcheson
Back at Fairchild, during January of 1959, Bob Noyce entered in his notebook a series of innocuous ideas about the possibility of integrating circuits using Hoerni’s planar process, by isolating the transistors in silicon with reversed biased p–n junctions, and wiring them together with the PVD-Litho-Etch process using an adherent layer of Al on the SiO2 . This was then put away until word of Jack Kilby’s invention rocked the world later in March 1959. While many derided Kilby’s work as a technology that would never yield, with designs that were fixed and difficult to change, Noyce sat up and took notice. Putting it all together, Noyce and his team at Fairchild would architect what would become the mainstream manufacturing process for fabricating integrated circuits on silicon wafers.3 Transistors, capacitors, and resistors could now be integrated onto a single substrate. The reasons why this method was so important would be codified in Moore’s 1965 paper. In 1964, Electronics Magazine asked Moore, then at Fairchild Semiconductor, to write about what trends he thought would important in the semiconductor industry over the next ten years for its 35th anniversary issue. He and Fairchild were at the forefront of what would be a revolution with silicon. However, when Moore sat down to write the paper that would become so famous for its law, ICs were relatively new – having been commercialized only a year or two earlier. Many designers didn’t see a use for them and worse, some still argued over whether transistors would replace tubes. A few even saw ICs as a threat: If the system could be integrated into an IC, who would need system designers? Indeed even Moore may have been skeptical early on. Robert Graham recalled that in 1960, when he was a senior Fairchild sales and marketing executive, Moore had told him, “Bob, do not oversell the future of integrated circuits. ICs will never be as cheap as the same function implemented using discrete components” [8]. In fact, Moore actually didn’t notice the trend until he was writing the paper [9]. Nevertheless, by 1964 Moore saw the growing complexity and lowered cost and this understanding of the process convinced him that ICs would come to dominate. Fairchild was trying to move the market from transistors to ICs. Moore was also convinced that ICs would play an important role and he was writing the paper that would convince the world. 3 Ironically, Kilby’s method for integrating circuits gets little credit for being what is now widely viewed as one of the most important paths into the future. In practice, his invention was what would be renamed as hybrid circuits, which would then be renamed Multi-Chip Modules (MCM), then Multi-Chip Packages (MCP), and now System In a Package (SIP). It is clear today that System-On-a-Chip (SOC) is limited by the constraints of process complexity and cost; and so Jack Kilby’s original integrated circuit is finally becoming a critical mainstream technology. Unfortunately, while he gets credit for the invention of the IC, few give him credit for inventing a method that had to wait 40 years to become critical in a new millennium. Most give both Jack Kilby and Bob Noyce credit as co-inventors of the IC because of these differences; they both came up with similar ideas independently and it was Jack Kilby that prodded Bob Noyce into action. TI would go on to become an industry icon. Fairchild’s early successes would turn to failure under bad management and suffer a palace revolt, similar to the one at Shockley, in 1967. It was called the Fairchild brain drain and resulted in the founding of 13 start-ups within a year. Noyce and Moore would leave to start-up Intel in 1968. But that’s another story.
2 The Economic Implications of Moore’s Law
17
Titled “Cramming More Components into Integrated Circuits,” Moore’s paper was published by Electronics Magazine in its April 19, 1965 issue on page 114. Its subtitle was “With unit cost falling as the number of components per circuit rises, by 1975 economics may dictate squeezing as many as 65,000 components on a single chip of silicon.” This issue’s contents exemplify how so few really understood the importance of the IC. Ahead of it was the cover article by RCA’s legendary David Sarnoff who, facing retirement, reminisced about “Electronics’ first 35 years” with a look ahead. Behind this were articles titled “The Future of Electronics in 1930 – Predictions from Our First Issue” and “A Forward Look at Electronics – Looking Farther into the Future” (both written by the magazine’s authors). Then there appeared an article from Motorola, by Dan Noble titled “Wonderland for Consumers – New Electronic Products Will Make Life Easier.” All these papers were before Moore’s paper. Indeed, Moore’s paper would have been at the back of the magazine had it not been for what would prove to be mundane papers titled “Changing the Nature of Research for Space,” “Light on the Future of the Laser,” “More and Faster Communications” and “Computers to Run Electric Utility Plants.” At the time Electronics Magazine was the most respected publication covering the industry and it had assembled the best visionaries possible. Yet, with the exception of Moore’s paper, it was mostly classic forecasting “through the rear-view mirror.” His paper would be the only thing remembered from this issue. In fact, those who entered the industry in the 1990s wouldn’t even recognize the magazine as it is now defunct, not surviving the Moore’s law article it contained. Moore’s law paper proved so long-lasting because it was more than just a prediction. The paper provided the basis for understanding how and why ICs would transform the industry. Moore considered user benefits, technology trends, and the economics of manufacturing in his assessment. Thus he had described the basic business model for the semiconductor industry – a business model that lasted through the end of the millennium. From a user perspective, Moore’s major points in favor of ICs were that they had proven to be reliable, they lowered system costs and they often improved performance. He concluded, “Thus a foundation has been constructed for integrated electronics to pervade all of electronics.” This was one of the first times the word “pervade” was ever published with respect to semiconductors. During this time frame the word “pervade” was first used by both Moore and Patrick Haggerty of TI. Since then, the theme of increasing pervasiveness has been a feature of almost all semiconductor forecasts.4 From a manufacturing perspective, Moore’s major points in favor of ICs were that integration levels could be systematically increased based on continuous improvements in largely existing manufacturing technology. The number of circuits that could be integrated at the same yield had already been systematically increas4 Pervasiveness is another misused word. Many have used it during boom times to argue that the semiconductor industry would no longer be cyclical and thus, not have a downturn. While semiconductors have been increasingly pervasive since the dawn of the industry, this fact has not been a factor in alleviating the industry’s inherent cyclicality.
18
G.D. Hutcheson
ing for these reasons. He saw no reason to believe that integration levels of 65,000 components would not be achieved by 1975 and that the pace of a doubling each year would remain constant. He pointed to multilayer metalization and optical lithography as key to achieving these goals. Multilayer metalization meant that single transistors could be wired together to form ICs. But of far greater import was the mention of optical lithography. Prior to the invention of the planar process, the dominant device was known as a mesa transistor. It was made by hand painting black wax over the areas to be protected from etchant. While the tool was incredibly inexpensive (a 10-cent camel’s hair brush), the process was incredibly labor intensive [10].5 The introduction of optical lithography meant transistors could be made simultaneously by the thousands on a wafer. This dramatically lowered the cost of making transistors. This was done to the degree that, by the mid-1960s, packaging costs swamped the cost of making the transistor itself. From an economics perspective Moore recognized the business import of these manufacturing trends and wrote, “Reduced cost is one of the big attractions of integrated electronics, and the cost advantage continues to increase as the technology evolves toward the production of larger and larger circuit functions on a single semiconductor substrate. For simple circuits, the cost per component is nearly inversely proportional to the number of components, the result of the equivalent package containing more components.” As important as these concepts proved to be, it was still not clear that the arguments would stand the test of time. Package costs now overwhelmed silicon processing costs. These costs were not subject to Moore’s law and technical efforts were shifting to lowering them. Fairchild was reducing packaging costs, which were still highly labor intensive, by moving its assembly lines offshore. Texas Instruments and Motorola among others were developing and building automatic assembly equipment. Many, even those at Fairchild, were still struggling with how to make a profitable business out of ICs. While transistors could be integrated, designing and marketing circuits that customers could easily use proved more complicated. The industry had no design standards. Fairchild had developed circuits with Resistor–Transistor Logic (RTL), but customers were using Diode–Transistor Logic (DTL). Plus, Fairchild was undergoing major internal turmoil as political battles raged throughout the company. Many start-up companies were spinning out of it as senior executives left. The most famous of these spin-offs was Intel, for which its lead founders included no lesser than Robert Noyce and Gordon Moore. Lacking the packaging infrastructure of Fairchild and having the cream of its research capability, Intel’s strength was in its founders’ ability to build complex circuits and their deep understanding of Moore’s law. They leveraged this by focusing on memories, which Moore believed had huge market potential and could be more easily integrated – both in terms of putting large numbers of transistors on silicon and in integrating 5 See also [11, Chap. 3, pp. 53–56].
2 The Economic Implications of Moore’s Law
19
them into customers’ designs.6 He was also convinced that the only way to compete effectively was by making the silicon more valuable than the package, offsetting the natural advantage in packaging that Fairchild, Motorola and TI had. The strategy worked. Intel became the memory IC market leader in the early 1970s. They had started with the SRAM (Static Random Access Memory) and soon invented the DRAM (Dynamic Random Access Memory), which proved to be very integrable and was much more cost effective than the ferrite core memories used by computer makers at the time. It was at this point that Moore’s law began to morph into the idea that the bits were doubling every year or two. Transistors were now old fashioned. Few argued the practicality of ICs. It was important to move on and use Moore’s law as a way to demonstrate the viability of the emerging memory market. At the time, there was more to the marketing of Moore’s law than most ever understood. The strategies taken would become a critical competitive advantage for Intel – enough of an advantage to keep it ahead of TI, who also focused on memories and had much more manufacturing infrastructure. Bob Graham, another Intel founder, recalled7 that there was a problem with Moore’s law: it was too fast. Its cycle called for a doubling every year, but systems designers needed more than a doubling to justify a new design. They typically fielded a new design every three-to-four years. Graham’s strategy to harness the power of Moore’s law was to match the chip design cycle to the system designers’. The difference between the nodes of Moore’s clock cycles and these memory nodes would lead to much confusion. But according to Graham, Intel used this confusion to keep competitors at bay when Intel’s early memory strategies were plotted. It was considered highly confidential and a big secret that the real generational nodes were based on a quadrupling, not a doubling. Moore’s law in his view was also a marketing head fake. Intel introduced each new generation of memories with a 4× increase in bits about every three years. Each of its generations was closely matched to customers’ design cycles. Other competitors fell behind because they followed the letter of Moore’s law. They tried to get ahead by introducing new chips with every 2× increase. But interim generations failed. They failed from the first 64-bit memory introduced by Intel to the 64 M-bit memory. This cycle, with every 4× increase in bits, was not broken until the 128 M-bit DRAM came to market three decades later in the early 2000s. Tax law and capital depreciation also played a critical role in determining the pacing of nodes. The United States’ MACRS (Modified Accelerated Cost Recovery Systems) tax code for capital depreciation specified that computers be fully depreciated over a six-year length of time. If systems makers had designed new systems every year, there would have been six generations of computers per depreciation cycle – clearly too many for customers. Customers would have over-bought and had to write-off equipment that was not fully depreciated, the annual market size would have been split into a sixth of its potential, or some compromise between the two 6 See also [11, pp. 181–185]. 7 Private conversations with the author.
20
G.D. Hutcheson Table 2.1. Average months to double device complexity Year 1959–1965 1966–1975 1976–1985 1986–1995 1996–2005 1976–2005
Overall 12 17 22 32 27 24
MPU
DRAM
33 22 38 26 24
17 25 25 22 24
would have happened. Early on, systems makers paced designs so that at least one half of potential customers would replace their systems every two-to-three years. It is likely that the competitive reasons accounted for the more rapid cycling of system designs in the early 1960s. The computer market was still highly competitive then. It consolidated during the latter 1960s and IBM emerged as the dominant supplier. There are no technical reasons to explain why the node pacing slowed. But from a market perspective, the pace should have slowed naturally as IBM sought to leverage its dominance to extend the life of designs, hence having greater amortization of these costs and enhancing profitability. IBM was sued for monopolist trade practices and one of the claims was that it intentionally slowed technology. Whatever the reason, node pacing slowed to a rate of one design node every three years. Moore’s clock was roughly half that. In 1975, Moore wrote an update to the 1965 paper and revised his predictions. While technically his prediction of 65,000 components had come true, it was based on a 16 K-bit CCD memory – a technology well out of the mainstream. The largest memory in general use at the time – the 16 K-bit DRAM, which contained less than half this amount – would not be in production until 1976. Between 1965 and 1975 the pace had actually slowed to a doubling every 17 months or roughly every year-anda-half. Later, Moore’s law was widely quoted by others as being every 18 months. But, despite being widely referenced as the source, this is a periodicity that Moore never gave. The 1975 paper actually predicted the periodicity would be a doubling every two years [3]. This would turn out to be extremely accurate, though seldom quoted with any accuracy. Contrary to what many have thought, the finer points of the accuracy of Moore’s law never really mattered. The real import of Moore’s law was that it had proved a predictable business model. It gave confidence in the industry’s future because it was predictable. One could plan for it and invest in it on the basis that the integration scale would always rise in a year or two, obsolescing the electronics that were out there and creating new demand because the unobtainable and confusing would become affordable and easy to use. This then fed back to reinforce it, as engineers planned for it and designed more feature-rich products or products that were easier to use. As Moore later put it, Moore’s law “had become a self-fulfilling prophecy [9]”. But as a result, heavy international competition and technical issues would loom in the future.
2 The Economic Implications of Moore’s Law
21
It was at about this time that Japan seized on Moore’s law as a planning mechanism. Without it, the industry appeared to go in random directions. But Moore’s law made it easy to plan and it had been clearly proven by 1975. The DRAM explosion was already in place at TI, AMD, IBM, Intel, and Mostek. Moore’s law made it all understandable. Now believers in Moore’s law, they saw that since memory demand and device type were very predictable it would play to their strengths. Moreover, cost of manufacturing was critical to success – one of their key strategic strengths. Moore’s law was the basis of the arguments that prompted them to start their government-funded effort called the VLSI program in 1976, for which the goal was to build a working 64K-bit DRAM. They believed that if the VLSI program could do this, their semiconductor industry could lever off the results to build their own. Japan was already successful in 4 K-bit DRAMs and their potential with 16 K-bit DRAMs looked promising. One of the keys to their success was that they implemented multiple development teams. Each team worked on the same generation, walking it from research, through development, and into manufacturing. In contrast, the west had highly stratified walls between these three areas. Research departments threw their results over the wall to development, and so forth into manufacturing. Often, manufacturing used few of these efforts because they were seldom compatible with manufacturing. Instead they built designs coming directly from marketing because they knew they would sell. Japan got the edge because they could get new designs to manufacturing faster and their cost of manufacturing was fundamentally lower. They had lower labor rates and their line workers typically stayed with a company for several times longer. This combined with the Japanese penchant for detail. TI had a facility in Japan and this facility consistently yielded higher than its American facilities for these reasons.8 The Japanese also had newer equipment, having invested heavily in the late 1970s. Capital was tough to get in the late 1970s for American chipmakers. They had cut their investments to low levels, which would ultimately give Japan another source of yield advantage. But the real shocker came in 1980, when Hewlett-Packard (HP) published an internal study comparing quality between American- and Japanese-made DRAMs. It showed Japan to have higher quality DRAMs. American chipmakers cried foul – that this was tested-in quality and that Japanese suppliers sent HP more thoroughly tested parts. Indeed, independent studies did later show that Japanese-made DRAM’s obtained on the open market were of no higher quality than American ones. However, it was too late to change the perception that HP’s announcement had created (a perception that remains to this day). Whether or not the quality was tested-in, the one clear advantage the Japanese had was yield. It was typically 10–15% higher than equivalent American fabs and this gave the Japanese a fundamental cost advantage. When the downturn hit in 1981, these yield differences allowed Japanese companies to undercut American companies on 16-Kbit DRAMs. This, combined with the downturn, caused American producers to make further cuts in capital investment, and put them further behind. At 8 Conversations with Howard Moss, a Board Member of Texas Instruments at the time, 1985.
22
G.D. Hutcheson
the time, the Chairman of NEC noted that they had no fab older than five years. The average American fab was 8 years old. So by the early 1980s, Japan came to dominate 64-Kbit memories. By 1985, America’s giants were bleeding heavily. Intel was among the worst when it came to manufacturing memories. It was forced out of memories. The technical issues with the 64-Kbit DRAM proved to be enormous. The most commonly used lithography tool of the day, the projection aligner, would not make it to this generation because it lacked the overlay alignment capability. Something new had to be developed and the industry was not ready. The transition to stepping aligners proved much more traumatic than anyone expected. Steppers had the potential for very high yields. But unless the reticle was perfect and had no particles, yield would be zero because the stepper would repeat these defects. The result was a three-year hiatus in Moore’s law. 64-Kbit DRAMs did not enter volume production until 1982 – a full three years after they should have arrived – taking a full six years to pass from the 16-Kbit node. Another transition occurred in the early 1980s that favored Intel. NMOS began to run out of steam and couldn’t scale well below one micron. Some even predicted that we had hit Moore’s wall. But the industry escaped this by transitioning to CMOS. One of the benefits of CMOS was that performance also scaled easily with shrinks. An engineer in Intel’s research organization observed this and recognized its importance to microprocessors. Moreover, having exited memories it was important that Intel not lose the brand value of Moore’s law it had, having its discoverer as chairman of the company. So marketing morphed Moore’s law a second time. It had started as the number of transistors doubling, then the number of bits, and now it was speed, or more comprehensively, performance. This new form would serve Intel and the industry well. In the early 1990s, the pace of integration increased again. There were several factors driving this change. It could be argued that the manufacturing challenges of the early 1980s had been overcome. Yet there was significant fear that new hurdles looming in the near future would be insurmountable. America’s semiconductor industry had just instituted the roadmap process for planning and coordinating semiconductor development. As it turned out, this focused pre-competitive research and development like it had never been before. Instead of hundreds of companies duplicating efforts, it allowed them to start from a common understanding. The result was an increased pace of technology development. At the same time, efforts to reinvigorate competitiveness led companies to adopt time-to-market measures of effectiveness. This naturally accelerated the pace of development. Also, the shift from mainframe and minicomputers to personal computers had dramatically altered the competitive landscape in the 1980s. IBM had quickly come to dominate the PC market in the early 1980s. Unlike all earlier IBM computers, the PC had been designed with an open architecture. Their dominance might never have been challenged. However on August 2, 1985, the IBM senior executives who ran and had developed its PC business violated a major corporate rule and boarded Delta Airlines flight 191 to Dallas, Texas. The flight encountered wind-shear and crashed on landing. IBM’s understanding of how
2 The Economic Implications of Moore’s Law
23
the market was evolving as well as its leadership capability in the still-emerging PC market perished. Unlike all earlier IBM computers, the PC had been designed with an open architecture, which meant it could be easily cloned. Outside of this small group, IBM really didn’t understand how to respond to the hoards of clone makers who had entered the market. Applied to pricing and self-obsolescence, the clone hoard’s slash-and-burn strategies made IBM’s defenses about as useful as France’s Maginot line in World War II. As they lost their leadership, the pace of technical change accelerated again to limits set primarily by technical developments. At the 1995 Semiconductor Industry Association (SIA) forecast dinner, Gordon Moore gave a retrospective on 30 years of Moore’s law. He claimed to be more surprised than anyone that the pace of integration had kept up for so long. He had given up on trying to predict its end, but commented that it was an exponential and all exponentials had to end. While it did not stop the standing ovation he received, he concluded that “This can’t go on indefinitely – because by 2050. . . we’re everything.”
2.4 The Microeconomics of Moore’s Law The essential economic statement of Moore’s law is that the evolution of technology brings more components and thus greater functionality for the same cost. This cost reduction is largely responsible for the exponential growth in transistor production over the years. Lower cost of production has led to an amazing ability to not only produce transistors on a massive scale, but to consume them as well. Each year’s new production alone amounts to roughly 40% of the total transistors ever produced in every year before it. It has crossed 12 orders of magnitude since the industry’s inception. By comparison, auto production did not cross a single order of magnitude over this period. The other amazing aspect of this is that anomalies such as the millennial boom have no effect on production. In any given year since the industry’s birth, the production of transistors has averaged 41% of the cumulative total ever produced up until then. So what makes Moore’s law work? The law itself only describes two variables in the equation: transistor count and cost. Behind these are the fundamental technological underpinnings that drive these variables and make Moore’s law work. There are three primary technical factors that make Moore’s law possible: reductions in feature size, increased yield, and increased packing density. The first two are largely driven by improvements in manufacturing and the latter largely by improvements in design methodology. Design methodology changes have been significant over the years. They have come as both continuous and step function improvements. The earliest step function improvements were the reduction in transistor counts to store memories. The industry started with 6-transistor memories. In the late 1960s, designers quickly figured how to reduce this to four, then two, and, finally, the 1-transistor/1-capacitor DRAM cell, developed by R.H. Dennard [12]. While this did not affect Moore’s law as measured in transistors, it did when measured in bits, because a 4-Kbit memory (with a 6-T
24
G.D. Hutcheson
Fig. 2.2. Worldwide transistor production (all, including foundry, merchant, and captive producers)
cell) needed 24-K transistors and could now be made with only 4-K transistors with a 1 T/ 1 capacitor cell. This was an enormous advance. Cost-per-bit plummeted and it further added to the mythical proportions of Moore’s law, as customers saw little real difference between transistors and bits. What they were most interested in was reductions in cost-per-function and designers had delivered. There were less wellknown additions as well. The development of Computer-Aided Design (CAD) in the early 1980s was a significant turning point. Now with Electronic Design Aids (EDA), CAD’s first contribution was to prevent the ending of Moore’s law. As the industry progressed from MSI to LSI levels of integration, the number of transistors to be wired together was becoming too large for humans to handle. Laying out the circuit diagram and cutting the rubylith9 for wiring 10,000 transistors (with three connections each) together, with 3-connections each, by hand had to have been a daunting task. The 64-Kbit DRAM loomed large in this picture as the decade turned. With just over 100,000 K transistors, it would be the first commercial VLSI grade 9 In those days, masks were made by drawing the circuit out on a large piece of paper that
sometimes covered the floor of a large room. Then a laminated piece of Mylar called rubylith was used to make the initial mask pattern. Rubylith was made of two Mylar films, one clear and one red. A razor-edged knife was used to cut strips of the red Mylar away. Initially this was done by hand. One had to be careful not to cut the clear Mylar underneath so the red Mylar could be pulled away, leaving the final mask pattern. This pattern was then reduced to circuit dimensions to make a mask master.
2 The Economic Implications of Moore’s Law
25
Table 2.2. Integration scale measures used in the 1970s SSI MSI LSI VLSI
Small Scale Integration Medium Scale Integration Large Scale Integration Very Large Scale Integration
100,000 Transistors
chip produced in volume – and it was a point of hot competition. So being able to automate this process would be a great leap forward. The next step for design was to improve the layout capability of these tools. This improved the packing density. Early layout tools were fast, but humans could lay out a circuit manually in 20–30% of the area. Today, no one would manually lay out an entire circuit with millions of transistors. Even today, EDA tools do not offer the most efficient packing density. Designers who want the smallest die will “handcraft” portions of a circuit. This is still commonly done when a market is large and the die-size reduction can justify the cost of handcrafting. Performance improvements are another way that design has directly affected Moore’s law. It is particularly important to the third version of Moore’s law, which measures the gain in circuit performance over time. Scaling theory states that transistor switching speed increases at a rate that is inversely proportional to the reduction in physical gate length. However, a designer can improve on this by using the switching of transistors more efficiently. These transistors switch with the clock of the circuit. Early processor architecture required several clock cycles per instruction. So a 1-GHz clock might only perform at a rate of 300 Millions of Instructions Per Second (MIPS). Using techniques like parallel processing, pipelining, scalar processing, and fractional clocks, designers have systematically improved this so that three-to-five instructions per clock cycle can be achieved. Thus, a processor with a 1-GHz clock can exhibit run rates of 3,000-to-5,000 MIPS. Considering 1-MIP was considered state-of-the-art for a circa-1980 mainframe processor, subsequent architectural gains have been quite significant. Design tools are having further impacts today; one is their ability to improve testing. Without these tools test costs would explode or worse, the circuits would be untestable, making further gains in integration scale pointless. Another impact is the ability to automatically lay out the patterns needed to make reticles with optical proximity correction and phase-shifting. This substantially reduces feature sizes. But, it is important to realize that none of these gains would have been possible without ever more powerful cost-effective computers. All of these benefits were made possible by Moore’s law. Hence, instead of running down, Moore’s law is a self-fulfilling prophecy that runs up. Indeed, many of the manufacturing improvements since the 1980s have come only because Moore’s law had made computing power so cheap that it could be distributed throughout the factory and in the tools, be used to design the tools, and even perform the engineering and economic analysis to make more efficient decisions.
26
G.D. Hutcheson
Fig. 2.3. Five decades of critical dimension shrinks (in nanometers)
Reductions in feature sizes have made the largest contributions by far, accounting for roughly half of the gains since 1976. Feature sizes are reduced by improvements in lithography methods. These enable smaller critical dimensions (CDs, which are also known as Minimum Feature Sizes or MFSs) to be manufactured. If the dimensions can be made smaller, then transistors can be made smaller and hence more can be packed into a given area. This is so important that Moore’s first paper relied entirely on it to explain the process. Improvements in lithography have been the most significant factor responsible for these gains. These gains have come from new exposure tools; resist processing tools and materials; and etch tools. The greatest change in etch tools was the transition from wet to dry etching. In etching, most of the older technology is still used today. Wet chemistries used for both etching and cleaning are the most prominent of these. Improvements in resist processing tools and materials have generally been incremental. Resist processing tools have remained largely unchanged from a physical perspective since they became automated. The changes have mostly been in incremental details changed to improve uniformity and thickness control. Resist chemistries have changed dramatically, but these changes are easy to overlook. Moreover etch and resist areas have relatively small effects on cost. Exposure tools have gone through multiple generations that followed the CD reductions. At the same time they have been the most costly tools and so generally garner the most attention when it comes to Moore’s law. Moreover, without improvements in the exposure tool, improvements elsewhere would not have been needed. Exposure tools were not always the most costly tools in the factory. The camel’s hair brush, first used in 1957 to paint on hot wax for the mesa transistors, cost little more than 10 cents. But since that time prices have escalated rapidly, increasing roughly an order of magnitude every decade-and-a-half. By 1974, Perkin–Elmer’s newly introduced projection aligner cost well over $100,000. In 1990, a state-of-the-
2 The Economic Implications of Moore’s Law
27
Table 2.3. Evolution of lithography technology used to manufacture semiconductors Year first used in manufacturing
CD (microns)
Lithography technology
Etch
1957 1958 1959 1964 1972 1974 1982 1984 1988 1990 1997 2003
254.000 127.000 76.200 16.000 8.000 5.000 2.000 1.500 1.000 0.800 0.250 0.100
Camel’s hair brush, hand painting Silk screen printer Contact printer W/emulsion plates Contact printer W/chrome plates Proximity aligner Projection aligner g-line (436 nm) stepper
Wet etching
Barrel plasma Planar plasma Reactive ion etching High density plasma
i-line (365 nm) stepper 248 nm scanner 193 nm scanner
Fig. 2.4. History of average cost to build & equip a wafer fab
art i-line stepping aligner cost just over $1 million. When this was first written in 2002, 193-nm ArF excimer laser scanning aligners were about to enter manufacturing and they cost a shocking $10 million each. As of this writing in late 2006, they cost upwards of $50 million. Over the decades, these cost increases have been consistently pointed to as a threat to the continuance of Moore’s law. Yet the industry has never hesitated to
28
G.D. Hutcheson
adopt these new technologies. It is a testimony to the power of this law that these costs have been absorbed with little effect. Lithography tools have become more productive to offset these increases. Moreover, they are only part of the rising cost picture. The increase in the cost of semiconductor factories had been a recurring theme over the years. In fact it was first noted in 1987 that there was a link between Moore’s law and wafer fab costs [13]. Between 1977 and 1987, wafer fab costs had increased at a rate of 1.7× for every doubling of transistors. In the 1980s, the cost of factories was offset primarily by yield increases. So rising tool costs were offset by lower die costs. However, this relationship stalled in the 1990s, when the rise in tool prices began to be offset by increases in productivity. So as effective throughputs rose, the unit volumes of tools in a fab needed to produce the same number of wafers declined. This did change somewhat with the introduction of 300 mm wafers. However, the doubling in fab costs, shown above starting in 2004, is due to the fact that the typical output of a fab in wafers doubled. When normalized for output, fab costs have been constant since the 1990s and really have ceased to be an issue. In any case, Moore’s law governs the real limit to how fast costs can grow. According to the original paper given in 1965, the minimal cost of manufacturing a chip should decrease at a rate nearly inversely proportional to the increase in the number of components. So the cost per component, or transistor, should be cut roughly in half for each tick of Moore’s clock (see (2.1) and (2.2)). However, since this paper was first given, it has generally been believed that industry growth will not be affected if the cost per function drops by at least 30% for every doubling of transistors. This 30% drop would allow the manufacturing cost per unit area of silicon to rise by 40% per node of Moore’s law (or by twice the cost-per-function reduction ratio requirement) (Appendix A). This includes everything from the fab cost to materials and labor. However it does not take yield or wafer size into account. Thus if cost per function needs to drop by 30% with each node, wafer costs can also theoretically increase by 40%, assuming no yield gains (see Appendix A for proof). Yield is a function of die size and so is directly dependent on component counts and CD reductions. There are many equations for calculating yield, the most basic of which is the classic Poisson probability exponential function: Y = exp −(ADN ), where A = die area, D = defect density per mask layer, N = number of masks. Note that this equation also accounts for increased process complexity as component counts rise. It would seem that this effect would be the most significant costreducing factor. In the early days of the industry, it was. In the 1970s, yield typically started at 15% when a device was introduced and peaked at 40% as the device matured. Gains of two-to-three times were typical over the life of a chip and gains of four-to-five times were not uncommon for devices that had long lives. Improvement in manufacturing methods increased these gains dramatically during the 1980s and 1990s. This was primarily due to better equipment and cleanroom technology. For example, the switch to VLSI equipment technology such as steppers and plasma
2 The Economic Implications of Moore’s Law
29
etchers caused initial yields for the 64-Kbit DRAM to come in at 40% in 1982. It matured at 75% three years later. Today, devices typically enter production at 80%; rise to 90% within six months; and can achieve close to 100% at maturity. But at best the gain is only a quarter of initial yields. Wafer size has been another cost-reducing factor used over the yields. Larger wafers have typically cost only 30% more to process and yet have had an area increase of 50-to-80%. The move to 300 mm wafers from 200 mm will yield an area increase of 125%! Larger wafers have always brought a yield bonus because of their larger “sweet spot” – yielding a relatively larger number of good chips in the inner regions of the wafer – and the fact that they require a new generation of higher-performing equipment. Like the sweet spot of a tennis racket, wafers tend to have the lowest defect density at their centers and highest at their edges where they are handled most. In addition, process chamber uniformity tends to suffer the most at the edges of the wafer. There are also gains in manufacturing efficiency that occur over time. The result is a continued decrease in manufacturing costs per die. However, the continuation of Moore’s law via reduction in CDs, increased yields, larger wafer sizes, and manufacturing improvements has taken its toll in other areas. Costs have risen significantly over time as seen in the rise of wafer fab costs. Moreover, the CD reductions have caused a need for increasing levels of technical sophistication and the resultant costs. For example, the camel’s hair brush used for lithography in the 1950s cost only 10 cents; early contact aligners cost $3,000–5,000 in the 1960 and $10,000 by the 1970s; a projection aligner in the late 1970s cost $250,000; and the first steppers cost $500,000 in the early 1980s. By 2000, a 193-nm (using an ArF excimer laser) stepper cost $10 million and the latest tools can cost upward of $50 million. That is an increase of more than eight orders of magnitude over five decades. Moreover, the cost increases are prevalent throughout the fab. Increased speeds have forced a transition from aluminum to copper wiring. Also, silicon-dioxide insulation no longer works well when millions of transistors are switching at 2 GHz, necessitating a switch to interlevel dielectrics with lower permittivity. At the gate level, silicon-dioxide will no longer be useful as a gate dielectric. Scaling has meant that fewer than ten atomic thicknesses will be used and it will not be long before they fail to work well. The solution is to replace them with high-k dielectrics so that physical thicknesses can be increased, even as the electrical thickness decreases. These new materials are also causing costs to escalate. An evaporator, which could be bought for a few thousand dollars in the early 1970s, now costs $4–5 million. Even diffusion furnaces cost $1 million per tube. As costs have risen, so has risk. There has been a tendency to over-spec requirements to ensure a wide safety margin. This has added to cost escalation. At some point the effect of these technologies translating into high costs will cause Moore’s law to cease. As Gordon Moore has put it, “I’ve learned to live with the term. But it’s really not a law; it’s a prediction. No exponential runs forever. The key has always been our ability to shrink dimensions and we will soon reach atomic dimensions, which are an absolute limit.” But the question is not if,
30
G.D. Hutcheson
it’s when will Moore’s wall appear? “Who knows? I used to argue that we would never get the gate oxide thickness below 1000 angstroms and then later 100. Now we’re below 10 and we’ve demonstrated 30-nm gate lengths. We can build them in the 1000’s. But the real difficulty will be in figuring out how to uniformly build tens of millions of these transistors and wire them together in one chip” [14]. In fact, we routinely build them today in cutting-edge manufacturing. In the end, it is more likely that economic barriers will present themselves before technical roadblocks stop progress [15].
2.5 The Macroeconomics of Moore’s Law Moore’s law was more than a forecast of an industry’s ability to improve, it was a statement of the ability for semiconductor technology to contribute to economic growth and even the improvement of mankind in general. It has a far richer history than the development of semiconductors, which to some extent explains why Moore’s law was so readily accepted. This history also explains why there has been an insatiable demand for more powerful computers no matter what people have thought to the contrary. The quest to store, retrieve, and process information is one task that makes humans different from other animals. The matriarch in a herd of elephants may be somewhat similar to the person in early tribes who memorized historical events by song. But no known animal uses tools to store, retrieve, and process information. Moreover the social and technological progress of the human race can be directly traced to this attribute. More recent writers have pointed to this as a significant driving force in the emergence of western Europe as the dominant global force in the last millennium [16]. Man’s earliest attempts to store, retrieve, and process information date back to prehistoric times when humans first carved images in stone walls. Then in ancient times, Sumerian clay tokens developed as a way to track purchases and assets. By 3000 B.C. this early accounting tool had developed into the first complete system of writing on clay tablets. Ironically, these were the first silicon-based storage technologies and would be abandoned by 2000 B.C. when the Egyptians developed papyrus-based writing materials. It would take almost four millennia before silicon would stage a comeback as the base material, with the main addition being the ability to process stored information. In 105 A.D. a Chinese court official named Ts’ai Lun invented wood-based paper. But it wasn’t until Johann Gutenberg invented the movable-type printing press around 1436 that books could be reproduced cost effectively in volume. The first large book was the Gutenberg Bible, published in 1456. Something akin to Moore’s law occurred, as Gutenberg went from printing single pages to entire books in 20 years. At the same time, resolution also improved, allowing finer type as well as image storage. Yet, this was primarily a storage mechanism. It would take at least another 400 years before retrieval would be an issue. In 1876, Melvil Dewey published his classification system that enabled libraries to store and retrieve all the books that were being made by that time. Alan Turing’s “Turing Ma-
2 The Economic Implications of Moore’s Law
31
Fig. 2.5. Average price and cost per transistor for all semiconductors (in nanodollars)
chine,” first described in 1936, was the step that would make the transformation from books to computers. So Moore’s law can be seen to have a social significance that reaches back more than five millennia. The economic value of Moore’s law is also understated, because it has been a powerful deflationary force in the world’s macro-economy. Inflation is a measure of price changes without any qualitative change. So if price per function is declining, it is deflationary. Interestingly, this effect has never been accounted for in the national accounts that measure inflation adjusted gross domestic product (GDP). The main reason is that if it were, it would overwhelm all other economic activity. It would also cause productivity to soar far beyond even the most optimistic beliefs. This is easy to show, because we know how many devices have been manufactured over the years and what revenues have been derived from their sales. Probably the best set of data to use for analyzing the economic impact of Moore’s law is simply price and cost per transistor. It is exceptionally good because it can easily be translated into a universal measure of value to a user: transistors. Transistors are a good measure because in economic terms they translate directly into system functionality. The more transistors, the greater the functionality of the electronic products consumers can buy. This data is shown in Fig. 2.5. This data includes both merchant and captive production, so it is a complete measure of industry production. The constancy of this phenomenon is so stunning that even Gordon Moore has questioned its viability. The implications of this data are even more stunning: Since a transistor’s price in 1954 was 64 million times more than it is as of this writing, the economic value the industry has brought to the world is unimaginable. If we take 2006’s market and adjust for inflation, the value of to-
32
G.D. Hutcheson
day’s integrated circuit production would be 13 peta-dollars – or $13,000 trillion. That is far larger than Gross World Product, which measures the value of output of all the world’s economies. Moreover, that doesn’t include the value of all semiconductors! So it is hard to understate the long-term economic impact of the semiconductor industry.
2.6 Moore’s Law Meets Moore’s Wall: What Is Likely to Happen Moore’s law meets Moore’s wall and then the show stops, or the contrary belief that there will be unending prosperity in the twenty-first century buoyed by Moore’s law, have been recurring themes in the media and technical community since the mid1970s. The pessimists are often led by conservative scientists who have the laws of physics to stand behind. The optimists are usually led by those who cling to “facts” generated by linear extrapolation. The problem with the optimists is that the issues that loom are not easily amenable to measurement by conventional analysis. Eventually real barriers emerge to limit growth with any technology. Moreover, as Moore himself has often quipped, “No exponential goes on forever.” But so far, the optimists have been right. The problem with the pessimists is that they typically rely too much on known facts and do not allow for invention. They don’t fully account for what they don’t know, leaving out the “what they don’t know” pieces when assembling the information puzzle. Yet it is the scientific community itself that expands the bounds of knowledge and extends Moore’s law beyond what was thought possible. History is replete with many really good scientists and engineers who have come up with new things to constantly expand the boundaries of our knowledge, and as noted above, this is not likely to stop. When anyone asks me about Moore’s wall, my first response is “Moore’s wall is in Santa Clara, just outside Intel’s Robert Noyce building. If you look closely, you will find the engraved names of people who made career-limiting predictions for the end of Moore’s law.” This has certainly been the case for those who have predicted the coming of Moore’s wall in a five-or-ten-year span over the years. Yet, Moore himself said in 1995 that the wall should be finished and in place somewhere around 2040, when he poignantly pointed out that otherwise, “we’ll be everything” if things continue at historical growth rates. Herein lies the real dilemma. If our industry continues to grow unbounded, it really will become as large as the global economy in the first half of the twenty-first century. This leads to the historical view that as this occurs our industry’s growth will become bounded by macroeconomic growth. However, if you look at history, it dispels this idea. At the beginning of this millennium rapid advances in agricultural techniques did not slow economic growth. Instead, they buoyed it as they freed-up human resources to work on other things, which in turn kicked off the High Middle Ages. Ultimately, this made possible the industrial age in the latter part of the millennium. As industry grew to be a larger part of the economy it did not slow to the 1% annual economic growth of agricultural economies. While it did slow, it also pushed economic growth up to an average of about 3%. Mechanized transportation
2 The Economic Implications of Moore’s Law
33
allowed centralized manufacturing, so factories could achieve greater economies of scale. This combined with the mechanization of the factory and greatly improved productivity; thus allowing greater non-inflationary growth levels. Since the latter half of the 1990s, the United States has been able to achieve regular non-inflationary growth of 4–5%. It is non-inflationary because of productivity gains. These gains are made possible by information technology. Another factor driving the non-inflationary growth potential of the economy is that information technology tends to be energy saving as well. One of the real limits to the agricultural age was the fact that the primary fuel was wood. Entire forests were decimated in the Middle East and then Greece and Italy. The industrial age was prompted with the discovery of fossil fuels. This stopped deforestation to a great degree, but from an economic perspective, it also allowed for greater growth potential. Fossil fuels were easier to transport and use, so they too increased productivity. This, combined with the ability to transport materials to centralized manufacturing locations and then back out with trains, led to massive improvements in productivity. The information age takes the next step and relies on electricity. More important, it replaces the need to transport people, materials, and products with information. For example, video teleconferencing allows people to meet without traveling great distances. The voice and image information at both ends is digitized into information packets and sent around the world so that people can communicate without being physically near. At the same time, products can be designed in different places around the world and the information can be sent so products can be produced either in low-cost areas or, where transportation costs are high, locally. For example, for semiconductors being designed in the United States in close cooperation with a customer in Europe it is now a daily event to have the designs sent over the Internet to Texas for reticles to be made, to California for test programs, then to Taiwan to convert wafers into ICs, then to Korea for packaging, and finally the product is shipped to the customer in Europe. In the case of beer, transporting liquids is far too expensive. So a company in Europe can license its process to brewers in the United States and Japan, where they are manufactured locally. Using the Internet, the original brewer can monitor production and quality with little need to leave the home factory. The productivity effect seen in the transition from the agricultural to the industrial age is really happening as we move into the information age. It can be argued that macroeconomic growth could rise to as high as 8% while creating a similar growth cap for our industry. What happens when this occurs? It is inevitable that the semiconductor industry’s growth will slow from the 15–20% range it has averaged over its history in the last half of the twentieth century. The barriers that will limit it will be economic not technical, as Moore’s law is a statement of powerful economic forces [15]. Technology barriers first show up as rising costs that go beyond the bounds of economic sense. Transportation speed limits could exceed the speed of sound. But economic limits make private jet ownership unattainable for all but a very few. Economic limits make the automobile the most commonly used vehicle in major industrialized countries and the bicycle in others. But even here, the economic limits of building infrastructure limit average speed to less than 20 MPH in
34
G.D. Hutcheson
Fig. 2.6. Ford motor company’s equivalent to Moore’s law (the early years of the auto industry)
industrial countries (which is one reason why the bicycle has become such a popular alternative). If we look to the auto industry for guidance, similar declines in cost during its early years can be found. At the turn of the century, cars were luxury items, which typically sold for $20,000. They were the main frames of their day, and only the ultra-rich could afford them. Henry Ford revolutionized the auto industry with the invention of the assembly line. Ford’s efforts resulted in a steady reduction in costs, quickly bringing the cost of manufacturing a car to under $1000. But even Ford’s ability to reduce costs had bottomed out by 1918, when the average hit a low of $205 (see Fig. 2.6, which has not been adjusted for inflation). While these efforts pale in comparison to gains made in semiconductors, the lesson to be learned is that cost gains made on pushing down one technical river of thought will eventually lead to a bottom, after which costs rise. Science and engineering can only push limits to the boundaries of the laws of physics. Costs begin to escalate as this is done because the easy problems are solved and making the next advance is more difficult. At some point, little gains can be made by taking the next step, but the cost is astronomical. In the case of automobiles, the gains were made by the development and improvement of assembly line technology. In the case of semiconductors it has largely been lithography where the gains were made. These are not “economies of scale” as taught in most economics classes, where increased scale drives cost down to a minimum – after which, costs rise. Instead, technology is driving cost. These are economies of technology and are some of the
2 The Economic Implications of Moore’s Law
35
most important underlying factors that make Moore’s law possible and will ultimately result in its demise when gains can no longer be made. Similar things are happening in semiconductors. Fab equipment prices have risen steadily at annual rates above 10%. This was fine as long as yields rose, giving added economic boost to the cost of steadily shrinking transistors to stay on Moore’s curve. But yields cannot go up much further, so gains will have to come from productivity improvements. It is important to note that as these economic barriers are hit, it does not mean the end of the semiconductor industry. The industry has lived with Moore’s law so long that it is almost of matter of faith, as exemplified in the term “show stopper.” The term has been used extensively to highlight the importance of potential limits seen in the industry’s “road mapping” of future technologies. Yet it is unlikely that the show will stop when the show stoppers are finally encountered. Just think of the alternatives. Moreover, the auto industry has been quite healthy in the eight decades since it hit its show stoppers. People did not go back to horses as a means of regular transport. As the gains from automation petered out, auto manufacturers shifted their emphasis from low-cost one-size-fits-all vehicles to many varieties – each with distinct levels of product differentiation. The other hallmarks of the industrial age trains and planes also found ways to go on after they hit technical and economic limits. For this to happen in semiconductors, it means manufacturing will have to be more flexible and design will continue to become more important.
2.7 Conclusion Moore’s law has had an amazing run as well as an unmeasured economic impact. While it is virtually certain that we will face its end sometime in this century, it is extremely important that we extend its life as long as possible. However well these barriers may be ultimately expressed economically, barriers to Moore’s law have always been overcome with new technology. It may take every ounce of creativity from the engineers and scientists who populate this industry, but they have always been up to the task. Moore’s law is predicated on shrinking critical features. Since the 1970s, it has always seemed that we are fast approaching the limits of what can be done, only to find someone had come up with a new idea to get around the barrier. The “red brick wall” has proved more imaginary that real – it’s real effect having been to spur innovation. So what advice would Gordon give us? I had the chance to ask him several years ago on the day he entered retirement.10 One thing he wanted to point out was that he never liked the term Moore’s law: “I’ve learned to live with the term. But it’s really not a law; it’s a prediction. No exponential runs forever. The key has always been our ability to shrink dimensions and we will soon reach atomic dimensions, which are an absolute limit.” But the question is not if, it’s when will Moore’s wall appear? “Who knows? I used to argue that we would never get the gate oxide thickness below 10 Personal conversations with Dr. Gordon Moore and the author, May 24, 2001.
36
G.D. Hutcheson
1000 angstroms and then later 100. Now we’re below 10 and we’ve demonstrated 30nm gate lengths. We can build them in the 1000’s. But the real difficulty will be in figuring out how to uniformly build ten’s of millions of these transistors and wire them together in one chip.” The key is to keep trying. Keep trying we did: As of this writing 30-nm gates are just making it to production. Gordon felt that any solution must champion manufacturing because “there is no value in developing something that cannot be built in volume. Back at Fairchild the problem was always in getting something from research into manufacturing. So at Intel we merged the two together.” He advised, “Always look for the technical advantage (in cost). I knew we could continue to shrink dimensions for many years, which would double complexity for the same cost. All we had to do was find a product that had the volume to drive our business. In the early days that was memories. We knew it was time to get out of memories when this advantage was lost. The argument at the time was that you had to be in memories because they were the technology driver. But we saw that DRAMs were going off in a different technical direction because problems in bit storage meant they had to develop all these difficult capacitor structures.” He also pointed to the need to avoid dependency on specific products. “I’ve never been good at forecasting. I’ve been lucky to be in the right place at the right time and know enough to be able to take advantage of it. I always believed in microprocessors but the market wasn’t big enough in the early days. Ted Hoff showed that microprocessors could be used for calculators and traffic lights and the volume could come in what we now call embedded controllers. I continued to support it despite the fact that for a long time the business was smaller than the development systems we sold to implement them. But just when memories were going out, microprocessors were coming into their own. Success came because we always sought to use silicon in unique ways.” So what did Gordon have to say about his contribution and the future of our industry: “I helped get the electronics revolution off on the right foot . . . I hope. I think the real benefits of what we have done are yet to come. I sure wish I could be here in a hundred years just to see how it all plays out.” The day after this discussion with Gordon, I knew it was the first day of a new era, one without Gordon Moore’s oversight. I got up that morning half-wondering if the sun would rise again to shine on Silicon Valley. It did – reflecting Gordon Moore’s ever-present optimism for the future of technology. As has Moore’s law, which continues to plug on, delivering benefits to many who will never realize the important contributions of this man and his observation.
Appendix A Moore’s law governs the real limit to how fast costs can grow. Starting with the basic equations (2.1) and (2.2), the optimal component density for any given period is: Ct = 2 · Ct−1 ,
2 The Economic Implications of Moore’s Law
37
where Ct = Component count in period t, Ct−1 = Component count in the prior period (Also please note the “−1” here and below is symbolic in nature and not used mathematically.) According to the original paper given in 1965, the minimal cost of manufacturing a chip should decrease at a rate that is nearly inversely proportional to the increase in the number of components. So the cost per component, or transistor, should be cut roughly in half for each tick of Moore’s clock: Mt−1 2 = 0.5 · Mt−1 ,
Mt =
where Mt = Manufacturing cost per component in period t, Mt−1 = Manufacturing cost component in the prior period. However, since this paper was first given, it has generally been believed that industry growth will not be affected if the cost per function drops by at least 30% for every doubling of transistors. Thus: Mt = 0.7 · Mt−1 since, Tdct and, Ct Tdct−1 = , Ct−1
Mt = Mt−1
where Tdct = Total die cost in period t, Tdct−1 = Total die cost in the prior period. Thus, Tdct 0.7 · Tdct−1 = , Ct Ct−1 Tdct 0.7 · Tdct−1 = , 2 · Ct−1 Ct−1 2 · Ct−1 · 0.7 · Tdct−1 Tdct = Ct−1 Simplified it reduces to: 2 · 0.7 · Ct−1 · Tdct−1 , Ct−1 Tdct = 1.4Tdct−1 .
Tdct =
If the cost-per-function reduction ratio is different than 0.7, then Tdct = 2 · Cpfr · Tdct−1 ,
38
G.D. Hutcheson
where: Cpfr = Cost-per-function reduction ratio for every node as required by the market. In general, the manufacturing cost per unit area of silicon can rise by 40% per node of Moore’s law (or by twice the cost-per-function reduction ratio requirement. This includes everything from the fab cost to materials and labor. However, it does not take yield or wafer size into account. Adding these two: Twct = 2 · Cpfr · Twct−1 . So, Tdct =
Twct 2 · Cpfr · Twct−1 = , Dpwt · Yt W · Dpwt−1 · Yr · Yt−1
where Twct = Total wafer cost requirement in period t, Twct−1 = Total wafer cost in the prior period, Dpwt = Die-per-wafer in period t, Yt = Yielded die-per-wafer in period t, W = Ratio of die added with a wafer size change, Dpwt−1 = Die-perwafer in the prior period, Yr = Yield reductions due to improvements with time. Yt−1 = Yielded die-per-wafer in the prior period.
References 1. G.E. Moore, Lithography and the future of Moore’s law. SPIE 2440, 2–17 (1995) 2. G.E. Moore, The future of integrated electronics. Fairchild Semiconductor (1965). This was the original internal document from which Electronics Magazine would publish “Cramming more components into integrated circuits,” in its April 1965 issue celebrating the 35th anniversary of electronics 3. G.E. Moore, Progress in digital integrated electronics, in International Electron Devices Meeting (IEEE, New York, 1975) 4. K. Marx, Capital (Progress, Moscow, 1978). Chap. 15, Sect. 2 5. J.S. Mill, Principles of Political Economy (London, 1848) 6. G. Ip, Did greenspan push high-tech optimism on growth too far?. The Wall Street Journal, December 28, 2001, pp. A1, A12 7. H.R. Huff, John Bardeen and transistor physics, in Characterization and Metrology for ULSI Technology 2000. AIP Conference Proceedings, vol. 550 (American Institute of Physics, New York, 2001), pp. 3–29 8. G.D. Hutcheson, The chip insider. VLSI Research Inc., September 17, 1998 9. Scientific American Interview: Gordon Moore, in Scientific American, September 1997 10. W.R. Runyan, K.E. Bean, Semiconductor Integrated Circuit Processing Technology (Addison-Wesley, Reading, 1990), p. 18 11. C.E. Spork, Spinoff (Saranac Lake, New York, 2001) 12. R.H. Dennard, IBM – field-effect transistor memory. US Patent 3,387,286, Issued June 4, 1968 13. G.D. Hutcheson, The VLSI capital equipment outlook. VLSI Research Inc., 1987 14. G.D. Hutcheson, The chip insider. VLSI Research Inc., May 25, 2001 15. G.D. Hutcheson, J.D. Hutcheson, Technology and economics in the semiconductor industry. Sci. Am. 274, 54–62 (January, 1996) 16. J. Diamond, Guns, Germs, and Steel (W.W. Norton, New York, 1997)
Part II
State-of-the-Art
3 Using Silicon to Understand Silicon J.R. Chelikowsky
3.1 Introduction Silicon is the material of our time. We live in the age of silicon; it is all around us in terms of electronic gadgets and computers. However, it has not always been that way. There was a time before silicon. In the 1940s, transistors did not exist at all and in the early 1950s, transistors were made of germanium, not silicon. These germanium transistors were not very reliable; in particular they were difficult to package and process, and worked over a very limited temperature range. This changed by the mid-1950s. On May 10, 1954, Texas Instruments announced the invention of the silicon transistor: A revolutionary new electronic product – long predicted and awaited – became a reality today with the announcement by Texas Instruments Incorporated of the start of commercial production on silicon transistors. By using silicon instead of germanium, the initial commercial silicon transistor immediately raises power outputs and doubles operating temperatures! The potential application of this entirely new transistor is so great that major electronics firms have been conducting silicon experiments for some time. – Texas Instruments Press Release.1 Within ten years after the invention of the silicon transistor, a silicon chip containing over 2000 transistors was constructed. In contrast, the current Pentium-4 processor made by Intel contains 42 million transistors. By the end of this decade, we should see a processor containing one billion transistors. Today’s silicon-based computers are capable of teraflop performance2 and the next generation will be exhibiting improvements of several orders of magnitude. One can contrast this with the computers available before transistor computers were developed. At best, vacuum tube computers were capable of several kiloflops, or were roughly six orders of magnitude slower than contemporary computers! This 1 http://www.ti.com/corp/docs/company/history/siltransproduction.shtml. 2 A “teraflop” corresponds to one trillion floating point operations per second.
42
J.R. Chelikowsky
increase in computing speed of about an order of magnitude every decade coincides with Moore’s law, which states the number of transistors per integrated circuit will double every 18 months [1]. This is an amazing and unprecedented progression of technology, which has affected all of science. A revolution in our understanding of the theory of materials has accompanied this technological revolution. Silicon technology has provided a material for serving both as a testing ground material and as the basic material for computational tools to study the properties of materials [1]. One measure of scientific impact is to examine the technical literature. Suppose we examine the number of scientific papers on silicon published since the discovery of the silicon chip. A quick search of the scientific and engineering literature suggests that over a quarter of a million papers have been written over the last 30 years that mention silicon [1]. Of course, many of these papers are not directly related to silicon technology or silicon science, but a good fraction of them are. For anyone interested in electronic materials research, this is a “treasure trove” of information. Specifically, this vast database can be used to test and benchmark theoretical methodology and approximations. This is an imperative activity as the quantum theory of materials deals with extraordinarily complex systems. In a macroscopic crystal of silicon, one has 1023 electrons and nuclei. In principle, the application of the known laws of quantum mechanics would allow one to predict all physical and chemical properties of such a system. However, given the astronomical number of particles and corresponding degrees of freedom, it is absolutely hopeless to extract physically meaningful results without some dramatic approximations. What approximations will work and how well? One avenue to help resolve this question is to test methods using the silicon database as a reference. This has been the route used by most condensed matter theorists, at least those interested in the electronic properties of solids (and liquids). The confluence of these two megatrends (advances in materials and computing power), both resulting from the study of silicon, has resulted in the ability of materials physicists to apply new concepts, which can be tested against the silicon database, and to invent new algorithms, which can be implemented on high performance computing platforms. The combination of new ideas and new avenues for computing allows us to predict the properties of materials solely from theoretical constructs. While few would argue that such approaches will replace experimentation in the near term, it possible to use computers to predict some properties more accurately than experimentation can.
3.2 The Electronic Structure Problem 3.2.1 The Empirical Pseudopotential Method The first realistic energy bands for electronic materials were constructed using silicon spectroscopic data, i.e., reflectivity and photoemission data. In particular, the energy bands were fixed using the solution of a one-electron Schrödinger equation
3 Using Silicon to Understand Silicon
43
Fig. 3.1. Pseudopotential model of a solid. The ion cores (nucleus plus core electrons) are chemically inert. The pseudopotential accurately reproduces the electronic states of the valence electrons
[2, 3]. These band diagrams established a framework that allowed one to interpret and understand transport and optical properties of semiconductors. In order to solve for the energy bands of silicon, the electronic interactions have to be accurately described. When the first energy bands were constructed, the understanding of these interactions was limited and was not amenable to calculations from “first principles.” Several approximations were required. One approximation was to assume that the many-electron problem could be mapped on a one-electron problem. This approximation was justified by the Hartree approximation [3]. Another approximation was to separate the chemically active valence states from the chemically inert core electrons. This resulted in the so-called “pseudopotential” approximation. The pseudopotential model of solids is illustrated in Fig. 3.1. In the 1960s, these pseudopotentials were developed and applied to silicon and related materials such as other semiconductors and simple metals [2]. Establishing the accuracy of these potentials was a key test and the results were strikingly successful [3, 4]. One could accurately replicate the experimental results with just a few parameters. Pseudopotentials can be formally justified by the Phillips–Kleinman cancellation theorem [5], which states that the orthogonality requirement of the valence states to the core can be translated into an effective kinetic term in the potential. This repulsive term cancels the strong attractive part of the Coulomb potential. The resulting pseudopotential can be written as a Fourier expansion in plane waves: Vpa (G)S(G) exp(iG · r), (3.1) Vp (r) = G
Vpa (G)
where is the atomic form factor or the Fourier coefficients of the pseudopotential, S(G) is the structure factor, and G is a reciprocal lattice vector. The form
44
J.R. Chelikowsky
Fig. 3.2. Diamond crystal structure
factors are functions only of the magnitude of the reciprocal vector, provided one assumes a spherical pseudopotential, Vpa (r), centered on each atom. For the silicon in the diamond crystal (see Fig. 3.2), the structure factor is given by S(G) = cos(G · τ ) where τ = a(1, 1, 1)/8 is the basis vector and a is the lattice parameter. For Si, a = 5.43 Å. For a semiconductor such as silicon, the sum over reciprocal lattice vector converges so rapidly that only three unique form factors are required to define the potential. The form factors can be extracted from model potentials based on atomic spectra and then fitted to the experiment. The experimental data available greatly exceeds the number of form factors and it is not possible to obtain an arbitrary band configuration. The band structure method, which requires fitting the form factors to the experiment is called the “empirical pseudopotential method” or EPM [2–4]. Most of our understanding of the electronic structure of semiconductors is derived from this method. One of the first applications of the EPM was to determine the energy bands of silicon. The three form factors required were Vp (G2 = 3, 8, 11) where G is in units of 2π/a. As an example, typical values for the form factors are Vp (G2 = 3, 8, 11) = −0.112, 0.028, 0.036 atomic units,3 respectively. A plane wave basis in Bloch form is commonly used: αn (k, G) exp i(k + G) · r (3.2) ψn,k (r) = G
where k is the wave vector, and n is the band index. Pseudopotentials for a semiconductor such as Si require no more than about 50 G-vectors for a converged solution. A matrix of this size is diagonalized to extract the energy bands as a function of k: 1 2 (3.3) det (k + G) − En (k) δG,G + Vp |G − G | S(G − G ) = 0, 2 3 In atomic units (a.u.), e = h = m = 1. Energies are given in hartrees (1 Eh = 27.2 eV) ¯
and the atomic unit of length is the bohr unit (1 a0 = 0.529 Å).
3 Using Silicon to Understand Silicon
45
Fig. 3.3. Energy band structure of crystalline silicon. Two different band structures are illustrated, corresponding to two different pseudopotentials. See Cohen and Chelikowsky [2] for details
where atomic units are used. Equation (3.3) can easily be solved on a laptop computer. In Fig. 3.3, an energy band structure for silicon is illustrated for two different pseudopotentials. The energy bands are similar near the gap, but the valence band width is different. The procedure is empirical because the potentials can be modified to bring the energy bands into agreement with the experiment. Given the wave functions and energy bands, it is possible to determine the optical response of the system. A simple approach is to use the Ehrenreich–Cohen dielectric function [6]: 2 4π 2 2 δ ωcv (k) − ω Mcv (k) d3 k, (3.4) 2 (ω) = 3ω2 cv (2π)3 BZ where the sum is over all transitions from the filled valence band (v) to the empty conduction bands (c), ωcv (k) = Ec (k) − Ev (k), and the integration is over all k points in the Brillouin zone. The dipole matrix element is given by |Mcv (k)|2 = |ck|∇|vk|2 . The real part of the dielectric function, 1 is determined by causality, i.e., by a Kramers–Kronig transformation. Given the dielectric function: = 1 + i2 one can write the normal incident reflectivity in terms of the complex index of refraction, N , where N 2 = as N (ω) − 1 2 . (3.5) R(ω) = N (ω) + 1 Owing to the quasi-continuous nature of energy bands, reflectivity spectra for solids are not highly structured. This situation is in strong contrast with respect to atomic
46
J.R. Chelikowsky
Fig. 3.4. Modulated reflectivity spectra of silicon. Top panel is experiment from Zucca and Shen [51]. The bottom panel gives two theoretical spectra corresponding to the band structures in Fig. 3.3
spectra, which are highly structured with discrete absorption and emission lines. However, it is possible to enhance structural features in the spectra of solids by taking the numerical derivative of the spectra with respect to an external parameter or to the wavelength of the light. Such “modulated” spectra show structures associated with the band structure van Hove singularities [2]. In Fig. 3.4, the wavelength modulated reflectivity spectrum of silicon is illustrated along with the calculated spectrum. Overall the agreement is satisfactory, especially with respect to the peak positions. The EPM calculations do not include spin–orbit coupling, which in the case of silicon is small. The “doublet” peak in the experimental spectrum around 3.4 eV is from spin–orbit effects. Also, the Ehrenreich–Cohen dielectric function does not include local fields [2], nor does it include excitonic effects, i.e., electron–hole interactions. Excitons play only a small role in silicon in terms of altering the band gap, but may have profound effects on the spectral line shape, especially near the threshold. This likely accounts for the “sharpness” of the reflectivity spectrum at low energies.
3 Using Silicon to Understand Silicon
47
Values for the form factors for other semiconductors can be found in the literature [2–4]. The EPM does equally well, if not better, in describing the optical features of semiconductors other than Si. It is not unusual for the energy bands to agree with the experiment to within ∼0.25 eV over a ∼15 eV range, or within a few percent. Suppose the EPM had failed the “Si test” and not yielded an accurate energy band description for this well-known material? Such a failure would have had profound ramifications; it would have suggested that a one-electron description of semiconducting materials was not possible. 3.2.2 Ab Initio Pseudopotentials and the Electronic Structure Problem While “empirical pseudopotentials” can be used for systems with high symmetry, their use for complex systems becomes problematic. The lack of symmetry means that the number of forms factors can be quite large; too large to be fixed by experiment. While it is possible to fix a model potential and extract form factors for an arbitrary reciprocal lattice vector, the transferability of such potentials is often uncertain. The effect of charge transfer and hybridization is not present in empirical pseudopotentials, save for the system for which the pseudopotentials were fit. As an extreme example, Na and Cl pseudopotentials fit to the standard states of Na (a metal) and Cl (a gas composed of covalent molecules), will not likely result in an accurate description of crystalline NaCl (a ionic solid). In most contemporary work, electronic potentials have been fixed not by experiment, but from “first principles.” These first principles pseudopotentials often rely on density functional theories, which are “exact” in principle, but in practice rely on a variety of approximations such as the local density approximation. A solution to the electronic structure is obtained in this approach from the Kohn–Sham equation [7, 8]: 2 2 −h¯ ∇ p (3.6) + Vion (r) + VH (r) + Vxc (r) ψn (r) = En ψn (r), 2m p
where Vion is the ion-core pseudopotential, VH is the Coulomb or Hartree potential and Vxc is the effective exchange-correlation potential. The Hartree potential is obtained by solving a Poisson equation: ∇ 2 VH (r) = −4πeρ(r),
(3.7)
where ρ is the charge density given by ρ(r) = −e
ψn (r)2 .
(3.8)
occup,n
The summation is over all occupied states. Within the local density approximation, the Vxc potential is functional of the charge density: Vxc = Vxc [ρ]. Solving the Kohn–Sham problem corresponds to the construction of a selfconsistent screening potential (VH plus Vxc ) based on the charge density. The ioncore pseudopotential is based on an atomic calculation, which is easy to implement
48
J.R. Chelikowsky
Fig. 3.5. All electron and pseudopotential wave function for the 3s state in silicon. The all electron 3s state has nodes which arise because of an orthogonality requirement to the 1s and 2s core states
[9–11]. In the case of silicon, the ion-core pseudopotential corresponds to the nuclear charge plus the screening potential from the core electrons (1s 2 2s 2 2p 6 ). The ion-core pseudopotential when so screened by the valence charge will yield the same solution as the all-electron potential, save for the charge density near the core region. A variety of methods exist to construct pseudopotentials [9]. Almost all these methods are based on “inverting” the Kohn–Sham equation. As a simple example, suppose we consider an atom, where we know the valence wave function, ψv and the valence energy, Ev . Let us replace the true valence wave function by an approximate p pseudo-wave function, φv . Then the ion-core pseudopotential is given by p
p
Vion =
h¯ 2 φv − VH − Vxc + Ev . 2m p
(3.9)
The charge density in this case is ρ = |φv |2 , from which VH and Vxc can be calcup lated. The key aspect of this inversion is choosing φv to meet several criteria, e.g., p φv = ψv outside the core radius, rc . In Fig. 3.5, the “all-electron” wave function for the silicon 3s state is shown and compared to the “pseudo” 3s state. Unlike the all-electron potential, pseudopotentials are not simple functions of position. For example, the pseudopotential is state dependent, or angular momentum dependent, i.e., in principle one has a different potential for s, p, d, . . . states. Details can be found in the literature [9, 11].
3 Using Silicon to Understand Silicon
49
Fig. 3.6. Total electronic energy for polymorphs of crystalline silicon. The volume is normalized by the experimental value. The absolute energy scale is set by the details of the pseudopotential [12]
Once the Kohn–Sham equation is solved, the total electronic energy of the system can be obtained from knowledge of the energy levels and wave functions: 1 En − (3.10) Etotal = VH ρ d3 r + (Exc − Vxc )ρ d3 r + Eion-ion . 2 occup,n The sum is over all occupied states. The second term subtracts off the double counting terms. The third term subtracts off the exchange-correlation potential and adds in the correct energy density functional. The last term is the ion–ion core repulsion term. Silicon played a crucial role in assessing the validity of this approach, which has proved to be reasonably accurate for electronic materials. Typically, bond lengths can be calculated to within a few percent, although chemically accurate bond energies are more problematic. Initial applications of first principles pseudopotentials constructed within the local density approximation included studies of silicon polymorphs under pressure [12]. Specifically, Yin and Cohen [12] were the first to show the accuracy of the local density approximation for the total electronic energy of silicon in various crystalline forms. Their work is illustrated in Fig. 3.6. They found that the lowest energy structure calculated was for silicon in the diamond structure, as expected. This work produced some rather remarkable predictions. For example, pseudopotential studies predicted that some high-pressure forms of silicon would be superconducting [13]. This prediction was later confirmed by experiment. More recently, the first molecular dynamics simulations using quantum forces were done on
50
J.R. Chelikowsky
silicon, including the first theoretical studies to examine the electronic and structural properties of liquid silicon and amorphous silicon [14, 15]. While the theoretical background for calculating ground state properties of manyelectron systems is now well established, excited state properties such as optical spectra present a challenge for density functional theory. Recently developed linear response theory within the time-dependent density-functional formalism provides a new tool for calculating excited states properties [16–20]. This method, known as the time-dependent local density approximation (TDLDA), allows one to compute the true excitation energies from the conventional, time-independent Kohn–Sham transition energies and wave functions. Within the TDLDA, the electronic transition energies Ωn are obtained from the solution of the following eigenvalue problem [18, 19]:
2 ωij σ δik δj l δσ τ + 2 fij σ ωij σ Kij σ,klτ fklτ ωklτ F n = Ωn2 F n (3.11) where ωij σ = j σ − iσ are the Kohn–Sham transition energies, fij σ = niσ − nj σ are the differences between the occupation numbers of the i-th and j -th states, the eigenvectors F n are related to the transition oscillator strengths, and Kij σ,klτ is a coupling matrix given by:
∂vσxc (r) 1 ∗ ∗ + φkτ (r )φlτ Kij σ,klτ = φiσ (r)φj σ (r) (r ) dr dr , (3.12) |r − r | ∂ρτ (r ) where i, j, σ are the occupied state, unoccupied state, and spin indices respectively, φ(r) are the Kohn–Sham wave functions, and v xc (r) is the LDA exchange-correlation potential. Some of the first applications of TDLDA were made to silicon clusters and quantum dots [21].
3.3 New Algorithms for the Nanoscale: Silicon Leads the Way The Kohn–Sham problem, cast within the pseudopotential-density functional formalism, is fairly easy to solve for simple elemental crystals such as silicon. For crystalline materials, the number of degrees of freedom is dramatically reduced by symmetry, i.e., the use of Bloch wave functions where k is a good quantum number [10, 11]. However, for systems that lack symmetry the Kohn–Sham problem remains quite difficult. Confined systems such as fragments of the bulk crystal, or extended systems such as an amorphous solid or a liquid are examples of such systems. In these cases, a solution of the Kohn–Sham equation involves many degrees of freedom and often scales poorly with the number of electrons in the system. Any solution of this complex problem must be handled by making several numerical and physical approximations [10, 11]. Computing the electronic structure of a known material such as silicon can test these approximations. A popular algorithm for describing the electronic structure of a localized system is based on a real space description. This algorithm solves the Kohn–Sham equation on a grid in real space [22, 23]. This method was first tested against traditional solutions for silicon clusters and quantum dots. The real space approach has become
3 Using Silicon to Understand Silicon
51
popular and several groups have implemented different variations of this general approach [24–27]. We illustrate a particular version of real space approaches based on high-order finite differencing [22, 23, 27]. This approach overcomes many of the complications involved with non-periodic systems such as replicating the vacuum, and although the resulting matrices can be larger than with other methods such as plane waves, the matrices are sparse. Real space methods are also easier to parallelize than plane wave methods. Even on sequential machines, we find that real space methods can be an order of magnitude faster than plane wave methods [22, 23, 27]. Real space algorithms avoid the use of fast Fourier Transforms (FFTs) by performing all calculations in real space instead of Fourier space. A benefit of avoiding FFTs is that the new approaches have very few global communications. A key aspect to the success of the finite difference method is the availability of high-order finite difference expansions for the kinetic energy operator [28]. High-order finite difference methods significantly improve convergence of the eigenvalue problem when compared with standard finite difference methods. If one imposes a simple, uniform grid on our system where the points are described in a finite domain by (xi , yj , zk ), we approximate: M ∂2 Ψ (x , y , z ) ≈ C(n)Ψ (xi + nh, yj , zk ) + O h2M+2 , i j k 2 ∂x
(3.13)
n=−M
where h is the grid spacing, C(n) are expansion coefficients, and M gives the number of neighboring points. The expansion approximation for the Laplacian is accurate to O(h2M+2 ) given the assumption that Ψ can be accurately approximated by a power series in h. This is a good assumption if pseudopotentials are used, as the resulting wave functions are smoothly varying. Algorithms are available to compute the coefficients for arbitrary order in h [28]. With the kinetic energy operator expanded as in (3.14), one can set up the Kohn– Sham equation over a grid. A typical uniform grid configuration is illustrated by examining the electronic structure of a localized system in Fig 3.7. In this illustration, a cluster is shown. Outside a given domain the wave functions must vanish. A uniform grid is not a requirement for this procedure, but the problem is considerably more difficult to implement when the grid is non-uniform. For example, if the atoms are moved, the grid should be re-optimized. Moreover, the convergence of the eigenvalue problem is no longer dependent on one grid parameter, but may require an optimization in a multi-parameter space. Ψn is computed on the grid by solving the eigenvalue problem: −h¯ 2 2m
M
C(n1 , n2 , n3 )Ψn (xi + n1 h, yj + n2 h, zk + n3 h) n1 ,n2 ,n3 =−M p + Vion (xi , yj , zk ) + VH (xi , yj , zk ) + Vxc (xi , yj , zk ) Ψn (xi , yj , zk )
= En Ψn (xi , yj , zk ).
(3.14)
52
J.R. Chelikowsky
Fig. 3.7. Schematic of a real space grid for calculating the electronic structure of a localized system. The system of interest is placed in a large domain on which a uniform grid is set down. The wave functions of the system are constrained to vanish outside the domain
There are a number of difficulties which emerge when solving the (discretized) eigenproblems, besides the sheer size of the matrices. The first, and biggest, challenge is that the number of required eigenvectors is proportional to the number of atoms in the system. This number can grow up to thousands, if not more for systems such as quantum dots. In addition to storage issues, maintaining the orthogonality of these vectors can be very demanding. Usually, the most computationally expensive part of diagonalization codes is orthogonalization. Second, the relative separation of the eigenvalues decreases as the matrix size increases and this has an adverse effect on the rate of convergence of the eigenvalue solvers. Preconditioning techniques attempt to alleviate this problem. Real space codes benefit from savings brought about by not needing to store the Hamiltonian matrix, although this benefit may be offset by the need to store large vector bases.
3.4 Optical Properties of Silicon Quantum Dots The TDLDA formalism is easy to implement in real space within the higher-order finite difference pseudopotential method [29]. The real space pseudopotential code represents a natural choice for implementing TDLDA due to the real space formulation of the general theory. With other methods, such as the plane wave approach, TDLDA calculations typically require an intermediate real space basis. After the original plane wave calculation has been completed, all functions are transferred into that basis, and the TDLDA response is computed in real space [30]. The additional basis complicates calculations and introduces an extra error. The real space approach
3 Using Silicon to Understand Silicon
53
Fig. 3.8. Model of hydrogenated silicon quantum dot. The dot’s surface is passivated with hydrogen atoms to remove electronically active dangling bond states
simplifies implementation and allows us to perform the complete TDLDA response calculation in a single step. One of the first applications of TDLDA centered on hydrogenated silicon clusters and quantum dots. Crystalline silicon is not a very effective optical material. The optical gap in silicon is indirect, i.e., the valence band maximum and conduction band minimum do not occur at the same wave vector k. This does not allow a direct optical transition as wave vector momentum cannot be conserved without invoking lattice vibration contributions or phonons. However, as the size of the system approaches that of the electron–hole interaction, Bloch’s theorem ceases to hold. At this point, silicon is transformed to an optically active material. This occurs at length scales on the order of a few nanometers. Porous silicon is an example of an optically active form of silicon owing to the quantum confinement of the electron–hole pairs [31–34]. Silicon nanocrystals are another form of silicon where the optical properties can be strongly modified. An example of such a system is given in Fig. 3.8. In this example, a nanocrystal of silicon is hydrogenated to remove the electronically active states. A variety of methods have been developed to examine the role of optical confinement in silicon. Tight binding [35, 36], empirical pseudopotentials [37–39], quantum Monte Carlo [40], and density functional methods, including time-dependent density functional theory [21, 41, 42]. In the tight binding and empirical pseudopotential methods, parameters from crystalline silicon are extrapolated or scaled to the nano-size regime. This represents a drawback of these theories as it is problematic as to how the electron–hole and sur-
54
J.R. Chelikowsky
Fig. 3.9. The theoretical optical gap for nanocrystalline silicon as a function of size. Results from tight binding (TB) [35, 52], quasi-particle excitations from density functional theory (QP-LDA) [34], time-dependent local density approximation (TDLDA) [21] and empirical pseudopotential method (EPM) [53]. With the notable exception of the EPM results, the predicted gaps are in good agreement
face terms should be scaled. Quantum Monte Carlo (QMC) methods are in principle exact, but do not generate an optical spectra and, as such, are quite limited in giving insights into higher excited states. They yield the lowest energy excitations, but current QMC implementations do not provide the oscillator strength. Density functional theory approaches can be used in the static limit. The optical gap is calculated in two steps. First, the energy to create a non-interacting electron–hole pair is calculated by finding the electron affinity (A) and the ionization energy (I ). The difference, I − A, yields the “quasi-particle” gap. This quantity cannot be compared to the optical gap for small dots without including excitonic energies. A simple approach is to compute the electron–hole coulombic energy and add this to the quasi-particle gap. However, it is difficult to include properly the electron–hole interaction. Some attempts have been made along these lines by evaluating the dielectric matrix within density functional theory [43]. In principle, time-dependent density functional theory captures the collective response of the system. The only additional approximation besides that within the local density approximation is that the system follows the applied field adiabatically [44], i.e., within TDLDA the interactions are assumed to be local in space and time. This assumption may not properly reproduce the excitonic interactions and there have been concerns raised as to whether the method scales properly with size [44, 45]. Despite these reservations, the agreement between the various methods is quite satisfactory as illustrated in Fig. 3.9. All show a strong blue shift of the absorption edge as the size of the dot is decreased. This trend can be obtained by simple “particle in a box” arguments, i.e., as the size of the box shrinks, the energy levels increase in size and separation. Although this argument is not quite correct in that it assumes
3 Using Silicon to Understand Silicon
55
an infinite potential barrier outside the dot, it is qualitatively correct. Typically, the gaps scale as R −n where n ≈ 1.5. The difference between the EPM and the other methods in Fig. 3.9 may arise from an incorrect transfer of parameters related to the electron–hole interactions in the crystal to the quantum do. Also, in this particular approach, quantum and classical electrostatic terms are combined in an ad hoc manner. The experimental situation is complex as a number of different synthesis methods have been used to produce nanocrystals, e.g., see [33, 46]. However, these methods yield gaps that agree in general with those shown in Fig. 3.9.
3.5 Doping Silicon Nanocrystals In nanocrystals or quantum dots, where the motion of electrons (or holes) is limited in all three dimensions, one expects that both electronic and optical properties will be affected as well. For example, in bulk semiconductors, shallow donors (or acceptors) are crucial in determining the transport properties required to construct electronic devices [47]. However, these properties are significantly altered in highly confined systems such as quantum dots. Important questions arise as to whether dopants will continue to play a role similar to that in bulk semiconductors in the nano-size regime. As is typical for electronic materials these questions have been explored in small fragments of silicon, typically passivated with hydrogen atoms. In Fig. 3.8, we illustrate such a system containing over a hundred silicon atoms. To examine the role of a dopant atom, one can consider such a dot doped with a single phosphorous atom. In such a system, the binding environment of the phosphorous–silicon bond can be strongly modified, resulting in a large change in the electron ionization energy of the phosphorous donor electron. We can calculate the electron affinity and ionization energy by calculating the total electronic energy of the system (as from (3.10) in different charge states): I = Etotal (N − 1) − Etotal (N ), A = Etotal (N ) − Etotal (N + 1),
(3.15)
where there are N electrons in the neutral system. In this situation, the electron affinity and ionization energy correspond to ground state properties for neutral and charged systems. In contrast to plane wave methods, which use super-cells, real space methods require no special assumptions to accommodate charge systems [11, 22]. In a crystal, one can calculate the binding energy of an electron to the donor atom by finding the energy difference in ionizing the phosphorous atom relative to adding an electron to the intrinsic crystal. The donor energy, Ed , is given by Ed = A(Si) − I (P : Si),
(3.16)
where I (P : Si) represents the ionization energy of the doped crystal and A(Si) represents the energy of the undoped crystal. For a crystal, Ed is typically a few
56
J.R. Chelikowsky
Fig. 3.10. The affinity and ionization energies for Si quantum dots doped with P. The ionization energies for doped and undoped dots are illustrated
meV, which accounts for the ionization of donors at normal operating temperatures. The extent of the interaction can be estimated by the Bohr orbit (in a.u.): aSi = /m∗e . A rough estimate can be made using = 15 and m∗e = 0.5me [48], or aSi = 30 a.u. or 3 nm. We expect quantum dots of silicon with a diameter less than ∼6 nm to exhibit significant changes in the doping properties owing to quantum confinement. We plot ionization and affinity energies as a function of quantum dot radius, R, shown in Fig. 3.10. The ionization energies for undoped hydrogenated Si nanocrystals are also given for comparison. The most striking feature in Fig. 3.10 is that the ionization energy for the doped dot shows a very weak dependence on the size of the dot. The size dependence of ionization energy is different from the behavior of the ionization energy in undoped Si quantum dots where this quantity is very large at small radii and gradually decreases, scaling as R −1.1 . This dependence of the ionization energy on the radius is weaker than the R −2 law predicted by effective mass theory, as previously noted. It is, nevertheless, a consequence of spatial confinement of electrons (holes) within the quantum dot. The absence of a strong dependence of the ionization energy in the doped dot is largely due to the weak screening present in quantum dots and the physical confinement of the donor electron within the dot. The donor energy, Ed , is on the order of several eV, not several meV as for the bulk crystal. In this sense, phosphorous is not a shallow donor in nanoscale silicon dots. Figure 3.11 illustrates the charge density along a line through the P atom in the Si quantum dot. This plot shows the square of the wave function for the highest occupied state. This plot confirms the localization of the charge around the phosphorous atom. Given the charge distribution of the dopant electron, one can evaluate the isotropic hyperfine parameter and the corresponding hyperfine splitting (HFS), which is determined by the contact interaction between the electron and defect nuclei [49]. The method by Van de Walle and Blöchl allows one to extract the isotropic hyperfine parameter and the resulting HFS from knowledge of the charge density at the nuclear
3 Using Silicon to Understand Silicon
57
Fig. 3.11. Charge density for the dopant electron associated with P along the [100] direction in a silicon quantum dot. The extent of the dot along this direction is indicated by the arrows
Fig. 3.12. Calculated (•) and experimental () isotropic hyperfine parameter A vs. dot’s radius R. The solid line is the best fit to calculations (bulk value of hyperfine parameter 42 G was used to obtain this fit). The Inset shows experimental data of Fujii et al. [49] together with the fit to results of calculations. Two sets of experimental points correspond to the average size of nanocrystals (×) and the size of nanocrystals () estimated from comparison of photoluminescence energies for doped and undoped samples
site [50]. The calculated HPS for a P atom positioned in the dot center is given in Fig. 3.12. At small sizes, the HFS is very large owing to strong localization of the electron around impurity. As the radius increases, the value of the splitting decreases. Our calculated results scale with the radius of the dot as R −1.5 (effective mass theory
58
J.R. Chelikowsky
gives R −3 ) [50]. In Fig. 3.12, we also present experimental data [49]. The measured values of the HFS fall on the best fit to calculated results with the limit of the fit constructed to approach the bulk value. When this study was completed, computational limitations prevented a comparison directly to the experimental size regime [54]. It should be emphasized that we found no strong dependence on the choice of the P site. We examined other sites by replacing one of the Si atoms in each shell with a P atom while retaining the passivating hydrogen atoms. We found that the ionization and binding energies were unchanged to within 10%, independent of the impurity atom position, save for the surface site. Our work illustrates how computational work can be used to describe doping in silicon nanostructures. Other work on these quantum dots includes extensive studies of the optical properties and recent studies on nanowires composed of silicon.
3.6 The Future Will silicon continue to play such an important role in understanding the electronic properties of materials? One suspects it will. For example, recent developments in algorithms for nanoscale systems with thousands of atoms are focusing on silicon atoms. As long as Moore’s law holds for silicon technology, workers will continue to work on understanding the fundamental properties of this material. As such, I see no reason why silicon will not continue to be the testing ground for new theoretical tools.
References 1. J.R. Chelikowsky, Mater. Res. Bull. 27, 951 (2002) 2. M.L. Cohen, J.R. Chelikowsky, Electronic Structure and Optical Properties of Semiconductors, 2nd edn. (Springer, Berlin, 1989) 3. J.R. Chelikowsky, M.L. Cohen, Phys. Rev. B 14, 556 (1976) 4. M.L. Cohen, T.K. Bergstresser, Phys. Rev. 141, 789 (1965) 5. J.C. Phillips, L. Kleinman, Phys. Rev. 116, 287 (1959) 6. H. Ehrenreich, M.H. Cohen, Phys. Rev. 115, 786 (1959) 7. P. Hohenberg, W. Kohn, Phys. Rev. 136, B864 (1964) 8. W. Kohn, L.J. Sham, Phys. Rev. 140, A1133 (1965) 9. N. Troullier, J.L. Martins, Phys. Rev. B 43, 1993 (1991) 10. W. Pickett, Comput. Phys. Rep. 9, 115 (1989) 11. J.R. Chelikowsky, M.L. Cohen, in Handbook of Semiconductors, 2 edn., ed. by T.S. Moss, P.T. Landsberg (Elsevier, Amsterdam, 1992) 12. M.T. Yin, M.L. Cohen, Phys. Rev. Lett. 45, 1004 (1980) 13. K.J. Chang, M.M. Dacorogna, M.L. Cohen, J.M. Mignot, G. Chouteau, G. Martinez, Phys. Rev. Lett. 54, 2375 (1985) 14. R. Car, M. Parrinello, Phys. Rev. Lett. 55, 2471 (1985) 15. R. Car, M. Parrinello, Phys. Rev. Lett. 60, 204 (1988) 16. E.K.U. Gross, W. Kohn, Phys. Rev. Lett. 55, 2850 (1985) 17. E.K.U. Gross, W. Kohn, Adv. Quantum Chem. 21, 255 (1990)
3 Using Silicon to Understand Silicon
59
18. M. Casida, in Recent Advances in Density-Functional Methods, Part I, ed. by D. Chong (World Scientific, Singapore, 1995), p. 155 19. M. Casida, in Recent Developments and Applications of Modern Density Functional Theory, ed. by J. Seminario (Elsevier, Amsterdam, 1996), p. 391 20. H. Appel, E.K.U. Gross, K. Burke, Phys. Rev. Lett. 90, 043005 (2003) 21. I. Vasiliev, S. Ö˘güt, J.R. Chelikowsky, Phys. Rev. B 65, 115416 (2002) 22. J.R. Chelikowsky, N. Troullier, Y. Saad, Phys. Rev. Lett. 72, 1240 (1994) 23. J.R. Chelikowsky, Y. Saad, S. Ogut, I. Vasiliev, A. Stathopoulos, Phys. Stat. Sol. (b) 217, 173 (2000) 24. E.L. Briggs, D.J. Sullivan, J. Bernholc, Phys. Rev. B 52, R5471 (1995) 25. J.E. Pask, B.M. Klein, P.A. Sterne, C.Y. Fong, Comput. Phys. Commun. 135, 1 (2001) 26. G. Zumbach, N.A. Modine, E. Kaxiras, Solid State Commun. 99, 57 (1996) 27. T.L. Beck, Rev. Mod. Phys. 74, 1041 (2000) 28. B. Fornberg, D.M. Sloan, Acta Numer. 94, 203 (1994) 29. J.R. Chelikowsky, J. Phys. D: Appl. Phys. 33, R33 (2000) 30. X. Blase, A. Rubio, S.G. Louie, M.L. Cohen, Phys. Rev. B 52, R2225 (1995) 31. J.P. Proot, C. Delerue, G. Allan, Appl. Phys. Lett. 61, 1948 (1992) 32. L.T. Canham, Appl. Phys. Lett. 57, 1046 (1990) 33. M.V. Wolkin, J. Jorne, P.M. Fauchet, G. Allan, C. Delerue, Phys. Rev. Lett. 82, 197 (1999) 34. S. Ogut, J.R. Chelikowsky, S.G. Louie, Phys. Rev. Lett. 79, 1770 (1997) 35. N.A. Hill, K.B. Whaley, Phys. Rev. Lett. 76, 3039 (1996) 36. C. Delerue, M. Lannoo, G. Allan, Phys. Rev. Lett. 84, 2457 (2000) 37. L.W. Wang, A. Zunger, J. Phys. Chem. 98, 2158 (1994) 38. L.W. Wang, A. Zunger, J. Phys. Chem. 100, 2394 (1994) 39. A. Franceschetti, L.W. Wang, A. Zunger, Phys. Rev. Lett. 83, 1269 (1999) 40. A.J. Williamson, J.C. Grossman, R.Q. Hood, A. Puzder, G. Galli, Phys. Rev. Lett. 89, 196803 (2002) 41. B. Delley, E.F. Steigmeier, Phys. Rev. B 47, 1397 (1993) 42. L.E. Ramos, J. Furthmüller, F. Bechstedt, Phys. Rev. B 71, 035328 (2005) 43. S. Ogut, R. Burdick, Y. Saad, J.R. Chelikowsky, Phys. Rev. Lett. 90, 127401 (2003) 44. G. Onida, L. Reining, A. Rubio, Rev. Mod. Phys. 74, 601 (2002) 45. L. Reining, V. Olevano, A. Rubio, G. Onida, Phys. Rev. Lett. 88, 066404 (2002) 46. J. von Behren, T. van Buuren, M. Zacharias, E.H. Chimowitz, P.M. Fauchet, Solid State Commun. 105, 317 (1998) 47. D. Melnikov, J.R. Chelikowsky, Phys. Rev. Lett. 92, 046802 (2004) 48. S.M. Sze, Semiconductor Devices, Physics and Technology, 2nd edn. (Wiley, New York, 2002) 49. M. Fujii, A. Mimura, S. Hayashhi, Y. Yamamoto, K. Murakami, Phys. Rev. Lett. 89, 206805 (2002) 50. C.G.V. de Walle, P.E. Blöchl, Phys. Rev. B 47, 4244 (1993) 51. R.R.L. Zucca, Y.R. Shen, Phys. Rev. B 1, 2668 (1970) 52. N.A. Hill, K.B. Whaley, Phys. Rev. Lett. 75, 1130 (1995) 53. F.A. Reboredo, A. Franceschetti, A. Zunger, Phys. Rev. B 61, 13073 (2000) 54. T.-L. Chan, M.L. Tiago, E. Kaxiras, J.R. Chelikowsky, Nano Lett. 8, 596 (2008)
4 Theory of Defects in Si: Past, Present, and Challenges S.K. Estreicher
4.1 Introduction Defects greatly affect the mechanical, electrical, optical, and magnetic properties of materials, especially semiconductors [1, 2]. There are many different types of defects, ranging from extended structures (e.g., grain boundaries, interfaces, dislocations, and precipitates), to complexes, to isolated native defects or impurities. The focus in this chapter is on localized defects such as vacancies, self-interstitials, isolated impurities, pairs and small complexes. These nanometer-size defects play many important roles and are the building blocks of larger defect structures. Understanding the properties of defects begins at this scale. There are many examples of the beneficial or detrimental roles of defects. Oxygen and nitrogen pin dislocations in Si and allow wafers to undergo a range of processing steps without breaking [3]. Small oxygen precipitates provide internal gettering sites for transition metals, but some oxygen clusters are unwanted donors which must be annealed out [4]. Shallow dopants are often implanted. They contribute electrons to the conduction band or holes to the valence band. Native defects, such as vacancies or self-interstitials, promote or prevent the diffusion of selected impurities, in particular dopants. Self-interstitial precipitates may release selfinterstitials, which in turn promote the transient enhanced diffusion of dopants [5]. Transition metal (TM) impurities – and especially the aggregates of TMs – are associated with electron–hole recombination centers. Hydrogen, almost always present at various stages of device processing, passivates the electrical activity of dopants and of many deep-level defects or forms extended defect structures known as platelets [6]. Mg-doped GaN must be annealed at rather high temperatures to break up the {Mg, H} complexes which suppress p-type doping [7]. Magnetic impurities such as Mn can render a semiconductor ferromagnetic. The list goes on. In the past, approximate (but sufficient) defect control has been achieved by educated guesses; for example, changing the order of the processing steps, hightemperature anneals, and a variety of such manipulations. Today, sophisticated tech-
62
S.K. Estreicher
niques allow the manufacture of thin layers, virtually defect-free, protected from impurities in the bulk by a buried oxide [8]. The dimensions of the active volume of today’s devices are often measured in nm. Defect control at the atomic scale is increasingly required. This in turn implies the need for understanding the basic properties of defects at the atomic level: equilibrium sites; charge states and electrical activity; activation energies for reorientation, diffusion, or dissociation; interactions with common impurities such as hydrogen, carbon, oxygen, or dopants; and so on. Many defects are found in more than one configuration and/or charge state; many of their properties are affected by the position of the Fermi level, exposure to bandgap light, thermal annealing, irradiation, and other external factors. Much of the microscopic information about defects comes from electrical, optical, and/or magnetic experimental probes. The electrical data is often obtained from capacitance techniques such as deep-level transient spectroscopy (DLTS). The sensitivity of DLTS is very high and the presence of defects in concentrations as low as 1011 cm−3 can be detected. However, even in conjunction with uniaxial stress experiments, this data provides little or no elemental and structural information and, by itself, is insufficient to identify the defect responsible for electrical activity. Local vibrational mode (LVM) spectroscopy, that is, Raman and Fourier-transform infrared absorption (FTIR), often gives sharp lines characteristic of the Raman- or IR-active LVMs of impurities lighter than the host atoms. When uniaxial stress, annealing, and isotope substitution studies are performed, the experimental data provides a wealth of critical information about a defect. This information can be correlated, for example, with DLTS annealing data. However, the sensitivity of LVM techniques is relatively low as compared to DLTS. In the case of Raman, over 1017 cm−3 defect centers must be present in the sub-surface layer exposed to the laser. In the case of FTIR, some 1016 cm−3 defect centers are needed, although much higher sensitivities have been obtained from multiple-internal reflection FTIR [9]. Photoluminescence (PL) is much more sensitive, sometimes down to 1011 cm−3 , but the spectra can be more complicated to interpret [10]. Finally, magnetic probes such as electron paramagnetic resonance (EPR) [11–13] and electron–nuclear double resonance (ENDOR) [14], are wonderfully detailed and a lot of defect-specific data can be extracted: identification of the element(s) involved in the defect and its immediate surroundings, symmetry, spin density maps, etc. However, the sensitivity of EPR is also low compared to that of PL or DLTS, that is, of the order of 1016 cm−3 EPR-active centers are needed. Further, localized gap levels in semiconductors often prefer to be empty or doubly occupied as most defect centers in semiconductors are unstable in a spin 12 state. The sample must be illuminated in order to create an EPR-active version of the defect under study [11–13]. There are very few defects for which electrical, optical, and magnetic information is available. One example is interstitial hydrogen [12, 13, 15, 16]. Such a fortunate situation is the exception, not the rule. In the overwhelming majority of cases, only a fraction of the information desired can be obtained from an experiment. Then theoretical input is required. Until a decade ago or so, the predictive power of theory was limited. In recent years, however, it has become much more quantitative and first-principles theory is now a full-time partner of experiment. The energetics
4 Theory of Defects in Si: Past, Present, and Challenges
63
and vibrational spectra of defects, for example, are predicted reliably. On the other hand, the locations of their electrically-active gap levels are calculated with error bars that are generally too large to allow defect identification based on electrical activity alone. In contrast to experiment, which measures the properties of an unknown defect, theory begins with an assumed defect structure and predicts its properties. Then the measured and calculated properties of the defect can be compared. Following an overview of the evolution of the theory in the past few decades [17] (Sect. 4.2), I will discuss the theoretical approach most commonly used by theorists today (Sect. 4.3). Then selected recent theory developments will be highlighted (Sect. 4.4). The chapter concludes with a brief discussion (Sect. 4.5).
4.2 From Empirical to First-Principles When semiconductor technology emerged during and immediately following World War II, the most pressing issue was to understand doping. Effective mass theory (EMT) [18] described shallow-level impurities as weak perturbations to the perfect crystal. The idea was to write the Schrödinger equation for the nearly-free charge carrier, trapped very close to a parabolic band edge, in hydrogenic form with an effective mass determined by the curvature of the band. The calculated binding energy of the charge carrier is that of a hydrogen atom but reduced by the square of the dielectric constant. As a result, the associated wave function is substantially delocalized, with an effective Bohr radius some 100 times larger than that of the free H atom. A number of refinements to EMT have been proposed [19]. EMT provided a basic understanding of doping. However, it could not be extended to defects that have gap levels “far” from a band edge. These defects are not weak perturbations to the crystal and often involve substantial relaxations and distortions of the crystal, thus significantly disrupting its periodicity. The first such defects to be studied were the byproducts of radiation damage, a hot issue in the early days of the Cold War. EPR data became available for the vacancy [11, 20] and the divacancy [21] in silicon (but not the Si self-interstitial, which has never been detected). Many TM impurities, which are often very active recombination centers, have also been studied by EPR [22]. The EPR studies showed that the vacancy and the divacancy undergo symmetrylowering Jahn–Teller distortions. Interstitial oxygen, the most common impurity in Czochralski-grown Si, was known to reside at a puckered bond-centered site [23, 24]. Yet it was not realized at first how much energy is involved in such relaxations and distortions. It was believed that the physics and chemistry of defects in semiconductors are properly described (in first order) when assuming that the impurities reside at high-symmetry, undistorted, lattice sites. Defects were thought to exist in a stiff host crystal. Lattice relaxations and/or symmetry-lowering distortions were assumed to be small corrections to the energy. The important issue then was to correctly predict trends in the spin densities and electrical activities of specific defects centers in order to explain the EPR and electrical data (see, e.g., [25, 26]). The critical impor-
64
S.K. Estreicher
tance of carefully optimizing the geometry around defects and the magnitudes of the relaxation energies were not fully appreciated until the 1980s [27, 28]. Green’s functions [1, 29–32] were the first theoretical tools used to describe localized defects in semiconductors. These calculations begin with the Hamiltonian H0 of the perfect crystal. Its eigenvalues give the band structure and the eigenfunctions are Bloch or Wannier functions. In principle, the defect-free host crystal is perfectly described. The localized defect is represented by a Hamiltonian H which includes the defect potential V . The Green’s function is G(E) = 1/(E − H ). The perturbed energies E coincide with its poles. The new eigenvalues include the gap levels of the defect and the corresponding eigenfunctions are the defect wave functions. In principle, Green’s functions provide an ideal description of the defect in its crystalline environment. In practice, there are many difficulties associated with the Hamiltonian, the construction of perfect-crystal eigenfunctions that can be used as a basis set for the defect calculation [33, 34], and the construction of the defect potential itself. The first successful Green’s functions calculations for semiconductors date back to the late 1970s [35–39]. They were used to study charged defects [38, 39] and to calculate forces [40–42], total energies [43, 44], and LVMs [45, 46]. These calculations also provided important clues about the role of native defects in impurity diffusion [47]. However, while Green’s functions provide an excellent description of the defect in a crystal, the technique is not very intuitive and are difficult to implement. Clusters or supercells are much easier to use and provide a physically and chemically appealing description of the defect and its immediate surroundings. Green’s functions have mostly been abandoned since the mid-1980s, but a rebirth within the GW formalism [48] is now taking place [49]. In 1967, in order to describe the distortions around a vacancy, Friedel et al. [50] completely ignored the host crystal. They limited their description to rigid linear combinations of atomic orbitals (LCAO) surrounding the defects. Messmer and Watkins [51, 52] expanded this approach to linear combinations of dangling-bond states. These simple quantum-chemical descriptions provided a much-needed insight and a correct, albeit qualitative, explanation of the EPR data. Here, the defect was assumed to be so localized that the entire crystal could be ignored in 0th order. The natural extension of this work was to include a few host atoms around the defect, thus defining a cluster. The calculations were performed in real space with basis sets consisting of localized functions such as Gaussians or LCAOs. The dangling bonds on the surface atoms had to be tied up in some way, most often with H atoms. Without the underlying crystal and its periodicity, the band structure is missing and the defect’s energy eigenvalues cannot be placed within the energy gap. Further, the small size of the cluster used at the time (a dozen atoms or so) artificially confined the wave functions. This affects mostly charged defects, as the charge tends to distribute itself on the surface of the cluster. However, the covalent nature of the local interactions was often well described. A cluster and the defect it contains form a large molecule. Its Schrödinger equation can be solved using any one of many electronic structure methods. The early work heavily borrowed from quantum chemistry, including extended Hückel theory
4 Theory of Defects in Si: Past, Present, and Challenges
65
[53, 54] and then self-consistent semiempirical Hartree–Fock, of the “NDO” type (neglect of differential, or diatomic, overlap): CNDO [55], MNDO [56], MINDO [57]. Geometries could be optimized, albeit often with symmetry assumptions. However, the methods suffered from a variety of problems such as cluster size and surface effects, basis set limitations, and lack of electron correlation. But the main problem was the use of semiempirical parameters. Their values are normally fitted to atomic or molecular data, and transferability to defects in crystals is questionable. DeLeo and co-workers extensively used the scattering-Xα method in clusters [58, 59] to study trends for interstitial TM impurities and hydrogen-alkali metal complexes. The results provided useful but qualitative insight into these issues. Ultimately, their method proved difficult to bring to self-consistency and the rather arbitrarily defined muffin-tin spheres rendered it poorly suited to the calculation of total energies vs. atomic positions. In order to bypass the surface problem, cyclic clusters have been designed, mostly in conjunction with semiempirical Hartree–Fock. Cyclic clusters can be viewed as clusters to which Born–von Karman periodic boundary conditions are applied [60, 61]. These boundary conditions can be difficult to handle, in particular when 3- and 4-center interactions are included [62]. The method of Partial Retention of Diatomic Differential Overlap [63, 64] (PRDDO) was first used for defects in diamond and silicon in the mid-1980s. It is self-consistent, contains no semiempirical parameters, and allows geometry optimizations to be performed without symmetry assumptions. These calculations scale as N 3 , where N is the total number of one-electron orbitals in the basis set. In contrast, the true ab initio Hartree–Fock methods scale as N 4 . However, PRDDO is a minimal basis set technique and ignores electron correlation. Its earliest success was to demonstrate [27, 28] the stability of bond-centered hydrogen in c-C and Si. At the time, it was not expected that an impurity as light as H could force a Si–Si bond to stretch by as much as 1.5 to 1.6 Å. PRDDO has been used to study cluster size and surface effects [65] and to describe many defects [66, 67]. The method provides good input geometries for single-point ab initio Hartree–Fock calculations [68]. However, it suffers from the problems associated with all Hartree–Fock techniques, such as unreasonably large gaps and inaccurate LVMs. A number of research groups have used Hartree–Fock and post-Hartree–Fock techniques [69–71] to study defects in clusters, but these efforts have now been mostly abandoned. Density-functional (DF) theory [72–75] with local basis sets [76]. in large clusters allows more quantitative predictions. A DF-based ab initio code developed at the University of Exeter, AIMPRO [77, 78], uses Gaussian basis sets and has been applied to many defect problems (this code handles periodic supercells as well). In addition to geometries and energetics, rather accurate LVMs for light impurities can be predicted [79, 80]. Large clusters have been used [81, 82] to study the distortions around a vacancy or divacancy in Si. However, all clusters suffer from the surface problem and lack of periodicity. Substantial progress in the theory of defects in semiconductors occurred in the mid-1980s with the combination of periodic supercells to represent the host crystal, ab-initio-type pseudopotentials [83–85] for the core regions, DF theory for the va-
66
S.K. Estreicher
lence regions, and (classical) ab initio molecular dynamics (MD) simulations [86, 87] for nuclear motion. This combination is now referred to as “first-principles” in opposition to “semiempirical.” The parameters in the theory (size of the supercell, k-point sampling, size and type of one-particle basis set, pseudopotential) are not fitted to an experimental database but are determined self-consistently, calculated from first principles, obtained from high-level atomic calculations, or selected by the user. Note that the first supercell calculations were done in the 1970s in conjunction with approximate electronic structure methods [88–90].
4.3 First-Principles Theory First-principles methods have proven to be powerful and versatile tools to predict quantitatively some key properties of defects. The key features of the theory are as follows. The host crystal is represented by a large unit cell (“supercell”), which is reproduced periodically in all directions of space. The cell contains the defect under study, and the calculations are performed in a single cell. In order to obtain reliable energies, it is necessary to sample the Brillouin zone of the cell. A common scheme is to use a Monkhorst–Pack sampling [91] usually 2 × 2 × 2 or larger. The early supercells were quite small, often 16 or 32 atoms. Today, 64 host atoms is considered rather small and many authors use cells containing up to a couple hundred atoms. The nuclei are treated classically using MD simulations [92]. Semiempirical MD simulations are very approximate and very fast. They are appropriate for the study of very large systems [93–96]. Ab initio simulations, much more accurate but much slower, are used for more quantitative predictions. Very fast methods whose computational speed varies linearly with the size of the problem, so-called Order(N) methods, are being developed [97, 98], but their use in the context of defects in semiconductors has so far been limited. First-principles theory relies on ab initio MD simulations of the Car and Parrinello type [86]. They developed a method for coupling the approximation of stationary states of a huge basis eigenvalue problem with the associated ion dynamics. The method is generally applied to plane wave basis sets because of the great speed of fast Fourier transforms. One of the early applications to defects in silicon was the diffusion of bond-centered hydrogen [99]. An alternative ab initio approach to MD simulations, based on a tight-binding perspective, was proposed by Sankey and co-workers [87]. Their basis sets consist of pseudo-atomic orbitals with s, p, d, . . . symmetry. The wave functions are truncated beyond some radius and renormalized. The early version of this code was not self-consistent and was restricted to minimum basis sets (a single atomic-like function for each valence orbital). The next version [100] was self-consistent and could accommodate expanded and polarized basis sets. This is also the case for the flexible SIESTA code (Spanish Initiative for the Electronic Structure with Thousands of Atoms) [97, 101, 102]. The basis sets consist of local orbitals (typically, LCAOs). The method is highly intuitive and allows population analysis and other chemical
4 Theory of Defects in Si: Past, Present, and Challenges
67
information to be calculated. When an atom such as Si is given two sets of 3s and 3p functions plus a set of 3d orbitals, the basis set is sufficient to describe quite well virtually all the chemical interactions of this element, as the contribution of the n = 4 shell of Si is exceedingly small, except under extreme conditions that ground-state theories are not capable of handling anyway. However, proving that the basis set has converged is more cumbersome with local than with plane-wave basis sets. Classical MD simulations begin with the Born–Oppenheimer approximation to separate the electrons from the nuclei. For a given configuration of the nuclei at a time t, the DF electronic energy is calculated. The Hellmann–Feynman theorem [103, 104] gives the force on each nucleus, hence its acceleration. Newton’s laws of motion are solved, giving the positions and velocities of each nucleus at the time t +t. The nuclei are moved to their new positions, assigned their new velocities, and the electronic problem is solved again. The time step t (typically, one femtosecond) needs to be short relative to the fastest oscillation in the system. The temperature of the cell is defined from the kinetic energy of the nuclei. The electrons remain in their ground state. MD simulations performed at the nuclear temperature T = 0 K force the geometry to converge toward the nearest minimum of the potential energy. When a sufficient range of a priori plausible initial configurations for a defect are used, such “conjugate gradients” calculations provide all the stable and metastable minima of the potential energy. It is also possible to start at some high (nuclear) temperature and then use simulated quenching to explore the potential energy surface. The internuclear volume is divided into core regions close to the nuclei and a much larger valence region. The electrons near the core are removed from the calculation using norm-conserving, angular-momentum dependent ab initio pseudopotentials [105]. The choice of pseudopotential is dictated by the basis set for the singleparticle states. For example, most plane-wave packages use ultra-soft pseudopotentials [106] while SIESTA users prefer pseudopotentials in the Kleinman–Bylander form [107]. The electrons in the valence region are treated using first-principles DF theory, most commonly within the local-density or general gradient approximations [75]. Both approximations involve local exchange, resulting in a calculated band gap about half the experimental value. Non-local contributions to electron exchange [108] (of the ab initio Hartree–Fock type) produce a much better band gap but substantially increase the computational cost as the method is no longer ∼N 3 but ∼N 4 , where N is the size of the basis set. In all these calculations, the theorists must worry about supercell and basis set size [109, 110], k-point sampling [111], pseudopotentials, time steps, and other user inputs. There have been too many applications to even attempt a complete list here. Examples include complexes [112, 113], diffusion coefficients [114], formation energies [115], defect reactions and formation dynamics [116, 117], accurate LVMs [118], and even extended defects [119]. Some of the approximations inherent to the method will obviously resolve themselves as computer power increases, but others are more fundamental. First, the host crystal is a periodic supercell of finite size. Today, typical cells contain some 100 atoms, which means defect concentrations of the order of one atomic percent. This is much higher than in typical experimental situations. Since the defect is peri-
68
S.K. Estreicher
odic, defect levels become defect bands. In cells containing fewer than 64 host atoms, defect bands can be several tenths of an eV wide. Such defect band widths are an artifact of the calculation. Second, the Brillouin zone of the supercell differs from that of the primitive cell of the crystal. The k-point sampling is finite, which does affect total energy differences, especially in cells containing fewer than 100 atoms or so. Third, the optimization of the pseudopotentials resembles an art more than a science. The predictions do depend on how well the core radii and other pseudopotential parameters have been determined. Fourth, the electronic problem is solved in the electronic ground state, with finite basis sets and within the local density or general gradient approximations. There is no information about excited states, and the non-local contributions to electron exchange are ignored. Fifth, the nuclei are treated as classical particles. This works very well for heavy nuclei, but quantum behavior has been observed for interstitial hydrogen in Si, for example [120]. Sixth, even though the zero-point energies associated with selected LVMs can be added [121], the total zero-point energy differences are missing. Finally, the energies of charged defects are not accurately calculated. Even though the supercell is always neutral owing to a neutralizing background charge, the charge distribution in the cell is not uniform. The periodicity of the defect implies that a spurious Madelung energy term is included in the total energy. This term must be removed. Point-charge type calculations [122] show that this correction may be as large as 0.3 eV for a localized charge in a Si 64 host-atoms cell. This error bar affects formation energies [123]. It also affects the calculated positions of defect-related gap levels which have yet to be quantitatively predicted on a systematic basis. This issue is compounded by the fact that calculations using local exchange (that includes LDA and GGA) seriously underestimate the energy gap. An interesting way to bypass the charged defect and band gap problems involves scaling the calculated ionization energies and electron affinities using a known “marker” defect. But there is no universal marker and the accuracy is of the order of 0.1 to 0.2 eV [124]. Progress in this area is being made [125]. Despite these approximations, the theory accurately predicts a number of important defect properties. Conjugate gradient optimizations provide equilibrium geometries in the stable and metastable states, which connects theory to experimental information obtained, for example, from EPR or uniaxial-stress FTIR. The energetics are in general quite accurate. This means the relative energies between stable and metastable configurations, formation, migration, reorientation, and binding energies. However, the total energy at a minimum of the potential energy surface is more accurate than at a saddle point, and the identification of the saddle point itself is no trivial matter [126–129]. These energetics connect the theory to all sorts of annealing data. Spin densities are also well predicted. This includes hyperfine parameters [130], which offer a direct link to EPR experiments. Calculating the change in energy associated with small atomic displacements allows the prediction of specific normal-mode frequencies of impurities lighter than the host atoms [131–133]. Some of these LVMs are Raman or FTIR active. Finally, constant-temperature MD simulations sometimes allow the study of defect diffusion or reaction [134, 135].
4 Theory of Defects in Si: Past, Present, and Challenges
69
It is most useful to calculate the complete dynamical matrix of the supercell [118]. Its eigenvalues are the normal-mode frequencies ωs of the system. The orthos (i = x, y, z) give the relative displacements of the nuclei α normal eigenvectors eαi for each mode s. A quantitative measure of how localized a specific mode is on one s )2 + (es )2 + (es )2 vs. atom or a group of atoms is provided by a plot of L2{α} = (eαx αy αz s or ωs . {α} may be a single atom (e.g., an isolated impurity) or a sum over a group of atoms (e.g., its host atom nearest neighbors). Such a localization plot allows the identification of all the local and pseudolocal vibrational modes in the cell (LVMs and pLVMs, respectively) as well as the resonant modes associated with a specific defect. A local mode is an impurity-associated mode with frequency above the highest normal mode of the crystal, the phonon. A pseudolocal mode is a localized mode located below the phonon. Such modes are sometimes visible as phonon sidebands in PL spectra [136, 137]. Resonant modes occur when a defect-related strain in the crystal locally disturbs host crystal modes resulting in a host-atom oscillation close to the phonon. Figure 4.1 shows the localized modes associated with bond-centered hydrogen [27, 28] H+ bc in Si. The calculated asymmetric stretch is the LVM at 2004 cm−1 (measured [15] at 1998 cm−1 ). The wag modes are a doublet pLVM at 264 cm−1 , and the symmetric stretch of the two Si NNs to H is at 409 cm−1 . The knowledge of all the normal modes of the supercell also allows one to prepare a supercell in thermal equilibrium at any temperature for constant temperature MD runs without thermalization or thermostat. Then, it is possible to calculate vibrational lifetimes and decay channels from first principles [138–140]. Finally, the knowledge of all the normal modes also allows the construction of the phonon den-
Fig. 4.1. Localization plot L2{α} of H+ bc in the Si64 supercell, with α = H (2004 and 264 lines)
or its two Si NNs (409 line). The dotted line near 540 cm−1 shows the calculated phonon
70
S.K. Estreicher
sity of states g(ω). We obtain this function by evaluating the dynamical matrix at about 90 k points in the Brillouin zone of the supercell. Then, the Helmholtz vibrational free energy Fvib is straightforward to calculate [141], as discussed in the next section. All these issues are reviewed in detail elsewhere [142].
4.4 First-Principles Theory at Non-zero Temperatures So far, the discussion has centered around the calculation of (spin and charge) densities and electronic total energies. Of interest are total energy differences since one needs the relative energy of two metastable configurations of some defect, a formation energy relative to some chemical potential, a binding energy (energy difference between a bound complex and its dissociation products at their equilibrium sites), etc. The energetics discussed so far were potential energy differences U , calculated in the ground electronic state at T = 0 K. But the real world exists at nonzero temperatures. Most devices function at or above room temperature. Samples undergo thermal anneals, are implanted and exposed to light, etc. The behavior of defects is temperature-dependent, which is obvious when one observes diffusion, association or dissociation reactions, and other phenomena where the temperature is crucial. One approach to this issue involves thermodynamic integration [143], a method which requires extensive Monte Carlo or MD runs. A more direct approach involves calculating explicitly the various contributions to free energy differences Fvib + Fe/h + Frot + · · · as well as the configurational entropy term T Sconfig . In most cases, the electronic potential energy difference Uelec is obtained from first-principles DF theory. The vibrational free energy difference Fvib = Uvib is discussed below. Fe/h , the free energy associated with charge carriers, refers only to the difference in the number of carriers for different configurations of a defect. This does not include the background (dopant-related) free carriers: Only the change induced by the different electrical activities of different configurations of a localized defect survive the energy difference. This term has been shown to be very small in all but the most extreme cases [141]. The next term, Frot , occurs when the defect is a (nearly) free rotator, as for the interstitial H2 molecule in Si [144]. This contribution can be calculated [141] analytically from the partition function and will not be discussed here. There may be additional free energy terms associated with magnetic or spin degrees of freedom. However, high spin multiplicities are rare for impurities in Si. Finally, Sconfig is the difference in configurational entropy. It can be very small or even zero when comparing metastable configurations of a defect, but it can also be very important when discussing dissociation and association reactions. At high temperatures, this term often dominates the interactions. The processes taking place at non-zero temperatures almost always occur at constant pressure and the Gibbs free energy is the relevant quantity. As the lattice con-
4 Theory of Defects in Si: Past, Present, and Challenges
71
stant changes, the bond lengths change and so do the normal mode frequencies, which in turn affect the vibrational free energy. However, in crystalline semiconductors that have a high melting point, such as Si, working at constant volume rather than constant pressure is appropriate up to several hundred degrees Celsius. In Si, the thermal expansion coefficient is 4.68 × 10−6 K−1 at room temperature and the phonon frequencies shift slowly with T . Indeed, the difference between the constantpressure and constant-volume specific heats [145] CP −CV is 0.0165 J/molK at room temperature, a correction of only 0.08% to CP = 20 J/molK. This implies that the constant volume approximation is likely to work very well at low temperatures and gradually become worse at high temperatures. Therefore, we focus here on the Helmholtz free energy in the harmonic approximation and restrict ourselves to the temperature range where the use of the phonon density of states g(ω) calculated once and for all at T = 0 K is appropriate. This greatly simplifies the calculations. In fact, it renders them possible since calculating temperature-dependent anharmonic phonon densities of state is a formidable task. The phonon densities of states of defect-free crystals are normally calculated from the dynamical matrix of the primitive unit cell evaluated at thousands of q points in the Brillouin zone of the lattice. However, when studying defects, large supercells must be used and their Brillouin zones are distinct from that of the primitive unit cell. The eigenvalues of the dynamical matrix only provide a small number of normal mode frequencies, and the phonon density of states extrapolated from those few hundred frequencies lead to rather poor g(ω)s. However, evaluating the dynamical matrix at about 90 q points in the Brillouin zone of a Si64 supercell works very well and the calculated g(ω) closely matches [142] the measured data [146]. In the harmonic approximation, the Helmholtz free energy is given by ∞ Fvib (T ) = kB T ln sinh(h¯ ω/2kB T ) g(ω) dω, (4.1) 0
where kB is the Boltzmann constant. In the perfect cell, the integration is carried out up to the phonon. With a defect in the supercell, the integral extends up to the highest normal mode of the cell (perturbed phonon) and becomes a discrete sum for the high-frequency LVMs. Fvib (T = 0) is the total zero-point energy of the supercell. Once Fvib is calculated, the vibrational entropy and specific heat at constant volume are given by 2 ∂Fvib ∂ Fvib Svib = − , CV = −T . (4.2) ∂T V ∂T 2 V The latter can be compared to the measured CP in order to determine the temperature up to which the constant-volume and harmonic approximations are appropriate. For c-C, Si, Ge, and GaN [147–149], the approximations work very well up to 600– 700 K. At low temperatures, the agreement is excellent with even the finest features: Theory accurately predicts the temperature at which C/T 3 exhibits a small peak, the height of the peak, and the isotope splitting at the peak [147–150]. For defects, Fvib (T ) is generally a slowly increasing function of T . Except at very low temperatures, it is a linear function of T and the slope depends on the
72
S.K. Estreicher
defect under consideration [141]. For example, the CH∗2 complex has two configurations in Si: Cs –Hbc · · · Si–Hab and Si–Hbc · · · Cs –Hab , where the subscript s means substitutional, and “ab” and “bc” stand for anti-bonding and bond-centered, respectively. Fvib is 0.007 eV at 300 K and 0.014 eV at 500 K, in favor of Si–Hbc · · · Cs – Hab . The numbers are somewhat larger when comparing the two interstitial hydrogen dimers in Si: the H2 molecule and the H∗2 complex (Si–Hbc · · · Si–Hab ). Here, Fvib is 0.041 eV at 300 K and 0.070 eV at 500 K, in favor of H2 . These energies appear small, but the corresponding Boltzmann terms are substantial. The role of the configuration entropy is most important when considering the formation and dissociation of complexes, that is when calculating binding free energies [151]. If a complex {A, B} has dissociation products A and B, there are often vastly different numbers of configurations in the crystal for the dissociated species and the complexes, sometimes leading to surprisingly large values for Sconfig . The calculation cannot be done in a supercell. Instead, one must use realistic concentrations of the species involved. The details of the calculation depend on the specific situation. An example is discussed below. The difference in configurational entropy per complex is Sconfig = (kB /[{A, B}]) ln(Ωpair /Ωnopair ), where [{A, B}] is the number of complexes, and Ωpair and Ωnopair are the number of configurations with all possible complexes forming and with all complexes dissociated, respectively. One can refine the calculation by introducing a capture radius rc : {A, B} is “dissociated” when no B species is within a sphere of radius rc of any A. The results are not very sensitive to the actual value of rc . However, Sconfig changes substantially when the concentrations of the various species involved change. Thus, samples containing large or small concentrations of one of the same species can have very different configurational entropy terms. Indeed, if A and/or B are abundant, there are many configurations resulting in pairs and relatively few configurations with A far away from B. On the other hand, when both A and B are scarce, there are far fewer ways to make pairs but a great number of dissociated configurations. As a result, the binding free energy of a specific complex at a specific temperature is different in samples containing different concentrations of the species involved. An example recently discussed in detail [151] involves the binding free energies of two boron–oxygen complexes in Si. Each of them contains one B atom and one interstitial oxygen dimer: {Bs , Oi , Oi } and {Bi , Oi , Oi }. The structure and chemistry of these two complexes are comparable, as are their dissociation products and binding energies at 0 K (about 0.5 and 0.6 eV, respectively). However, the first complex involves a very abundant species, substitutional boron (Bs ), while the second involves the much rarer interstitial boron (Bi ). Assuming the concentrations 1019 cm−3 for Bs , 1014 cm−3 for {Oi , Oi } and for Bi , combinatorial analysis yields Sconfig = −0.515 meV/K for {Bs , Oi , Oi } and −1.538 meV/K for {Bi , Oi , Oi }. In both cases, the contribution of Fvib can be ignored. Plots of Eb (T ) = Uelec + Fvib − T Sconfig yield straight lines with very different slopes. The {Bs , Oi , Oi } complex remains bound up to high temperatures, but the binding free energy of {Bi , Oi , Oi } becomes 0 at 400 K. Bi and {Oi , Oi } repel each other above that temperature. The critical temperature T0 at which the binding
4 Theory of Defects in Si: Past, Present, and Challenges
73
free energy becomes 0 depends on Uelec and on the slope, which is determined by the concentrations of the species involved and the number of sites available. If T0 is below the melting point of the material, then the complex will dissociate readily above T0 . It is interesting to note that measurements of the dissociation reaction of an {A, B} complex in a crystal is often analyzed using Arrhenius plots. If R is a dissociation rate, one often assumes the temperature dependence to be R exp{−Eb /kB T }. A plot of the (natural) logarithm of this function vs. inverse temperature provides the binding energy (slope) and the logarithm of the rate (intercept). However, Eb is a linear function of T with a coefficient approximately equal to the difference of configurational entropy between the bound and dissociated species. Thus, R exp{−Eb /kB T } = R exp{Sconfig /kB } exp{−Eb (0)/kB T }. Thus, an Arrhenius plot really yields a straight line with slope −Eb (0)/kB (binding energy at 0 K) and intercept (ln R + Sconfig /kB ). Thus, Arrhenius plots of the dissociation of an {A, B} complex in samples containing different concentrations of A and/or B should produce parallel lines since the slopes are the same but the intercepts differ. This suggests a way to measure configurational entropies. If we take R = 1011 s−1 and Sconfig = −0.5 or −1.0 meV/K, the intercepts will be at 25.3 − 5.8 = 19.5 or 25.3 − 11.6 = 13.7, a measurable change. In fact, an experimental study of the dissociation of substitutional/interstitial carbon pairs in Si did yield different dissociation rates in different samples [152].
4.5 Discussion In the past half-century, the theory of defects in semiconductors has progressed to the point that experimentalists now seek theoretical support to better explain their data. Today’s theory is almost always of the first-principles type, and the predictions are quantitative in many respects. The geometry of a defect is expected to be correct, the binding and various activation energies close to the measured ones, the hyperfine parameters accurate, and the LVMs within a few percent of experiment. However, even though the theoretical approach does not contain parameters that are fitted to an experimental database, it still contains numerous approximations. These are discussed in great detail elsewhere [153]. Some approximations will disappear as computer power increases. These include supercell size, k-point sampling, basis set size, short real times in MD simulations, and the Madelung energy correction for charged defects. The problems associated with the underestimation of the energy gap and the calculation of defect-related gap levels are being addressed by various groups. Density-functional theory is a ground state theory by construction, and precious little can be said about excited states. The description of electrons in the conduction band and of electron-hole recombination at defect levels are problems which will require a different level of theory. Progress has been achieved in expanding first-principles theory to non-zero temperatures. The method of thermodynamic integration is used more often now than in
74
S.K. Estreicher
the past [143]. Direct calculations of free energies at constant volume can be done routinely. Actual numbers for vibrational, rotation, free carrier, and other free energy contributions have been published [141]. The critical role of the configurational entropy in processes involving the formation and dissociation of complexes has also been highlighted [151]. These issues are very significant when discussing issues such as the dynamics of thermal donor appearance and disappearance with temperature and time or the dynamics of self-assembly. If the progress achieved by theorists in the past few decades continues at a comparable pace, many of the open questions will be addressed and solved in the coming years. Such developments would be most welcome as the sizes of devices shrink to the point where the relevant samples are too small for detailed experimental analysis. Although this brief review has focused on recent progress in the supercell densityfunctional approach, other powerful theoretical tools are being developed. One such tool, only casually mentioned in this chapter, involves Green’s functions. This is mathematically and computationally more difficult but the host crystal is in principle perfectly described. The difficulties associated with finding the appropriate defect potential and allowing for geometry optimizations are being discussed [49]. A tool which was not discussed at all here is Quantum Monte Carlo [154]. Another method which is rapidly gaining popularity involves combining very fast semiempirical MD techniques with the much more accurate first-principles method in order to study large defects in supercells containing many thousands of atoms. The trick is to describe at a high level of theory the relatively small regions of a huge supercell which are affected by the dynamics while treating at an approximate level those regions that do not change much [155]. One example of such an approach is the “learn-on-the-fly” MD study of crack propagation [156]. The difficulty lies in matching the first-principles/semiempirical boundary, an especially difficult task in MD simulations when atoms enter or leave the first-principles region and/or when the number of atoms treated at the first-principles level changes with time. The theory of defects in materials is a rapidly evolving field. We live in interesting times.
Acknowledgments This work was supported in part by the National Renewable Energy Laboratory and the R. A. Welch Foundation.
References 1. A.M. Stoneham, Theory of Defects in Solids (Clarendon, Oxford, 1985) 2. H.J. Queisser, E.E. Haller, Science 281, 945 (1998) 3. Oxygen in Silicon, ed. by F. Shimura, Semic. and Semimetals, vol. 42 (Academic, Boston, 1994) 4. Early Stages of Oxygen Precipitation in Silicon, ed. by R. Jones (Kluwer, Dordrecht, 1996)
4 Theory of Defects in Si: Past, Present, and Challenges
75
5. R. Scholtz, U. Gösele, J.Y.Y. Huh, T.Y. Tan, Appl. Phys. Lett. 72, 200 (1998) 6. S.K. Estreicher, Mat. Sci. Eng. R 14, 319 (1995) 7. S.J. Pearton, in GaN and Related Materials, ed. by S.J. Pearton (Gordon and Breach, Amsterdam, 1997), p. 333 8. G.K. Celler, S. Cristoloveanu, J. Appl. Phys. 93, 4955 (2003) 9. F. Jiang, M. Stavola, A. Rohatgi, D. Kim, J. Holt, H. Atwater, J. Kalejs, Appl. Phys. Lett. 83, 931 (2003) 10. G. Davies, Phys. Rep. 176, 83 (1989) 11. G.D. Watkins, in Deep Centers in Semiconductors, ed. by S.T. Pantelides (Gordon and Breach, New York, 1986), p. 147 12. Y.V. Gorelkinskii, N.N. Nevinnyi, Physica B 170, 155 (1991) 13. Y.V. Gorelkinskii, N.N. Nevinnyi, Mat. Sci. Eng. B 36, 133 (1996) 14. S. Greulich-Weber, J.R. Niklas, E.R. Weber, J.M. Spaeth, Phys. Rev. B 30, 6292 (1984) 15. M. Budde, G. Lüpke, E. Chen, X. Zhang, N.H. Tolk, L.C. Feldman, E. Tarhan, A.K. Ramdas, M. Stavola, Phys. Rev. Lett. 87, 145501 (2001) 16. K. Bonde Nielsen, L. Dobaczewski, S. Søgård, B. Bech Nielsen, Phys. Rev. B 65, 075205 (2002) 17. S.K. Estreicher, Mater. Today 6(6), 26 (2003) 18. W. Kohn, Solid State Phys. 5, 257 (1957) 19. S.T. Pantelides, Rev. Mod. Phys. 50, 797 (1978) 20. G.D. Watkins, in Radiation Damage in Semiconductors (Dunod, Paris, 1964), p. 97 21. G.D. Watkins, J.W. Corbett, Phys. Rev. 138, A543 (1965) 22. G.W. Ludwig, H.H. Woodbury, Solid State Phys. 13, 223 (1962) 23. W. Kaiser, P.H. Keck, C.F. Lange, Phys. Rev. 101, 1264 (1956) 24. J.W. Corbett, R.S. McDonald, G.D. Watkins, J. Phys. Chem. Sol. 25, 873 (1964) 25. G.G. DeLeo, G.D. Watkins, W. Beal Fowler, Phys. Rev. B 25, 4972 (1982) 26. A. Zunger, U. Lindefelt, Phys. Rev. B 26, 5989 (1982) 27. T.L. Estle, S.K. Estreicher, D.S. Marynick, Hyperfine Interact. 32, 637 (1986) 28. T.L. Estle, S.K. Estreicher, D.S. Marynick, Phys. Rev. Lett. 58, 1547 (1987) 29. S.T. Pantelides, in Deep Centers in Semiconductors, ed. by S.T. Pantelides (Gordon and Breach, New York, 1986) 30. G.F. Koster, J.C. Slater, Phys. Rev. 95, 1167 (1954) 31. G.F. Koster, J.C. Slater, Phys. Rev. 96, 1208 (1954) 32. J. Callaway, J. Math. Phys. 5, 783 (1964) 33. J. Callaway, A.J. Hughes, Phys. Rev. 156, 860 (1967) 34. J. Callaway, A.J. Hughes, Phys. Rev. 164, 1043 (1967) 35. J. Bernholc, S.T. Pantelides, Phys. Rev. B 18, 1780 (1978) 36. J. Bernholc, N.O. Lipari, S.T. Pantelides, Phys. Rev. Lett. 41, 895 (1978) 37. G.A. Baraff, N. Schlüter, Phys. Rev. Lett. 41, 892 (1978) 38. G.A. Baraff, E.O. Kane, M. Schlüter, Phys. Rev. Lett. 43, 956 (1979) 39. G.A. Baraff, M. Schlüter, Phys. Rev. B 30, 1853 (1984) 40. M. Scheffler, J.P. Vigneron, G.B. Bachelet, Phys. Rev. Lett. 49, 1756 (1982) 41. U. Lindefelt, Phys. Rev. B 28, 4510 (1983) 42. U. Lindefelt, A. Zunger, Phys. Rev. B 30, 1102 (1984) 43. G.A. Baraff, M. Schlüter, Phys. Rev. B 28, 2296 (1983) 44. R. Car, P.J. Kelly, A. Oshiyama, S.T. Pantelides, Phys. Rev. Lett. 52, 1814 (1984) 45. D.N. Talwar, M. Vandevyver, M. Zigone, J. Phys. C 13, 3775 (1980) 46. R.M. Feenstra, R.J. Hauenstein, T.C. McGill, Phys. Rev. B 28, 5793 (1983) 47. R. Car, P.J. Kelly, A. Oshiyama, S.T. Pantelides, Phys. Rev. Lett. 54, 360 (1985) 48. F. Aryasetiawan, O. Gunnarsson, Rep. Prog. Phys. 61, 237 (1998)
76
S.K. Estreicher
49. M. Scheffler, A. Schindlmayr, in Theory of Defects in Semiconductors, ed. by D. Drabold, S.K. Estreicher (Springer, Berlin, 2006), p. 165 50. J. Friedel, M. Lannoo, G. Leman, Phys. Rev. 164, 1056 (1967) 51. R.P. Messmer, G.D. Watkins, Phys. Rev. Lett. 25, 656 (1970) 52. R.P. Messmer, G.D. Watkins, Phys. Rev. B 7, 2568 (1973) 53. F.P. Larkins, J. Phys. C: Solid State Phys. 4, 3065 (1971) 54. R.P. Messmer, G.D. Watkins, in Proc. Reading Conf. on Radiation Damage and Defects in Semiconductors, ed. by J.E. Whitehouse (Institute of Physics, 1973), p. 255 55. A. Mainwood, J. Phys. C: Solid State Phys. 11, 2703 (1978) 56. J.W. Corbett, S.N. Sahu, T.S. Shi, L.C. Snyder, Phys. Lett. A 93, 303 (1983) 57. P. Deák, L.C. Snyder, Radiat. Eff. Defects Solids 111–112, 77 (1989) 58. G.G. DeLeo, G.D. Watkins, W.B. Fowler, Phys. Rev. B 23, 1819 (1981) 59. G.G. DeLeo, W.B. Fowler, G.D. Watkins, Phys. Rev. B 29, 1851 (1984) 60. P. Deák, L.C. Snyder, Phys. Rev. B 36, 9619 (1987) 61. P. Deák, L.C. Snyder, J.W. Corbett, Phys. Rev. B 45, 11612 (1992) 62. J. Miró, P. Deák, C.P. Ewels, R. Jones, J. Phys.: Condens. Matter 9, 9555 (1997) 63. T.A. Halgren, W.N. Lipscomb, J. Chem. Phys. 58, 1569 (1973) 64. D.S. Marynick, W.N. Lipscomb, Proc. Natl. Acad. Sci. (USA) 79, 1341 (1982) 65. S.K. Estreicher, A.K. Ray, J.L. Fry, D.S. Marynick, Phys. Rev. Lett. 55, 1976 (1985) 66. S.K. Estreicher, Phys. Rev. B 41, 9886 (1990) 67. S.K. Estreicher, Phys. Rev. B 41, 5447 (1990) 68. S.K. Estreicher, Phys. Rev. B 60, 5375 (1999) 69. A.A. Bonapasta, A. Lapiccirella, N. Tomassini, M. Capizzi, Phys. Rev. B 36, 6228 (1987) 70. A. Artacho, F. Ynduráin, Solid State Commun. 72, 393 (1989) 71. R. Luchsinger, P.F. Meier, N. Paschedag, H.U. Suter, Y. Zhou, Philos. Trans. R. Soc. Lond. A 350, 203 (1995) 72. P. Hohenberg, W. Kohn, Phys. Rev. B 136, 468 (1964) 73. W. Kohn, L.J. Sham, Phys. Rev. A 140, 1133 (1965) 74. L.J. Sham, W. Kohn, Phys. Rev. 145, 561 (1966) 75. W. Kohn, Rev. Mod. Phys. 71, 1253 (1999) 76. M. Saito, A. Oshiyama, Phys. Rev. B 38, 10711 (1988) 77. R. Jones, A. Sayyash, J. Phys. C 19, L653 (1986) 78. R. Jones, P.R. Briddon, in Identification of Defects in Semiconductors, ed. by M. Stavola. Semiconductors and Semimetals, vol. 51 (Academic, Boston, 1998) 79. J.D. Holbech, B. Bech Nielsen, R. Jones, P. Sitch, S. Öberg, Phys. Rev. Lett. 71, 875 (1993) 80. R. Jones, B.J. Coomer, P.R. Briddon, J. Phys.: Condens. Matter 16, S2643 (2004) 81. S. Ogüt, J.R. Chelikowsky, Phys. Rev. Lett. 83, 3852 (1999) 82. S. Ogüt, J.R. Chelikowsky, Phys. Rev. B 64, 245206 (2001) 83. D.R. Hamann, M. Schlüter, C. Chiang, Phys. Rev. Lett. 43, 1494 (1979) 84. G.B. Bachelet, D.R. Hamann, M. Schlüter, Phys. Rev. B 26, 4199 (1982) 85. L. Kleinman, D.M. Bylander, Phys. Rev. Lett. 48, 1425 (1982) 86. R. Car, M. Parrinello, Phys. Rev. Lett. 55, 2471 (1985) 87. O. Sankey, D.J. Niklewski, Phys. Rev. B 40, 3979 (1989) 88. A. Zunger, A. Katzir, Phys. Rev. B 11, 2378 (1975) 89. S.G. Louie, M. Schlüter, J.R. Chelikowsky, M.L. Cohen, Phys. Rev. B 13, 1654 (1976) 90. W. Pickett, M.L. Cohen, C. Kittel, Phys. Rev. B 20, 5050 (1979) 91. H.J. Monkhorst, J.D. Pack, Phys. Rev. B 13, 5188 (1976)
4 Theory of Defects in Si: Past, Present, and Challenges
77
92. S.K. Estreicher, P.A. Fedders, in Computational Studies of New Materials, ed. by D.A. Jelski, T.F. George (World Scientific, Singapore, 1999), 93. R. Smith, D.E. Harrison Jr., B.J. Garrison, Phys. Rev. B 40, 93 (1989) 94. K. Scheerschmidt, D. Conrad, U. Gösele, Comput. Mater. Sci. 7, 40 (1996) 95. K. Scheerschmidt, in Theory of Defects in Semiconductors, ed. by D. Drabold, S.K. Estreicher (Springer, Berlin, 2006), p. 213 96. G. Panzarini, L. Colombo, Phys. Rev. Lett. 73, 1636 (1994) 97. J.M. Soler, E. Artacho, J.D. Gale, A. García, J. Junquera, P. Ordejón, D. Sánchez-Portal, J. Phys.: Condens. Matter 14, 2745 (2002) 98. T. Diaz de la Rubia, G.H. Gilmer, Phys. Rev. Lett. 74, 2507 (1995) 99. F. Buda, G.L. Chiarotti, R. Car, M. Parrinello, Phys. Rev. Lett. 63, 4294 (1989) 100. A.A. Demkov, J. Ortega, O.F. Sankey, M.P. Grumbach, Phys. Rev. B 53, 10441 (1995) 101. D. Sánchez-Portal, P. Ordejón, E. Artacho, J.M. Soler, Int. J. Quant. Chem. 65, 453 (1997) 102. E. Artacho, D. Sánchez-Portal, P. Ordejón, A. García, J.M. Soler, Phys. Stat. Sol. (b) 215, 809 (1999) 103. R.P. Feynman, Phys. Rev. 56, 340 (1939) 104. M. Scheffler, J.P. Vigneron, G.B. Bachelet, Phys. Rev. B 31, 6541 (1985) 105. J.R. Chelikowsky, J. Phys. D: Appl. Phys. 33, R33 (2000) 106. D. Vanderbilt, Phys. Rev. B 41, 7892 (1990) 107. L. Kleiman, D.M. Bylander, Phys. Rev. Lett. 48, 1425 (1982) 108. J. Lento, R.M. Nieminen, J. Phys.: Condens. Matter 15, 4387 (2003) 109. Y. Zhou, R. Luchsinger, P.F. Meier, Phys. Rev. B 51, 4166 (1995) 110. M.J. Puska, S. Pöykkö, M. Pesola, R.M. Nieminen, Phys. Rev. B 58, 1318 (1998) 111. S. Pöykkö, M.J. Puska, M. Alatamlo, R.M. Nieminen, Phys. Rev. B 54, 7909 (1996) 112. P.J.H. Denteneer, C.G. Van de Walle, S.T. Pantelides, Phys. Rev. Lett. 62, 1884 (1989) 113. S.B. Zhang, D.J. Chadi, Phys. Rev. B 41, 3882 (1990) 114. P.E. Blöchl, C.G. Van de Walle, S.T. Pantelides, Phys. Rev. Lett. 64, 1401 (1990) 115. C.G. Van de Walle, P.J.H. Denteneer, Y. Bar-Yam, S.T. Pantelides, Phys. Rev. B 39, 10791 (1989) 116. S.K. Estreicher, J.L. Hastings, P.A. Fedders, Phys. Rev. Lett. 82, 815 (1999) 117. S.K. Estreicher, M. Gharaibeh, P.A. Fedders, P. Ordejón, Phys. Rev. Lett. 86, 1247 (2001) 118. J.M. Pruneda, S.K. Estreicher, J. Junquera, J. Ferrer, P. Ordejón, Phys. Rev. B 65, 075210 (2002) 119. Y.-S. Kim, K.J. Chang, Phys. Rev. Lett. 86, 1773 (2001) 120. K. Muro, A.J. Sievers, Phys. Rev. Lett. 57, 897 (1986) 121. C.G. Van de Walle, J. Vac. Sci. Technol. A 16, 1767 (1998) 122. J. Lento, J.-L. Mozos, R.M. Nieminen, J. Phys.: Condens. Matter 14, 2637 (2002) 123. S.B. Zhang, J.E. Northrup, Phys. Rev. Lett. 67, 2339 (1991) 124. J.P. Goss, M.J. Shaw, P.R. Briddon, in Theory of Defects in Semiconductors, ed. by D.A. Drabold, S.K. Estreicher (Springer, Berlin, 2006), p. 69 125. P.A. Schultz, Phys. Rev. Lett. 96, 246401 (2007) 126. G. Mills, H. Jonsson, G.K. Schenter, Surf. Sci. 324, 305 (1995) 127. H. Jonsson, G. Mills, K.W. Jacobsen, in Classical and Quantum Dynamics in Condensed Phase Simulations, ed. by B.J. Berne, G. Ciccotti, D.F. Coker (World Scientific, Singapore, 1998) 128. G. Hankelman, H. Jonsson, J. Chem. Phys. 111, 7010 (1999) 129. A.F. Wright, T.R. Mattsson, J. Appl. Phys. 96, 2015 (2004) 130. C.G. Van de Walle, Phys. Rev. Lett. 64, 669 (1990)
78
S.K. Estreicher
131. S. Limpijumnong, C.G. Van de Walle, Phys. Rev. B 68, 235203 (2003) 132. A.F. Wright, C.H. Seager, S.M. Myers, D.D. Koleske, A.A. Allerman, J. Appl. Phys. 94, 2311 (2003) 133. A. Carvalho, R. Jones, J. Coutinho, P.R. Briddon, J. Phys.: Condens. Matter 17, L155 (2005) 134. S.K. Estreicher, J.L. Hastings, P.A. Fedders, Phys. Rev. B 57, R12663 (1998) 135. S.K. Estreicher, J.L. Hastings, P.A. Fedders, Phys. Rev. Lett. 82, 815 (1999) 136. S.K. Estreicher, D. West, J. Goss, S. Knack, J. Weber, Phys. Rev. Lett. 90, 035504 (2003) 137. S.K. Estreicher, D. West, M. Sanati, Phys. Rev. B 72, R121201 (2005) 138. D. West, S.K. Estreicher, Phys. Rev. Lett. 96, 115504 (2006) 139. K.K. Kohli, G. Davies, N.Q. Vinh, D. West, S.K. Estreicher, T. Gregorkiewicz, K.M. Itoh, Phys. Rev. Lett. 96, 225503 (2006) 140. D. West, S.K. Estreicher, Phys. Rev. B 75, 075206 (2007) 141. S.K. Estreicher, M. Sanati, D. West, F. Ruymgaart, Phys. Rev. B 70, 125209 (2000) 142. S.K. Estreicher, M. Sanati, in Theory of Defects in Semiconductors, ed. by D. Drabold, S.K. Estreicher (Springer, Berlin, 2006), p. 95 143. E.R. Hernández, A. Antonelli, L. Colombo, P. Ordejón, in Theory of Defects in Semiconductors, ed. by D. Drabold, S.K. Estreicher (Springer, Berlin, 2006), p. 115 144. S.K. Estreicher, Acta Phys. Pol. A 102, 403 (2002) 145. P. Flubacher, A.J. Leadbetter, J.A. Morrison, Philos. Mag. 4, 273 (1959) (‘cal/g atom’ should read ‘cal/mol’, or the numbers be divided by the atomic mass of Si) 146. F. Widulle, T. Ruf, M. Konuma, I. Silier, M. Cardona, W. Kriegseis, V.I. Ozhogin, Solid State Commun. 118, 1 (2002) 147. M. Cardona, R.K. Kremer, M. Sanati, S.K. Estreicher, T.R. Anthony, Solid State Commun. 133, 465 (2005) 148. R.K. Kremer, M. Cardona, E. Schmitt, J. Blumm, S.K. Estreicher, M. Sanati, M. Bockowski, I. Grzegory, T. Suski, A. Jezowski, Phys. Rev. B 72, 075209 (2005) 149. M. Sanati, S.K. Estreicher, M. Cardona, Solid State Commun. 131, 229 (2004) 150. W. Schnelle, E. Gmelin, J. Phys.: Condens. Matter 13, 6087 (2001) 151. M. Sanati, S.K. Estreicher, Phys. Rev. B 72, 165206 (2005) 152. G. Davies, K.T. Kun, T. Reade, Phys. Rev. B 44, 12146 (1991) 153. R.M. Nieminen, in Theory of Defects in Semiconductors, ed. by D. Drabold, S.K. Estreicher (Springer, Berlin, 2006), p. 29 154. R.J. Needs, in Theory of Defects in Semiconductors, ed. by D. Drabold, S.K. Estreicher (Springer, Berlin, 2006), p. 141 155. G. Csányi, G. Moras, J.R. Kermode, M.C. Payne, A. Mainwood, A. De Vita, in Theory of Defects in Semiconductors, ed. by D. Drabold, S.K. Estreicher (Springer, Berlin, 2006), p. 193 156. G. Csányi, T. Albaret, M.C. Payne, A. De Vita, Phys. Rev. Lett. 93, 175503 (2004)
5 Structural, Elemental, and Chemical Complex Defects in Silicon and Their Impact on Silicon Devices A.A. Istratov, T. Buonassisi, and E.R. Weber
5.1 Introduction The electrical properties of silicon are intimately related to the defects and impurities it contains. A hypothetical piece of perfectly pure and defect-free silicon would be a highly resistive material that would not find many practical applications before adding impurities to dope it n-type or p-type. In practice, besides intentionally introduced impurities (dopants), silicon always contains native defects (self-interstitials and vacancies) and unintentional contaminants (oxygen, carbon, transition metals) and may contain structural defects such as voids, stacking faults, dislocations, or, in the case of multicrystalline silicon, grain boundaries. The inherent difficulty in understanding the properties and mechanisms of defect formation is that any silicon wafer is a complicated system in which native defects, dopants, unintentional impurities, and structural defects can interact with each other and affect each other in a variety of ways. Making sense of this complicated system requires a combination of approaches, including the isolated study of individual defects in well-defined, simplified structures, combined with the study of defects within the complex environment of the actual silicon wafer. The final goal is to identify, characterize, and control the defects that are most detrimental to silicon-based device performance. Often, the study of one defect can elucidate the behavior of others. In this chapter, we will show several examples from the literature of how one type of defect can be engineered or understood through control over the other types of defects and provide examples from our own work on understanding and engineering nanoscale transition metal defects in silicon. Many of these examples come from photovoltaics, a rapidly growing silicon industry which may in a few years surpass in revenues the integrated circuits industry if the current growth rate of up to 30% per year is sustained.
80
A.A. Istratov et al.
5.2 Defect Interactions in Single-Crystalline Silicon Many defect reactions in silicon are affected by the native defects – vacancies, selfinterstitials, and their complexes. The origin of A-defects (now interpreted as clusters of silicon self-interstitials); D-defects (clusters of silicon vacancies); oxidationinduced stacking fault (OSF) rings (rings formed during thermal oxidation of silicon wafers at the boundary that separates the vacancy-rich area near the center of the wafer and the interstitial-rich area near its edges); swirl defects (microscopic dislocation loops arranged in swirl-like patterns), and crystal-originated pits (COPs, which are voids formed through agglomeration of vacancies) have been important research topics in the past, when it was found that these defects could affect gate oxide integrity and device performance, yield, and reliability. Starting from the early report of Seeger and Chik [1], it eventually became clear that all these defects are either complexes of vacancies or self-interstitials, or formed via interaction of vacancies and self-interstitials with oxygen; and that their formation is determined by the growth conditions of the ingot (such as pulling speed and temperature gradients). Progress in understanding the properties of native defects was hindered by the small equilibrium concentrations of native defects at room temperature (although at high temperatures their concentration may reach 1013 –1015 cm−3 ) and by the lack of experimental techniques for their direct detection. Therefore, data on the properties of native defects had to be extracted from indirect studies (such as the analysis of defects in ingots grown using different pulling parameters, kinetics of diffusion of shallow dopants, heavy metals, and precipitation of oxygen), combined with theoretical modeling. A good historical perspective is given by Bergholz and Gilles [2]. The 1982 study of Voronkov [3] established that the growth parameter V /G (where V is the crystal growth velocity and G is the temperature gradient near the melt–solid interface) determines whether the crystal is grown with vacancies or with silicon selfinterstitials as the dominant native point defect. Typically, for a given pulling rate, the thermal gradient varies from the center to the edges of an ingot, and so does the V /G ratio; consequently, an ingot may contain both vacancy-rich and interstitial-rich areas, which lead to the formation of an OSF ring. Falster et al. [4, 5] modeled the defect equilibrium between vacancies and self-interstitials and identified the crystal growth conditions required to grow relatively defect-free ingots with slight predominance of vacancies, which made it possible then to suppress the formation of the OSF ring and reduce the density of native defect clusters. The understanding of the properties of native defects in silicon, gained over many years, led to the development of crystal growth techniques which enable one to engineer distribution of defects in silicon ingots. For example, one can grow vacancyrich, interstitial-rich, or ingots in which silicon self-interstitials and vacancies are balanced in such a way that the ingot contains no COPs and no interstitial/vacancy clusters with sizes above the detection threshold of the analytical equipment used in the semiconductor industry. Likewise, solutions to the long-standing problem of controlling the oxygen precipitation behavior in silicon were found through silicon native defects engineering. Oxygen is easily incorporated into silicon and extremely difficult to remove because of its near-unity segregation coefficient from the melt. It
5 Structural, Elemental, and Chemical Complex Defects in Silicon
81
is usually introduced into Czochralski (CZ) grown silicon from a quartz crucible as the latter is slowly etched away by the silicon melt. Interstitial oxygen does not significantly affect the electrical properties of the material. However, a large fraction of the total oxygen concentration incorporated into the crystal in concentrations close to its solid state solubility in silicon (typically in the 1017 –1018 cm−3 range) precipitates out during cooling of the ingot or during subsequent anneals of the wafers. Oxygen precipitates in the bulk of the silicon wafer provide sinks for transition metals (internal gettering sites), thus keeping the metals away from the devices [6]. On the other hand, oxygen precipitates formed in the near-surface area of the wafers (in the device area) are detrimental for the device yield. Hence, it is important to enhance oxygen precipitation in the bulk and simultaneously prevent the formation of oxygen precipitates in the near-surface layers of the wafers. In the past, this was achieved by controlled out-diffusion of oxygen from the near-surface areas of the wafers achieved by a high temperature anneal in inert gas. However, it was often difficult to achieve a reproducible formation of the denuded zone with the desired thickness because the formation of oxygen precipitates depends on many factors, including the oxygen content of the ingot, its growth conditions, and thermal history. Additionally, the cost of such heat treatment was substantial, and throughput of the annealing furnaces was low. The key to engineering oxygen precipitation was in understanding the defect reactions which are involved in nucleation of oxygen precipitation. Falster et al. [7] have shown that the kinetics of nucleation of oxygen precipitates is, to a large extent, determined by the local vacancy concentration. They suggested the use of short anneals in a controlled ambient followed by a rapid cool to facilitate the formation of a vacancy profile in the wafer which stimulates nucleation of oxygen precipitates in the bulk (high local vacancy concentration), but suppresses oxygen precipitation in the near-surface areas (low local vacancy concentration). This technique, called “magic denuded zones,” gives highly reproducible results and depends neither on oxygen concentration in the wafers nor on growth conditions of the ingot. It takes advantage of the facts that [7] (a) the equilibrium concentration of vacancies at high temperatures is higher than that of interstitials, (b) recombination and generation of vacancies and interstitials is much faster at the wafer surfaces than in the bulk. Transition metals and their complexes, precipitates, and inclusions are another type of defect that is a continuous cause of concerns for the semiconductor industry. These defects can reduce the efficiencies of silicon-based devices in a variety of ways, e.g., via “spiking” [8–10] or completely shunting [11, 12] pn-junctions, increasing bulk recombination [13, 14], compromising gate oxide integrity [15–17], and increasing pn-junction leakage currents [18–20]. Note that the onset of the latter three phenomena can be caused by minute quantities of impurities, e.g., interstitial transition metal concentrations as low as 1010 –1011 cm−3 (i.e., in the parts-pertrillion range). Due to their high diffusivity and solubility in silicon (see [21, 22]), transition metals such as copper and nickel can easily diffuse through the wafer to integrated circuit (IC) devices. The devices contain heavily n-type and p-type doped regions and may include areas with high local strain, which makes them attractive precipitation or segregation sites for metals. Since the device area thickness (in a
82
A.A. Istratov et al.
range of a micron) is significantly less than the wafer thickness (500–800 microns), agglomeration of metals dissolved in the wafer at the preferred precipitation sites in the device area can lead to an increase in their local concentration in the device area by several orders of magnitude (proportionally to the ratio of the thicknesses of the wafer and the device area). The task of engineering the distribution of transition metals in the wafer by trapping them at intentionally formed sinks away from the devices is generally referred to as “gettering” [6, 23]. Since gettering sites have to compete with devices for metals [24], the efficiency of the gettering sites and optimization of the gettering treatments is an important issue. Extended defects, such as dislocations, generally cannot be tolerated in the device area for reasons similar to those discussed above for transition metals. With modern technology and proper seeding, single crystals can be grown dislocationfree. However, dislocations can nucleate at strained areas and interfaces in the device area during processing. Likewise, dislocations often nucleate at metal precipitates. Prevention of nucleation and propagation of dislocations in the device area was a serious problem in the 1960–1980s. Eventually, this problem was controlled by the improved quality of silicon wafers, improved device design, and lower temperature budgets utilized in device processing. Implementing strained silicon structures and continuing the reduction of device dimensions have led to renewed interest in the nucleation and propagation of dislocations. Relaxation of strain occurs via propagation of dislocations. The impact of various wafer parameters, such as doping level and impurity content (e.g., oxygen, carbon, and nitrogen) on the nucleation and propagation of misfit dislocations has yet to be fully understood. Understanding the interactions of dislocations with various impurities is very important for two reasons. First, impurities (such as oxygen) trapped at dislocation cores may arrest their propagation. Second, metal impurities segregated or precipitated at dislocations can dramatically increase their recombination activity [25–27]. In the last few years, silicon photovoltaics (PV) has become a major test ground for more extensive studies of defect reactions in silicon and has motivated many research teams to embark on the analysis of fundamental physical properties of metalrelated defects and their interactions with dislocations and grain boundaries in silicon. There are two reasons for the increase in importance of the photovoltaic-related research. The first reason is economic: up until the mid-1990s, the PV industry was in a fledgling status, and many of the technologies for growing and processing solar cell devices were borrowed directly from the microelectronics industry. The PV silicon supply would often consist of unwanted “silicon scrap” material, including the upper and lower parts of ingots. However, booming growth rates of the photovoltaic industry (20–45% annually) in the last few years has made it an important player on the silicon market. The amount of electronic-grade silicon consumed by the solar cell industry has become comparable with the needs of the IC industry. Estimates based on the current growth rate of the silicon photovoltaics predict that the solar energy market may surpass the value of the IC market within the next 10–15 years. The second reason is the unique combination of the relative simplicity of the solar cell as a device with the complexity of the defect interactions within the ma-
5 Structural, Elemental, and Chemical Complex Defects in Silicon
83
terial, which could not be adequately understood without state-of-the-art analytical tools and innovative approaches. A solar cell is a flat p–n junction with geometric dimensions typically from 10 × 10 to 20 × 20 cm2 . Photovoltaic companies face the challenge of ensuring that all incident light is absorbed by the silicon and that the largest possible fraction of light-generated electrons and holes is collected by the p–n junction rather than recombining at interfaces and defects within the solar cell. Since the solar cell industry has to compete on the commodity market with fossil-fuelbased power generators, these problems have to be solved at a minimal possible cost. A 16% efficient solar cell translates into an upper limit of $0.50 per gram of fullyprocessed silicon. For comparison, a Pentium-class microprocessor contains roughly 150–200 mg of silicon and sells at a price from $150–$500, which is equivalent to $2000 per gram of fully-processed silicon. This 4000-factor difference in selling price severely limits the commercial viability of expensive, microelectronics-grade silicon purification techniques, cleanroom processing, and high-purity chemicals for solar cell applications. In order to reduce the material costs, many photovoltaic companies have chosen to use multicrystalline silicon (mc-Si) or upgraded metallurgical silicon (umg-Si), which may contain a high density of structural defects (dislocations, grain boundaries) along with oxygen and carbon [28–30], and transition metals [31–33]. Transition metals are undoubtedly one of the main culprits of undesirable carrier recombination in the bulk of the cells. Unlike single crystalline CZ and FZ wafers used for integrated circuits and power devices, in mc-Si metals are predominantly not found in the interstitially or substitutionally dissolved form, but rather in tiny (often tens or hundreds of nanometers in size) precipitates or agglomerates. The density of these nanoscale metal-rich particles is usually too low to be detected by conventional analytical techniques, and yet their removal, passivation, or defect engineering is extremely important for making solar energy economically competitive. Additionally, due to an abundance of structural defects and other impurities, transition metals in multicrystalline silicon solar cells can no longer be considered independent of other impurities, silicon native defects (interstitials and vacancies) and structural defects (such as grain boundaries and dislocations); instead, the complex defect reactions have to be taken into account. In the rest of this chapter, we will show that silicon solar cells offer a unique opportunity to investigate complex reactions involving the formation of detrimental metal-related defects, the dissolution of these defects during heat treatments, and the precipitation of metals at structural defects in silicon. Furthermore, we will present examples how novel analytical tools can be used to access the properties of nanoscale metal precipitates in silicon. While the majority of the results presented below were obtained on multicrystalline silicon used for solar cells, the conclusions are of fundamental physical nature and are not specific to PV only. The experimental methods that enabled us to analyze metal-rich particles in solar cells are synchrotron radiation-based X-ray microprobe techniques. These techniques were first applied to solar cells in the late 1990s by McHugo, Thompson, et al. [34–37]. They used an ultra-bright, synchrotron-produced X-ray beam, a few square microns or less in size, focused onto a sample mounted on an X–Y stage near an X-ray fluorescence (XRF) detector. The X-ray fluorescence microscopy (μ-XRF)
84
A.A. Istratov et al.
technique enabled one to study the spatial distribution and elemental composition of metal-rich particles as small as hundreds or even tens of nm in size. The chemical states of these metal-rich particles (e.g., to distinguish if a certain particle is composed of iron oxide, silicide, silicate, metal, etc.) can be determined by utilizing the X-ray absorption microspectroscopy (μ-XAS) technique [36, 38, 39]. Finally, one can use the X-ray beam induced current technique (XBIC) [40, 41] to in situ (simultaneously with μ-XRF mapping) characterize the recombination activity of metal-related defects. Applications of these techniques allowed us to understand the preferred chemical state, distribution, and origins of iron and copper contamination in solar cells and shed light on interaction of these metals with structural defects in silicon. While all metal contaminants are detrimental for solar cells, Cu, Fe, and Ni are the most common contaminants and are usually found in higher concentrations in mc-Si than any other metal. Our X-ray microscopy experiments were performed at the Advanced Light Source (ALS) beamlines 10.3.1 and 10.3.2 at Lawrence Berkeley National Laboratory, and at the Advanced Photon Source (APS) beamlines 2-ID-D and 20-ID-B at Argonne National Laboratory. At the time of measurement, the focusing optics of ALS Beamlines 10.3.1 and 10.3.2 respectively were adjusted to achieve optimum spot sizes (spatial resolutions) of 2 × 3 μm2 and 5 × 7 μm2 , at fluxes of ∼1 × 109 photons/s. The zone plate optics of APS Beamline 2-ID-D achieved a spot size of 200 nm in diameter, with a flux ∼1010 photons/s. Further details of the experimental techniques can be found in [42–44]. A summary of deliverables of X-ray microscopy techniques used in this study is presented in Table 5.1.
5.3 Precipitation Behavior, Chemical State, and Interaction of Copper with Extended Defects in Single-Crystalline and Multicrystalline Silicon Our discussion will start with copper. Copper is a ubiquitous contaminant in siliconbased device technology that can be easily introduced into the bulk of silicon wafers. According to the existing data on solubility and diffusivity of Cu in Si [22, 45, 46], at only 425◦ C the equilibrium solubility of Cu in Si is as high as 1013 cm−3 , and the diffusivity is such that Cu can traverse 220 μm of single crystalline p-type silicon in under 10 seconds. While interstitial copper is a shallow donor with relatively benign electrical activity [14, 47], copper-rich precipitates are known to severely reduce the minority carrier diffusion length by forming bands of states within the silicon bandgap, thereby providing very effective pathways for recombination [48–52]. The precipitation of copper is unfavorable in structurally perfect p-type silicon because of the significant lattice strains involved in the formation of copper-rich precipitates [53], compounded with the energy required to change the charge state of Cu upon precipitation [47]. Copper precipitation in bulk p-type silicon can occur if the Cu contamination level is sufficiently high and the chemical driving force (electrochemical potential) for precipitation is sufficient to overcome the barrier for precipitation [50, 51, 54–56]. More importantly, even in low concentrations copper readily segregates to or precipitates in the presence of heterogeneous nucleation
Example
Deliverables
Acronym Technique name(s) Sub-technique(s)
Elemental composition, size, morphology, depth, and spatial distribution of metal-rich particles. Grain boundary structure from elastically scattered peak
μ-SXRF: Scanning μ-XRF. Often used interchangeably with μ-XRF
μ-XRF X-ray fluorescence microscopy
Dependent on local unoccupied density of states (function of local bonding)
XANES: X-ray absorption near-edge spectroscopy Structural info: Distances and # of electrons of nearest neighbor atoms
EXAFS: Extended X-ray absorption fine structure
μ-XAS X-ray absorption microspectroscopy
Recombination activity map, analogous to simple LBIC
(Traditional) XBIC
Minority carrier diffusion length map, analogous to spectrally resolved LBIC
SR-XBIC: Spectrally resolved XBIC
XBIC X-ray beam induced current
Table 5.1. Summary of synchrotron-based analytical microprobe techniques, from [44]
5 Structural, Elemental, and Chemical Complex Defects in Silicon 85
86
A.A. Istratov et al.
sites, such as stacking faults or certain types of dislocations [57–59]. It is also known that metal-rich particles can be incorporated into structural defects during crystal growth [60]. Multicrystalline silicon (mc-Si) typically contains high transition metal concentrations combined with a high density and variety of structural defects. Not surprisingly, copper-rich particles have been observed at structural defects in poorlyperforming regions of some types of mc-Si solar cell material [33, 34, 38, 61, 62], complementing neutron activation analysis (NAA) data reporting Cu concentrations in mc-Si as high as 1015 cm−3 [31]. While Cu-rich clusters are undoubtedly not the only type of defect responsible for reducing the efficiencies of mc-Si solar cells, their known recombination activity and repeated observation in poorly-performing regions indicate they most certainly can be a contributing factor. The chemical states of these Cu-rich particles have wide-reaching implications for predicting the stability of these defects, and ultimately, their impact upon mc-Si solar cell devices [39, 63]. For example, it is much more difficult to dissolve and getter copper from copper oxide or copper silicate particles than from copper silicide due to the higher binding energy of the metal atoms to the silicides [39]. Previous studies that have attempted to determine the chemical state of Cu-rich particles in Si were largely restricted to TEM-based energy dispersive X-ray spectroscopy (EDX) and diffraction analyses of copper precipitates in samples prepared by in-diffusion of unusually high Cu concentrations, or grown from a heavily Cu-contaminated melt [64–69]. In those studies, a species of copper silicide, η-Cu3 Si, was the predominantly observed phase. The question arises as to whether copper-rich particles in intentionally contaminated samples are of a phase identical to that found in samples containing lower Cu concentrations, representative of what one might encounter in silicon without intentional contamination. For instance, the feasibility of interaction of Cu with oxygen during crystal growth with the formation of copper-oxide particles has long been an open question. To address this issue, we compared several samples with varying copper and oxygen concentrations, including a low-oxygen multicrystalline float zone (mc-FZ, [70]) sample heavily doped with copper during growth, a silicon/silicon germanium heterostructure intentionally contaminated with copper after growth, CZ with oxygen precipitates contaminated after growth, and samples of ingot-grown mc-Si extracted from near the bottom of the ingot without intentional Cu contamination, but with oxygen concentrations as high as 1018 cm−3 . Intentional contamination was performed by depositing a thin layer of metal on the surface of a sample and diffusing it at sufficiently high temperature. The metal contamination level was determined by the equilibrium solubility of metal at the diffusion temperature. The distribution of Cu in each Si sample was mapped using μ-XRF. The noteworthy observations from each of the samples are as follows: (1) For FZ-Si heavily doped with Cu during crystal growth, irregularly distributed Cu particles were observed at structural defects (grain boundaries and dislocations), as shown in Fig. 5.1(a). This irregular Cu decoration is expected for slow-
5 Structural, Elemental, and Chemical Complex Defects in Silicon
87
Fig. 5.1. (a) Cu-Kα X-ray fluorescence microscopy and (b) X-ray beam induced current maps of float zone silicon contaminated with (3–4) × 1016 Cu cm−3 during crystal growth. Notice the strong correlation between the presence of copper-rich particles (a) and the decrease of current collection efficiency (b)
cooled samples, wherein supersaturated Cu can diffuse to preferred precipitation sites [59]. The observed Cu-rich particles were strongly recombination-active, as revealed by XBIC in Fig. 5.1(b). (2) Cu-rich precipitates were observed along misfit dislocations in the Si0.98 Ge0.02 /Si heterostructure. From the Cu-Kα fluorescence map (Fig. 5.2), one can clearly see the copper contamination along the network of 60◦ misfit dislocations parallel to the surface, which intersect at 90◦ angles in agreement with literature observations [25, 71, 72]. The recombination-activity of these precipitates has been well-established by electron beam induced current (EBIC) and XBIC measurements [25, 73, 74]. (3) In the CZ-Si sample containing 106 cm−3 oxygen precipitates, ∼1.1 × 106 cm−3 density of copper particles are observed (Fig. 5.3), which is close to the density of oxygen precipitates (1 × 106 cm−3 ). (4) In the as-grown cast mc-Si material, Cu-rich particles were located at a grain boundary in the material, together with similar amounts of Ni and less abundant Fe, although no intentional contamination was performed. The μ-XRF map in Fig. 5.4 shows the Cu distribution along a representative region of the grain boundary. Although the particle sizes were smaller than the X-ray beam spot size
88
A.A. Istratov et al.
Fig. 5.2. Cu-Kα X-ray fluorescence microscopy map of a Cu-contaminated Si0.98 Ge0.02 /Si heterostructure. The misfit dislocations parallel to the surface, intersecting at 90◦ , are heavily decorated with Cu-rich particles, confirming the tendency of Cu to precipitate in the vicinity of structural defects
Fig. 5.3. Cu-Kα X-ray fluorescence microscopy map of Cu-contaminated Czochralski silicon with ∼106 oxygen precipitates per cm3 . Elliptical Cu-rich particles can be observed, oriented along preferred crystallographic orientations. The density of Cu particles matches the density of oxygen precipitates
5 Structural, Elemental, and Chemical Complex Defects in Silicon
89
Fig. 5.4. Cu-Kα X-ray fluorescence microscopy map along a grain boundary of as-grown cast multicrystalline silicon. Despite the absence of intentional contamination, Cu-rich particles are present
of 200 nm, the number of Cu atoms per particle was determined to fall within the range of (3 ± 1.5) × 107 . Were all these Cu atoms contained within one large spherical Cu3 Si particle, the diameter of these particles would be approximately 100 ± 15 nm. However, it is also possible that these Cu3 Si molecules are distributed among a colony of nanoparticles as reported in TEM studies of intentionally-contaminated monocrystalline Si [69]. Cu K-edge μ-XANES scans of the copper-rich particles in all four samples yielded strikingly similar spectra to Cu3 Si standard material (Fig. 5.5(b,c)). It is interesting that although the X-ray absorption spectra of Cu precipitates found in mc-Si are distinctively different from those of Cu2 O standard, the Cu K-edge absorption onset energy of Cu3 Si matches that of the Cu2 O standard, and is shifted as compared to that of metallic Cu standard. The Cu K-edge absorption energy shift of Cu3 Si relative to Cu metal is not typical for all metal silicides. Iron metal and silicides, for example, have identical Fe K-edge X-ray absorption onset energies, unlike oxidized iron species that have K-edge onsets shifted to higher energies by amounts proportional to the Fe charge state [36, 75]. This behavior of Cu appears to stem from the unique electronic properties of Cu in Si. Copper dissolved in p-type silicon is well-known to diffuse predominantly as Cu+ i [45]. Recent ab initio Hartree–Fock calculations published by Estreicher [47] 10 0 indicate that Cu+ i will not diffuse as a compact [Ar]3d 4s sphere, but rather, it will promote some its electrons from the 3 d to the 4 sp orbitals to form weak covalent bonds with nearby silicon atoms. Similarly, copper atoms precipitated at certain internal voids are predicted to promote a small fraction of their electrons to 4 sp orbitals for covalent overlap with neighboring silicon atoms [47]. Macroscopic studies on and models of the properties of copper silicides have also indicated a hybridization of the valence Cu and Si orbitals [65, 76–78]. The increased delocalization of Cu valence electrons can qualitatively explain the Cu–K absorption edge shift to higher energies: as they are photo-excited out of the atom, Cu 1 s core electrons experience a greater Coulombic attraction with the Cu nucleus due to reduced electron screening, and thus require higher X-ray energies for photoionization.
90
A.A. Istratov et al.
Fig. 5.5. μ-XANES showing the spectra of standard materials (a), and then the excellent match of Cu-rich particles in a variety of silicon materials with the Cu3 Si standard (b, c, taken at different beamlines)
5 Structural, Elemental, and Chemical Complex Defects in Silicon
91
5.4 Precipitation Behavior, Chemical State, and Interaction of Iron with Extended Defects in Silicon It is known that even minute concentrations of iron can drastically reduce the minority carrier diffusion length in silicon. Just ∼2 × 1012 cm−3 of interstitial Fe (Fei ) or ∼2 × 1013 of iron–boron (Fei –Bs ) pairs in p-type single-crystalline silicon can reduce the minority carrier diffusion length to 50 μm (τ ∼ 1 μs), which is lower than the desirable value for most PV devices of reasonable efficiencies [79] and certainly unacceptable for the IC industry. Despite this fact, recent neutron activation analysis (NAA) data on several commercially-available mc-Si solar cell materials reveal Fe concentrations as high as 1014 to 1016 cm−3 ([32, 33]). The question naturally arises as to how mc-Si solar cells can contain so much iron yet manage to achieve reasonable operating efficiencies. DLTS or μ-PCD measurements [80–83] typically reveal only 1011 –1013 cm−3 of interstitial iron (or FeB pairs) in mc-Si. Hence, the majority of iron should be in agglomerates or precipitates, which must then be less detrimental (per iron atom) to solar cell performance than interstitial iron. Synchrotron-based microprobe techniques were used to systematically characterize iron-rich precipitates and inclusions affecting two very different types of mcSi solar cell materials: directionally-solidified cast mc-Si and SiliconFilm™ sheet material. Directionally solidified mc-Si material [84] can take days to cool down after being cast in a typical 200–300 kg ingot. SiliconFilm™ sheet material [85–87] is grown by a different process, involving lower-quality feedstock and high crystal growth rates, which produces 600–800 μm-thick sheets of mc-Si material. A typical cooling time from the silicon melting point to room temperature is measured in minutes. While such a high throughput and low-cost process has the potential to significantly reduce the price of PV [88], it also results in higher densities of structural defects [89] and increased impurity content in the material, including carbon- and oxygen-related defects [81, 87] and transition metals [33, 87]. XBIC maps revealed certain grain boundaries with exceptionally high recombination activity in both material processed into solar cells (Fig. 5.6(a)) and unprocessed material (Fig. 5.7(a)). Multiple iron-rich particles were detected by μ-XRF at these locations, as the maps in Fig. 5.6(b) and Fig. 5.7(b) demonstrate. These ironrich particles populating grain boundaries can be divided into two distinct types, with distinct physical properties. First, while the vast majority of iron-rich particles are small (e.g., P1, P3, and P4 in Fig. 5.6(b) and all particles in Fig. 5.7(b)), some rare particles have nearly two orders of magnitude higher μ-XRF Fe counts (e.g., P2 in Fig. 5.6(b); note the log scale of Fe concentration). An analysis by μ-XAS reveals that the particles with smaller Fe counts are composed of iron-silicide (FeSi2 ), while the particles with much larger Fe counts are composed of oxidized iron (Fe2 O3 ), as shown in Fig. 5.8(a) and Fig. 5.8(b), respectively. The average sizes of these two types of iron-rich particle, tabulated in Table 2, indicate that the FeSi2 particles are indeed smaller than the Fe2 O3 particles (Fig. 5.6(b)). The compositions of these particles also differ, as determined by the μ-XRF point scans. While the Fe2 O3 particles show appreciable amounts of other contaminants such as Cr, Mn, and Ca (Fig. 5.9(b)), the smaller FeSi2 particles show none of these
92
A.A. Istratov et al.
Fig. 5.6. (a) Typical XBIC image of a cast mc-Si sample extracted from a fully-processed vertically-oriented (i.e., parallel to the growth direction of the ingot) wafer near the bottom of the ingot. The arrow in (a) points to a recombination-active grain boundary, a region of which was analyzed by μ-XRF in (b). Fe-rich particles are found along the grain boundary, highlighted by the arrow and the dotted line in (b). Properties of the Fe-rich particles labeled “P1” through “P4” are summarized in Table 5.2
Fig. 5.7. (a) Large area XBIC and (b) high-resolution μ-XRF map of the iron distribution at a grain boundary in as-grown cast mc-Si. Several FeSi2 nanoprecipitates are observed. Although some clustering is evident on a micron-scale, on a larger scale these FeSi2 nanoprecipitates are distributed rather homogeneously
above the μ-XRF detection limit (Fig. 5.9(a)). Only in as-grown material can Ni and Cu be found precipitated in the immediate vicinity of FeSi2 in detectable quantities, but not Cr, Mn, Ti, or Ca. The distributions of these particles also differ. While the large Fe2 O3 particles are inhomogeneously distributed, the smaller FeSi2 particles appear to be more regularly spaced. Taking into account the attenuation length of the Fe–Kα fluorescence in Si, one calculates a FeSi2 precipitate density of (1.5–2) × 106 per cm2 of grain boundary surface area in Fig. 5.7, resulting in an average spacing between precipitates of 7–8 μm along the grain boundary. The typical number of iron atoms in each of these nanoprecipitates at the grain boundaries was determined to be (2.9–3.6) × 106 .
5 Structural, Elemental, and Chemical Complex Defects in Silicon
93
Fig. 5.8. μ-XAS data discern two types of Fe-rich particle in cast mc-Si material: (a) smaller iron silicide (FeSi2 ) and (b) larger iron oxide (Fe2 O3 ). Data labels (P1, P2, . . .) correspond to precipitates viewed in the μ-XRF image Fig. 5.6(b). The parameters of these precipitates can be found in Table 5.2
Fig. 5.9. Typical μ-XRF point scans for the two types of Fe-rich particle in cast mc-Si: (a) smaller FeSi2 particles without detectable quantities of other metals, and (b) larger Fe2 O3 particles wherein iron is accompanied by other elements reminiscent of ceramics, dirt, and furnace components
μ-XRF point spectra (not shown) taken on sheet material reveal Fe present at both grain boundaries and at localized intragranular defects, with a small contribution from Cr in the case of certain intragranular defects. High-resolution μ-XRF maps reveal that the intragranular defects are irregular in shape and consist of an agglomeration of many nanoparticles, as shown in Fig. 5.10. In the spaces between intragranular defects, where the XBIC signal is higher, no Fe-rich particles were detected. These intragranular defects may consist of iron precipitated at voids and their associated dislocation bunches, which are known to exist in this material [87, 90]. The grain boundaries were also decorated by Fe-rich nanoparticles, as shown in Fig. 5.10(c). The chemical states of these particles were measured by μ-XAS; by comparison with standard material, it was deduced that Fe is most likely to be in the form of iron silicide, as shown in Fig. 5.10(d).
94
A.A. Istratov et al.
Fig. 5.10. (a, b) μ-XRF maps of the Fe distribution within typical intragranular defects in sheet material, noted by points of lower minority carrier lifetime. These defects consist of irregularly-shaped, micron-sized clusters of Fe nanoprecipitates. (c) μ-XRF map of Fe nanoprecipitates (radii ∼23 ± 5 nm) within a typical grain boundary, shown in 3D and in 2D projection. Despite the small size of individual Fe nanoprecipitates, they are estimated to contain a considerable fraction of the total Fe content in this sample due to their high spatial density. (d) The chemical state of the Fe nanoparticles shown in (a–c) is revealed by μ-XAS to be most similar to FeSi2
Some of the Fe-rich particles identified by μ-XRF mapping deviated significantly from the others in terms of morphology, composition, and chemical state. Large, (Fe+Cr+Ni)-rich clusters of particles containing iron in a chemical state most similar to iron silicide and in proportions reminiscent of stainless steel were observed (Fig. 5.11). Additionally, a large Fe-rich particle measuring up to 25 μm in diameter could be observed (Fig. 5.12(a)); this particle contained no Cr or Ni above the detection limit. In addition, the μ-XAS spectrum of this particle clearly indicates the presence of Fe2 O3 (Fig. 5.12(b)). The overall density of these particles is believed to be rather low, given the sighting of only one such particle in a scan area of several mm2 . To summarize, two distinct classes of Fe-rich particles have been observed in mc-Si materials: (1) iron silicide nanoprecipitates, present in higher densities at grain boundaries and intragranular defect clusters, and (2) a lower density of micron-sized
5 Structural, Elemental, and Chemical Complex Defects in Silicon
95
Fig. 5.11. (a) Normalized μ-XRF point spectra of Cr/Fe/Ni-rich particles in the front and back sides of an as-grown sheet material sample. The similar composition of these particles is noted, although the particle on the back surface yields a larger XRF signal, indicating more metals are present. (b) The chemical state of the Fe in such particles found on both the back and front surfaces of the sample is revealed by μ-XAS to be FeSi2 . The coincidence of these three metals in the observed proportions suggests contamination by stainless steel; one can conclude from the μ-XAS spectra that these particles were likely introduced in metallic form and not as oxides. A high-resolution μ-XRF map of the Fe, Ni, and Cr distributions within one such particle is shown below
Fig. 5.12. (a) μ-XRF and (b) μ-XAS of an oxidized iron particle in as-grown sheet material. Note the large size of this particle relative to the iron silicide nanoprecipitates (Fig. 5.7(b), Fig. 5.10(a–c)). Such a particle, with large size and oxidized chemical state, is most likely to be an inclusion
particles, either consisting of elements suggestive of stainless steel (Fe+Cr+Ni), or oxidized iron.
96
A.A. Istratov et al.
5.5 Pathways for Metal Contamination in Solar Cells Based on experimental evidence obtained in our own studies and on other reports in the literature, we were able to define five non-exclusive pathways for incorporating metals into the mc-Si material at high temperatures. These pathways are presented in Fig. 5.13: (a) direct incorporation of incompletely dissolved foreign metal-rich particles into the crystal as inclusions, (b) direct precipitation of locally supersaturated impurities from the melt, (c) segregation of metals dissolved in the melt to structural defects, (d) incorporation of dissolved metals in the melt into single-crystalline regions of the material as metal point defects, and (e) diffusion of metals from the growth surfaces into the solidified crystal, which is still at sufficiently high temperature for rapid diffusion of metals. The latter two mechanisms (d and e) are rather limited in their potential to introduce large amounts of metals into most mc-Si materials. Mechanism (e) only affects the border regions of the crystal [91]. For ribbon or sheet materials only fractions of a millimeter thick, however, this may be a motive of concern as the impurity species is more likely to diffuse through the entire thickness of the material. As far as mechanism (d) is concerned, simple equilibrium segregation models alone cannot account for the fact that 1014−16 cm−3 Fe ([33]) is present in mc-Si materials. Were the amount of Fe incorporated into the final crystal determined simply by the segregation of iron from the melt into single-crystalline regions (as defined by kSi = 8 × 10−6 ,
Fig. 5.13. Graphic representation of non-exclusive contamination pathways for iron in mc-Si, demonstrating the origins of Fe contamination in mc-Si, the physical mechanisms responsible for incorporating large amounts of Fe into the mc-Si material when warm (i.e., temperatures at which impurity atoms are mobile), and the formation mechanisms of the Fe-rich particles one observes in mc-Si material
5 Structural, Elemental, and Chemical Complex Defects in Silicon
97
Table 5.2. Dimensions and chemical states of various Fe-rich particles detected by μ-XRF along a strongly recombination-active grain boundary in cast mc-Si material. Dimensions “X” and “Y” are perpendicular to the crystal growth direction of the directionally-solidified cast multicrystalline silicon ingot, while “Z” is parallel to the growth direction. Dimensions were determined by the full-width half-maximum of the Fe concentration extracted from highresolution μ-XRF line scans, while chemical states were measured with μ-XAS. Notice the elongation of FeSi2 particles along the crystal growth direction (Z-dimension). Particles P1 through P4 are shown in Fig. 5.6(b). Note also that the beam spot size is roughly 200 nm, thus dimensions given as 200–300 nm may in reality be much smaller Particle label P1 P2 P3 P4 P5 P6
Chemical state FeSi2 Fe2 O3 FeSi2 FeSi2 Fe2 O3 FeSi2
X-diameter (nm) 290 1250 ≤200 275 ∼1450 ≤200
Z-diameter (nm) 547 892 710 772 ∼1800 570
i.e., the ratio of Fe solubilities in single crystalline silicon and in the melt, [92]), this would imply that the melt at the solid–liquid interface contained as much as 0.01–1% Fe! If in fact this much Fe were present, instability in the solid–liquid interface would arise, and certainly columnar crystal growth with centimeter-sized grains would not proceed as desired in the case of cast mc-Si. The first three mechanisms (a, b, and c) can account for large amounts (parts per billion/million) of iron being incorporated into mc-Si. Evidence for mechanism (a), the inclusion of foreign particles, is provided by the μ-XRF observations of a few metal-rich particles of unusually large sizes (typically ≥1 μm, see Fig. 5.6(a), Fig. 5.12, and Table 5.2). All of these large particles observed have one or both of the following additional characteristics: (i) the coincidence of iron with large amounts of other (often slowly-diffusing) metal species (e.g., Ca, Ti, Cr, Mn, Ni), the relative proportions of which allude to certain steels or ceramics (e.g., Fig. 5.9(b) and Fig. 5.11) and (ii) an oxidized chemical state (e.g., Fig. 5.8(b), Fig. 5.12(b), and Table 5.2). This last point is a significant indicator of foreign particles being included in the melt, as it is currently believed that oxidized iron compounds are not thermodynamically favored to form under equilibrium conditions within silicon, since the ironoxygen binding energy is much weaker than that of silicon-oxygen, and thus iron cannot “out-compete” silicon for the oxygen (see the discussion below). However, an Fe2 O3 particle inserted into the silicon melt should retain its structural integrity for a limited time, as the melting temperature of Fe2 O3 is approximately 1565◦ C, about 150◦ C above the melting temperature of Si. Nevertheless, because molten Si is a reactive environment, it is extremely likely that these particles will, over time, be etched away by the melt and lead to the formation of smaller, more distributed iron silicide particles. If a limited amount of iron dissolves from iron oxide particles, then it is likely that an even greater amount of iron should dissolve from large metallic iron and iron sili-
98
A.A. Istratov et al.
cide particles, given that (a) the melting temperatures of iron-silicon alloys are below that of silicon [93], and (b) the binding energy of iron atoms to a metallic iron or an iron silicide particle is weaker than to an iron oxide particle [38]. The large, micronsized (Fe+Cr+Ni)-rich clusters observed by μ-XRF in sheet material (Fig. 5.11), the compositions of which are very similar to stainless steel, were determined by μ-XAS to be composed of FeSi2 . One can conclude that this iron was most likely introduced into the melt as metallic stainless steel particles, given that oxide particles are expected to survive the rapid crystallization process. It thus seems very plausible that Fe can dissolve from the micron-sized, unoxidized (Fe+Cr+Ni)-rich clusters during crystal growth and subsequent processing, contaminating nearby structural defects. During crystal growth, foreign particles in the melt can become trapped at interruptions in the solid–liquid interface causing structural defects such as grain boundaries, instead of being pushed forward in the melt by a uniformly advancing solidification front. Faster growth velocities tend to trap larger particles [94]. Thus, stainless steel and iron oxide inclusions in fast-grown sheet material (Fig. 5.11) are typically much larger than those found in slowly-grown cast mc-Si (Fig. 5.6(b), Table 5.2). As iron-rich particles in contact with the melt dissolve, the dissolved iron content in the melt increases. Furthermore, as crystal growth progresses, the segregation of metals from solid to liquid silicon causes an increase of the metal content in the melt, more noticeably at the solid–liquid interface (depending on convective flows in the melt) [95, 96]. When high metal contents are present in the melt, mechanisms (b) and (c), described above, can result in the incorporation of large amounts of iron into the final crystal. The direct precipitation of iron from the melt, pathway (b), can only occur in special conditions, namely, when the impurity concentration in the melt reaches a critical level, as to promote the onset of a second phase. It is known that when the convective flows in the melt are not sufficient, the impurity concentration at the solid–liquid interface increases as impurities are rejected from the crystal into the melt. Once the impurity concentration at the interface reaches a critical level, any small perturbation of the interface can result in the precipitation of the metal silicide directly into the crystal [60, 95]. While the convective flows within the melt during the casting process are not likely to allow impurities to reach critical levels at the planar interface, this might not be the case near grain boundaries. It is known from the work of Abrosimov et al. [97] that a grain boundary reaching the solid–liquid interface causes a localized distortion (dip) to occur at the solid–liquid interface. It is conceivable that impurities within this dip may be more sheltered from the convective flow of the melt, wherein they may be able to reach critical concentrations and precipitate directly. The last mechanism, pathway (c), enabling large amounts of iron to be incorporated into mc-Si is the segregation of metals to structural defects. It is known that the solubility of metals in polysilicon is much higher than in single-crystalline silicon, especially at lower temperatures [98, 99]. This can be explained by the interaction of metals with dangling or reconstructed silicon bonds in structural defects (e.g., grain boundaries), as well as the reduction of strain energy from impurities settling in a distorted silicon lattice near the structural defects [98, 100–102]. Iron incorporated via this mechanism could either remain homogeneously distributed along grain
5 Structural, Elemental, and Chemical Complex Defects in Silicon
99
boundaries, below the detection limit of μ-XRF, or could perhaps diffuse along grain boundaries and form iron silicide nanoprecipitates. This hypothesis is consistent with the observation of ∼1014 Fe cm−3 distributed as silicide nanoprecipitates along grain boundaries in cast mc-Si, which is the same amount of iron estimated to atomically segregate to grain boundaries at higher temperatures [103]. During sample cooling, supersaturated dissolved metals tend to precipitate at existing defect clusters or form new ones [66, 104, 105]. The actual processes leading to precipitate formation during cooling are complex and not entirely understood, but available data [98] suggests that metals supersaturate faster (i.e., have a stronger temperature dependence of solubility) within regions of the material with fewer structural defects compared to regions with elevated amounts of structural defects. Therefore, a strong driving force exists for supersaturated metals within single crystalline regions to accumulate at grain boundaries and other structural defects during sample cooling. With slow cooling rates and high metal concentrations, a few ”large” (tens of nm dia.) metal silicide precipitates are expected to form. On the other hand, faster cools offer less time for supersaturated metals to diffuse and form large particles, thus favoring a more homogeneous distribution of metals along structural defects, either atomically or as smaller precipitates [106]. While impeding the introduction of metals into the melt via the feedstock, production equipment, and/or growth surfaces will undoubtedly lead to improved materials, often there are certain limits imposed by economics, materials suppliers, or crystal growth conditions. Other strategies must be developed in parallel to limit the impact of metals on device performance. These require a fundamental understanding of how metals behave during thermal treatments and solar cell processing.
5.6 Effect of Thermal Treatments on Metal Distributions and on Device Performance In the experiment described below we monitored the effect of emitter diffusion processing temperature on the distribution of metals and its effect on the electrical properties of the material. Three adjacent vertical slices of the mc-Si ingot, with virtually identical initial crystal structure, were selected for this analysis. The first wafer was left unprocessed and was used as a reference. The second and the third wafers were processed into solar cells using different temperatures for emitter diffusion from a phosphorus spin-on source; these tasks were performed by colleagues at the Fraunhofer Institute for Solar Energy Systems in Freiburg, Germany. The second wafer was processed at 860◦ C for 120 seconds while the third wafer was processed at 1000◦ C for 20 seconds. Different annealing times were chosen to obtain comparable emitter depths. The average cooling rate after the end of RTP annealing was approximately 100◦ C/s. No anti-reflection coating was deposited, as to minimize the effect of hydrogen passivation in these experiments. The solar cell fabricated using low-temperature (860◦ C) RTP was found to be 20% (rel.) more efficient than the cell fabricated using high-temperature (1000◦ C) RTP. Most of this change in efficiency was linked to an increase of the minority
100
A.A. Istratov et al.
Fig. 5.14. Laser beam induced current (LBIC) maps of minority carrier diffusion length in (a) a Low-T RTP (860◦ C, 120 sec.) solar cell and (b) a High-T (1000◦ C, 20 sec.) RTP cell. The Low-T RTP sample is 20% (rel.) more efficient than the High-T RTP sample. Note the different minority carrier diffusion length scales in (a) and (b)
carrier diffusion length Leff , as shown in the laser-beam induced current (LBIC) maps in Fig. 5.14. A characteristic region of material was extracted for synchrotron-based analytical studies from the same location in all three sister wafers. Medium-resolution μ-XRF scans were performed over the same grain boundary in all three samples, as shown in Fig. 5.15. Average metal content per metal precipitate is plotted in Fig. 5.16. Comparison of these samples led us to the following observations: (1) In the as-grown material, multiple iron-rich particles can be seen decorating the grain boundary. μ-XAS analyses (not shown) revealed the chemical state of iron in these precipitates to be most similar to iron silicide. Some copper and nickel precipitates were also observed, although in lower spatial densities. The copper was determined by μ-XAS to be in the form of copper silicide. Some similarly large particles were observed in intragranular locations, probably coinciding with dislocations. (2) In the “low-temperature RTP” sample, some large FeSi2 precipitates remain, with the same count rate (i.e., number of metal atoms per precipitate) as in the as-grown material. However, while faint traces of Ni-rich precipitates can still be detected, the number of nickel atoms per precipitate is much reduced in amount relative to the as-grown sample. Cu3 Si precipitates are no longer detectable. (3) In the “high-temperature RTP” sample, FeSi2 precipitates are detected, but they contain on average 50% fewer iron atoms than as-grown material, giving evidence for iron silicide precipitate dissolution. No Cu- or Ni-rich precipitates are above the detection limits. These results indicate that low-temperature RTP effectively dissolves most Cuand Ni-rich precipitates, while FeSi2 precipitates are largely undisturbed. This effect is due to solubilities and diffusivities of Cu and Ni that are orders of magnitude higher relative to Fe. On the other hand, high-temperature RTP not only completely dissolves the Cu- and Ni-rich precipitates, but it also partially dissolves
5 Structural, Elemental, and Chemical Complex Defects in Silicon
101
Fig. 5.15. Synchrotron-based analysis of the metal content and distribution at grain boundaries in three sister wafers: (a) Unprocessed material (as-grown), (b) High-T RTP (1000◦ C, 20 sec.), and (c) Low-T RTP (860◦ C, 120 sec.). μ-XRF reveals that while some large FeSi2 precipitates remain after Low-T RTP, High-T RTP reduces the average metal content of FeSi2 precipitates by 50%, as discussed in the text. The recombination activity of intragranular regions increases with decreasing Fe content at structural defects, as seen in XBIC images and Fig. 5.14, suggesting the dissolved iron contaminates nearby regions
the FeSi2 precipitates. Higher temperatures greatly enhance the solubility and diffusivity of iron, allowing it to diffuse away from the precipitates at structural defects and contaminate the intragranular regions of the material. This effect can be seen by comparing XBIC images in Fig. 5.15: while the as-grown sample exhibits denuded (i.e., metal-lean) zones around the grain boundaries and other structural defects, the high-temperature RTP sample shows exactly the opposite, i.e., grain boundary “bleeding” into the grains. The observation of material degradation due to high temperature anneals (especially followed by fast cools) as reported in the literature [107–111] is now understood in terms of metal silicide precipitate dissolution.
5.7 Discussion: Chemical States of Metals in mc-Si The metal-rich particles in all materials analyzed in this study can be divided into two distinct classes: (a) the metal silicide (e.g., FeSi2 , NiSi2 , Cu3 Si) precipitate,
102
A.A. Istratov et al.
Fig. 5.16. Average metal content per precipitate analyzed by high-resolution μ-XRF in reference material (Ref), Low-T RTP (860◦ C, 120 sec.), and High-T RTP (1000◦ C, 20 sec.). Nickel and copper silicide precipitates are nearly entirely dissolved during both RTP treatments. High-T RTP reduces the average iron silicide precipitate size by nearly 50% while low-T RTP dissolves iron silicide precipitates to a much lesser extent
typically up to several tens of nanometers in diameter, most often associated with structural defects, and (b) the occasional larger particle – up to several microns in diameter – which is frequently oxidized, found within the grains, and/or composed of multiple slowly-diffusing metal species reminiscent of foreign material inclusions (e.g., coming from stainless steels, furnace material). Metal silicide nanoprecipitates are the more frequently observed type of metalrich particle and are likely to have formed from metals incorporated in the crystal in solid solution. Although most classes of metal silicide compounds can be synthesized in the laboratory (e.g., Fe3 Si, FeSi, FeSi2 ), only the most silicon-rich metal silicides (e.g., FeSi2 , Cu3 Si, NiSi2 ) are observed in mc-Si materials. This observation is consistent with the assumption that these precipitates are formed in a silicon-rich environment through precipitation of initially-dissolved metal atoms either in the crystal or the Si melt. In the past, it has been suggested that oxidized metallic precipitates may form within silicon because many species of metal atom, e.g., Cu and Fe, have higher binding energies to oxidized compounds such as silicates and oxides than to silicides [36, 39, 63]. While it is true that metals bond strongly to oxygen, the same can also be said for silicon, and thus an analysis of whether a metallic oxide, silicate, or silicide will form should take this competitive oxidation potential into consideration. It is known that oxygen can form a very stable and electrically inactive interstitial complex with silicon (Oi ), not to mention SiO2 . Table 5.3 reproduces the enthalpy of formation per oxygen atom (the figure of merit in a balanced equation) from individual elements for a selection of oxidized species, demonstrating that when [Si] [O] > [Cu], equilibrium thermodynamics predicts that silicon will be the predominant oxidized species.
5 Structural, Elemental, and Chemical Complex Defects in Silicon
103
Table 5.3. The enthalpies of formation per mol per oxygen atom at 298.15 K for various oxidized metal species in vacuum. It is shown that the binding energy of oxygen to silicon is far greater than that of oxygen to iron or to copper. The same is not true for all metals, e.g., hafnium. Data are from [130] Compound 1/2 HfO2 1/2 ZrO2 1/2 TiO2 1/2 SiO2 1/4 Fe2 SiO4 1/4 Fe3 O4 1/3 Fe2 O3 Cu2 O CuO
Δ f H◦ (kJ/mol) −572.4 −550.3 −472.0 −455.4 −370.0 −279.6 −274.7 −168.6 −157.3
While the precise values of enthalpies of formation cited in Table 5.3 do not reflect the additional detailed calculations necessary to account for the formation of a species within a cooling silicon lattice, this treatment provides the conceptual framework from which to analyze the prospect of a metal forming an oxidized species. In the presence of silicon, a strong competitor for oxygen, Cu will likely be reduced or remain unoxidized. This has been demonstrated on a macroscopic level, via the observation that an oxidizing Cu3 Si layer will first form Cu2 O on Cu3 Si, then progress to a final state of SiO2 on Cu3 Si after annealing [77]. Microscopic calculations predict that although Cui is attracted to Oi because it fits nicely into the void at the interstitial site near the Oi , no covalent Cu–O bonding occurs [112], again confirming that Si wins out over Cu when competing for oxygen. Based on these observations and our μ-XAS measurements, it is concluded that Cu in the presence of Si with [O] [Si] will not tend to form stable chemical bonds with oxygen, and thus will likely either form non-oxidized precipitates, out-diffuse, or remain dissolved if solubility permits. This treatment can be generalized to other metal and metal-oxide or metal-silicate species. For example, the values in Table 5.3 would suggest that Hf would form strong bonds to oxygen even if the heat of formation of hafnium silicide were considered, as was experimentally observed by Murarka and Chang [113]. On the other hand, it appears unlikely that iron will form oxidized precipitates within the silicon bulk, if [Si] [O] > [Fe], as confirmed by multiple identifications of sub-micron FeSi2 inclusions in mc-Si in this study. Alternative pathways are likely to exist for the introduction of oxidized metal species into mc-Si material. Large oxidized iron particles could be introduced into the melt, survive the crystal growth process, and become trapped, e.g., between grain boundaries during mc-Si growth. For example, iron in stainless steel readily oxidizes at temperatures above 1000◦ C [75], and the melting temperature of Fe2 O3 is 150◦ C higher than that of silicon. While molten silicon would invariably attack these foreign particles and reduce their size, the observation by McHugo et al. [36] of partially-
104
A.A. Istratov et al.
oxidized Fe inside a Fe+Ni+Cr particle cluster >15 μm in diameter seems to suggest this contamination pathway may indeed occur. The same pathway is not predicted to occur for oxidized Cu particles, as the low melting temperatures of both Cu2 O (1235◦ C) and CuO (1326◦ C) imply that such particles would quickly dissolve in molten silicon (1414◦ C). Experimental evidence up to this point has shown no evidence for oxidized Cu-rich particles inside silicon crystals. Ensuing from the above discussion, metal-rich particles over 1 μm in size, especially those containing oxidized and/or slowly-diffusing species, are believed to be incorporated directly from the melt during crystal growth.
5.8 Discussion: Interactions between Metals and Structural Defects In almost all materials analyzed in our experiments, metal silicide precipitates were detected at structural defects in multicrystalline silicon or multilayer structures, especially two-dimensional defect surfaces such as grain boundaries and voids, in agreement with previous studies [60, 97, 114, 115]. The appearance of metal-rich particles at structural defects is likely due to a combination of processes. First, impurity atoms in solution (i.e., dissolved) in the crystal at high temperatures supersaturate upon cooling and seek the most energetically favorable sites [116] for second-phase (e.g., metal silicide) precipitate nucleation. Second, impurity atoms such as iron [99] and copper [98] have been observed to segregate to structural defects even at elevated temperatures; upon cooling, these metals will form precipitates when they reach a supersaturated state. Third, if metals are present in locally high concentrations in the melt, they can precipitate directly from the melt at structural defects that perturb the solidification front, e.g., grain boundaries, when a locally high metal content exists at the liquid solid interface [60, 97]. Such perturbations of the solidification front are also believed to favor the incorporation of foreign particles (e.g., oxidized Ti or Fe) in the melt as inclusions [94]. The end result of all these processes is the accumulation of metals at structural defects in mc-Si. Different growth techniques result in materials with different predominant structural defects: different grain sizes, different preferred orientations of grain and twin boundaries, different dislocation densities, and different densities of intragranular defects. All these factors affect the availability and spatial distribution of preferred precipitation sites, thus contributing to determine the size and spatial distribution of metal-rich particles. Several studies [117–119] indicate that each type of structural defect has its own capacity for transition metals, which affects the ability for the segregated metals to aggregate at the defect and eventually form precipitates during crystal cool-down. Evidence for this is shown in Fig. 5.17, which compares grain boundary decoration in ribbon and in cast mc-Si samples. The grain boundary locations were determined by the intensity of the elastically scattered X-ray beam in the direction of the detector (a function of grain orientation). In the cast mc-Si sample, metal silicide precipitates
5 Structural, Elemental, and Chemical Complex Defects in Silicon
105
Fig. 5.17. μ-XRF maps of iron distributions in ribbon and directionally-solidified mc-Si, using the elastically scattered X-ray beam peak intensity to determine grain structure. Metal silicide nanoprecipitates are detected along certain structural defects in the ingot mc-Si, whereas in ribbon materials no metal-rich particles are detected. This difference can stem from two possible (non-exclusive) phenomena: (a) the faster cooling rate of ribbon mc-Si and a low metal concentration favor the formation of smaller precipitates below the current detection limits of μ-XRF, and (b) the structural defects in ribbon (especially 60◦ twin boundaries) have less capacity for metals than the high-angle twin and random grain boundaries of cast mc-Si and sheet materials
are detected at some grain boundaries but not at others, while no metal-rich particles are detected in ribbon. While differences in grain boundary metal decoration between cast and ribbon materials are also influenced by other factors (namely crystal cooling rates), differences within the same material (e.g., cast) can be explained only by differences in grain boundary character. The spatial density and character of structural defects are influenced by crystal growth variables, and thus vary considerably between different mc-Si materials.
106
A.A. Istratov et al.
Likewise, the nature and the degree of metal-structural defect interactions also vary. Large-angle and random grain boundaries common to sheet [120, 121] and cast mcSi are the easiest for precipitate nucleation, given current model defect studies [117, 119]. This is supported by the greater abundance of metal nanoprecipitates detected by μ-XRF at these defect types (e.g., Fig. 5.10). Such structural defects with large capacities for impurities will also act as internal gettering sites, reducing the metal point defect concentration [81] and increasing the minority carrier diffusion length within the grains. On the other hand, 60◦ twin boundaries common to both cast mc-Si and ribbon [29, 30, 122] offer fewer segregation and nucleation sites for metals. Metal precipitates are seldom observed by μ-XRF at those locations in these materials.
5.9 Discussion: Engineering of Metal-Related Nanodefects by Altering the Distributions and Chemical States of Metals in mc-Si The diffusivity and solubility of interstitial transition metals increase exponentially with temperature [22]. Thus, any process during which high temperatures are used has the potential to change the distribution, and even the chemical state, of metals within the wafer. This same exponential temperature dependence also signifies that metal atoms quickly become supersaturated as the sample is cooled. The degree of supersaturation and the temperature can influence the precipitation behavior of these metals. Thus, the temperature profile during cooling, both during crystal growth and during subsequent device processing, has a strong impact on the final distribution of metals within mc-Si. Very fast cooling rates result in a homogeneous distribution of predominantly dissolved metals and their complexes, and lower the minority carrier diffusion length. Slower cools result in lower spatial densities of larger particles, enhancing minority carrier diffusion length. In as-grown mc-Si material, one must conceive of iron precipitates at grain boundaries and intragranular structural defects as effective reservoirs of metals. When metal atoms accumulate at these reservoirs, the metal defect concentration elsewhere is reduced, improving the diffusion lengths in those regions. On the other hand, when metal silicide precipitates are partially dissolved during processing, metals can diffuse from these reservoirs and contaminate neighboring regions, effectively increasing the bulk minority carrier trap density and reducing the bulk diffusion lengths, as seen in Fig. 5.14. This effect is most pronounced for cast mc-Si, in which slowcooling during crystal growth promotes the formation of larger metal silicide precipitates (and a reduction in metal point defects) in as-grown material. With a better understanding of the chemical nature and distribution of metals in mc-Si, as well as their evolution during processing, it becomes clear that the following four paths are available to reduce their impact on minority carrier lifetime: (1) Decrease the total impurity content. This is an obvious solution widely used in IC technology. However, extensive purification of silicon feedstock and growth surfaces, and elaborate cleanroom facilities, may not always be economically viable for the solar cell industry.
5 Structural, Elemental, and Chemical Complex Defects in Silicon
107
(2) Remove impurities after crystal growth via gettering. While gettering has proven to be effective in this study and numerous others, and while there is definitely room for further improvement and optimization that take new understanding [123] into account, there are physical [29, 124] and economic limitations to gettering which make it difficult to remove all metals. (3) Passivate the defects. Although hydrogen passivation is generally agreed to be very effective overall [29, 124–126], some regions of mc-Si do not improve with passivation, [29, 124], especially those with high densities of structural defects [127, 128]. (4) Reduce the density of metal-related defects via intelligent crystal growth, processing, and defect engineering. Although often overlooked, this alternative can provide another boost to material performance, and can be employed in combination with the other alternatives listed previously. While it is not uncommon that the majority of metals are contained in micron-sized inclusions, the average distances between these particles are very large, so they cannot have a significant direct impact on minority carrier diffusion length. Unlike these large inclusions, smaller metal-rich particles and interstitial metals are present in significantly higher spatial densities and are much more dangerous to solar cell device performance. These observations lead us to the conclusion that to maximize the minority carrier diffusion length without changing the total metal concentration, all metals must be completely contained in large, micron-sized clusters separated by several hundreds of microns, thus minimizing the interaction between metal atoms and charge-carrying electrons. To test this hypothesis, we purposely engineered different metal distributions within heavily contaminated mc-Si material, and compared these directly to the minority carrier diffusion length. We found that metal impurity distributions can be predictably engineered, for example, by controlling the sample cooling rate from high temperatures [129]. Three samples of high purity mc-Si samples grown by FZ technique were heavily contaminated with copper, nickel, and iron at 1200◦ C (the concentration of the metals was determined by their solubility at the diffusion temperature) to mimic the high metal content which is likely to be found in the future solar grade Si material, and were subjected to different cooling rates: quench (200◦ C/s cooling rate), slow cool (3–8◦ C/s), and quench to room temperature followed by a re-anneal at 655◦ C terminated by slow cool. The metal nanodefect distributions in these three samples were mapped using μ-XRF, and the minority carrier diffusion lengths were determined by SR-XBIC. Very fast cooling rates result in a homogeneous distribution of predominantly dissolved metals and their complexes. The histogram labeled “quench” in Fig. 5.18 shows a narrow distribution of minority carrier diffusion lengths under 10 μm, unacceptable for solar cell devices. The sample quenched and subsequently re-annealed exhibits a fine distribution of nanodefect clusters, tens of nanometers in size, with lower spatial densities than in the quenched sample. Simultaneously, the minority carrier diffusion length increases almost two-fold (Fig. 5.18). Finally, a low density of micron-sized defect clusters was observed in the slowly-cooled sample, which
108
A.A. Istratov et al.
Fig. 5.18. Effect of the distribution of metal defects on material performance. Material performance (minority-carrier diffusion length histograms, left) in three differently cooled samples (quench, quench and re-anneal, slow cool) is compared with size and spatial distributions of metal defects (high-resolution μ-XRF maps, right), XRF copper counts per second plotted against x and y coordinates in μm). The material with microdefects in lower spatial densities clearly outperforms materials with smaller nanodefects in higher spatial densities despite the fact that all materials contain the same total amount of metals
has a maximum minority carrier diffusion length higher than the quenched sample by a factor of four. These results provide direct evidence for the correlation between changes in metal defect distribution and enhancement of material performance.
5.10 Summary and Conclusions Silicon wafers are complex environments, with varying concentrations of intrinsic point defects, non-metallic impurities such as oxygen, metallic impurities such as iron and copper, and structural defects such as dislocations and grain boundaries. These different defect types cannot be viewed in complete isolation, but as interacting with the others as well as the crystal growth and device processing conditions. Thus, understanding the physical properties of transition metal precipitates in silicon, the interaction of metals with structural defects, and the effects of processing opens ways both for mitigation of their detrimental impact on device performance and for engineering metals in solar cells and integrated circuits.
5 Structural, Elemental, and Chemical Complex Defects in Silicon
109
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19.
20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35.
A. Seeger, K.P. Chik, Phys. Stat. Sol. A 29, 455 (1968) W. Bergholz, D. Gilles, Phys. Stat. Sol. B 222, 5 (2000) V.V. Voronkov, J. Cryst. Growth 59, 625 (1982) R. Falster, V.V. Voronkov, F. Quast, Phys. Stat. Sol. A 222, 219 (2000) R. Falster, V.V. Voronkov, Mater. Sci. Eng. B 73, 87 (2000) D. Gilles, E.R. Weber, S.K. Hahn, Phys. Rev. Lett. 64, 196 (1990) R. Falster, D. Gambaro, M. Olmo, M. Cornara, H. Korb, Mater. Res. Soc. Symp. Proc. 510, 27 (1998) A. Goetzberger, W. Shockley, J. Appl. Phys. 31, 1821 (1960) M. Miyazaki, M. Sano, S. Sumita, N. Fujino, Jpn. J. Appl. Phys. Lett. 30, L295 (1991) J. Baumann, C. Kaufmann, M. Rennau, T. Werner, T. Gessner, Microelectron. Eng. 33, 283 (1997) O. Breitenstein, J.P. Rakotoniaina, M. Hejjo Al Rifai, M. Werner, Progr. Photovolt. Res. Appl. 12, 529 (2004) T. Buonassisi, O.F. Vyvenko, A.A. Istratov, E.R. Weber, G. Hahn, D. Sontag, J.P. Rakotoniaina, O. Breitenstein, J. Isenberg, R. Schindler, J. Appl. Phys. 95, 1556 (2004) A.A. Istratov, H. Hieslmair, E.R. Weber, Appl. Phys. A: Mater. Sci. Process. 70, 489 (2000) A.A. Istratov, E.R. Weber, J. Electrochem. Soc. 149, G21 (2002) E.P. Burte, W. Aderhold, Solid-State Electron. 41, 1021 (1997) W.B. Henley, L. Jastrzebski, N.F. Haddad, J. Non-Cryst. Solids 187, 134 (1995) K. Hiramoto, M. Sano, S. Sadamitsu, N. Fujino, Jpn. J. Appl. Phys. Lett. 28, L2109 (1989) K. Lee, A. Nussbaum, Solid-State Electron. 23, 655 (1980) M. Morita, Y. Muramatsu, K. Watanabe, N. Nishio, T. Taketomi, T. Shimono, in Diagnostic Techniques for Semiconductor Materials and Devices, ed. by J.L. Benton, G.N. Maracas, P. Rai-Choudhury (The Electrochem. Soc., Pennington, 1992), p. 152 S.S. Simeonov, M.D. Ivanovich, Phys. Stat. Sol. (a) 82, 275 (1984) K. Graff, Metal Impurities in Silicon-Device Fabrication (Springer, Berlin, 2000) E.R. Weber, Appl. Phys. A: Solids Surf. 30, 1 (1983) S.M. Myers, M. Seibt, W. Schröter, J. Appl. Phys. 88, 3795 (2000) A.A. Istratov, W. Huber, E.R. Weber, J. Electrochem. Soc. 150, G244 (2003) T.S. Fell, P.R. Wilshaw, J. Phys. IV 1, C6 (1991) M. Kittler, W. Seifert, V. Higgs, Phys. Stat. Sol. A 137, 327 (1993) V. Kveder, M. Kittler, W. Schröter, Phys. Rev. B 63, 115208 (2001) /1 H.J. Möller, C. Funke, M. Rinio, S. Scholz, Thin Solid Films 487, 179–187 (2005) G. Hahn, A. Schönecker, J. Phys.: Condens. Matter 16, R1615 (2004) J.P. Kalejs, Solid State Phenom. 95–96, 159 (2004) D. Macdonald, A. Cuevas, A. Kinomura, Y. Nakano, L.J. Geerligs, J. Appl. Phys. 97, 033523 (2005) D. Macdonald, A. Cuevas, A. Kinomura, Y. Nakano, in 29th IEEE Photovoltaic Specialists Conference, New Orleans, USA, 2002 A.A. Istratov, T. Buonassisi, R.J. McDonald, A.R. Smith, R. Schindler, J.A. Rand, J.P. Kalejs, E.R. Weber, J. Appl. Phys. 94, 6552 (2003) S.A. McHugo, Appl. Phys. Lett. 71, 1984 (1997) S.A. McHugo, A.C. Thompson, I. Périchaud, S. Martinuzzi, Appl. Phys. Lett. 72, 3482 (1998)
110
A.A. Istratov et al.
36. S.A. McHugo, A.C. Thompson, G. Lamble, C. Flink, E.R. Weber, Physica B 273–274, 371 (1999) 37. S.A. McHugo, A.C. Thompson, C. Flink, E.R. Weber, G. Lamble, B. Gunion, A. MacDowell, R. Celestre, H.A. Padmore, Z. Hussain, J. Cryst. Growth 210, 395 (2000) 38. S.A. McHugo, A.C. Thompson, A. Mohammed, G. Lamble, I. Périchaud, S. Martinuzzi, M. Werner, M. Rinio, W. Koch, H.-U. Höfs, C. Häßler, J. Appl. Phys. 89, 4282 (2001) 39. S.A. McHugo, A. Mohammed, A.C. Thompson, B. Lai, Z. Cai, J. Appl. Phys. 91, 6396 (2002) 40. O.F. Vyvenko, T. Buonassisi, A.A. Istratov, H. Hieslmair, A.C. Thompson, R. Schindler, E.R. Weber, J. Appl. Phys. 91, 3614 (2002) 41. T. Buonassisi, A.A. Istratov, M.D. Pickett, M.A. Marcus, G. Hahn, S. Riepe, J. Isenberg, W. Warta, G. Willeke, T. Ciszek, E.R. Weber, Appl. Phys. Lett. 87, 044101 (2005) 42. S.A. McHugo, A.C. Thompson, C. Flink, E.R. Weber, G. Lamble, B. Gunion, A. MacDowell, R. Celestre, H.A. Padmore, Z. Hussain, J. Cryst. Growth 210, 395 (2000) 43. O.F. Vyvenko, T. Buonassisi, A.A. Istratov, E.R. Weber, J. Phys.: Condens. Matter 16, S141 (2004) 44. T. Buonassisi, A.A. Istratov, M.A. Marcus, M. Heuer, M.D. Pickett, B. Lai, Z. Cai, S.M. Heald, E.R. Weber, Solid State Phenom. 108–109, 577 (2005) 45. R.N. Hall, J.H. Racette, J. Appl. Phys. 35, 379 (1964) 46. A.A. Istratov, C. Flink, H. Hieslmair, E.R. Weber, T. Heiser, Phys. Rev. Lett. 81, 1243 (1998) 47. S.K. Estreicher, Phys. Rev. B 60, 5375 (1999) 48. J.-L. Maurice, C. Colliex, Appl. Phys. Lett. 55, 241 (1989) 49. A. Broniatowski, Philos. Mag. B 66, 767 (1992) 50. A.A. Istratov, H. Hedemann, M. Seibt, O.F. Vyvenko, W. Schröter, T. Heiser, C. Flink, H. Hieslmair, E.R. Weber, J. Electrochem. Soc. 145, 3889 (1998) 51. M. Seibt, H. Hedemann, A.A. Istratov, F. Riedel, A. Sattler, W. Schröter, Phys. Stat. Sol. (a) 171, 301 (1999) 52. R. Sachdeva, A.A. Istratov, E.R. Weber, Appl. Phys. Lett. 79, 2937 (2001) 53. M. Seibt, in Crystalline Defects and Contamination: Their Impact and Control in Device Manufacturing II, ed. by B.O. Kolbesen et al. (The Electrochem. Soc., Pennington, 1997), p. 243 54. M. Seibt, K. Graff, J. Appl. Phys. 63, 4444 (1988) 55. C. Flink, H. Feick, S.A. McHugo, W. Seifert, H. Hieslmair, T. Heiser, A.A. Istratov, E.R. Weber, Phys. Rev. Lett. 85, 4900 (2000) 56. M. Seibt, in Semiconductor Silicon 1990, ed. by H.R. Huff, K.G. Barraclough, J.I. Chikawa (The Electrochem. Soc., Pennington, 1990), p. 663 57. B. Shen, T. Sekiguchi, J. Jablonski, K. Sumino, J. Appl. Phys. 76, 4540 (1994) 58. B. Shen, J. Jablonski, T. Sekiguchi, K. Sumino, Jpn. J. Appl. Phys. 35, 4187 (1996) 59. B. Shen, T. Sekiguchi, R. Zhang, Y. Shi, H. Shi, K. Yang, Y. Zheng, K. Sumino, Jpn. J. Appl. Phys. 35, 3301 (1996) 60. J.P. Kalejs, B. Bathey, C. Dubé, J. Cryst. Growth 109, 174 (1991) 61. A.A. Istratov, H. Hieslmair, O.F. Vyvenko, E.R. Weber, R. Schindler, Sol. Energy Mater. Sol. Cells 72, 441 (2002) 62. T. Buonassisi, O.F. Vyvenko, A.A. Istratov, E.R. Weber, G. Hahn, D. Sontag, J.P. Rakotoniaina, O. Breitenstein, J. Isenberg, R. Schindler, J. Appl. Phys. 95, 1556 (2004) 63. S.A. McHugo, A.C. Thompson, A. Mohammed, G. Lamble, I. Périchaud, S. Martinuzzi, M. Werner, M. Rinio, W. Koch, H.-U. Höfs, C. Häßler, J. Appl. Phys. 89, 4282 (2001) 64. E. Nes, J. Washburn, J. Appl. Phys. 43, 2005 (1972)
5 Structural, Elemental, and Chemical Complex Defects in Silicon 65. 66. 67. 68. 69. 70. 71. 72. 73. 74. 75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89. 90. 91. 92. 93. 94. 95. 96. 97. 98. 99. 100.
111
G. Das, J. Appl. Phys. 44, 4459 (1973) E. Nes, G. Lunde, J. Appl. Phys. 43, 1835 (1972) J.K. Solberg, Acta Cryst. A34, 684 (1978) R. Rizk, X. Portier, G. Allais, G. Nouet, J. Appl. Phys. 76, 952 (1994) M. Seibt, M. Griess, A.A. Istratov, H. Hedemann, A. Sattler, W. Schröter, Phys. Stat. Sol. (a) 166, 171 (1998) T.F. Ciszek, T.H. Wang, J. Cryst. Growth 237–239, 1685 (2002) D.M. Lee, G.A. Rozgonyi, Appl. Phys. Lett. 65, 350 (1994) G. Kissinger, G. Morgenstern, H. Richter, J. Appl. Phys. 75, 4994 (1994) M. Kittler, C. Ulhaq-Bouillet, V. Higgs, J. Appl. Phys. 78, 4573 (1995) O.F. Vyvenko, T. Buonassisi, A.A. Istratov, E.R. Weber, M. Kittler, W. Seifert, J. Phys. Condens. Matter 14, 13079 (2002) N.T. Barrett, P.N. Gibson, G.N. Greaves, P. Mackle, K.J. Roberts, M. Sacchi, J. Phys. D: Appl. Phys. 22, 542 (1989) L. Magaud, S. Guillet, T. Lopez-Rios, Physica B 225, 225 (1996) A. Cros, M.O. Aboelfotoh, K.N. Tu, J. Appl. Phys. 67, 3328 (1990) Z. An, C. Kamezawa, M. Hirai, M. Kusaka, M. Iwami, J. Phys. Soc. Jpn. 71, 2948 (2002) A.A. Istratov, H. Hieslmair, E.R. Weber, Appl. Phys. A: Mater. Sci. Process. 70, 489 (2000) D.H. Macdonald, L.J. Geerligs, A. Azzizi, J. Appl. Phys. 95, 1021 (2004) J. Lu, M. Wagener, G. Rozgonyi, J. Rand, R. Jonczyk, J. Appl. Phys. 94, 140 (2003) R.A. Sinton, T. Mankad, S. Bowden, N. Enjalbert, in 19th European Photovoltaic Solar Energy Conference, Paris, France, 2004 L.J. Geerligs, in 14th Workshop on Crystalline Silicon Solar Cells & Modules: Materials and Processes, Winter Park, CO, 2004 A. Schönecker, L.J. Geerligs, A. Müller, Solid State Phenom. 95–96, 149 (2003) R.B. Hall, A.M. Barnett, et al., US patents 6,420,643; 6,414,237; 6,326,021; 6,211,455; 6,207,891; 6,111,191 (2000–2002) J.S. Culik, I.S. Goncharovsky, J.A. Rand, A.M. Barnett, Progr. Photovolt. 10, 119 (2002) J. Rand, G.A. Rozgonyi, R. Jonczyk, S. Batta, J. Lu, R. Reedy, R. Zhang, in 12th Workshop on Crystalline Silicon Solar Cell Materials and Processes, Breckenridge, CO, 2002 T.F. Ciszek, in Crystal Growth Technology, ed. by H.J. Scheel, T. Fukuda (Wiley, Chichester, 2003), p. 267 G.A. Rozgonyi, J. Lu, R. Zhang, J. Rand, R. Jonczyk, Diffusion & Defect Data Pt. B: Solid State Phenom. 95–96, 211 (2004) R. Zhang, G. Duscher, J. Rand, G.A. Rozgonyi, in 12th Workshop on Crystalline Silicon Solar Cell Materials and Processes, ed. by B. Sopori, Breckenridge, CO, 2002, p. 206 M. Rinio, C. Ballif, T. Buonassisi, D. Borchert, in 19th European Photovoltaic Solar Energy Conference and Exhibition, Paris, France, 2004 S. Mahajan, K.S.S. Harsha, Principles of Growth and Processing of Semiconductors (WCB/McGraw-Hill, New York, 1999) T.B. Massalski, H. Okamoto, P.R. Subramanian, L. Kacprzak, Binary Alloy Phase Diagrams (ASM International, Materials Park, 1990) D.R. Uhlmann, B. Chalmers, K.A. Jackson, J. Appl. Phys. 35, 2986 (1964) P.S. Ravishankar, J.P. Dismukes, W.R. Wilcox, J. Cryst. Growth 71, 579 (1985) B. Chalmers, J. Cryst. Growth 82, 70 (1987) N.V. Abrosimov, A.V. Bazhenov, V.A. Tatarchenko, J. Cryst. Growth 82, 203 (1987) R.C. Dorward, J.S. Kirkaldy, J. Mater. Sci. 3, 502 (1968) A.A. Istratov, W. Huber, E.R. Weber, Appl. Phys. Lett. 85, 4472 (2004) D. West, S.K. Estreicher, S. Knack, J. Weber, Phys. Rev. B 68, 035310 (2003)
112
A.A. Istratov et al.
101. A.H. Cottrell, B.A. Bilby, Proc. Phys. Soc. Sect. A 62, 49 (1949) 102. D. Blavette, E. Cadel, A. Fraczkiewicz, A. Menand, Science 286, 2317 (1999) 103. A.A. Istratov, T. Buonassisi, W. Huber, E.R. Weber, in 14th NREL Workshop on Crystalline Silicon Solar Cell Materials and Processes, Winter Park, CO, USA, 2004 104. W.C. Dash, J. Appl. Phys. 27, 1193 (1956) 105. W.T. Stacy, D.F. Allison, T.C. Wu, J. Electrochem. Soc. 129, 1128 (1982) 106. A.A. Istratov, T. Buonassisi, M.A. Marcus, T.F. Ciszek, E.R. Weber, in 14th NREL Workshop on Crystalline Silicon Solar Cell Materials and Processes, 2004 107. B. Sopori, L. Jastrzebski, T. Tan, in 25th Photovoltaic Specialists Conference, Washington, DC, 1996 108. R. Einhaus, F. Deuerinckx, E. Van Kerscaver, J. Szlufcik, F. Durand, P.J. Ribeyron, J.C. Duby, D. Sarti, G. Goaer, G.N. Le, I. Perichaud, L. Clerc, S. Martinuzzi, Mater. Sci. Eng. B 58, 81 (1999) 109. S. Peters, J.Y. Lee, C. Ballif, C. Borchert, S.W. Glunz, W. Warta, G. Willeke, in 29th IEEE Photovoltaic Specialists Conference, New Orleans, 2002 110. A. Rohatgi, D.S. Kim, K. Nakayashiki, V. Yelundur, B. Rounsaville, Appl. Phys. Lett. 84, 145 (2004) 111. J. Knobloch, B. Voss, K. Leo, in 18th IEEE Photovoltaic Specialists Conference, Las Vegas, USA, 1985 112. S.K. Estreicher, D. West, in 12th Workshop on Crystalline Silicon Solar Cell Materials and Processes, Golden, CO, 2002 113. S.P. Murarka, C.C. Chang, Appl. Phys. Lett. 37, 639 (1980) 114. S.A. McHugo, E.R. Weber, S.M. Myers, G.A. Petersen, Appl. Phys. Lett. 69, 3060 (1996) 115. L.L. Kazmerski, P.J. Ireland, T.F. Ciszek, Appl. Phys. Lett. 36, 323 (1980) 116. C.H. Seager, Annu. Rev. Mater. Sci. 15, 271 (1985) 117. A. Ihlal, R. Rizk, O.B.M. Hardouin Duparc, J. Appl. Phys. 80, 2665 (1996) 118. J. Chen, D. Yang, X. Zhenqiang, T. Sekiguchi, J. Appl. Phys. 97, 033701 (2005) 119. J. Chen, T. Sekiguchi, D. Yang, F. Yin, K. Kido, S. Tsurekawa, J. Appl. Phys. 96, 5490 (2004) 120. G.A. Rozgonyi, J. Lu, R. Zhang, J. Rand, R. Jonczyk, Solid State Phenom. 95–96, 211 (2004) 121. J. Rand, G.A. Rozgonyi, J. Lu, R. Reedy, in 29th IEEE Photovoltaic Specialists Conference, New Orleans, USA, 2002 122. J.P. Kalejs, Sol. Energy Mater. Sol. Cells 72, 139 (2002) 123. W. Schröter, V. Kveder, M. Seibt, A. Sattler, E. Spiecker, Sol. Energy Mater. Sol. Cells 72, 299 (2002) 124. P. Geiger, G. Kragler, G. Hahn, P. Fath, Sol. Energy Mater. Sol. Cells 85, 559 (2005) 125. A. Rohatgi, in 15th Workshop on Crystalline Silicon Solar Cells and Modules: Materials and Processes, Vail, Colorado, USA, 2005 126. J.I. Hanoka, Sol. Energy Mater. Sol. Cells 65, 231 (2001) 127. S. Pizzini, Phys. Stat. Sol. A 171, 123 (1999) 128. M. Kittler, W. Seifert, Solid State Phenom. 95–96, 197 (2004) 129. T. Buonassisi, A.A. Istratov, M.A. Marcus, B. Lai, Z. Cai, S.M. Heald, E.R. Weber, Nat. Mater. 4, 676 (2005) 130. CRC Handbook of Chemistry and Physics, 84th edn., ed. by D.R. Lide (CRC Press, 2003)
6 Surface and Interface Chemistry for Gate Stacks on Silicon M.M. Frank and Y.J. Chabal
6.1 Introduction: The Silicon/Silicon Oxide Interface at the Heart of Electronics The silicon metal-oxide-semiconductor field-effect transistor (MOSFET) has been at the heart of the information technology revolution. During the past four decades, we have witnessed an exponential increase in the integration density of logic circuits (“Moore’s law” [1]), powered by exponential reductions in MOSFET device size. Despite these changes, silicon oxide, usually grown by simple exposure to oxygen gas or water vapor at elevated temperatures, until recently remained the gate insulator of choice. Its success has been based on the wonderful properties of the silicon/silicon dioxide interface. This interface has only about 1012 cm−2 electrically active defects. And after a simple passivating hydrogen exposure, 1010 cm−2 defects remain – only one defect for every 100,000 interface atoms! If silicon oxide (SiO2 ) is such an excellent insulator and has served us well for decades, why would we even consider replacing it? Continued “scaling” of the MOSFET gate length has required simultaneous scaling of other geometrical and electronic device parameters, such as gate insulator thickness, threshold and supply voltages, and body doping (“Dennard’s scaling theory” [2]). During the past decade, sustaining this historical complementary MOS (CMOS) scaling trend has become increasingly challenging [3, 4]. Voltage scaling slowed down dramatically as supply voltages approached 1 V, due to constraints on the threshold voltage needed to limit source-to-drain leakage in the transistor “off” state. In addition, SiO2 or silicon oxynitride (SiON) gate dielectric scaling has all but stopped at thicknesses close to 10 Å, as the gate leakage currents due to quantum mechanical tunneling through this “insulator” have reached values of >100 A/cm2 in transistors aimed at highperformance applications, e.g., in servers. This causes computer chips to generate heat with power densities of 100 W/cm2 or more, making chip cooling technologically challenging and costly. Scaling of the gate oxide in low-power circuits, e.g., for
114
M.M. Frank and Y.J. Chabal
cell phones, has reached its limits already near 20 Å, to ensure substantially lower gate leakage and thereby sufficient battery life. To sustain continued density and performance increases, much of the attention has turned to modified device structures [4] and to new materials, among them highpermittivity (“high-k”) gate dielectrics such as hafnium oxide (HfO2 ) to replace SiO2 or SiON [5–7] and metal gate electrodes to replace doped polycrystalline silicon (poly-Si) [5]. High-k gate insulators offer two main benefits, enabled by their increased relative permittivity compared to SiO2 (e.g., kSiO2 = 3.9; kHfO2 ∼ 20–25). This becomes apparent when visualizing the gate stack as a parallel plate capacitor. First, in a short channel transistor, a high-k dielectric more strongly confines the electrical field generated by the gate electrode to within the gate dielectric than an SiO2 layer of similar physical thickness and therefore similar gate leakage. This improves control of the gate electrode over the band bending in the channel. In the transistor off state, that translates into lower source-to-drain leakage, which in turn allows further shrinking of the gate length at a given off-state current specification. High-k dielectrics thus enable continued scaling according to Moore’s and Dennard’s laws. Second, switching speed of a high-k device is expected to be higher than with an SiO2 insulator of the same thickness t. This is due to the higher inversion charge Qinv in the channel at a given gate voltage Vgate , when the permittivity k and therefore the capacitance Cgate of the gate stack is increased: Qinv = Cgate · Vgate , with Cgate ∼ k/t. A higher inversion charge results in higher on-state currents that can flow in the channel [8]. It is convenient to define a “capacitance equivalent thickness” (CET) of the high-k layer as the physical thickness of an SiO2 layer with the same areal capacitance, i.e., Cgate ∼ khigh-k /thigh-k = kSiO2 /CET → CET = thigh-k · kSiO2 /khigh-k . Maximizing gate capacitance thus is equivalent to minimizing CET. With the advent of high-k dielectrics, it may seem that SiO2 will soon be a material of the past. However, all material changes must be made in such a way that the outstanding degree of electrical perfection achieved with SiO2 is preserved. With continued scaling to CET values near and below 10 Å, the starting surface plays an ever more critical role in terms of perfection, flatness, and cleanliness. Growth of the gate dielectric is becoming a more and more delicate process requiring atomic control. And high-k/silicon interface structure and electrical quality in the fully processed device must be tuned to an optimum. Since the silicon/silicon oxide interface is of such remarkably high quality, it will likely survive even in high-k gate stacks as a subnanometer layer between the silicon channel and the high-k dielectric, either grown intentionally or formed during high-k oxide deposition. It is thus clear that we need to gain an ever deeper understanding of silicon oxide films and the silicon/silicon oxide interface on the atomic scale. This chapter therefore describes the fundamental silicon surface science associated with the continued progress of nanoelectronics. We shall focus on the ultrathin oxide films encountered during silicon cleaning and high-k gate stack fabrication. In Sect. 6.2, we review the current practices and understanding of silicon cleaning. In Sect. 6.3, we describe the fabrication of high-k gate stacks, focusing on the continued importance of the silicon/silicon oxide interface.
6 Surface and Interface Chemistry for Gate Stacks on Silicon
115
6.2 Current Practices and Understanding of Silicon Cleaning 6.2.1 Introduction Aqueous treatment of silicon surfaces is the first step to control the interfaces for further processing [9]. It typically involves a basic and/or an acidic solution treatment mixed with H2 O2 , the basis of the RCA Standard Clean developed by Kern in 1965, [10] sometimes with an HF etch in between (modified RCA clean). This cleaning procedure is effective to remove (1) particles by basic solutions that simultaneously oxidize and etch the surface (e.g., NH4 OH/H2 O2 /H2 O, referred to as SC-1), and (2) metal and hydrocarbon impurities by acid/peroxide solutions (e.g., HCl/H2 O2 /H2 O, referred to as SC-2, and H2 SO4 /H2 O2 /H2 O, referred to as piranha clean). This process leaves 6–15 Å of hydroxylated oxide on the Si surface which prevents carbon recontamination of the Si. For applications requiring a stable surface with no interfacial oxide, an “HF last” step is performed to dissolve the surface oxide completely in HF acid. The early electronic measurements of Buck and McKim (1958) [11] demonstrated the high degree of passivation of HF-treated Si surfaces, which are now known to be oxide-free and passivated with H. The H-terminated surfaces are hydrophobic in nature and are not wetted by aqueous solutions. Given the enormous practical importance of wet chemical treatment of silicon wafers during processing, each aspect of the process has been studied extensively, [9] and will be described in more detail in the next sections. The chemically grown oxides exhibit a more complex chemical composition than the high-quality stoichiometric, thermally grown SiO2 (standards for gate oxides). The importance of understanding these chemical oxides cannot be overstated, since they provide the foundation of SiO2 -based gate oxides as well as for high-k gate stacks. As the CET of high-k gate stacks decreases to ∼10 Å or less, the chemical oxides grown during preclean make up a large fraction of the total thickness and can, therefore, influence the overall performance of the dielectric. In addition, the deposition of high-k materials by atomic layer deposition (ALD) can be influenced by the nature of the chemical oxide growth (see Sect. 6.3). The mechanism leading to hydrogen passivation of silicon surfaces by HF etching is now well understood [12]. As discussed in the next section, the strong polarization of the Si–Si back-bonds of a surface Si–F species leads to removal of that surface atom as SiF4 , leaving behind a more neutral (less polar, i.e., more stable) hydrogenterminated surface. Hydrogen-passivated surfaces are remarkably stable in oxygen, nitrogen, and water vapor [13]. They only get slowly oxidized in air due to radicalmediated reactions. In solutions, hydrogen-passivated surfaces can also remain stable unless dissolved oxygen is present; or more aggressive oxidizing agents are used, such as peroxides. The degree and completeness of oxidation varies depending on the exact reagents. Among the four main methods (SC1, SC2, piranha, and nitric acid), [9] the piranha (sulfuric acid/peroxide mix, SPM) treatment leads to the most homogeneous, thinnest, and hydrogen-free oxide [14]. Starting with hydrogen-free oxides is important because some aspects of hydrogen within device structures can cause problems (reliability, etc.), although in other cases hydrogen can provide surface state passivation.
116
M.M. Frank and Y.J. Chabal
An important aspect of wet processing is the potential control of the surface morphology via preferential etching. Indeed, the microstructural state of the surface has been shown to affect subsequent device properties [15, 16]. For instance, surface roughness can cause degradation of the breakdown field strengths of thin gate oxides [15, 17], as well as decreased channel mobilities [15, 16]. Both hydrophilic and hydrophobic surface cleans can affect the morphology of the Si surface. Chemically grown oxides are non-crystalline, limiting the information obtained from diffraction and imaging techniques. In contrast, HF etching leads to H-termination of the crystal surfaces and can be studied in greater detail by spectroscopic techniques. For instance, interesting differences are observed between the surface structures of Si(100) and Si(111) wafers after HF etching. Under specific pH conditions (e.g., pH ∼8), preferential etching of Si itself takes place, favoring the formation of (111) facets resulting in atomically flat monohydride-terminated Si(111) surfaces [18] and atomically rough, multi-hydride-terminated Si(100) surfaces (see Sect. 6.2.3). Similar behavior is observed for etching H-terminated silicon in hot water [19] and in KOH solutions [20]. Contamination is also an important issue for any cleaning or passivation process because trace amounts of impurities can drastically influence the properties of subsequently fabricated devices. Contaminants can be intrinsic (e.g., H, OH, H2 O, O2 , S, Cl, F) or extrinsic (e.g., C, Fe, Ni, Cr, Cu) to the solutions used. The contamination and the associated resulting loss of passivation of H-terminated surfaces are briefly discussed below. 6.2.2 Silicon Cleans Leading to Oxidized Silicon Surfaces All oxidizing aqueous cleans, including the RCA Standard Clean [21] leave 6–15 Å of hydroxylated oxide on the Si surface that prevents recontamination of the Si. Such surfaces are hydrophilic in nature and are easily wetted by aqueous solutions. The chemical composition of the Si surface subsequent to a clean is fundamental to its passivation. The chemically grown oxides exhibit a more complex chemical composition than the high quality stoichiometric SiO2 grown thermally. Chemical Composition of Oxidized Surfaces The properties of chemically grown oxides produced by the various integrated circuit (IC) cleaning techniques are quite similar and have been reviewed by Deal [22]. These techniques include sulfuric acid-hydrogen peroxide mixture (SPM), standard clean 1 (SC-1), or standard clean 2 (SC-2). The oxides tend to be ∼6–15 Å thick, depending on the process temperature as well as the solution chemistry used [23, 24]. These films are largely stoichiometric but, because they are so thin, exhibit properties with many of the characteristics of the interfacial transition regions of thicker, thermally grown oxides. The large suboxide content characteristic of these chemically grown oxides is shown in the Si 2p X-ray photoelectron spectroscopy (XPS) data of Sugiyama et al. [25] in Fig. 6.1. For two different chemical preparations, the spectra are dominated by the Si crystal substrate and by stoichiometric SiO2 , Si4+ . There
6 Surface and Interface Chemistry for Gate Stacks on Silicon
117
Fig. 6.1. X-ray photoelectron spectra, with spin orbit splitting removed, of the Si 2p3/2 core level associated with native oxides on Si formed by immersion in (top) HNO3 at 45–50◦ C for 5 min, and (bottom) 4:1:1 H2 O:H2 O2 :NH4 OH at 63–80◦ C for 10 min. The dashed lines are the result of a spectral deconvolution performed by assuming that the chemical shifts and the values of FWHM (full width half maximum) for the various components are the same as for the Si4+ , which is associated with stoichiometric SiO2 [25]
is, however, a relatively strong and unambiguous Si2+ contribution corresponding to an interfacial transition region similar to that observed for the thermal oxides. In this case, the Si2+ contribution is 10–20% that of the Si4+ , indicating that the transition region is a large fraction of the surface layer. Besides the Si2+ , the presence of varying degrees of other suboxides depends on the exact surface treatment used [25, 26] and is not discussed here. Dangling bond defects at the Si/SiO2 interface have been quantified using minority carrier lifetime measurements to extract the surface recombination velocity and surface defect density [27]. Defect densities on the order of 1013 cm−2 are typical for these clean, unannealed native oxides, placing these surfaces at a level one-to-two orders of magnitude higher than for thermal oxides. Chemical oxidation involves species other than Si and O. All aqueous solutions, of course, are predominantly composed of H2 O and are, therefore, sources of H2 O, OH, and H. Thus, these chemical species must be incorporated in the oxide layer to some extent. Hydrogen bonding is common to all aqueous solutions and is intimately related to the heat of solvation, as well as to the hydrophilic nature of oxidecovered Si surfaces. Hydrogen-bonded O–H is observed on all chemically grown oxides, as shown for instance by the infrared (IR) absorption spectrum in Fig. 6.2 where a H-terminated Si surface was chemically oxidized using the SC-2 step of the RCA clean. The spectra presented in Fig. 6.2 are obtained by subtracting the spec-
118
M.M. Frank and Y.J. Chabal
Fig. 6.2. Infrared absorption spectra of Si wafer chemically oxidized in a 4:1:1 solution of H2 O:H2 O2 :HCI at 80◦ C for 10 min. The ratio of the spectra lines was compared to the corresponding spectra of the H-terminated Si by etching in a buffered HF solution [29]. A multiple internal reflection geometry was used with 75 reflections at a 45◦ internal angle of incidence, as shown in the inset
trum of H-terminated Si surfaces, and therefore display a negative absorption for the Si–H stretch bands (2100 cm−1 ). The H-bonded OH stretching vibration peak is at ∼3300 cm−1 and is ∼400 cm−1 wide (FWHM) with an asymmetric line shape. It is quite difficult to distinguish between Si–O–H and H2 O in these spectra without access to the 1600–1700-cm−1 spectral region where the characteristic scissor (i.e., deformation) mode of the H2 O molecule is located. The intensity difference observed between the spectra taken in s- and p-polarization indicates that the OH groups must reside in or on the oxide layer. Although it is unclear from these spectra, one knows in very general terms that most of the IR signal comes from H2 O on the surface of the oxide, since gentle heating (100◦ C) decreases the OH/H2 O absorption substantially. The surface OH concentration is extremely important in determining the initial reaction rate for many high-k growth reactions [28], and can be varied with the chemical treatment chosen. Ozone (O3 ) treatments are the most oxidizing, producing the lowest amount of OH on the surface. SC-1 solutions, on the other hand, produce larger quantities of OH units on the surface and within the films. It appears that the OH concentration is related to the degree to which the cleaning solutions produce SiO2 . More OH-terminated groups are formed when the oxide formation is less complete and vice versa. It is also apparent that Si–H units remain after the growth of chemical oxides [25, 26]. The first convincing evidence came from the IR spectra of Ogawa et al. (1992) [24] shown in Fig. 6.3, where Si–H stretching vibrations are identified at ∼2260 cm−1 . Si–H stretches in that region of the spectrum originate from Si–H where the surface Si atom is back-bonded to O atoms [30]. This evidence clearly indicates that the Si–H resides within the oxide matrix. The area density is estimated
6 Surface and Interface Chemistry for Gate Stacks on Silicon
119
Fig. 6.3. Infrared absorption spectra of six different native oxides on Si wafers: (a) “H2 SO4 ” corresponds to a 10-min treatment in 4:1 H2 SO4 :H2 O2 at 85–90◦ C, (b) “HCl” 10 min in 4:1:1 H2 O:H2 O2 :HCl at 37–65◦ C, (c) “NH4 OH,” 10 min in 4:1:1 H2 O:H2 O2 :NH4 OH at 63– 80◦ C, (d) “NH4 OH + hot HNO3 ” in “NH4 OH” followed by “hot HNO3 ” (e) “boil HNO3 ” in HNO3 at 115–125◦ C, and (f) “hot HNO3 ” 5 min in HNO3 at 45–60◦ C. The absorption (indicated by the arrow), which peaks at ∼2260 cm−1 , arises from Si–H stretches where the Si is back-bonded to O (i.e., Si–H inside SiO2 or on the upper surface of the oxide) [24]
to be 2–3 × 1013 cm−2 . X-ray photoelectron spectroscopy data from the same group suggests that these Si–H units are actually localized near the upper surface of the oxide. If this is indeed the case, these units may be residual Si–H bonds from the original H-terminated hydrophobic surface before the chemical oxidation. This kind of picture agrees well with the idea that oxidation proceeds via O atom insertion between the Si–Si back-bonds of the surface and is consistent with the observations of Nagasawa et al. (1990) on the initial stages of oxidation of hydrophobic surfaces [31]. Also observed in the spectra of Fig. 6.3 are Si–H stretches in the range of 2080 cm−1 , which are best explained by H-atoms bonded to unoxidized parts of the surface (i.e., to Si back-bonded to Si atoms rather than to O atoms). The high-frequency shoulder of this band (∼2140 cm−1 ) is most likely associated to Si–H stretches where some of the Si back-bonds are attached to only one O atom. If the mode at 2080 cm−1 does arise from Si–H at the Si/SiO2 interface, an interesting direction for future studies will be to determine its formation mechanism.
120
M.M. Frank and Y.J. Chabal
Structure and Morphology of Oxidized Surfaces The work of Hahn and Henzler [32], Heyns et al. [17], and Ohmi et al. [15] has correlated electronic device properties with surface structural properties. While surface roughness is intuitively detrimental to semiconductor devices at some scale, the main contribution of this early work was to define at what scale surface roughness is important to device yield and reliability. Degradation of thin gate oxide breakdown field strengths and of channel mobilities with surface roughness on a microscopic scale is now a parameter taken into account by the industry. The morphology that a surface exhibits tends to be a function of the complete processing history experienced by the wafer; it is therefore quite complicated. The initial surface polish processes, chemical cleaning processes, thermal oxidation processes, and etching processes, all influence the surface morphology. This section begins with a discussion of substrate wafers, including chemical mechanical polishing (CMP) and epitaxy. It then considers the effects of chemical cleaning on surface morphology. Finally, future trends in controlling oxidation and interfacial structure are briefly discussed. Near atomic perfection is achieved in surface CMP. Commercial wafers exhibit a typical surface roughness on the order of 2 Å RMS (root mean square) and surface finishes produced in the laboratory have approached 1 Å RMS [33]. A scanning tunneling microscope (STM) image of such a surface is shown in Fig. 6.4. Although STM images can characterize surface roughness on length scales covering the range from 1 to 1000 Å, these surfaces are also characterized with a variety of other techniques, such as diffraction methods, spanning lengths up to 1 mm with good correlation observed on all scales [33]. The post-CMP STM image shown in Fig. 6.4 was taken subsequent to HF removal of surface oxidation, and the surface is hydrophobic and H-terminated [34]. The process to produce H-terminated surfaces is now well understood and will be discussed briefly in Sect. 6.2.3.
Fig. 6.4. Scanning tunneling microscope image of a polished Si(100) surface exhibiting 1.2 Å RMS roughness. The image was taken immediately following a HF dip. Courtesy of P.O. Hahn, Wacker-Chemitronic GmbH, Germany [35]
6 Surface and Interface Chemistry for Gate Stacks on Silicon
121
Fig. 6.5. Typical scanning tunneling microscope images of Si(100) surfaces taken before (top) and after (bottom) an RCA Standard Clean (SC-1 and SC-2). Images taken after a newly developed buffered HF treatment (BHF) where the authors observed minimal increases in Si surface roughness due to the BHF [15]
Hydrogen-terminated surfaces are not as stable as oxide-terminated surfaces, and thus it is not surprising that wafer suppliers ship wafers in the hydrophilic (oxidecovered) state. Supplier polish and clean recipes are proprietary, but presumably the wafers receive something akin to an RCA clean before they are shipped. Another technique that provides atomically “perfect” surfaces uses Si molecular beam epitaxy [36], although this technique has not yet been commercialized. Under proper conditions, also surfaces formed during commercial Si epitaxial growth by chemical vapor deposition (CVD) are smoother than chemically cleaned surfaces. In 2005, asreceived CZ (Czochralski) Si and epitaxial Si substrates seemed to exhibit a surface roughness of ∼2 Å RMS [37]. The consensus is that the acidic peroxide cleans (SPM etch or SC-2) do not cause a substantial increase in the microscopic roughness of as-received wafers. The SC-1 clean (typically 5:1:1 H2 O:H2 O2 :NH4 OH at 80◦ C) on the other hand, has been found to substantially increase the surface roughness [38]. A comparison of the surface roughness of a control wafer and a wafer cleaned in a standard SC-1 process is shown in the STM images of Fig. 6.5 [15]. Control wafers exhibit a RMS roughness of
122
M.M. Frank and Y.J. Chabal
2 Å. The SC-1 treatment more than doubles the observed roughness and repetitive SC-1 cycles can increase it by as much as a factor of 5, approaching 10 Å RMS. This roughness has been shown to decrease breakdown field strengths by as much as 30% [39] and to degrade channel mobilities by factors of 2–3 times [40]. The mechanism leading to the roughening in the basic peroxide solution is not completely understood but is related to the slow but finite Si etch rate in the SC-1 solution (∼8 Å/min at full strength at 80◦ C) [15]. The acidic peroxides, on the other hand, do not etch SiO2 and hence do not roughen the oxide surface. A proposed solution to the basic peroxide roughening problem is to reduce the etch rate by reducing the concentration of NH4 OH in the SC-1 solution [38]. The etch rate drops to 1 Å/min if the concentration of NH4 OH is decreased by a factor of 100 times. Figure 6.6 shows a plot of the measured RMS roughness as a function of NH4 OH concentration [41]. One might wonder why the industry is working so hard to keep the standard SC-1 solution when it is clearly detrimental to the surfaces. The reason is simple: SC-1 is one of the most efficient particle removal agents known. Further, the fact that the basic peroxide solution slightly etches both SiO2 and Si may be precisely why it is such an efficient particle remover. This phenomenon is being investigated in the hopes that an optimum concentration can be found to minimize damage and retain particle removal efficiency [15]. The microscopic mechanism by which the etching roughens the surface is also being investigated. It should be mentioned that etching alone does not necessarily mean that the surface roughness will be increased. Non-uniform etching is the true culprit. For example, Verhaverbeke et al. (1991) have found that the Ca concentration in SC-1 dramatically changes the degree to which the surface roughens [42]. Similar studies are needed to fully understand the mechanism of surface roughness and are the direction of future work. The exact molecular structure of these chemically grown Si/SiO2 interfaces is very difficult to deduce. Native oxide growth has been shown to occur in an extremely controlled manner [43, 44], leading to atomically ordered interfaces under the right conditions. This phenomenon appears in fact to be more general. In very careful XPS studies of surface O concentration as a function of time, layer-by-layer initial oxidation of Si has been observed [45, 46]. Layer-by-layer oxidation, of course, requires that the previous layer finish before the next layer begins oxidation and leads by necessity to the conclusion that some form of order must exist at the Si/SiO2 interface. Contamination Issues Associated with Oxidized Surfaces One of the main objectives of the development of the RCA clean was to remove organic and metal contaminants from the surface of Si wafers [21]. Although the RCA clean was developed over 40 years ago, it has functioned extremely well and is still the main clean used prior to gate oxidation as well as the initial clean in some manufacturing lines. Residual trace metal contamination at 1010 cm−2 is observed after RCA cleaning and is dependent on the quality of the chemicals used. These metals can lead to surface roughening due to a couple of different mechanisms. Metals can enhance oxidation rates and therefore increase surface etching rates non-uniformly, leading to surface roughening. Another mechanism is related to bubble formation
6 Surface and Interface Chemistry for Gate Stacks on Silicon
123
Fig. 6.6. Surface roughness plotted as a function of NH4 OH concentration in a 10-min NH4 OH/H2 O2 solution treatment at 85◦ C [41]
that blocks surface reactions, again leading to increases in surface topography. Also, metals can get trapped in the oxide formed during the SC-1/SC-2 cleaning process which subsequently leads to leaky junctions and to yield and reliability problems in gate oxides [46, 47]. Ozone oxidation, sometimes with the addition of HCl, avoids metal contamination from the RCA chemicals and has been found to be an excellent way in which to grow passivating chemical oxides. Ozonated H2 O can be produced in extremely high purity and has been shown to result in high quality gate dielectrics. Another common contaminant on these oxide-covered surfaces is carbon. It is most likely incorporated in or on these surfaces in the form of hydrocarbons and can come from the chemicals, H2 O used to rinse the wafers, or from the air in the laboratory environment. Trace hydrocarbons have not proven to be detrimental to gate oxides. A predominant sentiment in the industry is that the hydrocarbons get burnt off in the O-rich high-temperature environment of the oxidation furnace. If handled improperly, however, SiC precipitates can cause weak spots in the oxides being grown [48]. Hydrocarbon contamination is much more of a concern for surface preparation prior to epitaxial growth of Si. In this case, surfaces that are completely free of contamination are needed to grow defect-free Si and C-contamination is of critical concern. The technique of desorbing the oxide at elevated temperature prior to epitaxy was first discussed by Henderson [49] in which the results showed that atomically clean surfaces with only a small amount of C residue could be obtained after the RCA standard clean. Ishizaka, Nakagawa, and Shiraki [50] reduced the level of C entrained in the oxide by repetitively immersing the wafers in boiling HNO3 acid followed by HF, ending with a concentrated SC-2 type of clean (4:1:1 HCl:H2 O2 :H2 O at 90–100◦ C). Another efficient technique to remove hydrocarbons is exposure to UV/O3 (ultraviolet/ozone) [51].
124
M.M. Frank and Y.J. Chabal
Some contaminants, such as S and Cl, can originate directly from the solution used during the chemical oxidation. Sulfur and Cl have been observed from the SPM and SC-2 cleaning solutions, respectively. Fluorine, on the other hand, has been observed when the chemical oxidation is preceded by a HF treatment. In this case, F is found to segregate at the Si/SiO2 interface [52]. 6.2.3 Si Cleans Leading to Hydrogen-Terminated Silicon Surfaces Mechanism of Hydrogen Termination The original belief that HF etching leads to F termination of the Si was based on the stability of the Si–F bonds and the accepted explanation of the mechanism for SiO2 dissolution leading to F-terminated Si. In its simplest form, the dissolution of SiO2 by HF can be depicted by the following reaction: SiO2 + 4HF → SiF4 + 2H2 O.
(6.1)
Notice that the above reaction involves HF molecules and not F− ions in the solution. HF is a weak acid with an equilibrium constant such that it does not dissociate readily in concentrated solutions [53]. Moreover, Judge [53] showed that, even if F− ions are available, they give an etching rate which is negligible compared to HF and HF2− species. Thus, only HF in its associated form needs to be considered in the dissolution mechanism. HF molecules attack Si–O bonds by inserting themselves between the Si and O atoms. This reaction is depicted schematically in Fig. 6.7(a) as if it were the last Si–O bond to be broken before reaching the Si substrate. This insertion occurs with a low activation barrier because the reaction is highly exothermic and conserves the number of broken and reformed bonds. The reaction is also greatly facilitated by the highly polar nature of the Si–O bond, which the highly polar HF molecule can use to its advantage during attack. The Coulomb attraction naturally leads to having the positively charged H ion associated with the negatively charged O ion, and the negatively charged F ion associated with the positively charged Si ion of the Si–O bond. This liberates H2 O into the solution and leaves Si–F in its place on the surface (Fig. 6.7(b)). The Si–F bond (∼6 eV) is the strongest single bond known in chemistry. In comparison, the bond strength of the Si–H is only ∼3.5 eV so that, based on these thermodynamic considerations, the F-terminated surface must be more stable than the H-terminated surface. Ubara, Imura, and Hiraki (1984) were the first to recognize that the Si–F bond must be highly polar because of the large electronegativity difference between these atoms. Hence, the Si–F bond causes bond polarization of the associated Si–Si backbond allowing HF attack of the back-bond, as illustrated in Fig. 6.7(c). This kinetically favorable pathway results in the release of stable SiFx species into the solution, leaving Si–H behind on the surface as shown in Fig. 6.7(d). The validity of this proposed pathway was confirmed using first principles molecular orbital calculations of the activation energies of these types of reactions on model compounds by Trucks et al. [12]. In these calculations, an activation energy of ∼1.0 eV was found for reactions shown in Fig. 6.7(c). The reaction barrier is lowered by the charge transfer
6 Surface and Interface Chemistry for Gate Stacks on Silicon
125
Fig. 6.7. Schematic representation of Si etching and H passivation by HF
between the Si and F atoms, as originally suggested by Ubara et al. [54]. In the absence of charge transfer, as is the case for the nonpolar Si–H bonds, the activation energy of the Si–Si back-bond attack is 1.6 eV, which is 0.6 eV higher in energy than for that of fluorinated Si species. The impact of the Coulomb interaction could also be observed by inverting the HF molecule, making the attack occur in opposition to the Coulomb force. In that case, an activation energy of 1.4 eV is obtained. In summary, HF attacks polar species very effectively but is much less effective against nonpolar species. The attack requires a specific orientation of the reactant atom to take advantage of the Coulomb interaction between the positively and negatively charged atoms. These concepts provide a basis to understand why H (and not F) terminates the Si dangling bonds after HF solution etching and why HF dissolves oxide so readily but leaves the Si relatively untouched. The preceding arguments give a basic understanding of HF etching. In reality, however, the situation is much more complex because (a) HF, HF2− , F− , H3 O− , OH− , and NH4 F species may coexist together in solution, in chemical equilibrium with one another; (b) steric constraints can play a role at the surface; and (c) solvation effects can affect reaction kinetics. The calculations mentioned above were performed for molecules in “free space” and thus can only accurately describe gas phase reactions. In vapor processes, however, H2 O vapor is needed to initiate SiO2 etching reactions with anhydrous HF [55]. In general, the main effect of placing the polar HF molecule into H2 O is to surround it, on the average, with H2 O molecules in the proper orientation to minimize the Coulomb energy. This, in turn, weakens the HF bond, facilitating all HF reactions that must break the HF bond. Therefore, it is reasonable to assume that solvation simply lowers the activation energy barriers that exist for the gas phase reactions described above. For instance, HF dissolution of SiO2 has an activation energy of approximately 0.35 eV [53] compared with the 0.55 eV calculated for the gas phase reaction [12]. The heat of solvation to place an HF molecule into solution is ∼0.4 eV, which is consistent with the observed 0.2 eV lowering of the energy barrier. This reasoning rationalizes the HF and HF2− etching behavior observed by Judge (1971) [53]. HF2− can be thought of as a more highly solvated form of HF with a weaker bond strength that explains the lower activation energy for SiO2 dissolution (0.31 eV) as well as the increased rate of dissolution
126
M.M. Frank and Y.J. Chabal
(factor of 4–5). The role of steric hindrance at the surface is also important. Consequently, the chemical trends discussed above provide a guide, but cannot be used to obtain exact activation energies, since modifications due to steric constraints or solvation can be expected. HF solution chemistry is also affected by the OH− concentration [18]. Experiments show that H2 O rinsing alone can remove dihydride species at steps [18, 56], leading to monohydride termination on Si(111) surfaces [56]. This observation is consistent with the fact that samples etched in HF remain hydrophobic even after boiling in H2 O for extended periods of time [57]. Similarly, CMP Si wafers (polished in slurries with pH ∼13) are hydrophobic and are terminated with H [34]. These observations suggest that Si surface reactions with OH− can also lead to hydrophobic H-terminated Si surfaces once the surface oxide is removed and that HF and OH− chemistry can remove Si atoms bonded to electronegative elements by back-bond attack of the polarized Si–Si bond. It is also interesting to note that HF and OH− in solution may have similarities in their reaction pathways at the surface. Structure and Morphology of Hydrogen-Terminated Silicon Surfaces Hahn and Henzler [32] in 1984 first provided information on the morphology of H-terminated Si(100) and Si(111) using low energy electron diffraction (LEED). Next, Grundner and Schulz [58] used electron energy loss spectroscopy (EELS) on HF-treated Si(111) and Si(100) to investigate the nature of the H-termination. They found that Si(100) was dihydride terminated (with a characteristic SiH2 scissor vibration at 900 cm−1 ) and Si(111) was monohydride terminated. This finding together with the observation of a LEED pattern [35] led to the conclusion that a uniform dihydride phase was obtained. For HF-etched Si(111) surfaces, the strong Si–H stretch loss at 2080 cm−1 together with the high quality 1 × 1 LEED pattern [35], suggested that the surface was ideally monohydride-terminated. A weak loss at 900 cm−1 was attributed to dihydride at steps. Soon thereafter, high-resolution IR reflection absorption spectroscopy revealed that, contrary to the conclusions drawn from EELS, Si(100) and Si(111) surfaces are atomically rough after similar etching treatments, as evidenced by complex IR absorption spectra with contributions from mono-, di-, and tri-hydrides, as schematically shown in Fig. 6.8. The spectra of Si(100) surfaces also show that the HF-etched surfaces are much more complex than atomically flat, Hterminated surfaces. Structural information was extracted by providing complete assignments of the observed bands, using isotopic substitution experiments combined with force constant normal mode analyses on model compounds [59]. In summary, etching in dilute HF leads to atomically rough surfaces. Mono-, di-, and trihydrides coexist on both Si(100) and Si(111) surfaces. STM images of Si(111) [60] show structures of 10–20 Å diameter and 3 Å in height, accounting for about 50% of the surface (i.e., ∼50% remains monohydride terminated) consistent with the IR data.
6 Surface and Interface Chemistry for Gate Stacks on Silicon
127
Fig. 6.8. Schematic representation of possible surface structures on the Si(111) surface with their associated H termination. The ideal monohydride and trihydride termination are possible for an atomically flat (111) plane. The “horizontal” dihydride (D) terminates the corner of a small adstructure where an isolated monohydride (M ) may exist. Both the “vertical” dihydride (D ) and coupled monohydride (M) can terminate larger structures of the type shown here. These are all the possible structures that do not involve surface reconstruction [59]
Si(100) Etched in Buffered HF Solutions Buffered HF (BHF) is composed of various mixtures of 50 wt% HF in H2 O and 40 wt% NH4 F in H2 O. A common mixture used in the industry is 7:1 buffered HF, which has a pH = 4.5 and is composed of 7 parts of NH4 F and 1 part of HF. The main difference between aqueous HF and buffered HF is the solution pH, which is the object of the following discussion. Raising the pH of the HF solution increases the etch rate of the H-terminated Si surfaces. Infrared absorption data clearly show that the morphology of chemically prepared Si(100) surfaces changes as the pH of the etching solution varies from 2–8 [61]. For a pH = 2, the IR absorption spectra are dominated by dihydrides with other hydrides present, consistent with an atomically rough surface. In buffered HF (pH ∼5), the spectrum sharpens and is dominated by coupled monohydrides, suggesting the formation of (111) microfacets on the Si(100) surface [18, 61]. For higher pH values, the etching proceeds quickly, as evidenced by the gas bubbles forming at the sample surface. After etching in a NH4 F solution (pH = 7.8), dihydride contributions are again dominant. However, the polarization of this mode is quite different from the pH = 2 spectra. In this case, the symmetric stretch (2105 cm−1 ) is polarized normal to the surface and the anti-symmetric stretch (2112 cm−1 ) is polarized parallel to the surface. Although these polarizations would be correct for terrace dihydrides, these surfaces are not believed to be atomically flat because of the existence of strong spectral contributions from mono- and tri-hydrides. Furthermore, the monohydride spectrum is now centered at ∼2085 cm−1 , indicating the growth of (111) facets. After several etching cycles, the evolution of the IR spectra suggests that (111) facets develop in solutions of high pH. Both isolated and coupled monohydrides
128
M.M. Frank and Y.J. Chabal
Fig. 6.9. Atomic force microscope images of (a) a CMP Si(100) control wafer (∼2 Å RMS) and (b) a Si(100) wafer etched in 7:1 buffered HF solution for 10 min (∼5 Å RMS) [J. Sapjeta, unpublished]
have symmetric stretches pointing away from the normal of the macroscopic surface plane. In contrast, the dihydride modes are characteristic of dihydrides with their axes pointing along the surface normal. The simplest atomic arrangement consistent with these observations is a distribution of tent-like structures with a row of dihydrides at the rooftop, (111) facets terminated with ideal monohydrides on the sides and coupled monohydrides at the periphery of the facets. Since the facets are small, the concentration of coupled monohydrides is as high as that of ideal monohydrides. The use of buffered HF may be ill advised in attempting to prepare atomically flat (100) surfaces, since (111) facets develop upon etching. Increased surface roughness has been directly observed after buffered HF etching using atomic force microscopy (AFM) [41]. As shown in Fig. 6.9, a control wafer is relatively smooth with ∼2 Å RMS roughness, whereas a wafer treated in buffered HF is characterized by ∼5 Å RMS roughness. To improve the atomic flatness of Si(100) surfaces, a thermal oxidation treatment is highly desirable because it is known to result in high-quality Si/SiO2 interfaces. In summary, Si(100) surfaces are microscopically rough when treated in either dilute or concentrated HF. These surfaces are macroscopically roughened by buffered HF solutions due to (111) facet formation. To date, little is known about the nature of such surfaces and its impact on IC device performance. The potential impact on the quality of subsequent interfaces formed after further processing will motivate future work in this area. Si(111) Etched in Buffered HF Solutions For the Si(111) surfaces, increasing the pH of the solution makes it possible to flatten the surface on an atomic scale thanks to preferential etching of the H-terminated surfaces [18], as reflected in the IR absorption spectra. For instance, Fig. 6.10 shows the difference between a Si(111) surface etched in dilute HF and buffered (pH ∼8) HF solutions (i.e., 40 wt% NH4 F). While the dilute HF etched surface is atomically rough (Fig. 6.10(a)) with all forms of hydrides, the surface etched in a 40 wt% NH4 F
6 Surface and Interface Chemistry for Gate Stacks on Silicon
129
Fig. 6.10. P-polarized IR absorption spectra of Si(111) after (a) etching in dilute HF (pH = 2), and (b) a 40% NH4 F solution (pH = 7.8) [61, 65, 66]
solution is characterized by a single sharp infrared absorption line at 2083.7 cm−1 , polarized perpendicular to the surface (Fig. 6.10(b)). The obvious implication is that atomically flat surfaces have been obtained with ideal monohydride termination. The measured linewidth, ν of ∼0.9 cm−1 , is the narrowest line ever measured for a chemisorbed atom or molecule on a surface at room temperature [18]. A substantial contribution to the width is due to thermal broadening. Low temperature measurements have shown that most of the linewidth measured at room temperature was thermally induced, due to an harmonic coupling of the Si–H stretch mode to surface Si phonons [62]. At present, the best samples are characterized by an extremely small (0.05 cm−1 ) inhomogeneous broadening [63, 64]. The LEED patterns obtained after careful introduction into UHV show a 1 × 1 pattern with resolution limited integral order spots and a background below the detection limit of conventional LEED systems (Fig. 6.11). This unreconstructed and ideally H-terminated surface is often referred to as the H/Si(111) (1 × 1). STM images [67, 68], such as the one shown in Fig. 6.12, have confirmed that the surface is nearly contamination free (